Data Code Equivocation Considered Harmful

If people would keep code and data unambiguously separated, we wouldn't have email viruses and malicious web pages and all that drek.

Documents are data. It should be safe to view them. JavaScript, Word macros, etc. are evil.

Comments in source code are data. It should be safe to delete them. HotComments are evil.

Whether the code is in a Turing-complete language or not is irrelevant. What's relevant is: Can it trash my hard drive? Can it pop up annoying windows?


What about formulas in a spreadsheet? (Excel macros are turing complete. See ProductivityRant for a discussion of that.)

This raises the question of whether formula propagation, as found in spreadsheets, is TuringComplete or not. If it is not TC, I wonder if it could still pose a hazard of some kind. Just because something is not TC does not necessarily mean it is safe. SQL, for example, is not TC, but massive destruction can be done if one has access to it. TC-ness perhaps can be seen as a continuum (below final passage) such that the closer it is to TC, the more risky it is. Even pure HTML, something that would probably be low on a TC scale, can crash some browsers in certain combinations. -- top


If people kept code and data unambiguously separated we wouldn't have any code.

Wouldn't that be a good thing?

Only if you want to go through all that data by hand.


Sneaky code in data only becomes a virus if you execute it. You should be careful what you execute regardless of whether you think they should be together and/or interchangeable or not. I would note that most file systems do not make a distinction between data and executables. It is up to the apps or OS to determine. Microsoft stuff spreads viruses because by default their email browsers run all kinds of multi-media goodies. This was probably done to make it "pretty". Lesson: don't let browsers run Turing-complete code without explicit permission.

I don't mind "documents" that contain code as long as the code has restricted privileges. Remember, TuringComplete refers to what programs can compute, not what they have access to. For example, a TuringComplete language must allow programs to compute PrimeNumbers, but does not have to allow programs to see what's in your clipboard. -- JesseRuderman?


I suppose this would mean that JavaScript in web-pages is evil. Some RemoteGuiProtocols suggest making GUI's possible without client-side TuringComplete ability.
This will also tick off Lisp fans.
CategoryRant

See KeepThingsSeparate for a similar viewpoint.

See DataAndCodeAreTheSameThing for opposing viewpoints.

EditText of this page (last edited January 30, 2012) or FindPage with title or text search