Existing Ide Architecture


I'd like to briefly report here about the architecture of SharpDevelop, an open-source .NET IDE (http://www.sharpdevelop.net).

Basically, it uses two different tree data structures to represent knowledge about programs: AST and DOM.

AST

AST is the output of the compiler front-end, and is an unresolved parse tree. It keeps all information about the source code in all detail, down to primitive expressions. Every tree only represents a single file and doesn't know anything about other files and projects. Every node is marked with TextCoordinates in the source file. The tree can even keep information about comments and whitespace.

DOM

DOM is the data structure shared between most parts of the IDE, and it models the high-level code (not including method bodies) at the project level. The DOM is resolved - type references actually point to other DOM objects. The DOM models entire Projects, and DOM objects can be referenced across projects. The DOM can be obtained from various sources - from the AST (including the resolving step, which converts strings from the AST nodes to resolved DOM object references), as well as from the compiled binaries (using reflection), etc.

I wonder, is it a good design decision to have two different tree data structures in the IDE? One low-level, code level, local to single files, and another high-level, global, shared across projects.

My main experience with the SharpDevelop implementation was that it overly uses TextCoordinates (line and column numbers) where tree nodes could have been accepted instead. The areas of the IDE where TextCoordinates and tree nodes are used are not clearly separated from each other.

My strong opinion is that the most of the IDE should only use the primary tree data structure and should know nothing about TextCoordinates. This way we are more flexible when we integrate tools that aren't based on code (for example, I was implementing a structured editor which directly edits the AST, so I had to completely rewrite the SharpDevelop Resolver to make it more generic).

So my message is: please please don't pass TextCoordinates around everywhere. Try to write a clean mapping between AST and TextCoordinates and use AST nodes wherever possible!

KirillOsenkov

 

Last edited June 18, 2007
Return to WelcomeVisitors