I wrote this page because I'd like to develop my own wiki code, doing "the simplest thing that might possibly work", and it wasn't clear to me from the existing discussion just what that might be without a wholesale import of Ward's code. My hope is that this will be a useful reference/tutorial for other programmers.
Wiki source code
topic leads to WardCunningham
's source code (in Perl) for this site. It can be browsed with commentary explaining each section, which is a good way to get an overview of how it works. Once this is understood, it can be downloaded as straight Perl code for further review, suggesting changes to Ward, or installing into another Web server to create a new wiki.
It may be useful to have some comments from Ward here about how best to submit a proposed change to the wiki code.
Wiki functionality - Ward's minimal version
- WikiBase is very obsolete, but cross your fingers. -- SunirShah
Here's a discussion of the features of wiki without including the specific code. It's my hope that this discussion will be useful to wiki developers, regardless of their favorite language:
For browsing the site, the server displays a page containing:
- the standard wiki header;
- the body text of the topic; and
- the standard wiki footer.
The header includes the title of the page as the Title element inside the HTML HEAD section. It also has the standard logo and the title of the page as the topmost text of the page.
In the body text, the server performs formatting conversion between the stored text - for example, three single quotes in a row for bold - to HTML tags for display by the client browser.
The footer has a link to the Edit page to modify the page being browsed; to the backup copy if available; and to the Find functionality.
The edit page shows the body text of the page in a text box, allowing it to be edited. When the form is submitted, a script does some processing to the text and updates the RecentChanges
list. The Thanks For Editing page is sent to the user with a link back to the as-edited page.
A discussion of the Links and Search pages should go here.
"Blue sky" discussion of other wiki features
I don't know if a code review would be of any use, but after perusing the code, I have some questions about why it works the way it does.
- How about separating content and presentation by moving the HTML parsing into a separate module?
This would make it a breeze for other wikis to change their headers & footers, e.g. maybe some would prefer the Search link to be at the top of the page.
Why not just have the edit script write a new static HTML page? (instead of dynamically generating the page at read time)
- Then cgi/wiki/Topic would return that static page. This would make the reading portion of the site much more scalable when there is no need to run a script on every hit.
- It would also make it possible to use a search engine to explore the site. ''(Huh? Search engines already do explore the site -- GoogleLovesWiki)
- Ward has rejected HTML editing. So using static HTML pages implies either (a) both wikicode and HTML saved on disk, or (b) have 2 parsers, wikicode-to-HTML when the page is saved, and HTML-to-wikicode the next time the page is edited. (a) CacheCalculations is an OptimizationPattern. We all know the RulesOfOptimization. Besides, since disk drives are so slow, I suspect the extra disk time used (to save both the wikicode and the HTML to disk) completely wipes out the CPU time saved (to translate wikicode-to-HTML a few extra times). (b) Bijective parsers are difficult to get right.
- The edit script might run a bit slower just after a page is edited, but that's the best time to introduce a delay to the user - they're probably still basking in the glow of their wonderful editing and won't notice the delay as they sigh contentedly. Rather than having the delay occur for every reader on every page.
- - if there is a link on the page which doesn't exist at the time the page is edited, the link will show up with a ? after it, but if someone clicks on the ? to create the page, this won't update the old page to show that the link is valid now. Umm, I hope that made sense. :-) One way to handle this is whenever a new page is created, search for all pages which link to it, and then re-render those pages.
That's easily solved. Store with each page (as invisible meta info in the file or as columns in the db) the incoming and outgoing links. When you edit a page, you have to update all the pages in your incoming and outgoing links. Again, the number of pages affected should be a tiny fraction of the total. The delay in the editing shouldn't be much greater then in viewing. Just that it happens only once instead of hundreds of times. I've implemented that in my own wiki and it works great.
Features Ward has already rejected
- HTML editing
- Non-white background colors
I am working on coding up a wiki in PythonLanguage. All but the wikicode/HTML parser-converter are done. It has a simple config file to let you define the pagebgcolor, header and footer background color, and domain name - so it can properly code the links. Although tradition MIGHT be to use white, and I respect Ward's choices for himself, I would never make such a decision on another person's behalf.
If you haven't already done so, you might want to have a look at WikiWikiSuggestions
for ideas. -- FalkBruegmann
Or http://wikifeatures.wiki.taoriver.net/ [BrokenLink] for even more ideas.
How is ReverseIndex
implemented? I'm writing my own wiki and, since it seems to be the most expensive operation, I'd like to explore good ways of implementing it.
My thoughts are here:
But I'd like to keep the discussion here on WardsOriginalWiki
How about updating OTHER pages' BackLink
s whenever you edit one page. You'll have to run one big job to bring your existing pages up to date, but then you only have to update the pages on your outgoing links when you save edits to a page. -- SvenNeumann
It's not quite that simple, because you need to handle links disappearing. Since the new version of the page (by definition) doesn't contain those links, you either need to scan the old version of the page, or check the backlink lists of all candidate pages, or something. TrikiWiki?
has a cunning solution using zero-length files whose names contain a record of a link: http://triki.publication.org.uk/ThesePagesLeadHere
. -- TomAnderson
I would like to open a discussion for the implementation of diff. I've just implemented the WagnerFisherAlgorithm
, using lines of text for characters. This seems to work well on showing up edits vs inserts/deletes and unchanged lines. Has anybody else used this algorithm for diff and what weighting for the ops did you use? If this is not an appropriate algorithm for this job, I'd be delighted to hear what others used. -- SvenNeumann
The diff used by MoinMoin http://purl.net/wiki/moin/ looks nice.