Gui Machine Language

In my grand mental quest to find the best way to have a ProgrammingLanguageNeutralGui, I've been kicking around the idea of a "GUI machine language" (or perhaps CLR for GUI's). It would not be "machine language" like we know it from x86 etc., but rather a tabular list with just 4 columns:

 formID, widgetID, attribute, value

Actually an ordering is assumed such that if you were representing it in a table, a sequence-number column of some kind would also be included. Also, perhaps every row should have a sequential number per login session, such that the first command would have a sequence of 1, the 2nd "2", etc. This can serve as a kind of internal verification mechanism, similar to parity-checking. The lists sent to a server would also have a tracking number. The number is somewhat comparable to the byte offset address shown in many assembler-language listings. However, our examples will only have it implied based on the order presented.

Here's an example based on HTML-like pseudo-code.

 <form id="f1" width=600 title="foo">  // define a window
   <form id="f2" x=400 y=200 title="inner">  // embedded form
     <text id="tx1" x=30 content="ssn:">
     <textbox x=80 id="tb1" width=30>
     <button id="b1" onclick="@sendThisForm" title="Go!">

The translation into "GUI machine code" may resemble:

 formID wdgtID  attrib   value
 f1     form    width    600
 f1     form    title    foo
 f2     form    parent   f1    // represents form nesting
 f2     form    x        400
 f2     form    y        200
 f2     form    title    inner
 f2     tx1     type     text
 f2     tx1     x        30
 f2     tx1     y        200
 f2     tx1     content  ssn:
 f2     tb1     type     textbox
 f2     tb1     x        80
 f2     tb1     width    30
 f2     b1      type     button
 f2     b1      title    Go!
 f2     b1      onclick  @sendThisForm  // standard pre-defined event

Although this example shows positional coordinates, the "browser" wouldn't necessary have to use absolution positioning. (Ideally it would offer both nested positioning capabilities and coordinate positioning, but that's another topic). Also, the "list" is the main focus, not the HTML-like pseudo-code. It is merely to serve as a reference example using a more-familiar notation. (Some positions and widths are not shown for brevity. Some default would normally be assumed for such values.)

"System" is a reserved "form" (not shown) for environmental, communication, and house-cleaning chores. "form" under widget-ID simply indicated that we are referencing the form and not a widget. Widgets don't have to be explicitly declared. Mention of their ID alone declares them. However, they are usually not of much use until a "type" is assigned. There is no need to "erase" widgets, just make them hidden if not used. (Although for garbage collection, perhaps some kind of erasure can be added, but that's later-fancy-land stuff.)

Because our "language" is limited to just 4 columns, sometimes "cursors" are used to indicate the point of focus for a given attribute. This is illustrated in the following grid widget example:

  <form id="f1">
    <grid id="grid1">
      <cell row=3 col=7 content="foo">
      <cell row=4 col=2 content="bar" focus="true">

formID wdgtID attrib value ----------------------------- f1 grid1 type grid f1 grid1 rowPntr 3 f1 grid1 colPntr 7 f1 grid1 content foo f1 grid1 rowPntr 4 f1 grid1 colPntr 2 f1 grid1 content bar f1 grid1 focus true

Setting the "pointers" is not the same as setting focus, at least not on the user-side. A focus command is a separate, explicit activity. Think of the row and column pointers as being for an internal pointer, which do not by themselves affect the mouse or keyboard cursor.

A GUI browser could potentially store command lists internally such that clicking a button can perform multiple actions, such as changing text-box values and changing focus. Maybe an internal "label" can be defined for any action, almost like a "jump" or go-to lable:

 formID wdgtID   attrib   value
 system loadmode internal true     // turn on "save only" mode to not execute
 system flow     label    myLabel  // instruction "marker" label
 form1  form     focus    true
 form1  box1     content  Hi Mom!
 system flow     stop              // end of "script"
 system loadmode internal false    // turn off "save only" mode
 form2  button1  onClick  myLabel  // run script "myLabel" upon click

In mark-up pseudo-code, this would resemble:

 <action id="myLabel" runAt="client">
   <form id="form1" focus="true">  // assuming these were already defined
   <textbox id="box1" content="Hi Mom!">
 <button id="button1" onClick="myLabel">

Flow is only forward such that it's not TuringComplete (assuming local events can't somehow indirectly trigger itself without asking the server first). Whether it should be made TuringComplete, at least as an explicit option (for security considerations) is a future question.

Not addressed here is multi-threading. In typical business forms, such is usually not needed. A separate "application" can be spawned if such is needed, such as processing a long-running report "on the side", with some kind of basic messaging system between the apps. If someone wants to define multi-threading for the above, be my guest.

If we wanted to have subroutines, then we could have a "system, flow, return" instead of "system, flow, stop".


Goals --top

Here's a fairly simple way to define drag-and-drop interaction: If a widget supports being dragged (draggable=true), then when the widget is dragged on/in a widget having attribute dragtarget=true, the following message tuple is sent to the server and/or custom client script using above conventions: (form=<targetForm>, widget=<targetWidget>, attrib="onDrag", value="formX.widgetX"). If the dual thing bothers you, then split it into two result statements with attributes "dragFromForm" and "dragFromWidget". In short, the drag target sends the "path" of the drag candidate. The client then optionally waits for the server or local script to approve and potentially process the drag. An approval script/server-check is an option, not required (dragapprov=<none,client,server>).

[Much ThreadMode content has been moved to GuiMachineLanguageSecurityDiscussion, GuiMachineLanguageDiscussion. We should attempt to refactor the salient points back here.]

To Recurse or not To Recurse or not To Recurse or not To Recurse...

Form-A may have an "on-open" trigger that opens form-B, and then close itself. Form-B may in turn have a trigger that does the same to form-A. Thus, they run each other back and forth. Other stuff could be put into such triggers as to potentially have an almost infinitely loop that could make it TuringComplete. -top

[original discussion devolved because TopMind misused the word "security" and a security nut (me) got on his case about it.]

There is plenty of motivation to avoid this sort of cyclic behavior, and many good reasons to avoid a TuringComplete language for describing display elements. There are actually many techniques for avoiding infinite loops. A simple one would be to have a fixed maximum number of trigger events, sort of like the 'hop count' fields on Internet packets. One can also do a lot with a maximum number of layers in a scene - i.e. the Super Famicom had four layers, and I know at least one system in industrial use (for embedded systems) that has a maximum of eight.

Perhaps, though: if we're going for the sort of data-driven system described above, are triggers the right answer at all? Perhaps you should wrack your brain a bit and see if you can't come up with a more data-driven alternative to triggers.

Regarding Client-side Configurability

Your stated motivation for a client's ability to "turn off" certain features was security (and it might actually provide some if the feature in question was insecure, but in those cases - e.g. granting the application access to the webcam - a PowerBox design with explicit 'opt-in' per app, with expirations and such, might be more suitable). Of some concern is that 'turning off' features, will lower the LowestCommonDenominator: basically, server-side developers wishing to provide an application won't be able to assume the features exist. And negotiating to provide the same service with many different features is painful (which is why LeastFlexibleProtocolWins). For true 'insecure' features such as webcam access, I would suggest a PowerBox approach, which would give the user fine-grained control over the remote system's authority. But for the others, I would recommend attempting to provide a standard environment that the application developers would target. If you're going to have a maximum number of trigger-cycles or layers, make it the same for all clients.

Client-side Abstraction Features

From a bandwidth perspective, at the very least the application developers are likely going to want to send some sort of code over to the clients. Abstraction is a form of compression: create a few abstractions, then reuse them as needed; DontRepeatYourself by sending the whole abstraction each time. Every MachineLanguage, except this one (apparently), supports development of abstractions. If you want fast-termination properties, that is manageable: cap the number of recursions or loops in the otherwise potentially non-terminating abstractions... or perhaps favor a basis such as DataLog or TotalFunctionalProgramming or labeled TermRewriting systems that can be guaranteed to terminate. You don't need to give up the advantages of non-termination in order to support abstraction.

Without the ability for the server to send an abstraction, GML will not be technically competitive (against HtmlDomJsCss, MicrosoftSilverlight, DotNet/CommonLanguageRuntime, CurlLanguage, Tk, and others). Even if it did succeed by some non-technical merit, chances are someone would come hack a scripting language to run atop it, which would defeat GML whatever purpose you had in dodging that bullet in the first place. Another way of thinking about that is you're rolling time back thirty years just so you can spend the next decade repeating the mistakes that HtmlDomJsCss already made. So I recommend hammering out support for abstractions that will fit well with the data-driven GML style.

Regarding the Niche for GML (from GuiMachineLanguageSecurityDiscussion)

A new protocol cannot jam every possible wish-list item into it. The wish-list items given attention versus those ignored in the first releases need to be weighed based on need and target audience. Roughly, I see 3 possible divisions:

It may be good to separate these rather than gum up the works.

I don't see three divisions. I see one continuous range of demands to support all of them in a high-performance manner.

Yes, that's the ideal, but generally humans must partition things to manage the scope and work-load effectively.

Is there a reason you believe web standards are insufficient for business-oriented forms and apps? I do my taxes online. I do trading online. For which business forms and apps is it insufficient?

Part of this topic is a hypothetical question: how would you do GUI engines if starting from scratch? Asking such questions sometimes helps people to design better even with different or established tools. It's almost like learning Lisp or ProLog: you likely won't use them in practice, but they give you viewpoints you may not have had otherwise. It's optional MentalMasturbation that may or may not make one think. Ignore it if you wish. Maybe some bored geek snowed-in will produce a new GUI browser that catches on. -top
I'm sure it took a lot of time to get those to work right. It's possible, but difficult and expensive to build reliable biz GUIs, and I bet when a new browser version comes out, they have to retune a lot. Things that took a few hours in a 90's desk-top GUI design tool take weeks and months using HTML+DOM+JS+CSS.

I bet when a new OS version comes out, they have to retune a lot ;-). I honestly can't imagine any useful desk-top GUI design tool that took hours in the 90s, so I suspect there is a lot of "in the good old days" exaggeration there.

I found that the 90's tools required tweaking around with existing wheels while the web requires reinvention to get a desk-top-ish feel (which is what the customer/user often wants). I rarely had to sample actual X,Y mouse coordinates and do hover time-outs loops. Even if it didn't do exactly what was originally envisioned, there was usually a work-around using a different existing widget that was just a few clicks away.
See Also: GuiMarkupProposal, HtmlDomJsCss, WebBrowserMissingWidgetWorkArounds, CrudScreen, DisplayPostscript, XwindowProtocol, DeclarativeGui, CurlLanguage


View edit of December 17, 2011 or FindPage with title or text search