Explicit Management Of Implicit Context

Explicit Management of Implicit Context (EMIC) is a potential language feature. I'm describing it here to help solidify its concept and to run it up the flagpole and see who salutes or flames - all comments are welcome.

EMIC shifts what is essentially a ContextObject (or ValueObject) into the implicit evaluation/execution context of a language, such that it needn't be explicitly threaded or forwarded via parameter passing from one method to another. However, it is intended to be a fully language-supported feature... requiring a more generic (and likely more difficult) implementation than the humble ContextObject.

Chances are EMIC is nothing new and could be implemented in many languages (esp. the dynamic ones) without much effort. However, I consider it worth describing as a feature.

Working definitions: It should be clear that most languages you know probably don't have EMIC. Of course I wouldn't be surprised if some of the more dynamic languages did have something like it (Lisp certainly does with its SpecialVariables, and Haskell monads are very close, albeit both of these are somewhat opaque and difficult to get or set in whole chunks). I expect that EMIC, or a good approximation, could readily be implemented in Ruby and Python (if they don't already provide it via some library I've never heard about)... though it would be non-standard and some of the advantages would not apply.

As noted above, EMIC, in a language that supported it, would effectively replace ThreadLocalStorage as far more syntactically efficient (though its underlying implementation may be atop TLS), and would ideally wrap or replace EnvironmentVariables except, perhaps, at those rare places where one must interact with the OperatingSystem in order to spawn a new process (because most operating systems limit EnvironmentVariables to plain-ol-strings).

So... what does EMIC buy us?

Above and Beyond the ContextObject: In Common with the ContextObject:

Does your idea of an EMIC contain any default behavior of an EMIC? I'm thinking of

As mentioned above, EMIC should take over for most global variables, services, and singletons. Ideally these could just be placed into the 'initial' EMIC context... (e.g. you say 'cerr' and you're actually getting 'context.cerr'). I can't say for certain that this would include ALL globals due to module-based encapsulation being what it is. Such a conclusion would require considerably more analysis. But the more EMIC takes over, and the less the need for supporting two distinct sorts of language contexts, the better - consistency is a good thing, as is the modularity and flexibility that comes with being able to swap out the global context. EMIC tracking the active thread (and anything that would normally enter ThreadLocalStorage), the environment variables (including various system properties), time and 'java.System' (or any equivalent in any language) and memory allocators and filesystem and GC (these all being "services") would all be part of this default behavior.

As far as "Reflection" - explicit access to implicit context may be useful for reflection (and its kissing cousin: debugging), but I'd have imagined it being more that a debugger could readily display context right along side the parameters to whichever function you're in (most debuggers I use fail to display relevant globals and singletons in the default watch windows). But I can't help but think that creating or allowing logical dependence on "who called me" to be a BadThing - it breaks abstraction and refactoring and a variety of optimizations and isn't useful for security (ConfusedDeputyProblem) and I wouldn't want the language I'm designing to encourage or provide special facility to support it (not that it's readily capable: all functions and procedures and types (which are all structural) are anonymous and are considered distinct from the names by which programmers handle them, so 'Who Am I' questions are out even before one gets recursive about it). But it can be done easily enough in a language with named methods/functions and nominative types/classes (NominativeAndStructuralTyping)... e.g. using aspect-oriented techniques to make the context track the complete stack and all parameters at every call (and thus taking over some of the debugger's traditional role), and I can see the value of such for embedding debugging and profiling. I'd prefer aspect-oriented techniques because I dislike introducing explicit dependencies on services that won't be available in the context at the time of a release build.

As far as "facility to create an exception": creating exceptions ideally shouldn't be markedly distinct from creating other values or objects in the language. Such distinctions are problematic when it comes to integrating language features. If the runtime 'throwing' mechanism wishes to couple exceptions with ActivationRecords at the time you 'throw' them - or after it is proven that no one will 'catch' it - that would be more acceptable to me.

I wonder how to type this EMIC.

What do you mean? Do you mean 'type' as in TypeSystem and TypeSafety or 'type' as in clacking away between chair and monitor to implement EMIC in a language? As the latter is really part of more general language implementation issues for which I have no good answers (but see TheSimplestPossibleCompiler and RethinkingCompilerDesign), I'm going to imagine you intend to ask about the former. It is worth noting, however, that implementation of a very 'basic' EMIC isn't all that difficult... it is only when you start integrating it with the other features and services and taking over parts of global space (by convention - e.g. shifting 'cerr' to 'context.cerr' - if not as a requirement) that it becomes a 'complex'. And I actually believe it simplifies many of these things. Based on my own work so far, EMIC seems to be one of those features that looks complex but makes a bunch of other things more simple... indicating that working without EMIC is 'simplistic' (violating the EinsteinPrinciple's second clause) and that EMIC may be a KeyLanguageFeature.

Anyhow, for languages providing DynamicTyping the solution to 'type' EMIC is extremely easy: dynamic DuckTyping. Don't bother checking for ahead... just throw an exception or whatever the language normally does when you ask for an element of an object or record that does not exist or is of the wrong type. Trust the programmer to get it mostly right and run any necessary unit-tests and such to get it working the rest of the way.

For StaticTyping, the problem might more generally be approached in much the same manner as Haskell's Monads. You can use a great deal of TypeInference (ImplicitTyping) to determine which features will be required on the execution/evaluation paths, taking into account modifications to the context as you move along it. You may additionally support mechanisms for manifestly annotating required context components (because 'dumb' comments are kinda pointless and if you're going to comment on required context items, you might as well make the compiler able to check for it). One thing to note, however, is that if you're using StaticTyping you'll need to also couple continuations and thunks more tightly with context because the programmer can't be permitted to change these out arbitrarily.

For DesignByContract implemented via StaticAssert, one requires that certain proofs over the context and its context be possible. This goes a bit beyond TypeSafety (though it is equivalent if the system supports DependentTyping and the ability to force static TypeSafety for certain methods). It allows code to be safely blocked unless contracts are fulfilled. DesignByContract implemented via dynamic (runtime) assertions is no different (and no less expensive) than it would be normally.

Finally, for SoftTyping, you combine the two mechanisms... essentially, you emit errors at compile-time if you can prove that something is missing or of the wrong type, and you emit warnings (as well as the code to check at runtime) if you cannot prove it to be safe (these warnings to be disabled with a compiler flag or pragma). This is what you always do for SoftTyping, of course, so there is nothing new here.

As far as optimizations go: the context equivalent of DeadCodeElimination? is obvious (no need to maintain unreferenced context) but how much you can eliminate depends on how 'modular' the code happens to be (i.e. if your application can load or create modules dynamically, the context must remain). Additionally, you can remove duplicate efforts... e.g. if a context contains a value (not a reference to a mutable-state service, but an honest-to-math immutable value), at one point, you can often prove it will contain the same value at a later point... and thus avoid the duplicate access or duplicate calculation. Also, PartialEvaluation can occur wherever you can prove a context will contain a certain value or reference at a given point in the code, which can then be used to compile the code specifically to its context. As with EMIC for StaticTyping, it helps a lot if continuations and thunks are tightly coupled to the context. If only the continuation can change its own context then you can ensure it won't suddenly become unsafe or invalidate the optimizations... and you can easily parameterize the continuations should you require a TypeSafe way to replace all or part of the context.

For both PartialEvaluation optimizations and StaticTyping it is likely useful that there be a TypeSafe mechanism for checking whether or not a record or object contains a particular entry prior to accessing it. For example, code might "context.cerr.print(x) if (context.cerr exists)". Not only would this represent a more flexible path-dependent model for StaticTyping in general (allowing a TypeSafe FeatureBuffetModel of process operations); it would also allow (given PartialEvaluation) the implicit context to take even more over the role of preprocessor-based conditional compilation (essentially equivalent to '#ifdef') based simply on the initial context components provided by 'main'.

I indeed meant the former. :-) I would have elaborated, but I was interrupted just when typing the question.

How to do DynamicTyping is indeed obvious. I was more concerned with StaticTyping. The only clean way I see is indeed with Monads (there may be other ways though). For 'normal' languages this is not an option I fear, but maybe I'm wrong.

Other less controlled ways to do it would be to declare that certain parts of the program use certain EMICs. For methods this is like a parameter type only without the parameter itself because it is 'passed' automatically. This is only a small improvement over the explicit 'context parameter pattern'.

Other scopes where the declaration is placed could be class or module scope. This saves more effort - the context is inherited my all methods/classes respectively - but is less specific of course.

On the method level this EMIC type seems to be comparable to EffectTyping - only the other way around. Not effects coming out of the method are typed but instead effects (data) required by the method are typed. I think this could nicely complement the existing results on EffectTyping.

-- GunnarZarncke

Security Context

A powerful, largely offline SecurityModel based on modified SimplePublicKeyInfrastructure (SPKI, a PasswordCapabilityModel) can readily be embedded into EMIC as a Security Context (in a TypeSafe, optimizable way... esp. with StaticAssert).

In essence, one checks that the necessary combination of signed certificates (values saying you have certain rights) are available in the context (e.g. located in a collection under context.security). One checks that these are either signed directly by the right people (with fixpoint "right people" = (a) a particular public key, (b) a URI or reference to a service to find such a key (ideally cacheable), or (c) the public key of someone who has provable signature-authority from the "right people"). With the optimizations above, you can ensure that most security checks pass without actually performing them at runtime (e.g. a process that creates a memory cell starts with access rights to it, so no need to check at runtime). StaticAssert can force checks into a GateKeeper so processing doesn't even start before all security is proven, but this doesn't prevent the GateKeeper itself from proving such checks statically.

To make it work in an environment where eavesdropping is presumed possible - and thus capabilities can be stolen but not forged - can be done by having these values specify 'extents' such as expiration dates, a statement that the value is only valid so long as one is involved in a particular (reversible) transaction, or only so long as some other capability is granted (dependent capability), or only so long as communications on a certain port aren't cut, or only so long as a request to XYZ says its valid (which might come back with a response cacheable for so many seconds... e.g. 'yeah, you can use that certificate for the next 30 seconds). Etc. These limitations in extent are honored by the code checking for the certificates (if you can compromise the security checking code accessed by the primary services you must already have full control of the system).

The ability to limit forwarding of signature authority (sigauth) also offers all the normal features of AccessList? based security. This is considered one 'extent' but feeds back into the "right people" calculation, and thereby allows you to limit 'who' can directly access a service. Proponents of the CapabilitySecurityModel think it a good thing it doesn't attempt to prevent the impossible (i.e. you can act 'directly' on someone else's behalf), but I believe it good to be able to provide 'buck stops here' support when it comes to dealing with responsibility for actions (i.e. AccessList? provides 'responsibility', and CapabilitySecurityModel provides 'capability'). I also figured there wasn't a reason to fight myself over CapabilitySecurityModel vs. AccessList? model when there is an easy way to obtain both.

I'm wondering about the relation of ExplicitManagementOfImplicitContext and DynamicScoping. It seems that EMIC depends on and build on or simply generalizes DynamicScoping. One key point again seems to be typing. DynamicScoping is mostly found in dynamic of weakly typed languages.

EMIC with dynamic scoping would essentially look like this:

 print(point) { stream << point.x << "," << point.y; }
 print(a, b) { print a; print b; }
  stream = environment.stdout;
  print new Point(3,4), new Point(4,5);

How can this be typed? A fully type version might look like this:

 @requires(Stream stream)
 void print(Point point) { stream << point.x << "," << point.y; }
 @requires(Stream stream)
 void print(Point a, Point b) { print a; print b; }

(here using some kind of annotation to add the type of the implicitly used stream). Note that the stream type must be declared for each method it passes thru. This exactly what we want to avoid: Chaining the context thru all intermediates where this gains nothing.

Type inference could infer and check these types (just infer treat all free variables as extra parameters; recursively) - but only if all outermost callers are available for checking. This breaks modularity and separate compilation. But we could enforce the rule that modules must make there EMIC requirements explicit (i.e. no free variables leaving ModuleScope?).

-- .gz

PS. Additional note: The introduction of variables in dynamic scope should happen in a structure way i.e. such that the 'scope' of the uses is clear and in particular clearly different syntactically from 'normal' variables. I think something like

  with(stream = environment.stdout)
    print new Point(3,4), new Point(4,5);

Type inference is the way to go here when the aim is to verify safety within a collectively compiled process. Manifest typing of Java exceptions was a huge mistake, and manifest typing of EMIC would destroy its utility.

There does not need to be a violation of "modularity and separate compilation"; in the context of such, one leaves some of the type-checking to the linker - something that needs to occur anyway when working with open functions and open datatypes, so if you already have similar cross-module features this isn't a loss at all. That said, we can (and, I say, should) also trade separate compilation as it is currently performed for other approaches to composition, efficient builds, and obfuscation that are more amenable to whole-program optimizations, EMIC, open datatypes and open functions, other features that cross module boundaries, and checks. It doesn't take much thinking to do much better.

I'm still thinking about how I favor introduction of variables to a context, but I would note that in a real system with EMIC and Java/C/C++-style process model, 'environment' itself would simply be the context passed to main(). That is, it is unlikely to exist as a global value. As such, there is no need for "stdout", since nothing out "out" needs to be "standard". With that, one could do something like this for the structured approach (changing the context for a limited region)
  using(environment with out = <my Stream>, x=A, y=B)
   print new Point(3,4), new Point(4,5)

Alternatively, changes to the environment could simply last until end-of-block. While one could save and restore the environment, most of the time one would simply structure code so one doesn't need to do so:
 savedEnv = environment;                                          
 environment = environment with out=<my Stream>, x=A, y=B
 print new Point(3,4), new Point(4,5)
 environment = savedEnv
   environment = environment with out=<my Stream>, x=A, y=B
   print new Point(3,4), new Point(4,5)

Procedures whose main task is to set up the environment would need to 'return environment' or 'return (value,environment)' more explicitly.

Admittedly, I think I'd prefer to use a word 'context' (which nicely pairs with 'content' when discussing messages) to the 10-character 'environment', and possibly treat access to it it as a primitive procedure rather than a primitive-supported variable. Syntactic details like that aren't on my immediate schedule for resolving.

ExplicitManagementOfImplicitContext and DataflowProgramming

In a recent discussion in the EventualSideEffects topic (discussing delayed SideEffects via FlowBasedProgramming and ComplexEventProcessing as the subject), I mentioned ExplicitManagementOfImplicitContext. I think the mention is worth a little explanation.

There are many issues for PublishSubscribeModel, MultiCaster, ObserverPattern, DataflowProgramming, etc. but some of the significant ones are: EMIC helps with the demand-side properties in the presence of secure composition (i.e. where a MultiCaster intermediary is acting primarily as a switching-network that composes dataflows) without sacrificing demand-driven publish or effective disruption handling. For sharing of dataflows, EMIC both hurts and helps - a property I'll discuss below.

Integrating EMIC with a dataflow is surprisingly simple:

In addition, and rather trivially, you may want some way to manipulate the context of the "return" messages. This might be done similar to transform functions and filter-functions, but it might not make much sense to keep the context of the original source; I'd suggest simply making the reply-context part of the hook (subscription).

I'm assuming a panoply of 'dataflow primitives' (such as dataflows from state, 'fallback' dataflows, a 'failure' dataflow, an 'indirection' dataflow (i.e. redirect: takes a dataflow-of-dataflow-of-X and returns a dataflow-of-X, a 'functional-transform' dataflow, various 'combinator' dataflows (which take state from other dataflows), as well as whatever 'initial dataflows' might be introduced or published by the application layer). Also, the cxmod and cxmux apply just as effectively to event-flows (the main difference between an event-flow and a dataflow is that you can always ask for the 'current state' of a dataflow, and you can get from ).

Anyhow, I said earlier that EMIC both helps and hurts sharing. Well, there is a complexity to that. Basically, using EMIC means that (cx,df) can share share 'df' among many users. Where this really helps sharing is MultiCaster, as it means that subscribers do not need to 'specialize' an 'abstract df' based on mutable demands. Since the MultiCaster domains become less specialized, dataflow 'splits' due to switching can be avoided. This, in turn, helps sharing. It's an incredibly indirect, emergent effect, but amounts to: 'cxmux' reduces necessity for higher-layer mux, and thus keeps more of the 'dataflow' defined in the lower layers where the language and runtimes can make more assumptions and perform more optimizations.

And at the outer edges of the system, context can may also be used to decide policy (aka demand-driven policy). E.g. assume there are dozens of dataflows indicating the current 'image' of a webcamera, but the webcamera can only hande certain modes (e.g. cannot be in both IR mode and visible-light mode, and cannot be in both low-band mode and high-band mode). One could control such modes via stateful commands but, just as one could control whether the camera is on or off by explicit state, it is often more convenient and robust (to changing needs) to allow such features to be demand-driven. One could use a priority or voting-scheme to decide bandwidth and IR in the absence of a directive from a control node. And, in the absence of connections, the webcam can power down. (Usefully, one could also use SPKI certificates as described in Security Context to ensure secure policy-fields like 'priority' to work in a system with different levels of true authority.)

Similarly, if 'polling' is used at the edges of the system to keep data up-to-date, the demand-driven policy can easily be used to set the polling rate (i.e. if nobody needs updates faster than once a second, just poll at once a second). One of the advantages of DataflowProgramming is to push the evils of 'polling' to the very edge of the system where it will consume the fewest bandwidth and CPU resources.

I might end up not using EMIC for the context, but I suspect the primary result of so doing would be to require a more explicit context. Upon reviewing primitives, there are a lot of places context would be useful, and the above access to context is quite limited. It might be worthwhile to still support the implicit context but to ensure simple access to it in 'select' and 'switch' statements rather than adding new primitives like cxmux and cxmod.

In addition to forward context, there is also some value in manipulating and supporting a couple more implicit contexts

Recent work on the ConfigurationProblem? has produced another solution for HaskellLanguage. See Functional Pearl: Implicit Configurations by Oleg Kiselyov and Chung-chieh Shan. http://www.cs.rutgers.edu/~ccshan/prepose/prepose.pdf (PageAnchor TypeClass?)

This uses typeclasses to introduce configuration-parameters, and manages many of the above critical features, especially regarding the ability to maintain multiple configurations at once, not mix them up, manipulate configuration contexts at runtime, etc.

Very nice solution. Works basically like the C++ template magic^H^H^H^Heta programming (see e.g. TemplateMetaprogrammingTechniques). The obvious disadvantages of this kind of techniques are So this is very useful but only for very specific areas and not the general audience. -- GunnarZarncke

If it's a choice between me and the compiler, I'd rather burden the compiler. In any case, TemplateMetaprogramming with concepts (which were recently redacted from C++0x (now C++1x) for lack of confidence that all the kinks were worked out) would have supported much better compiler errors. I suspect Haskell can do reasonably well with its typeclasses.

What I am wondering is: Can a separate 'type language' be avoided? I mean all this type/meta/macro programming usually happens in a separate language mostly for historical reasons: The associated language evolved out of more primitive structure of the 'host' language. Types were initially only keywords. But now types are as complex as the expressions they type. Macros were initially only simple replacement expressions. Now they have binders and scopes and libraries of their own. Same for templates. Why do I have to learn another language when large parts - at least structureal part - are the same for both? Could'nt these DSLs be the same - albeit with restricted domain/expressivity/types? For example to make the type-language sub-set computable we could restrict the loop expressions and types of its expressions. It is a kind of boot-strapping. We'd only need a means to connect the host language with this DSL. And that's where I'm not clear. -- GunnarZarncke

That question is off-topic with regards to EMIC, but the answer is: yes, easily. Indeed, in more dynamic languages (like SmalltalkLanguage), types themselves are often plain-old-objects. See FirstClassTypes.

Based on your earlier inquiry, though, I suspect you are more interested in the StaticTyping side of things. Attempting to achieve this statically, even given sufficient support for PartialEvaluation or staged evaluation (like CompileTimeResolution or ThirdFutamuraProjection), gives rise to issues of regression: one is using language expressions to describe type of language expressions. You need a base case. One could break the regression by falling back on DynamicTyping at some point, especially for a SideEffect-free (and ideally terminating) language subset used at compile-time (that is, one may take advantage that a DynamicTyping-error at CompileTime may be elevated to a CompileTime error, and it ain't a serious problem in the absence of SideEffects, especially if one can guarantee termination). One could also use third-party analysis to help validate the expressions describing types.

The Implicits or ImplicitParameters? of ScalaLanguage provide something comparable to an EMIC. They as powerful as TypeClass?es (as explained in the article below, see PageAnchor TypeClass? above) and could be the base of an implementation of EMIC.

Contributors: GunnarZarncke

See Also: ContextObject, ContextObjectsAreEvil, KeyLanguageFeature

View edit of November 12, 2014 or FindPage with title or text search