# Black Box Computation

Moved here from EncryptedComputation

I thought about this concept 10 years ago too. I have a 'solution', which I called BlackBoxComputation (note, that this seems to be the standard term for such computations recently). But it has two main disadvantages:
• It is inherently slow (depending on decryption for every step of the computation).
• The steps actually carried out can be traced (which is possible for any scheme, see below).
The idea is not to encrypt the complete program (which was the cause of the misunderstanding above I guess), but to encrypt every step according to the following scheme (rather the 'encrypted language' proposed above):
• Each step of the program is encrypted with a separate key.
• Each step consists of
• output data (could be a specification of a variable)
• specification of requested input data (could be a variable too),
• a primitive operation (e.g. comparison) on the input
• one or more encryption keys and locations of successor steps (chosen depending on primitive operation result).

I remember, that I had a multiple key scheme, to prevent the discovery of multiple instances of the same step (thus e.g. preventing of reliable discovery of loops), but cannot remember the details.

Obvious improvements on this scheme could be that variables are implemented with input/output as sketched above, but with homomorphic cryptography and a cryptographic store (if such a thing exists).

There are proofs, that EncryptedComputation is impossible (sorry, no source, heard it during study), that go roughly as follows:
• It is assumed, that the 'evil' CPU has access to the encrypted program (i.e. the complete program resides in 'evil' memory.
• The computation consists of separate steps (this must be so for serial input/output to be possible).
• The CPU executes the program step by step.
• At every step the step might output some data and/or request input to be incorporated into the next-step-decryption.
• The CPU will thus learn, that a sequence of input leads to a sequence of output but no branching yet.
• To learn the branching structure of the program the CPU can either backtrack or breadth-first-search the program by providing all possible input to the current step and look at the resulting steps.

This works even if the steps are not known beforehand (i.e. not the whole program is decrypted, but only step-by-step during execution). But this process is of course very time-consuming.

If no input on the CPU is required and only one large computation (say factorization) is required, one could conceive transforming the complete algorithm into an encrypted parallel form, where is there are 'random' processing elements, that pass encrypted information back-and forth and the result is disguised in only a few of these. Imagine a big weather simulation, where millions of elements. If you provide a special initial 'weather' and the 'physics' is turing complete, I think you could hide your information there (no proof).

-- GunnarZarncke

I understand why you consider it mockery, but it shouldn't be called that. How about SerialEncryptedComputation??

Okay, I admit that I probably got oversensitive about being mocked. Feel free to rename or move back the contents of this page.

EncryptedComputation can actually accomplish stuff. It may or may not be possible but if it's possible, there's a whole bunch of really useful applications it would impact. One of the things it would impact indirectly is the design of secure distributed operating systems. And if it's not possible, then the proof that it isn't possible, and why it isn't possible, still matters a hell of a lot.

The proposal here simply doesn't work. But worse than not working, it fails in a very uninteresting manner. I mean, you're basically trying to turn computation into storage, and that just doesn't work. You can turn storage into computation easily using lambdas but turning computation into storage? I don't even see how that can be done. The difference between the two is interesting and I'd love to understand it better ....

In the last paragraph, you talk about steganography. And yes, this is possible, or at least easily conceivable. But steganography isn't the same thing as cryptography.

-- rk

I know which kinds of applications it enables, that's why I though about it 10 years ago. But my result was (at least for serial computations): If it is possible it is too slow which makes it essentially useless for most applications.

And why do you think it doesn't work? I agree, that the actual steps can be followed, but that doesn't make it useless, because you do not gain the complete program, only traces. If the processed data can be stored cryptographically (i.e. store multiple values and retrieve them without the observer being able to discover which one is which), then you effectively gain nothing from the trace (or am I missing something).

As for the weather metaphor: Yes, steganography is a nice application too, but what I meant was creating a uniformity of the ongoing processes, that the actual calculation is not only hidden, but provably cannot be followed. I think about a process with the property, that the 'random process' can be potentially interpreted as any computation (though I don't know how this could be realized).

--.gz

• Isn't that how ReverseEngineering works? Following traces to find out the relevant information. If EncryptedComputation would be really possible and feasible, closed source application distributors would have started deploying it a long time ago. -- AntonioTaylor?

For encrypted computation, I assume that all the memory and CPUs used to execute the program are under the control of an untrusted party with the resources of the entire universe. They can reverse engineer the CPUs at will.

The big problem with your scheme, which I couldn't express earlier, is that it's a fuzzy security scheme and not a well-defined security scheme. For example, it's not clear what interlocking the sequence of instructions actually accomplishes beyond making the security scheme more difficult to understand.

It's also not clear what security is provided for the first instructions even though there is a well-defined first instruction and the first instruction seems to have weaker security than every other instruction.

When you speak of hiding a process within a uniform bunch of other processes, this is steganography of computation (as opposed to steganography of storage). But once a part of the process is found, and assuming near unlimited resources, it's difficult to conceive of the entire process not being followed.

I don't understand what you mean by "I think about a process with the property, that the 'random' can be potentially interpreted as any computation".

-- rk

• I thought that (informally) steganography is about hiding, that there is a message (resp. computation) at all, whereas cryptography is about not being able to decipher it (once known that it is there). Applied to my 'weather' scheme this means, that you know, that there is a complex process (our computation) going on, but you cannot discern what it does. You could call that hiding the process, but that is not the intention.
• I didn't intend a real 'weather simulation' (better I drop that term from now on and talk about a complex process). Ok, imagine a complex process, where all combinations of primitive operations are potentially going on at a a sufficiently large number of processing elements.
• "A process with the property, that the 'random process' can be potentially interpreted as any computation": Normal reverse engineering has clues about the process because
• it is sequential (you are right, that if there is a first instruction, from which on a computation can be traced, this is a weakness).
• it reads input and produces output, which can be put into relation.
• there usually is the 'big picture': blocks, that communicate with each other, modules into which the problem can be broken down.
• The idea (though I guess, that it cannot be realized due to the disproofs mentioned at the top) is to design a process, that is so uniform, that it provably lacks these properties. It does so, if the process could potentially be interpreted as any possible computation, i.e. each interpretation is equally likely.
• Your point about sufficient/unlimited resources: That's no argument. All public key schemes are based on the fact, that some operations are harder (but not impossible) than others. For me it is enough if the complexity measure for breaking the code is sufficiently high (say longer than the life cycle of the product protected).

-- .gz

Okay, I think I understand. What you want is something similar to compiling a normal program into a large logic array. Now the thing about steganography is this, is most of the resulting logic array computing the cipherprogram or is it just computing junk to hide the structure of the cipherprogram? Or better yet, how much of what the logic array computes is directly dependent on the plainprogram? As I understand it, steganography isn't just meant to hide the presence of data but also the structure of that data. -- RK

Sorry, I cant answer that. I don't know how much 'computing junk' is needed or if 'junk' is necessary at all (the idea was, that rather few junk is needed). A large logic array with feedback might do it. The scheme is as follows: We want to compute f(x), and what we actually do is d(c1(f)(c2(x)). Where c1(f) and c2(x) are sent to the evil cpu. Choose appropriate c1 and c2 with matching d, such that f(x) = d(c1(f)(c2(x)). Extremely simple example: f(x1, x2) := (x1 and x2, x1 xor x2)
• c1(f(x1, x2)) = !f(!x2, !x1)
• c2(x1, x2) = (!x2, !x1)
• d(y) = !y
No junk used here. The evil CPU sees the operations, but will not know whether and, or, nor, ==, =>, halfadd of 2 bits or most of the other possible 2^16 operations is actually intended.

I do not know if this can be generalized to looping constructs (which are obviously the interesting part, otherwise I could have done the operation locally because the complexity of c1 is as large as f).

I conjecture, that for no junk to be needed the actual computation must at least be reversible.

Addition to 'each process equally likely' above: Each process of the same complexity should be equally likely.

-- .gz

EditText of this page (last edited January 3, 2010) or FindPage with title or text search