Some information theory today.

Today we look at pairs of random variables distributed between two parties, Alice and Bob. (Alice gets and Bob gets , where are drawn from some joint distribution.) Such correlated random variables distributed among different parties is a powerful cryptographic tool. From encryption (think one-time pads) to oblivious-transfer, a whole gamut of cryptographic tasks can be realized information theoretically (i.e., without relying on the adversary’s computational limitations) once the appropriate random variables are generated and given to the parties. Read the rest of this entry »

To make things a little more concrete here are a few examples of functionalities and reductions. We shall consider four 2-party functionalities: COIN, EXCH, COMMIT and OR. Read the rest of this entry »

How do we measure or compare the complexity of the “cryptographic nature” of functionalities in a meaningful way?1

One could consider a functionality to be cryptographically trivial (i.e., of little complexity) if the access control it is enforcing could be enforced by individual parties themselves (without trusting each other to behave honestly). Read the rest of this entry »

  1. There are other kinds of complexity of a functionality that one may be interested in, like the amount of communication or computation that the functionality carries out. This is not what we intend to capture.

Last time, we said that we are interested in studying the “cryptographic complexity” of various systems. What exactly is a system and how do we specify it? Read the rest of this entry »

In a very broad sense, cryptography is all about controlled access to information. Now, with something that vague you will be justified in asking “what do you mean by access?” and “what kind of control?” and indeed, “what is information?” We’ll get more specific shortly, but continuing cryptosophically, there are at least two parts to “access to information” — learning information, and influencing information. You could think of that as read-access and write-access, if you will. But in fact, it gets much more involved than that. For instance, when information is not statically sitting around, but is the input to and output of computation, by choosing an input to a computation, I may gain access to information which corresponds to some complex combination of learning and influencing.

As a simple example, consider an auction system which accepts bids from several users and announces the largest bid. By choosing to bid really high, I can determine the outcome of this process (because I’ll win) or by choosing to bid really low, I can learn someone else’s bid (because someone else will win and their bid will be revealed). The computation involved in this system is simply a comparison operation to find out the maximum, which is then output. This already makes it impossible to consider read-access and-write access separately. The more complex the overall system is, the more intricate the kinds of access to information that can manifest. This is the kind of complexity — what we shall call cryptographic complexity — that we are interested in.

Can we understand all the various possible kinds of cryptographic complexity? Can we quantify the extent of cryptographic complexity in a system, using just a few numbers, maybe? The plan for the next several weeks (if I can keep my promise, that is) is to outline some of our research over the last couple of years which formalizes these questions and uncovers some (what I find) fascinating structures.

There is one more very fundamental question that we can ask here. And that has to do with computational complexity. It is well-recognized that many of the cryptographic goals can be realized only if we exploit the fact that there are tasks which cannot be carried out with a realistic amount of computational resources. In other words, for realizing many cryptographic goals, we need to rely on some computational intractability. Given that different cryptographic goals have different levels of cryptographic complexity, a natural question is whether/how the computational intractability required for these goals differ. We have a somewhat surprising answer here, that there are just a few fundamental kinds of computational intractability that are necessary and (when appropriately exploited) sufficient for a great variety of cryptographic goals. Not surprisingly, this setting is much more challenging than understading cryptographic content with no reference to computational power. We shall discuss our more recent developments on this front as well.


The idea for this little blogging project owes its existence to those who found the topic interesting, but couldn’t follow my badly executed talks :-) and suggested that I write up a summary. I am not planning to discuss all the technical results here, but for those who are interested, here’s a little bibliography of the papers in question. There are a couple more in the works, and hopefully the manuscripts will all be available before too long.

(I wrote a long post, and it turns out my login session had expired. On trying to save, I just lost the whole post. Just so that you know the hazards of leaving your browser tabs open for ages.)

How can a program running on a desktop/laptop get “true” random bits? From searching around a bit, the “good” answer seems to lie in the firmware hub RNG. (Also see this (pdf).) Since the raw bits produced by the circuitry are not unbiased, they use the von Neumann extractor: this is indeed sufficient, as long as the bits are i.i.d (independent and identically distributed) even if they are not uniform (i.e., as long as they are independent tosses using the same coin, even if the coin is not a fair coin).

Apparently you can access this RNG via /dev/hw_random. You may need root access for this. A word of caution: in the one Linux box that I have access to, cat’ing /dev/hw_random produces a rather questionable bit sequence of all 1′s. Maybe I don’t have the firmware hub in my box?

Anyway, it seems the more familiar source of randomness used by a program is /dev/random. It has a different philosophy to randomness:

The random number generator gathers environmental noise from device drivers and other sources into an entropy pool. The generator also keeps an estimate of the number of bits of noise in the entropy pool. From this entropy pool random numbers are created.

This sounds quite fine, as long as the raw randomness from this entropy pool is “refined” before use (or is already clean because it came from /dev/hw_random — well, not mine I hope). I don’t know what is there in the Linux kernel to refine raw randomness, but I’ll hazard a guess that it doesn’t probably have one of the shiny new randomness extractors (deterministic or seeded). (If anyone knows better, please do comment.) The randomness extractors are precisely the tools you want here: as long as there are some minimal guarantees of entropy and independence in the raw source, an extractor can be used to efficiently extract much of this randomness into a uniform source, ready to be used as a one-time pad or what not. It will be also nice to have a /dev/pseudorandom which can take such a seed from /dev/random and stretch it using a “provably secure” PRG.

Aside: Even the hardware RNGs have their share of trouble. See this for instance. Again, as long as the hardware can provide some partial guarantee in the face of such attacks, a randomness extractor will prove quite handy.

Course objective

As some of you in the course might have observed, we have opted to omit several details regarding the cryptographic constructions: e.g., construction of PRF from PRG, inner workings of any block-cipher, the Chinese Remainder Theorem, etc. So how would you be able to construct your own encryption scheme?

The objective of this course is not to teach you how to build low-level cryptographic primitives. (We simply won’t have the time to fit everything into a one-semester course.) Instead, we aim to learn what these primitives are and how to use them in a larger system.  In fact, simply using the various cryptographic primitives already calls for some expertise: just too many of them resemble each other (for e.g., those which take an input and a key, and produce an output that looks like gibberish). It is crucial to understand what security property we need for a specific task at hand.

One objective of this course is to introduce you to some of the most important cryptographic primitives used in theory and practice, and provide a good understanding of the security guarantees, the assumptions behind them, the efficiency bottlenecks, and possible trade-offs, so that when you encounter them in a system, you can ask yourself if they are used in a meaningful manner. You can also use these primitives correctly in building higher level schemes. But should you want to build lower level primitives, all that this course can do is to tell you that you’d better make sure you know what you are doing.

(I will have more to say regarding course objectives, later on, but this is the primary objective, in my view.)

Today was my first full lecture this Fall. You’d think by now I’d be quite used to lecturing. But half way through, I was already feeling tired — maybe I should have had a better breakfast :-) More likely, it is just all the summer idleness that is taking its toll.

Another possible reason for being exhausted is that I was packing material from some 3-5 lectures of past courses into a single lecture. Admittedly it was a little brisk, but hey, it wasn’t that unreasonable since we didn’t attempt any proofs in class. Nevertheless, a little bit of an explanatory text will be useful, since the textbooks don’t do it the same way we did it. So here you go.

In keeping with the idea of teaching as few ideas as possible, we have essentially only one security definition for a variety of tasks and settings. These are definitions in the “real/ideal paradigm” or simulation based security. I discussed the basic philosophy here and here. A little more follows.

The security definition presents an “ideal world” where the primitive (message transmission, in the case of encryption) is modeled as an ideal functionality (a channel which the adversary can’t read from). It also presents a “real world” model, where the encryption scheme (whose security we are defining) is deployed; this real world should model all the power the adversary could possibly have. Roughly, the security definition says that whatever an adversary can make happen in the real world, an adversary could have made happen in the ideal world too. Both these worlds come with the following cast of characters: the parties using the primitive (i.e., Alice and Bob who are participating in the message transmission), the adversary (Eve) and the environment. The Environment is perhaps the key novelty here.

The Environment. It models where the inputs to the parties (message to be encrypted by Alice) come from, and where the outputs from the parties (message obtained by Bob by decrypting) go to. Note that, Alice and Bob stand for merely the encryption/decryption (+key-generation) algorithms; so Alice must get the message from outside, and Bob must output the message to the outside. That outside is part of the environment. But there is more to the environment. The adversary might have arbitrary influence over, or have arbitrary information about, the inputs to Alice. So we let the environment interact with the adversary through out its life time.

The security requirement is that in the real world and in the ideal world, the environment behaves (almost) identically. What do we mean by behaves? The environment is modeled as producing an output at the end of its life. But this could in fact be the entire history of the environment’s life — all the messages it ever received and the outcome of all the coins it ever flipped (which together determine the rest of its behavior). Then the behavior of the environment in the experiment — the real world or the ideal world — is defined by the probability distribution over the various output strings that it can produce. (The probability is over the coin tosses of all the entities in the world — the parties, the adversary and the environment itself.) But it turns out that without loss of generality, we can consider only environments which output a single bit. [DDet (Why?)]

Let be the set of outputs produced by an environment . For each , consider modifying the environment to a new environment which outputs 1 iff the output of is in . Now, if the output distributions in the two experiments involving (i.e., real and ideal worlds) are not identical, then there is a non-empty set of outputs for which the probability in the real world is higher than that in the ideal world. Then the probability of outputting 1 in the two experiments will be different too. Hence if some environment with a long output behaves differently in the real and ideal worlds, then some environment which outputs a single bit behaves differently too.[/DDET]

Absolute vs. Relative Security. Note that if we are considering arbitrary environments interacting arbitrarily with the adversary, we can’t provide any meaningful absolute guarantees. For instance, we can’t say that the adversary will not learn the first bit of the message, because the environment might just give it away. Instead, we are interested in providing relative security guarantees. (This indeed was the point of this earlier note.) The simulation based definitions make this quite explicit.

Technical vs. Convincing definitions. All the SIM-* definitions we saw in the class have counterparts among IND-* definitions. (And this will be so when we get to public-key encryption as well.) The IND-* definitions, namely IND-CPA and IND-CCA, are more “standard.” The IND definitions are remarkably insightful and delightfully simple (if you ask a connoisseur, that is), but can be quite cryptic (no pun intended) in that it is not immediately clear why that security definition is the “right” one. On the other hand a SIM definition is some sort of a catch-all definition: it doesn’t necessarily get to the technical heart of the matter, but makes clear what security guarantee is being provided (namely, “you can imagine you are in the ideal world”).

When an IND definition (plus a correctness requirement) turns out to be equivalent to a SIM definition, one can think of the former as having distilled out the technical essence of the SIM definition. Typically, an IND definition describes a very specific environment — a single environment/experiment that is capable of testing the security of a scheme. Then, if you want to prove that a scheme is secure, you’d probably want to work with an IND definition, because writing a direct proof for the SIM definition can often be quite tedious. On the other hand, the security guarantee of a SIM definition would typically be easier to exploit directly, if you want to use your scheme in a complex environment (say, inside a larger protocol).

As we kick off the Applied Cryptography course, that is a good question to ask.

The answer of course, is that it is an an elephant! The goal of the course is to try and get a clearer picture of this animal (though we probably won’t be able to see the whole herd). We’ll start off from the theory, with definitions of some basic notions of information and secrecy.

At this point, it doesn’t hurt to reiterate a point we briefly touched up on in the introductory lecture, especially if it lets me recycle an old blog post. The very notion of information (from which stems the notion of secrecy) is based on a probabilistic model of the world. Pretty much all the cryptographic schemes we will encounter will require flipping coins in order to create uncertainty for an adversary. We will see this in action right away, with our first topic — namely, secret-sharing. Typically, cryptography courses would start off with encryption, but I think secret-sharing is perhaps a simpler context in which (perfect) secrecy can be understood. More on that later.

The new course

This Fall Nikita and I would be teaching a new course called Applied Cryptography. The plan is to cover the basic theory of “modern” cryptography and also look at several applications where these concepts are/should be used. One of the goals of the course, in my view, would be to teach as little as possible :-) Well, or rather, teach a few fundamental concepts which manifest in different ways (and unfortunately under different names) in various applications.

We don’t have a public webpage for the course yet, but we’ll get to it after both of us are in town, before the classes start.

Since Nikita and I’ll be taking turns lecturing, this time, I’ll hopefully get around to posting more here, on this blog. To get things started, what’s better than a video on youtube? Here’s a “Google Tech Talk” bemoaning some of the pitfalls in applying cryptography. (It may or may not make sense to you, but the bottom line is that you need to take this course before using any crypto stuff :-) )

« Older entries

by Watchmath