Learning Crypto

You are currently browsing the archive for the Learning Crypto category.

The Quest for True Randomness

(I wrote a long post, and it turns out my login session had expired. On trying to save, I just lost the whole post. Just so that you know the hazards of leaving your browser tabs open for ages.)

How can a program running on a desktop/laptop get “true” random bits? From searching around a bit, the “good” answer seems to lie in the firmware hub RNG. (Also see this (pdf).) Since the raw bits produced by the circuitry are not unbiased, they use the von Neumann extractor: this is indeed sufficient, as long as the bits are i.i.d (independent and identically distributed) even if they are not uniform (i.e., as long as they are independent tosses using the same coin, even if the coin is not a fair coin).

Apparently you can access this RNG via /dev/hw_random. You may need root access for this. A word of caution: in the one Linux box that I have access to, cat’ing /dev/hw_random produces a rather questionable bit sequence of all 1’s. Maybe I don’t have the firmware hub in my box?

Anyway, it seems the more familiar source of randomness used by a program is /dev/random. It has a different philosophy to randomness:

The random number generator gathers environmental noise from device drivers and other sources into an entropy pool. The generator also keeps an estimate of the number of bits of noise in the entropy pool. From this entropy pool random numbers are created.

This sounds quite fine, as long as the raw randomness from this entropy pool is “refined” before use (or is already clean because it came from /dev/hw_random — well, not mine I hope). I don’t know what is there in the Linux kernel to refine raw randomness, but I’ll hazard a guess that it doesn’t probably have one of the shiny new randomness extractors (deterministic or seeded). (If anyone knows better, please do comment.) The randomness extractors are precisely the tools you want here: as long as there are some minimal guarantees of entropy and independence in the raw source, an extractor can be used to efficiently extract much of this randomness into a uniform source, ready to be used as a one-time pad or what not. It will be also nice to have a /dev/pseudorandom which can take such a seed from /dev/random and stretch it using a “provably secure” PRG.

Aside: Even the hardware RNGs have their share of trouble. See this for instance. Again, as long as the hardware can provide some partial guarantee in the face of such attacks, a randomness extractor will prove quite handy.

Course objective

As some of you in the course might have observed, we have opted to omit several details regarding the cryptographic constructions: e.g., construction of PRF from PRG, inner workings of any block-cipher, the Chinese Remainder Theorem, etc. So how would you be able to construct your own encryption scheme?

The objective of this course is not to teach you how to build low-level cryptographic primitives. (We simply won’t have the time to fit everything into a one-semester course.) Instead, we aim to learn what these primitives are and how to use them in a larger system.  In fact, simply using the various cryptographic primitives already calls for some expertise: just too many of them resemble each other (for e.g., those which take an input and a key, and produce an output that looks like gibberish). It is crucial to understand what security property we need for a specific task at hand.

One objective of this course is to introduce you to some of the most important cryptographic primitives used in theory and practice, and provide a good understanding of the security guarantees, the assumptions behind them, the efficiency bottlenecks, and possible trade-offs, so that when you encounter them in a system, you can ask yourself if they are used in a meaningful manner. You can also use these primitives correctly in building higher level schemes. But should you want to build lower level primitives, all that this course can do is to tell you that you’d better make sure you know what you are doing.

(I will have more to say regarding course objectives, later on, but this is the primary objective, in my view.)

Today was my first full lecture this Fall. You’d think by now I’d be quite used to lecturing. But half way through, I was already feeling tired — maybe I should have had a better breakfast :-) More likely, it is just all the summer idleness that is taking its toll.

Another possible reason for being exhausted is that I was packing material from some 3-5 lectures of past courses into a single lecture. Admittedly it was a little brisk, but hey, it wasn’t that unreasonable since we didn’t attempt any proofs in class. Nevertheless, a little bit of an explanatory text will be useful, since the textbooks don’t do it the same way we did it. So here you go.

In keeping with the idea of teaching as few ideas as possible, we have essentially only one security definition for a variety of tasks and settings. These are definitions in the “real/ideal paradigm” or simulation based security. I discussed the basic philosophy here and here. A little more follows.

The security definition presents an “ideal world” where the primitive (message transmission, in the case of encryption) is modeled as an ideal functionality (a channel which the adversary can’t read from). It also presents a “real world” model, where the encryption scheme (whose security we are defining) is deployed; this real world should model all the power the adversary could possibly have. Roughly, the security definition says that whatever an adversary can make happen in the real world, an adversary could have made happen in the ideal world too. Both these worlds come with the following cast of characters: the parties using the primitive (i.e., Alice and Bob who are participating in the message transmission), the adversary (Eve) and the environment. The Environment is perhaps the key novelty here.

The Environment. It models where the inputs to the parties (message to be encrypted by Alice) come from, and where the outputs from the parties (message obtained by Bob by decrypting) go to. Note that, Alice and Bob stand for merely the encryption/decryption (+key-generation) algorithms; so Alice must get the message from outside, and Bob must output the message to the outside. That outside is part of the environment. But there is more to the environment. The adversary might have arbitrary influence over, or have arbitrary information about, the inputs to Alice. So we let the environment interact with the adversary through out its life time.

The security requirement is that in the real world and in the ideal world, the environment behaves (almost) identically. What do we mean by behaves? The environment is modeled as producing an output at the end of its life. But this could in fact be the entire history of the environment’s life — all the messages it ever received and the outcome of all the coins it ever flipped (which together determine the rest of its behavior). Then the behavior of the environment in the experiment — the real world or the ideal world — is defined by the probability distribution over the various output strings that it can produce. (The probability is over the coin tosses of all the entities in the world — the parties, the adversary and the environment itself.) But it turns out that without loss of generality, we can consider only environments which output a single bit.

Let Z be the set of outputs produced by an environment \text{Env}. For each X \subseteq Z, consider modifying the environment \text{Env} to a new environment \text{Env}'_X which outputs 1 iff the output of \text{Env} is in X. Now, if the output distributions in the two experiments involving \text{Env} (i.e., real and ideal worlds) are not identical, then there is a non-empty set X^* of outputs for which the probability in the real world is higher than that in the ideal world. Then the probability of \text{Env}_{X^*} outputting 1 in the two experiments will be different too. Hence if some environment with a long output behaves differently in the real and ideal worlds, then some environment which outputs a single bit behaves differently too.

Absolute vs. Relative Security. Note that if we are considering arbitrary environments interacting arbitrarily with the adversary, we can’t provide any meaningful absolute guarantees. For instance, we can’t say that the adversary will not learn the first bit of the message, because the environment might just give it away. Instead, we are interested in providing relative security guarantees. (This indeed was the point of this earlier note.) The simulation based definitions make this quite explicit.

Technical vs. Convincing definitions. All the SIM-* definitions we saw in the class have counterparts among IND-* definitions. (And this will be so when we get to public-key encryption as well.) The IND-* definitions, namely IND-CPA and IND-CCA, are more “standard.” The IND definitions are remarkably insightful and delightfully simple (if you ask a connoisseur, that is), but can be quite cryptic (no pun intended) in that it is not immediately clear why that security definition is the “right” one. On the other hand a SIM definition is some sort of a catch-all definition: it doesn’t necessarily get to the technical heart of the matter, but makes clear what security guarantee is being provided (namely, “you can imagine you are in the ideal world”).

When an IND definition (plus a correctness requirement) turns out to be equivalent to a SIM definition, one can think of the former as having distilled out the technical essence of the SIM definition. Typically, an IND definition describes a very specific environment — a single environment/experiment that is capable of testing the security of a scheme. Then, if you want to prove that a scheme is secure, you’d probably want to work with an IND definition, because writing a direct proof for the SIM definition can often be quite tedious. On the other hand, the security guarantee of a SIM definition would typically be easier to exploit directly, if you want to use your scheme in a complex environment (say, inside a larger protocol).

What is cryptography?

As we kick off the Applied Cryptography course, that is a good question to ask.

The answer of course, is that it is an an elephant! The goal of the course is to try and get a clearer picture of this animal (though we probably won’t be able to see the whole herd). We’ll start off from the theory, with definitions of some basic notions of information and secrecy.

At this point, it doesn’t hurt to reiterate a point we briefly touched up on in the introductory lecture, especially if it lets me recycle an old blog post. The very notion of information (from which stems the notion of secrecy) is based on a probabilistic model of the world. Pretty much all the cryptographic schemes we will encounter will require flipping coins in order to create uncertainty for an adversary. We will see this in action right away, with our first topic — namely, secret-sharing. Typically, cryptography courses would start off with encryption, but I think secret-sharing is perhaps a simpler context in which (perfect) secrecy can be understood. More on that later.

The new course

This Fall Nikita and I would be teaching a new course called Applied Cryptography. The plan is to cover the basic theory of “modern” cryptography and also look at several applications where these concepts are/should be used. One of the goals of the course, in my view, would be to teach as little as possible :-) Well, or rather, teach a few fundamental concepts which manifest in different ways (and unfortunately under different names) in various applications.

We don’t have a public webpage for the course yet, but we’ll get to it after both of us are in town, before the classes start.

Since Nikita and I’ll be taking turns lecturing, this time, I’ll hopefully get around to posting more here, on this blog. To get things started, what’s better than a video on youtube? Here’s a “Google Tech Talk” bemoaning some of the pitfalls in applying cryptography. (It may or may not make sense to you, but the bottom line is that you need to take this course before using any crypto stuff :-) )

Essay: The Edifice of Cryptography

As the semester (as well as the new edition of the crypto course) starts, it’s time to liven up this place a bit. So here’s an essay (under preparation, mind you) based on the introductory lecture for the course. It’s titled “The Edifice of Cryptography: An Artist’s Impression.”

The artist’s impression we present is an attempt to bring out the various aspects of the field. Like all such perspectives, this by no means is the only way to comprehend the field; nevertheless, we hope such a perspective is useful in conveying the spirit of the field to students and non-specialists. (read on)

As you might notice, the essay is on cryptutor. Over the course of the semester the wiki should accumulate more material.

On defining security – II

Last time — I know, it has been a while — we argued that a typical, good definition of security should be an assurance that whatever goes wrong, it wont be the cryptographic protocol to blame. And we asked how one could give such a guarantee.

For concreteness let us consider a specific cryptographic task: of secretly sending messages over a public channel. So, Alice wants to send a message to Bob. Now, what can cause things to go wrong here. Intuitively there are at least two causes of worry: something may go wrong because an eavesdropper learned something about the communication, and something may go wrong because of the fact that the message reached the other end (maybe because Alice chose to send the wrong message). If we are devising an encryption scheme for this task, clearly it is only the former concern that we need to address. The second concern is there just because of the functionality of message transmission.

Here is a thought experiment. Suppose Alice could send her message to Bob through a channel which is hidden from everyone else. In such a world, still things could go wrong because of the fact that the message reaches Bob. However the concern of eavesdropping does not exist. Indeed such a world has everything that encryption could offer (and a little more). So this world — which we shall call the ideal world — forms the benchmark to which a world using a proposed encryption scheme would be compared. If whatever could go wrong in a world using the proposed encryption scheme (over a public channel) could go wrong in the ideal world (with private channels) as well, then we contend that it is not the encryption scheme to blame.

It still remains to formalize the guarantee that “whatever could go wrong in the real world could go wrong in the ideal world as well.” We will return to the concept in more detail at a later point in this blog. But if you are curious, take a peek at the CRYPTUTOR definitions of security of encryption: SIM-CPA and SIM-CCA (under construction, I must warn you).

A word about CRYPTUTOR: It’s a wiki tutorial for theoretical cryptography, in the works. In its infancy, I should say. Much of the material currently there was developed during a course this Fall, with the help from the students, and from a most diligent TA Mike Rosulek (who also happens to be my PhD student). If you find any of the pages there in good shape, it must be Mike’s hands behind it!

I’d use the next few posts to start a discussion on defining security. Satisfactorily defining security is one of the most important enterprises in the theory of cryptography. There’s much to say, and I’m sure there’re many possible divergent views on this. So all the usual disclaimers are in place.

What should a security definition be like? One would expect to get a reassuring guarantee that if a scheme meets the definition, then nothing can go wrong in an unexpected way, when using the scheme. Now, it would be rather presumptuous on the part of a cryptographer to give such a guarantee, for there are other parts in the system which may have nothing to do with the cryptographic protocol (like, say your trusted buddy turns hostile and decides to publish all the e-mails you sent him or her, or say the stock-market crashes and you lose all your money). You wouldn’t want to blame your misfortunes on the poor cryptographer. On the other hand, if you think that your now-hostile buddy managed to find out some of your secrets because of a flaw in a cryptographic protocol, or that you lost money because your competitors managed to cheat in an auction protocol that ought prevent cheating, then by all means, you can ask for an explanation from your cryptographer.

The security guarantee, then, should be an assurance that whatever goes wrong, it won’t be the cryptographer’s fault.

So, how does one give a guarantee that if something goes wrong it won’t be the cryptographic protocol to blame? We shall come back to that question in subsequent posts. For now I just want to leave you with this thought: traditionally (I’d imagine) one indeed used to think of giving security guarantees as giving an assurance that nothing can go wrong in an unexpected way. The major limitation of this — philosophically speaking — is that we can give such a guarantee only in relatively simple scenarios where we know what all can go wrong. For more variable/complex scenarios, we shall give up on such an absolute security guarantee for a protocol, and instead go for a don’t-blame-me/not-my-fault kind of guarantee; it would leave room for things to “go wrong,” but not because of flaws in the protocol.

(Thanks to Jonathan Katz for prompting me to blog on. Well, more accurately he suggested I should either blog more often or just delete the blog. I don’t know if he would be disappointed at the choice I made :-) )

Coin flips

Theoretical computer science is mostly nice and self-contained, except for a few rough edges where it meets real life (unless you’re happy not to meet real life). One of these intersections is where we abstract some real phenomenon into a mathematical concept (another being when we have a result in our mathematical model and then are trying to interpret its meaning in real life).

If you’re learning theoretical cryptography, the very first abstraction that you’ll have to get to grips with is the use of probability. Our definitions, constructions and proofs of security all vitally depend on probabilistic notions. There is a leap of faith in assuming that something we do (like, obtain a bit-string from some circuitry in our computer) is in fact an instance of a random process. How is it that you can look at a string at hand and say that it was generated according to some probability distribution? If you are already comfortable with the quantum mechanical view of the world, this shouldn’t bother you at all. (You thought you were going to learn computer science, and here I’m talking about physics. But rest assured, it’s just going to be math all along, except at these clumsy interfaces of the abstract and the physical.) But if probabilities and distributions are new for you, well then you just have to keep talking about it until it doesn’t sound funny any more. If it helps, think only about the probability of events in the future, and not of what could have happened, or is happening in parallel universes.

Probability is a concept for which one has to develop an intuitive understanding. But, then again, chances are you have already done it. Don’t you look at a coin which fell heads up and think that it could have been tails up, equally probably? Without being bothered about the aerodynamics of your coin, of course. That’s because you understand the notion at an intuitive level. Maybe you never realized how far this can be taken. But really, that’s all there is to it: millions of coin flips. And you know what a coin flip is.

Once you buy into that completely, you can then start worrying about whether it was alright to do so. But that can wait.

Of course, probability and randomness are not special to cryptography. You must have come across this notion in physics, chemistry, biology, computer science (randomized algorithms), information theory, and perhaps in a course on probability in the math department! That’s probably a good excuse for me to not spend a lot of time on probability in a cryptography course. But I’m making up for that here, bringing it up right at the beginning.