The synergy between programming languages and cryptography
( 30. Nov – 05. Dec, 2014 )
- Gilles Barthe (IMDEA Software - Madrid, ES)
- Michael Hicks (University of Maryland - College Park, US)
- Florian Kerschbaum (SAP SE - Karlsruhe, DE)
- Dominique Unruh (University of Tartu, EE)
- Annette Beyer (für administrative Fragen)
Increasingly, modern cryptography (crypto) has moved beyond the problem of secure communication to a broader consideration of securing computation. The past thirty years have seen a steady progression of both theoretical and practical advances in designing cryptographic protocols for problems such as secure multiparty computation, searching and computing on encrypted data, verifiable storage and computation, statistical data privacy, and more.
More recently, the programming-languages (PL) community has begun to tackle the same set of problems, but from a different perspective, focusing on issues such as language design (e.g., new features or type systems), formal methods (e.g., model checking, deductive verification, static and dynamic analysis), compiler optimizations, and analyses of side-channel attacks and information leakage.
The main objectives of the seminar are to cross-fertilize ideas between the PL and crypto communities, to exploit the synergies for advancing the development of secure computing, broadly speaking, and to foster new research directions in and across both communities. The programme will include tutorials to ensure common understanding and familiarity with existing tools, and research sessions covering the following (by no means complete) areas:
- Formal tools for provable security: Moving beyond the traditional Dolev-Yao analysis, PL researchers have recently made progress on formally reasoning about standard cryptographic (reductionist) proofs of security, using concepts such as, e.g., Hoare logic, dependent type systems, and abstract interpretation.
- Automated generation and analysis of cryptographic constructions: Program verification naturally suggests the possibility of developing automated tools for synthesizing new cryptographic algorithms and protocols.
- Composability: Composability is a well known challenge in cryptographic protocol design, and is an area that has received significant study in the PL community as well.
- Verified implementations: Work here fills the gap between the high-level descriptions of cryptographic protocols and deployed implementations.
- Domain-specific programming languages for cryptography: Such languages can make advanced cryptographic techniques and abstractions more accessible, more efficient, and less error prone.
- Information leakage: PL techniques are well suited to analyzing information flow, and understanding what amount of information leakage might be acceptable for certain applications.
The participants of this seminar will be drawn from both the PL and crypto communities, and will include researchers who have already started integrating techniques from both areas. This seminar presents an exciting opportunity to hear about the latest research and discover new connections!
The seminar schedule consisted of three components: short, two minute introduction talks (one for each participant), longer technical talks (Section 3) and open discussions on four different subjects. The first two days consisted of the introduction talks, followed by most of the technical talks. The seminar attendees had a mix of backgrounds, with one half (roughly) leaning heavily toward the PL (programming languages) side, and the other half leaning more towards the crypto side. The diversity of talks reflected this diversity of backgrounds, but there was much opportunity to meet in the middle and discuss open problems. The latter days mixed some remaining technical talks with open discussion sessions focusing on various problems and topics. In particular, participants voted to select four breakout sessions: Secure Computation Compilers, Crypto verification, Obfuscation, and Verified implementations.
This section summarizes some interesting discussions from the seminar, in three parts. First, we consider the activities involved in developing programming languages the interface with cryptography, and surveying the research of the seminar participants. Second, we explore how reasoning in PL and Crypto compare and contrast, and how ideas from one area might be relevant to the other. Finally, we survey open problems identified during the discussions.
Programming languages for cryptography
One connection emerged repeatedly in the talks and discussions: the use of programming languages to do cryptography, e.g., to implement it, optimize it, and prove it correct.
Programming languages can be compiled to cryptographic mechanisms
Programming languages can make cryptographic mechanisms easier to use. For example, the systems Sharemind, ShareMonad, CBMC-GC, and WYSTERIA are all designed to make it easier for programmers to write secure multiparty computations (SMCs).
In an SMC, we have two (or more) parties X and Y whose goal is to compute a function F of their inputs x and y, whereby each party only learns the output F(x,y), but does not "see" the inputs. Cryptographers have developed ways to compute such functions, such as garbled circuits and computing on secret shares, without need of a trusted third party. These systems shield the programmer from the workings of these mechanisms, compiling normal-looking programs to use the cryptography automatically. The languages can also provide additional benefits, such compiler-driven optimization.
This line of work is motivated by privacy- and/or integrity-preserving outsourcing of computation, e.g., as promised by The Cloud. Programming languages have been designed to compile to other kinds of crypto aside from SMC, like zero-knowledge proofs and authenticated data structures. Examples include Geppetto, SNARKs for C and LambdaAuth.
Combinations also exist, such as compiling to support Authenticated SNARKs.
Programming languages for implementing cryptography
The above languages aim to make computations secure through the use of cryptography, introduced by the language's compiler. We are also interested in implementing the cryptographic algorithms themselves (e.g., for symmetric or public key encryption). The implementation task could be made easier, more efficient, or more secure by employing a special-purpose language. Two representatives in this space are CAO and Cryptol. Both are domain-specific, and both make it easier to connect implementations to tools for automated reasoning. The Seminar also featured work on synthesizing cryptography (block ciphers) from constraint-based specifications.
Programming languages methods to prove security of cryptographic protocols and/or their implementations
When a cryptographer defines a cryptographic protocol, she must prove it is secure. Programming languages methods can be used mechanically confirm that a proof of security is correct. Systems like ProVerif, CryptoVerif, EasyCrypt and CertiCrypt support cryptographic protocol verification, with varying kinds of assurance. These systems build on ideas developed in general verification systems like Coq or Isabelle.
Likewise, when a programmer implements some cryptography (in a language like C), she would like to formally verify that the implementation is correct (no more Heartbleed!). For example, we'd like to know that an implementation does not have side channels, it uses randomness sufficiently, it has no buffer overflows, etc. Once again, verification can be achieved using tools that are underpinned by PL methods developed in formal verification research. Frama-C and Fstar have been used to verify implementations.
Formal reasoning for PL and Crypto
Beyond using PLs as a tool for easier/safer use of Crypto, there is an opportunity for certain kinds of thinking, or reasoning, to cross over fruitfully between the PL an Crypto communities. In particular, both communities are interested in formalizing systems and proving properties about them but they often use different methods, either due to cultural differences, or because the properties and systems of interest are simply different. During the seminar we identified both analogous, similar styles of reasoning in two communities and connection points between the different styles of reasoning.
Analogies between PL and Crypto reasoning
The Ideal/Real paradigm was first proposed by Goldreich, Micali, and Widgerson in their work on Secure Multiparty Computation (SMC) [3,4], and further developed by Canetti in his universal composability (UC) framework. The basic idea is to treat a cryptographic computation among parties as if it were being carried out by a trusted third party (the "ideal"), and then prove that the actual implementation (the "real") emulates this ideal, in that the parties can learn nothing more than they would in a protocol involving a trusted party. (The paradigm also handles correctness, robustness, and other emergent properties.)
This is a classic kind of abstraction also present in formal verification: If a program P uses a module M that implements specification S, then relinking P to use M', which also implements S, should preserve the correct execution of P. One talk, by Alley Stoughton, made the interesting observation that the Real/Ideal notion might be a suitable organizing principle around which to verify software is secure, essentially by using the Ideal as a richer kind of security property than is typical in PL (which often looks at properties like information flow control), and using abstraction in key ways to show it is enforced.
In the Crypto setting, the Real-to-Ideal connection is established probabilistically, considering a diminishing likelihood that a computationally bounded adversary would be able to tell the difference between the Real and Ideal. In the PL setting, the specification-implementation connection is established using methods of formal reasoning and logic, and usually without considering an adversary.
However, a notion of adversary does arise in PL-style reasoning. In particular, an adversary can be expressed as a context C[.] into which we place a computation e of interest that is subject to that adversary; the composition of the two is written C[e]. One PL property in this setup with a Crypto connection is contextual equivalence, which states that e and e' are equivalent iff for all contexts C the outcome of running C[e] is the same as running C[e'] - e.g., both diverge or evaluate to the same result. In a PL setting this property is often of interest when proving that two different implementation of the same abstract data type have the same semantics (in all contexts). In a security setting we can view the contexts as adversaries, and e and e' as the Real and Ideal.
Another useful property is full abstraction. This property was originally introduced to connect an operational semantics to a denotational semantics - the former defines a kind of abstract machine that explains how programs compute, while the latter denotes the meaning of a program directly, in terms of another mathematical formalism (like complete partial orders). Both styles of semantics have different strengths, and full abstraction connects them: it requires that e and e' are observationally equivalent (according to the operational semantics) if an only if they have the same denotation (according to the denotational semantics).
Connections between PL and Crypto
The seminar also brought out ways that PL-style reasoning can be connected to Crypto-style reasoning for stronger end-to-end assurance of security. One connection point was at the Real/Ideal boundary. In particular, for privacy-preserving computation (or computation preserving some other security property), Crypto-style reasoning can first be used to establish that the Real emulates the Ideal, and then PL-style reasoning can consider the security of the Ideal itself.
For example, consider the setting of SMC. Here, we have two (or more) parties X and Y that wish to compute a function F of their inputs x and y, whereby each party only learns the output F(x,y), but does not "see" the inputs. That is, the security goal is to establish that the Real computation of F(x,y) is indistinguishable from the Ideal model of executing F at a trusted third party. While Crypto can establish that a technique like garbled circuits effectively emulates a trusted third party, it does not establish that the output of F, even when computed by the Ideal, does not reveal too much information. For example, if F(x,y) = y then X learns Y's value y directly. More subtly, if F(x,y) = x > y, then if x = 1, an output of TRUE tells X that Y's value y = 0. PL-style reasoning can be applied to functions F to establish whether they are sufficiently private, e.g., by using ideas like knowledge-based reasoning or type systems for differential privacy. PL-style reasoning about knowledge can also be used to optimize SMCs by identifying places where a transformation would not affect security (e.g., no more is learned by an adversary observing the transformed program), but could improve performance. Another way to connect PL to Crypto is to factor security-sensitive computations into general-purpose and cryptographic parts. Then PL-style methods can be used to specify the overall computation with the Crypto parts carefully abstracted out. The proof of security then follows a PL approach, assuming guarantees provided by the Crypto parts, which are separately proved using Crypto techniques. In a sense we can think of the PL techniques as employing syntactic/symbolic reasoning, and the Crypto techniques employing computational/probabilistic reasoning.
This is the approach taken in LambdaAuth, a language extension for programming authenticated data structures (in the style of Merkle trees), in which the key idea involving the use of cryptographic hashes was abstracted into a language feature, and the proof of security combined a standard PL soundness proof along with a proof of the assumption that hash collisions are computationally difficult to produce. Recent work by Chong and Tromer on proof-carrying data similarly considers a language-level problem and proves useful guarantees by appealing to abstracted cryptographic mechanisms. Likewise, work on Memory Trace Obliviousness reasons about Oblivious RAM abstractly/symbolically in a PL setting to prove that the address trace of a particular program leaks no information.
Beyond work that is being done, one goal of the seminar was to identify possible collaborations on future work. PL researchers and cryptographers work on common problems from different points of view, so one obvious next step is to collaborate on these problems.
One relevant problem is side channels. Cryptographers are concerned with side channels in their implementations, e.g., to make sure the time, space, or power consumption during an encryption/decryption operation does not reveal anything about the key. Likewise, PL folk care about side channels expressed at the language level, e.g. work by Andrew Myers' group on timing channels. Both groups bring a useful perspective.
Another common problem is code obfuscation. It was cryptographers that proved that virtual black box (VBB) obfuscation is impossible, and proposed an alternative indistinguishability-based definition. PL researchers, on the other hand, have looked at language-oriented views of obfuscation effectiveness, e.g., based on abstract interpretation. Just as the halting problem is undecidable, but practical tools exist that prove termination. I believe that there is an opportunity here to find something useful, if not perfect.
Finally, the question of composability comes up in both Crypto and PL: Can we take two modules that provide certain guarantees and compose them to create a larger system while still ensuring properties proved about each module individually? Each community has notions for composability that are slightly different, though analogous, as discussed above. Can we make precise connections so as to bring over results from one community to the other? Crypto currencies, exemplified by BitCoin, are an area of exploding interest. An interesting feature about these currencies is that they provide a foundation for fair, secure multiparty computation, as demonstrated by Andrychowicz, Dziembowski, Malinowski, and Mazurek in their best paper at IEEE Security and Privacy 2014 [1, 2]. Could PL-style reasoning be applied to strengthen the guarantees provided by such computations? Cryptographic properties are often proved by making probabilistic statements about a system subject to a computationally bounded adversary. Could program analyses be designed to give probabilistic guarantees, drawing on the connection between adversary and context mentioned above, to thus speak more quantitatively about the chances that a property is true, or not, given the judgment of an analysis? How might random testing, which has proved highly useful in security settings, be reasoned about in a similar way?
We would like to thank Jonathan Katz for his initial involvement in organizing the seminar and Matthew Hammer for his help in preparing this report.
- Marcin Andrychowicz, Stefan Dziembowski, Daniel Malinowski, and Lukasz Mazurek. Secure Multiparty Computations on Bitcoin. 2014 IEEE Symposium on Security and Privacy, SP 2014, Berkeley, CA, USA, May 18–21, 2014, pp. 443–458, IEEE CS. http://dx.doi.org/10.1109/SP.2014.35
- Marcin Andrychowicz, Stefan Dziembowski, Daniel Malinowski, and Łukasz Mazurek. Secure Multiparty Computations on Bitcoin. Cryptology ePrint Archive, Report 2013/784, 2013. https://eprint.iacr.org/2013/784
- Oded Goldrcich and Ronen Vainish. How to solve any protocol problem – an efficiency improvement (extended abstract). In Carl Pomerance, editor, Advances in Cryptology – CRYPTO’87, volume 293 of Lecture Notes in Computer Science, pages 73–86. Springer Berlin Heidelberg, 1988.
- Oded Goldreich, Silvio Micali, and Avi Wigderson. Proofs that yield nothing but their validity or all languages in NP have zero-knowledge proof systems. J. ACM, 38(3):690–728, July 1991.
- Joseph Ayo Akinyele (Johns Hopkins University - Baltimore, US) [dblp]
- David Archer (Galois - Portland, US) [dblp]
- Manuel Barbosa (University of Minho - Braga, PT) [dblp]
- Gilles Barthe (IMDEA Software - Madrid, ES) [dblp]
- Karthikeyan Bhargavan (INRIA - Paris, FR) [dblp]
- Bruno Blanchet (INRIA - Paris, FR) [dblp]
- Dan Bogdanov (Cybernetica AS - Tartu, EE) [dblp]
- Niklas Büscher (TU Darmstadt, DE) [dblp]
- Stephen Chong (Harvard University - Cambridge, US) [dblp]
- Véronique Cortier (LORIA - Nancy, FR) [dblp]
- Francois Dupressoir (IMDEA Software - Madrid, ES) [dblp]
- Cédric Fournet (Microsoft Research UK - Cambridge, GB) [dblp]
- Matthew A. Hammer (University of Maryland - College Park, US) [dblp]
- Michael Hicks (University of Maryland - College Park, US) [dblp]
- Catalin Hritcu (INRIA - Paris, FR) [dblp]
- Stefan Katzenbeisser (TU Darmstadt, DE) [dblp]
- Florian Kerschbaum (SAP SE - Karlsruhe, DE) [dblp]
- Markulf Kohlweiss (Microsoft Research UK - Cambridge, GB) [dblp]
- Boris Köpf (IMDEA Software - Madrid, ES) [dblp]
- Sven Laur (University of Tartu, EE) [dblp]
- Alex Malozemoff (University of Maryland - College Park, US) [dblp]
- Sarah Meiklejohn (University College London, GB) [dblp]
- Esfandiar Mohammadi (Universität des Saarlandes, DE) [dblp]
- Axel Schröpfer (SAP SE - Walldorf, DE) [dblp]
- Alley Stoughton (MIT Lincoln Laboratory - Lexington, US) [dblp]
- Eran Tromer (Tel Aviv University, IL) [dblp]
- Dominique Unruh (University of Tartu, EE) [dblp]
- Santiago Zanella-Béguelin (INRIA - Paris, FR) [dblp]
- programming languages / compiler
- security / cryptology
- semantics / formal methods
- Secure multiparty computation
- Authenticated data structures
- Privacy-preserving computation
- Garbled circuits
- Zero-knowledge proofs
- Mechanized verification
- Program analysis
- Domain-specific languages