https://www.dagstuhl.de/14492

### 30. November – 05. Dezember 2014, Dagstuhl-Seminar 14492

# The synergy between programming languages and cryptography

## Organisatoren

Gilles Barthe (IMDEA Software – Madrid, ES)

Michael Hicks (University of Maryland – College Park, US)

Florian Kerschbaum (SAP SE – Karlsruhe, DE)

Dominique Unruh (University of Tartu, EE)

## Auskunft zu diesem Dagstuhl-Seminar erteilt

## Dokumente

Dagstuhl Report, Volume 4, Issue 12

Motivationstext

Teilnehmerliste

## Summary

The seminar schedule consisted of three components: short, two minute introduction talks (one for each participant), longer technical talks (Section 3) and open discussions on four different subjects. The first two days consisted of the introduction talks, followed by most of the technical talks. The seminar attendees had a mix of backgrounds, with one half (roughly) leaning heavily toward the PL (programming languages) side, and the other half leaning more towards the crypto side. The diversity of talks reflected this diversity of backgrounds, but there was much opportunity to meet in the middle and discuss open problems. The latter days mixed some remaining technical talks with open discussion sessions focusing on various problems and topics. In particular, participants voted to select four breakout sessions: Secure Computation Compilers, Crypto verification, Obfuscation, and Verified implementations.

This section summarizes some interesting discussions from the seminar, in three parts. First, we consider the activities involved in developing programming languages the interface with cryptography, and surveying the research of the seminar participants. Second, we explore how reasoning in PL and Crypto compare and contrast, and how ideas from one area might be relevant to the other. Finally, we survey open problems identified during the discussions.

### Programming languages for cryptography

One connection emerged repeatedly in the talks and discussions: the use of programming languages to do cryptography, e.g., to implement it, optimize it, and prove it correct.

#### Programming languages can be compiled to cryptographic mechanisms

Programming languages can make cryptographic mechanisms easier to use. For example, the systems Sharemind, ShareMonad, CBMC-GC, and WYSTERIA are all designed to make it easier for programmers to write secure multiparty computations (SMCs).

In an SMC, we have two (or more) parties X and Y whose goal is to compute a function F of their inputs x and y, whereby each party only learns the output F(x,y), but does not "see" the inputs. Cryptographers have developed ways to compute such functions, such as garbled circuits and computing on secret shares, without need of a trusted third party. These systems shield the programmer from the workings of these mechanisms, compiling normal-looking programs to use the cryptography automatically. The languages can also provide additional benefits, such compiler-driven optimization.

This line of work is motivated by privacy- and/or integrity-preserving outsourcing of computation, e.g., as promised by The Cloud. Programming languages have been designed to compile to other kinds of crypto aside from SMC, like zero-knowledge proofs and authenticated data structures. Examples include Geppetto, SNARKs for C and LambdaAuth.

Combinations also exist, such as compiling to support Authenticated SNARKs.

#### Programming languages for implementing cryptography

The above languages aim to make computations secure through the use of cryptography, introduced by the language's compiler. We are also interested in implementing the cryptographic algorithms themselves (e.g., for symmetric or public key encryption). The implementation task could be made easier, more efficient, or more secure by employing a special-purpose language. Two representatives in this space are CAO and Cryptol. Both are domain-specific, and both make it easier to connect implementations to tools for automated reasoning. The Seminar also featured work on synthesizing cryptography (block ciphers) from constraint-based specifications.

#### Programming languages methods to prove security of cryptographic protocols and/or their implementations

When a cryptographer defines a cryptographic protocol, she must prove it is secure. Programming languages methods can be used mechanically confirm that a proof of security is correct. Systems like ProVerif, CryptoVerif, EasyCrypt and CertiCrypt support cryptographic protocol verification, with varying kinds of assurance. These systems build on ideas developed in general verification systems like Coq or Isabelle.

Likewise, when a programmer implements some cryptography (in a language like C), she would like to formally verify that the implementation is correct (no more Heartbleed!). For example, we'd like to know that an implementation does not have side channels, it uses randomness sufficiently, it has no buffer overflows, etc. Once again, verification can be achieved using tools that are underpinned by PL methods developed in formal verification research. Frama-C and Fstar have been used to verify implementations.

### Formal reasoning for PL and Crypto

Beyond using PLs as a tool for easier/safer use of Crypto, there is an
opportunity for certain kinds of thinking, or *reasoning*, to
cross over fruitfully between the PL an Crypto communities. In
particular, both communities are interested in formalizing systems and
proving properties about them but they often use different methods,
either due to cultural differences, or because the properties and
systems of interest are simply different. During the seminar we
identified both analogous, similar styles of reasoning in two
communities and connection points between the different styles of
reasoning.

#### Analogies between PL and Crypto reasoning

The Ideal/Real paradigm was first proposed by Goldreich, Micali, and Widgerson in their work on Secure Multiparty Computation (SMC) [3,4], and further developed by Canetti in his universal composability (UC) framework. The basic idea is to treat a cryptographic computation among parties as if it were being carried out by a trusted third party (the "ideal"), and then prove that the actual implementation (the "real") emulates this ideal, in that the parties can learn nothing more than they would in a protocol involving a trusted party. (The paradigm also handles correctness, robustness, and other emergent properties.)

This is a classic kind of abstraction also present in formal verification: If a program P uses a module M that implements specification S, then relinking P to use M', which also implements S, should preserve the correct execution of P. One talk, by Alley Stoughton, made the interesting observation that the Real/Ideal notion might be a suitable organizing principle around which to verify software is secure, essentially by using the Ideal as a richer kind of security property than is typical in PL (which often looks at properties like information flow control), and using abstraction in key ways to show it is enforced.

In the Crypto setting, the Real-to-Ideal connection is established probabilistically, considering a diminishing likelihood that a computationally bounded adversary would be able to tell the difference between the Real and Ideal. In the PL setting, the specification-implementation connection is established using methods of formal reasoning and logic, and usually without considering an adversary.

However, a notion of adversary does arise in PL-style reasoning. In particular, an adversary can be expressed as a context C[.] into which we place a computation e of interest that is subject to that adversary; the composition of the two is written C[e]. One PL property in this setup with a Crypto connection is contextual equivalence, which states that e and e' are equivalent iff for all contexts C the outcome of running C[e] is the same as running C[e'] - e.g., both diverge or evaluate to the same result. In a PL setting this property is often of interest when proving that two different implementation of the same abstract data type have the same semantics (in all contexts). In a security setting we can view the contexts as adversaries, and e and e' as the Real and Ideal.

Another useful property is *full abstraction*. This property was originally introduced to connect an operational semantics to a denotational semantics - the former defines a kind of abstract machine that explains how programs compute, while the latter denotes the meaning of a program directly, in terms of another mathematical formalism (like complete partial orders). Both styles of semantics
have different strengths, and full abstraction connects them: it requires that e and e' are observationally equivalent (according to the operational semantics) if an only if they have the same denotation (according to the denotational semantics).

In a Crypto setting, we might view the operational semantics as the Ideal and the denotational semantics as the Real, and full abstraction then states that despite the added observational power of the Real setting, an adversary cannot distinguish any more programs (i.e., learn any additional information) than he could in the Ideal setting. As a recent example of its use, Abadi and Plotkin used full abstraction to reason about the effectiveness of address space randomization. Another recent result is a fully abstract compiler from a type-safe high-level language to Javascript; the compiler effectively defines the denotational semantics, and the fact that it is fully abstract means that the added adversarial power that Javascript provides cannot violate the source language's semantics.

#### Connections between PL and Crypto

The seminar also brought out ways that PL-style reasoning can be connected to Crypto-style reasoning for stronger end-to-end assurance of security. One connection point was at the Real/Ideal boundary. In particular, for privacy-preserving computation (or computation preserving some other security property), Crypto-style reasoning can first be used to establish that the Real emulates the Ideal, and then PL-style reasoning can consider the security of the Ideal itself.

For example, consider the setting of SMC. Here, we have two (or more) parties X and Y that wish to compute a function F of their inputs x and y, whereby each party only learns the output F(x,y), but does not "see" the inputs. That is, the security goal is to establish that the Real computation of F(x,y) is indistinguishable from the Ideal model of executing F at a trusted third party. While Crypto can establish that a technique like garbled circuits effectively emulates a trusted third party, it does not establish that the output of F, even when computed by the Ideal, does not reveal too much information. For example, if F(x,y) = y then X learns Y's value y directly. More subtly, if F(x,y) = x > y, then if x = 1, an output of TRUE tells X that Y's value y = 0. PL-style reasoning can be applied to functions F to establish whether they are sufficiently private, e.g., by using ideas like knowledge-based reasoning or type systems for differential privacy. PL-style reasoning about knowledge can also be used to optimize SMCs by identifying places where a transformation would not affect security (e.g., no more is learned by an adversary observing the transformed program), but could improve performance. Another way to connect PL to Crypto is to factor security-sensitive computations into general-purpose and cryptographic parts. Then PL-style methods can be used to specify the overall computation with the Crypto parts carefully abstracted out. The proof of security then follows a PL approach, assuming guarantees provided by the Crypto parts, which are separately proved using Crypto techniques. In a sense we can think of the PL techniques as employing syntactic/symbolic reasoning, and the Crypto techniques employing computational/probabilistic reasoning.

This is the approach taken in LambdaAuth, a language extension for programming *authenticated data structures* (in the style of Merkle trees), in which the key idea involving the use of cryptographic hashes was abstracted into a
language feature, and the proof of security combined a standard PL soundness proof along with a proof of the assumption that hash collisions are computationally difficult to produce. Recent work by Chong and Tromer on proof-carrying data similarly considers a language-level problem and proves useful guarantees by appealing to abstracted cryptographic mechanisms. Likewise, work on Memory Trace
Obliviousness reasons about Oblivious RAM abstractly/symbolically in a PL setting to prove that the address trace of a particular program leaks no information.

### Open problems

Beyond work that is being done, one goal of the seminar was to identify possible collaborations on future work. PL researchers and cryptographers work on common problems from different points of view, so one obvious next step is to collaborate on these problems.

One relevant problem is *side channels*. Cryptographers are concerned
with side channels in their implementations, e.g., to make sure the time, space, or power consumption during an encryption/decryption operation does not reveal anything about the key. Likewise, PL folk care about side channels expressed at the language level, e.g. work by Andrew Myers' group on timing channels. Both groups bring a useful perspective.

Another common problem is *code obfuscation*. It was cryptographers that
proved that virtual black box (VBB) obfuscation is impossible, and proposed an alternative indistinguishability-based definition. PL researchers, on the other hand, have looked at language-oriented views of obfuscation effectiveness, e.g., based on abstract interpretation. Just as the halting problem is undecidable, but practical tools exist that prove termination. I believe that there is an opportunity here to find something useful, if not perfect.

Finally, the question of *composability* comes up in both Crypto and PL: Can we take two modules that provide certain guarantees and compose them to create a larger system while still ensuring properties proved about each module individually? Each community has notions for composability that are slightly different, though analogous, as discussed above. Can we make precise connections so as to bring over results from one community to the other? Crypto currencies,
exemplified by BitCoin, are an area of exploding interest. An interesting feature about these currencies is that they provide a foundation for fair, secure multiparty computation, as demonstrated by Andrychowicz, Dziembowski, Malinowski, and Mazurek in their best paper at IEEE Security and Privacy 2014 [1, 2]. Could PL-style reasoning be applied to strengthen the guarantees provided by such computations? Cryptographic properties are often proved by making probabilistic
statements about a system subject to a computationally bounded adversary. Could program analyses be designed to give probabilistic guarantees, drawing on the connection between adversary and context mentioned above, to thus speak more quantitatively about the chances that a property is true, or not, given the judgment of an analysis? How might random testing, which has proved highly useful in security settings, be reasoned about in a similar way?

#### Aknowledgements

We would like to thank Jonathan Katz for his initial involvement in organizing the seminar and Matthew Hammer for his help in preparing this report.

### References

- Marcin Andrychowicz, Stefan Dziembowski, Daniel Malinowski, and Lukasz Mazurek. Secure Multiparty Computations on Bitcoin. 2014 IEEE Symposium on Security and Privacy, SP 2014, Berkeley, CA, USA, May 18–21, 2014, pp. 443–458, IEEE CS. http://dx.doi.org/10.1109/SP.2014.35
- Marcin Andrychowicz, Stefan Dziembowski, Daniel Malinowski, and Łukasz Mazurek. Secure Multiparty Computations on Bitcoin. Cryptology ePrint Archive, Report 2013/784, 2013. https://eprint.iacr.org/2013/784
- Oded Goldrcich and Ronen Vainish. How to solve any protocol problem – an efficiency improvement (extended abstract). In Carl Pomerance, editor, Advances in Cryptology – CRYPTO’87, volume 293 of Lecture Notes in Computer Science, pages 73–86. Springer Berlin Heidelberg, 1988.
- Oded Goldreich, Silvio Micali, and Avi Wigderson. Proofs that yield nothing but their validity or all languages in NP have zero-knowledge proof systems. J. ACM, 38(3):690–728, July 1991.

**Summary text license**

Creative Commons BY 3.0 Unported license

Gilles Barthe, Michael Hicks, Florian Kerschbaum, and Dominique Unruh

## Classification

- Programming Languages / Compiler
- Security / Cryptology
- Semantics / Formal Methods

## Keywords

- Secure multiparty computation
- Authenticated data structures
- Privacy-preserving computation
- Garbled circuits
- Optimization
- Zero-knowledge proofs
- Mechanized verification
- Program analysis
- Compilation
- Domain-specific languages