The PDF version of this article can be found here.
We report on work performed in the Mobile Resource Guarantees (MRG) project [3] on developing proof-carrying code architectures for resource-related properties. The emergence of a global computing environment, where programs are routinely exchanged between principals that do not know or trust each other, has brought the issue of code verification to the forefront of attention: security-aware code consumers are simply not willing to execute downloaded code unless they can convince themselves that locally defined policies will be respected throughout the execution. Complementing mechanisms such as cryptographic techniques or code signatures that assert the authorship or the integrity of transmitted code, the proof-carrying code paradigm (PCC, [13]) has emerged as a powerful mean to guarantee code-inherent properties such as type-correct behaviour or the adherence to security policies. Contrary to earlier efforts in program verification, PCC and related technologies such as typed assembly languages (TAL, [11]) and foundational PCC [1] usually concern properties that are easier to verify than functional correctness, and that are applied on compiled rather than source code. Driven by the position of the sceptical code recipient, this paradigm requires code producers to equip programs with evidence of well-behaviour. This evidence must be transmitted as a formal certificate in a format that allows the recipient to validate its correctness by means of a fully automatic (and trusted) verification process.
The techniques used to generate certificates do not matter from the point of view of the recipient, but the ability to derive certificates during the compilation process is crucial for the success of any PCC architecture. In order to meet this goal, certifying compilers combine program annotations with sophisticated program analysis, often formulated as type systems or abstract interpretations.
Variations of the certified code paradigm have explored different formalisms in which the result of the analysis is communicated to the recipient. While the original PCC system was based on a domain-specific logic, type systems for assembly language were used in TAL. The most radical approach is foundational PCC, where the certification logic comprises only statements that can be derived from first principles: a detailed operational model that includes the machine-level encoding of instructions is formalised in a theorem prover, and certificates are only valid if their correctness w.r.t. this operational model can be derived inside the theorem prover. In effect, this approach removes domain-specific logics (or type systems in the case of TAL) from the trusted computing base (TCB), replacing it by a single component, the proof checker.
In MRG, we followed a variation of foundational PCC where additional layers mediate between the operational model and the application type system. Conceptually, the theorem prover remains a trusted component as the justification of each layer is given by its formal derivation from more basic layers. Pragmatically, however, certificates may be formulated at more abstract levels than the operational semantics, and the choice of certificate format becomes essentially a matter of negotiation between code producer and consumer. At levels where full automation is available, the performance benefits of stand-alone proof checkers may in practise justify their inclusion in the TCB, since their application can always be replaced by unfolding a certificate to a proof at a lower level of abstraction, together with an invocation of the theorem prover: at each level, a certificate may be seen as a tactic/proof script that limits the need for proof search at the lower level.
The lowest additional layer is given by a general-purpose program logic. Its role is to eliminate the repeated verification of program units through the introduction of invariants, and to serve as a platform at which various higher level logics may be unified. The latter purpose makes logical completeness of the program logic a desirable property, which has hitherto been mostly of meta-theoretic interest. Of course, soundness remains mandatory, as the trustworthiness of any application logic defined at higher levels depends upon it.
The second level is given by a system of derived assertions that represent an interpretation of a type system, possibly with respect to a particular compilation strategy. In contrast to the general-purpose logic, these derived proof systems are not expected to be logically complete, but they should provide support for automated proof search. In the case of the logic for heap consumption described below, this is achieved by formulating a system of derived assertions whose level of granularity is roughly similar to the high-level type system. However, the rules are expressed in terms of code fragments in the low-level language. Since the side conditions of the typing rules are computationally easy to validate, automated proof search is supported by the syntax-directness of the typing rules. At points where syntax-directness fails - such as recursive program structures - the necessary invariants are provided by the type system. Depending on the property of interest, the second level may be further refined into a hierarchy of proof systems, for example if parts of the soundness argument of derived assertions can be achieved by different type systems.
In general, the described approach builds on the observation that type systems represent a syntactic method to determine subsets of programs satisfying particular properties, trading off logical completeness for decidability, while (program) logics are more expressive, but usually undecidable.
In the remainder of this note, we describe practical work carried out in the MRG project that instantiates the above approach to the application domain ``resource consumption'', and more concretely, the allocation of heap memory. Fig. 1 shows a PCC architecture built from the components developed by MRG. On the left side, the code producer generates a JVML file from a program written in our high-level programming language ``Camelot'', using an intermediate representation ``Grail''. At the same time, the type system yields a certificate that is bundled with the JVM code to obtain a certified code package. On the consumer side, the components are taken apart, the Grail code is recovered and checked against the certificate. If the check is successful, the code may be trusted and executed on a standard JVM.
While this figure does not include the dependency on specific resource policies, we are currently working toward including other notions of resource consumption and admitting more flexible resource policies to be expressed and verified.
@_.
This primitive denotes a destructive match operation, i.e. the
node is inserted into the freelist after the components have been
extracted, so that its space can be reused during the construction
of values, as is the case in the following code for insertion sort.
type L = Nil | Cons of int * L
let ins a l = match l with Nil -> Cons(a,Nil)
| Cons(x,t)@_ -> if a < x
then Cons(a,Cons(x,t))
else Cons(x, ins a t)
let sort l = match l with Nil -> Nil
| Cons(a,t)@_ -> ins a (sort t)
In order to ensure that objects that are inserted into the freelist
cannot be aliased, various type systems may be imposed on Camelot
programs. In the remainder of this paper, we assume a linear typing
regime, although more permissive disciplines may be considered as
well [4].
The analysis of [7] is formulated using an
extended notion of types such that different portions of the input
can contribute a different amount to memory consumption.
For each datatype constructor, a numeric annotation
indicates the amount of heap that is required for a single build
operation using that constructor. Judgments are of the form
where T is
the (extended) type of e with respect to the context G,
while n and m represent constants describing memory requirements.
Numeric annotations in T are interpreted relative
to the size of the output data structure, and the annotations
in G refer to the size of the input data structures.
Some of the typing rules are given in Fig. 2.
The effect of a pattern match on the freelist depends on whether the match is performed destructively or not. In both cases, the branch executed in the case of an empty list has exactly the same memory behaviour as the composite expression. In the case of a non-empty list, the freelist available for the continuation grows at least by the amount k ``stored in'' the list node that is taken apart. In case of a destructive match, the cell inhabited by this node becomes available as well, which explains the additional increment in rule DMATCH.
The inference process, called LFD, for the type system consists of two stages. First, a skeleton type derivation is constructed where the numerical annotations n, k, m,... are interpreted as (rational) variables, constrained by side conditions such as the one in rule CONS. These side conditions are collected, and in a second step handed over to a linear programming solver. Any feasible solution to the linear program corresponds to a possible typing derivation.
In the context of certificate generation, the solution inferred by the analysis (if existing) is presented as a signature that contains one typing for each Camelot function. In the case of our example program, one such signature is
The compiler translates a program into Grail following a whole-program compilation approach. The resulting code contains one (static) method for each Camelot function. For example, the Grail code for insertion sort includes the method InsSort.ins, with the following (slightly pretty-printed) body:
Destructive datatype match operations lead to the insertion of the cell into a freelist (method free), and construction of a value draws a cell from the freelist (method make). The implementation of make is such that a new object is only allocated if the freelist is empty - in fact, this is the only place at which object creation occurs in the compiled code. Consequently, we may establish in-space execution if we can verify that make will never be called on an empty freelist. The verification of this property is based on a general-purpose program logic that we describe next.
Traditionally, one could argue that soundness and completeness of program logics are mostly of meta-theoretic interest. In a proof-carrying code architecture, they are of far greater importance. This obviously applies to soundness, as the code consumer bases the decision whether to accept a transmitted program on the validation of the certificate. Completeness, on the other hand, ensures that the logic is powerful enough to support not only our current high-level type system, but any type system that has a semantic interpretation depending only on the five semantic components E, h, h', v and r. In order to achieve this confidence in our logic, we formalized the operational semantics and the program logic in the theorem prover Isabelle [16] and performed the proofs of soundness and completeness relative to the ambient logic HOL [2].
Building upon the formalisation of the general-purpose program logic, we formally derived such rules in Isabelle/HOL, including rules for the freelist management methods and destructive and non-destructive match operations. Some of them are depicted in Fig. 3.
Depending on the implementation of the certificate checker, various certificate formats can be considered. In fact, there are several aspects where logic programming techniques are playing a role in PCC: to begin with, safety properties have been expressed as deductive systems, which, as popularized by systems such as Lambda Prolog [12] and Twelf [17], can be seen as logic programs. Further, and somewhat ironically, considering the reputation that declarative programming has in certain circles, goal-oriented proof search is now being exploited to make PCC more efficient. Indeed, in the original PCC approach, certificates were full LF proof terms, so as to make the proof-checker at the consumer side as simple as possible [13]. As this yields certificates that are several orders of magnitude larger than their accompanying programs, more efficient representation of such terms were devised, where (higher-order) unification was used to reconstruct the necessary information at the client side [14]. This represented a slight shift of burden from the producer side, and it was refined once it was realized that proof checking at the consumer side could be accomplished with a very simple and trusted (non-deterministic) logic programming interpreter. In both the foundational [18] and the oracle based approach [15], the certificate consists of a goal (the safety predicate) to be executed against the logic program embodying the safety policy. A hint (such as loop annotations) or an oracle bit-stream allows the consumer to replay the proof without any backtracking.
In the MRG architecture, similarly to the foundational one, the heap logic does not need to be trusted, but its soundness proof can be verified once and for all by the consumer (most likely in the guise of off-device verification). Once installed, the syntax-oriented rules of the heap logic may be used as a logic program. In the current implementation the soundness proof is directly exploited and we designed a hand-tailored Isabelle tactic [5] which performs goal-oriented proof-search on the derived rules, while at the same time checking that any arising arithmetic and set-theoretic constraint is met. Due to both the Isabelle inductive setup (requiring judgments such as provability in the program logic to be monotonic) and the need to stay close to the JVM model, the encoding is first-order and the verification at the moment far from efficient. However, similar to the emergence of Prolog from general resolution, there is much to be gained from switching from tactical theorem proving to logic programming, not only in pragmatic terms, but in a dramatic reduction of the trusted code base. Current research is focused on providing a Twelf implementation of the heap logic, using hypothetical and parametric judgments to encode explicit contexts such as G and D, much as in [18]. We further aim to generalise our approach to an architecture with more complex resource policies that involve high-level type systems for properties different from heap consumption.
http://www.cs.cmu.edu/~twelf/guide.