Andrés A. Aristizábal, Hugo A. López, Camilo Rueda
Universidad Javeriana Cali
|Editor: Frank Valencia
The P2P protocols used in various tools have to maintain a certain amount of important properties to guarantee its well functioning. One class of the most relevant P2P protocols are those concerned to security. Properties like secrecy, anonymity and non-traceability have been studied in the literature in order to overcome security risks . Anonymity itself, is considered of the essence of any peer to peer protocol, since the participants on the network wish to establish communications and share resources without revealing their identity. Similarly, secrecy is considered crucial, since messages transmitted and managed between the distributed components in the network shall be kept as a secret for an entity outside the peer to peer group.
Despite the popularity of this kind of protocols, the importance of maintaining security matters within them and the existence of different calculi to reason about protocols, to the best of our knowledge, little has been done in modelling P2P protocols using process languages.
MUTE is a P2P tool for sharing and transmitting resources in a highly dynamic distributed network . It is based on its particular searching protocol, which claims to guarantee an anonymous way of communicating data, in a secure way, through all the P2P network. In spite of being a very well known peer to peer protocol, MUTE has only been informally described.
We shall use SPL, a specification language for security protocols developed by Winskel and Crazzolara to give MUTE a formal specification for the first time. We then use SPL reasoning techniques to verify a secrecy property againts attacks of an outsider.
SPL has strong a similarity with Concurrent Constraint Programming (CCP) . Just like CCP, SPL is operationally defined in terms of configurations containing items of information (messages) which can only increase during evolution. Such a monotonic evolution of information is akin to the notion of monotonic store central to CCP and a source of its simplicity.
One contribution of this paper is to give, to our knowledge, the first formal model of MUTE which abstracts away from details not concerned with secrecy issues. Another contribution is to bear witness of the applicability of SPL and its proof techniques for modelling and reasoning about protocols. The work in the present paper represent our first approach towards the use of SPL as formalism for specifying and verifying P2P protocols. The current work has been submitted to a workshop in P2P systems.
We shall proceed as follows: We extract a formal model directly from the implementation code. Then, using the SPL formalism along with its compositional power, we establish the formal specification of the MUTE protocol searching phase, modeling its components as a set of processes which work together to achieve the main goal of the protocol. Finally we use the proof techniques of SPL to prove a secrecy property for the messages in the network, taking into account an outsider malicious entity.
The paper is structured as follows. In the next section we present a brief summary of preliminaries, including a short introduction of the SPL calculus. In section 3, we explain the MUTE protocol, presenting an intuitive representation, as well as its formalisation on SPL. In the following section, we follow the SPL proof techniques scheme to verify the secrecy property for messages behind a passive outsider in the MUTE protocol. In section 5 we discuss some related work and in the last chapter we give out some concluding remarks, as well as future work.
This section presents a brief overview of SPL (Security Protocol language), a process calculus for security protocols proposed by Winskel and Crazzolara in . The full coverage of the calculus is given in .
SPL is a process calculus designed to model protocols and prove their security properties by means of transitions and event-based semantics. SPL is based on the Dolev-Yao Model , which states that cryptography is unbreakable and the spy is an active intruder capable of intercept, modify, replay and suppress messages. The calculus is operationally defined in terms of configurations containing items of information (messages) which can only increase during evolution, modelling the fact that, in an open network an intruder can see and remember any message that was ever in transit.
The syntactic entities SPL are described below:
The rest of the elements of SPL syntactic set are defined in Table 1(a), where Pub(v), Priv(v), and denote the generation of public, private and shared keys respectively. We use the vector notation to denote a list of elements, possibly empty, s1, s2, ..., sn.
The output process generates a set of fresh distinct names (nonces) for the variables Then it outputs the message (i.e., with each xi replaced with ni) in the store and resumes as the process . The output process binds the occurrence of the variables in and p. As an example of a typical output, can be viewed as an agent posting a message with a nonce n and its own identifier encrypted with the public key of an agent . We shall write simply as if the is empty.
The input process waits for a message in the store that matches (the pattern) the message for some instantiation of its variables , and . The process resumes as p with the chosen instantiation. The input process is the other binder in SPL binding the occurrences of in and p. As an example of a typical input, can be seen as an agent waiting for a message of the form encrypted with its public key : If the message of pA above is in the store, the chosen instantiation for matching the pattern could be for and for . When no confusion arises we will sometimes abbreviate as .
Finally, denotes the parallel composition of all . For example in the processes and above run in parallel so they can communicate. We shall use to denote an infinite number of copies of in parallel. We sometimes write to mean
The syntactic notions of free variables and closed process/message are defined in the obvious way. A variable is free in a process/message is has a non-bound occurrence in that process/message. A process/message is said to be closed if it has no free variables.
SPL has a transition semantics over configurations that represents the evolution of processes. A configuration is defined as where p is a closed process term (the process currently executing), s a subset of names (the set of nonces so-far generated), and is a subset of variable-free messages (i.e., the store of output messages).
The transitions between configurations are labelled by actions which can be input/output and maybe tagged with an index i indicating the parallel component performing the action. Actions are thus given by the syntax where is as a set of names, i as an index and a closed message.
Intuitively a transition says that by executing the process p with s and t evolves into p' with s' and t'. The new set of messages t' contains those in t since output messages are meant to be read but not removed by the input processes. The rules in Table 1(b) define the transitions between configurations. The rules are easily seen to realize the intuitive behaviour of processes given in the previous section.
Nevertheless, SPL also provides an event based semantics, where events of the protocol and their dependencies are made more explicit. This is advantageous because events and their pre and post-conditions form a Petri-net, so-called SPL nets.
Although transition semantics provide an appropriate method to show the behaviour of configurations, these are not enough to show dependencies between events, or to support typical proof techniques based on maintenance of invariants along the trace of the protocols. To do so, SPL presents an additional semantics based in events that allow to explicit protocol events and their dependencies in a concrete way.
SPL event-based semantics are strictly related to persistent Petri nets, so called SPL-nets  defining events in the way they affect conditions. The reader may find full details about Petri Nets and all the elements of a SPL-Nets in AppendixA [Editor Note: the appendix is available in the PDF version of this document] and , below we just recall some basic notions.
In the event-based semantics of SPL, conditions take an important place as they represent some form of local state. There are three kinds of conditions: control, output and name conditions (denoted by , and , respectively). -conditions includes input and output processes, possibly tagged by an index. -conditions are the only persistent conditions in SPL-nets and consists of closed messages output on network. Finally, -conditions denotes basically the set of names being used for a transition. In order to denote pre and post conditions between events, let denote the set of control, name and output preconditions, and the equivalent set of postconditions. An SPL event e is a tuple of the preconditions and postconditions of e and each event e is associated with a unique action . Figure 1 gives the general form of an SPL event. In the Appendix we will give the events for the protocol MUTE according to the SPL Event-Semantics. The exact definition of each element of the semantics can be can be found in . For space limitations,here we shall recall some and illustrate others.
To illustrate the elements of the event semantics, consider a simple output event , where are distinct names to match with the variables . The action corresponding to this event is the ouput action Conditions related with this event are:
Where stands for the initial control conditions of a closed process p: The set is defined inductively as is is an input or an output process, otherwise
|Where are distinct names|
|Where names, are keys, and are closed messages|
where, if is a set, denotes the set
The MUTE protocol works in a P2p network as a tool to communicate requests of keywords through the net, so that an specific file can be found and then received . This protocol is composed of two main phases, the searching and the routing part. We will focus directly in its first phase, since it is the most related to the security concerns proposed in our work.
This protocol aims to provide an easy and effective search while protecting the privacy of the participants involved. It is inspired in the behaviour of ants in the search for food. The analogy is accomplished representing each ant as a node of a network, files requested as food, and pheromones as traces. In this way, one of the key properties of this model is the inherent anonymity of the protocol, because, like the ants that do not know the shortest path between the food and the anthill, peers are unaware of the overall environment layout and MUTE messages must be directed through the network using only local hints.1
Since the MUTE protocol claims to have anonymous users, none of the nodes in the P2P network knows where to find a particular recipient. Each node in the MUTE network contains direct connections to other nodes in the network in order to achieve its desired search. This nodes are called "neighbours" and through these, messages are secretly passed, either as a request or as an answer, in such a way that no agent outside the peer to peer network could manage to understand any of these data. Despite anonymity being essential on this protocol, secrecy is also one its main goals, since transmitted messages along the network involve information only concerned to the ones sharing the resources and must not be revealed to the outside world.
In spite of being already implemented and used as a tool for downloading and sharing files, to our knowledge MUTE has not yet been formally specified. Part of our work consists in abstracting from the code elements that have an impact in security.
We shall represent a P2P network as an undirected graph whose nodes represent the peers and whose edges represent the direct connections among peers. We use to denote the set of all nodes in . Given a node , Let be the set of immediate neighbours of . We use the Dolev Yoe notation stating that sends a message to For Example, consider a P2P network with peers . Suppose that is the initiator of the protocol and is the responder. In this case, can be any node inside the network, with the desired file on its store. So, requests a particular file he wishes to download. For this purpose he sends the request to the network, by broadcasting it to his neighbours. This request includes a keyword which will match the desired file. Along the searching path an unknown amount of peers will forward the request until it reaches , a peer which has the correct file, st and , where means the set of all files in the network, means the set of files of the Agent or peer in the network, the set of keywords associated to the files , and the keywords associated to the peer . Then, sends its response by means of the header of the file, again by means of a broadcast through a series of forward steps, until it reaches the actual sender . Figure 2 give a representation of the above description using Dolev Yao notation .
Here X, Y are variables which represent the forwarder peers along the path that goes from the Initiator to the responder node. This intermediate process may continue, until the target is reached. In the same way, these two variables will represent the peers which will forward the answer from the responder to the initiator. This process may be repeated several times as well.
Figure 3 illustrates a particular P2P network example in which is the initiator, are the peers involved in the forwarding process and is the receptor that sends its answer via the same three forwarders.
We use the core of the MUTE protocol in order to establish some security properties. The phases that we shall consider are the ones that involve the transmission of the keyword, the response message and the keys, leaving behind the phases of connection, and the submessages that include plaintext. We assume that The formal model in Figure 4.
We assume that the topology of the net has already been established. The agent starts searching for an own keyword. This agent broadcasts the desired keyword to all its neighbours. Its neighbours receive the message and see if the keyword matches one of their files, if at least one of the neighbours have the requested keyword, it will broadcast a response message, such that eventually the one searching for the keyword will get it and understand it as an answer to its request. The message will be forwarded by all the agents until it reaches its destiny. Otherwise, if the keyword does not match any file of the agent, then it will broadcast it to its neighbours asking them for the same keyword. The choice of having or not the right file is modeled in a non-deterministic way. This model abstracts way from issues such as the search for the best path, since it has no impact in secrecy.
Here we will establish the secrecy of MUTE for a Spy outside the P2P network.
In this way, the complete protocol includes the specification of MUTE, in Figure 4, in parallel with the Spy:
Let us recall some elements. Let be the set of headers of files, which is associated to , the set directly related to , such that each header which belongs to will be associated to a unique file belonging to (See section 3.1).
To analyze secrecy of a given protocol in SPL, one considers arbirtrary runs of the protocol.
We shall use in the theorems a binary relation between messages. Intuitively means message is a subexpression of message (See Appendix B for the exact definition)
In the following theorems we shall refer to the events of MUTE which for the sake of space are explicitely given in Appendix C.
The first secrecy theorem for the MUTE protocol concerns the shared keys of neighbours. If this shared keys are not corrupted from the start and the peers behave as the protocol states then the keys will not be leaked during a protocol run. If we assume that t0, where , then at the initial state of the run there is no danger of corruption. This will help us to prove some other security properties for MUTE.
The following theorem concerns the secrecy property for the request. It states that the keyword asked by the initiator and broadcasted through the network will never be visible for a Spy outside the peer to peer group.
By proving that the keyword sent by the initiator peer as a request never appears in the cleartext during a run of the protocol, we can affirm that a Spy outside of the network will never know that keyword, so he will never recognise the file a sender is requesting.
The next theorem states that the message sent as an answer by the responder will never appear as a cleartext during a run of the MUTE protocol, and in this way nobody outside the peer to peer boundaries will understand it.
If we prove that the answer header sent by the receiver, never appears in the cleartext during a run of the protocol we can manage to guarantee that a Spy outside the peer to peer network will never know or access the file.
To our knowledge, little work has been done in formalisation of P2P protocols using Process Calculi. In particular, the project Pepito  has started efforts in verification of properties using CCS variants in static versions of P2P protocols , in particular, correctness properties. Other analysis has been made for specific P2P functionalities, like  and . However, to our knowledge, this is the first formal attempt using process calculi to model and reason about security properties in P2P protocols.
We have proved the secrecy property for a passive outsider in the MUTE protocol. It will be interesting to explore security properties for threats inside the P2P network or an outsider who can masquerade as a trusted peer. Our future work will be the verification of the secrecy property for an insider in the MUTE protocol. In the same way, we shall explore the SPL expressiveness in order to model new cutting-edge protocols, using its own reasoning techniques, or extending it in order to verify other important security properties like non-traceability and non-malleability.