An Extension of PlusCal for Modeling Distributed Algorithms - Heba Alkayed et al.

Oct 1, 2020 15:02 · 1425 words · 7 minute read processes request access model checker

Hello, my name is Heba and I’m presenting an extension of PlusCal for Modeling Distributed Algorithms. Lets start out by saying what is TLA+? TLA+ is a formal specification language. It is mainly used to model concurrent and distributed algorithms. The TLA+ Toolbox is an IDE that provides many tools such as a model checker and an interactive proof assistant for example. Since TLA+ is more of a specification formalism rather than a programming language, it can be challenging for programmers to use it initially.

00:41 - PlusCal is an algorithmic language that was created to accompany TLA+ to give the users more of a familiar syntax while maintaining the same expressiveness as TLA+. However, if we’re discussing distributed algorithms, we usually find ourselves needing to model a directive processes that into play together in order to carry out a certain task. This need is not really met naturally in PlusCal since PlusCal has more flat hierarchy when it comes to its processes, and it doesn’t really allow a nesting of processes. We propose an extension of PlusCal called distributed PlusCal that enables the users to define sub-processes to model this parallelism as well as providing communication channel constructs that can help in the communication between these sub-processes. We’re going to start out by introducing motivational example to highlight all contributions. We have chosen Lamport’s Mutex Algorithm.

01:51 - For those of you who are not aware, it’s an algorithm for mutual exclusion and distributed systems where processes request access to a critical section and each request has a logical clock value where the requests with the lower values have higher priority. A process starts out by sending a request that it needs to access the critical section to all the other processes with a logical clock value. Then it waits for the acknowledgment from all the other processes to give it a right to enter the critical section. Once all the acknowledgements are received, it enters the critical section, execute its part, and then when it leaves the critical section, it notifies all the other processes that the critical section is now released. If we’re going to model this algorithm in PlusCal, we’re going to need two process types.

02:44 - The first process type will be just a process to carry out the main algorithm. We can see here that we have a non-critical section and then you have a trystep where you multicast the message requesting the access to the critical section. Then in the following step, you wait for the acknowledgments from all the other processes. Once you get all your acknowledgements, you can move forward and enter the critical section. Eventually when you exit, you again multicast a release message for all the processes notifying them that the critical section is now released.

03:20 - We also need a helper process that can be responsible for the communication between the processes. The helper process here uses an operator that we have defined called node. Node here identifies the main process corresponding to the communicator C. Inside this block here, we’re handling requested knowledge and released messages. Because we have two processes working together, the variables must be declared globally in order for them to be accessible by both processes.

03:57 - However, if we decide to model this algorithm in distributed PlusCal, it can be done in a simpler way. For example, we will have only one process type where you have two sub-processes working together to carry out this algorithms. The first sub-process starts with this curly brace here and ends with this curly brace here. Same thing for the second one, it starts here and ends here, where the first sub-process carries out the same logic that we have discussed before you have an uncritical section and all the way to releasing the critical section and the second sub-process handles the asynchronous message reception. As for modeling channels for this specific algorithm, in PlusCal we declare a variable here.

04:49 - We give it two dimensions where p can be the sender of a message, q can be the receiver and the messages are placed within the TLA+ sequence. In distributed PlusCal however, you can simply declare a predefined type called FIFOs to represent a FIFO channel and give it the appropriate dimensions. Also for implementing an operation in PlusCal, a channel operation, you will define a macro, give it the appropriate body, and then later on call the macro. Where in distributed PlusCal you can simply use one of the predefined operators. For example here we have an operator called multicast where the first argument represents the channel name and the second argument represents an expression that specifies the intended recipients as well as the message that needs to be sent.

05:39 - Here we can see the general structure of a distributed PlusCal Algorithm. It resembles the general structure of a PlusCal Algorithm. The differences can be seen here where you can define channels to represent unordered channels. You can define FIFO’s like we have seen in our example. Here, instead of giving the process one body, you can actually give it multiple sub-processes. We have supported several operations on channels. For example, you can send a single element to a channel, you can receive as enabled when an empty channel receives a message into a variable as can be seen here in the second argument. Broadcast and multicast send messages along several channels in an array while clear empties the channel. If we take a look at unordered channels, we will see that they are declared using the keyword channel or channels. They are also based on TLA+ sets. If a channel operation is actually invoked with an ordered channel, once it is translated to TLA+, it will be using the TLA+ set operators, as we can see here and here as well.

06:58 - As for FIFO channels, they are declared using the keyword FIFO as we have seen in our example, and they are based on TLA+ sequences because we want to maintain the order between the messages within the channel itself. Once an operator is translated into TLA+, it will be using the TLA+ sequence operators such as append, head, and tail. As for the variable, it is a special variable TLA+ and PlusCal. It should present a single string equal to the label of this next statement to be executed with respect to a process. We have extended this definition to include which sub-process is involved as well.

07:45 - Here we can see the initialization of the pc variable. For example, if we are handling a process of type p1, here we’ll be giving a sequence where lbl_11 represents the first label in the first sub-process, and lbl_12 will represent the first label in the second sub-process and so on. Here is a piece of translation that was produced by our own translator. This is the same exit all set that we have discussed before when the process exits the critical section and it releases a message for all the other processes notifying them that the critical section is now released. The first thing to pay attention to is the fact that the pc variable, it also references the number 1 here, which means that we are in the first sub-process.

08:42 - These lines represent the translation of a multicasts operation when invoked with a FIFO channel. We can see here even the PC prime variable also takes into consideration in which sub-process we are. Though to conclude, we offer an extension of PlusCal called distributed PlusCal. It gives the user the opportunity to define sub-processes that you would normally see coexisting within a distributed node. We also offer a communication channel constructs that assists in the communication between these subprocesses.

09:19 - We offer backward-compatible translator that translates from PlusCal to TLA+ as well as from distributed PlusCal to TLA+. This is because we used the same translator that exists within the TLA+ Toolbox before we extended the language. For future work, we intend to introduce more types of communication channels. We are also considering defining a TLA+ standard module that the channel operations can get their definitions from. This can give the user much more flexibility to define their own operations as well as override the operations they have already introduced.

10:02 - Thank you so much for watching and feel free to take a look at our GitHub repository and try out our translator. Thanks. .