Programming Parallel Computers: Part 6C

May 17, 2020 20:32 · 842 words · 4 minute read q instead lot see know

Let us now look at another simple but effective technique for designing parallel algorithms. It is called “pointer jumping”, and it is suitable in many situations in which you would like to follow pointers in some kind of a linked data structure. It is usually easiest to think about this so that your data structure is stored in some array p. There p[x] indicates which element is the “successor” of x. For example, in this illustration, the successor of element 0 is element 2, and the successor of element 2 is element 1, and element 1 doesn’t have any successor.

00:45 - You can use this kind of an array to represent for example linked lists or rooted trees. Now let’s say we are given such an array, and you’d like to know how far element x is from the end. Meaning, how many times you need to follow links to find the end of the list or the root of the tree. There is a trivial sequential algorithm, in which you just follow pointers one by one. But there you can’t do anything in parallel.

01:16 - And if you start to think about this, it may feel like this would be an example of a problem that is inherently sequential. After all, how can you possibly follow a linked list in any other way than by taking steps one by one? But there is a very simple solution that makes it possible to solve this type of a problem very efficiently in parallel! The technique is called “pointer jumping”. And the whole idea is basically just this one line. That’s pretty much all that there is to it. Let me try to explain it more carefully. We are given some array p, where p[x] tells who is the successor of x. We compute here a new array q.

02:02 - And for all x in parallel, we set q[x] = p[p[x]]. So q[x] will tell which element is exactly 2 hops away from x. Clearly, q is easy to compute in parallel, you can find p[p[x]] independently for all elements x. OK, so what happened is that we computed an array q, which also represents a linked data structure, but 1 hop in q corresponds to 2 hops in p. So we could now make progress 2 times faster by following links in q instead of p.

02:42 - Good, but we can do more! We can simply repeat this procedure, and compute an array in which we have shortcuts of length 4. And then shortcuts of length 8. And so on. Until we have shortcuts that are so long that you could directly jump from anywhere to the very end! And this already happens after a logarithmic number of parallel steps. So this way you can quickly compute shortcuts that make it possible to “fast-forward” in a linked data structure. Let’s look at a concrete example. Let’s say we have got some kind of a linked data structure, it might represent a list, or a tree, I’ve shown both examples here. And we’d like to know for all nodes how far it is from the end.

03:31 - First, everyone sees if they are themselves the end, or their successor is the end, so we know which nodes are at distance 0 or 1 from the end. And then there are all the other nodes about which we now only know that they are at distance 2 or more. Good, so let’s do some pointer jumping. We find for all nodes who is exactly 2 steps away from it. We get these new orange pointers. And now we can also see if we learned something new about the distances. Everyone can just follow the orange pointer, and see if the distance of the orange successor is known.

04:16 - For example, if your orange successor is at distance 1 from the end, you know that you have to be at distance 3 from the end. Good. So now we have got figured out for everyone if the distance is 0, 1, 2, 3, or more, and we have these pointers that now represent 2 hops. We can just do pointer jumping again. Now the orange links represent 4 hops. And we can again follow these shortcuts and see if the other end already knows the distance. Here, for instance, the shortcut takes us to someone at distance 2, so we know we are at distance 6. And this way we now know distances already up to 7 hops.

05:03 - And we can repeat this until everything is known, at each step doubling the distance up to which we see. Quick and easy. Even though the technique is very easy to use, it is not necessarily easy to come up with this idea, as at first it might seem like the whole idea of following pointers in a linked list is inherently sequential! Surprisingly, we can speed up also such tasks a lot with the help of parallelism. .