papers | collections | search login | register | forgot password?

Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications
by Ion Stoica, Robert Morris, David Karger, Frans Kaashoek, Hari Balakrishnan
url  show details
You need to log in to add tags and post comments.
Tags
Public comments
#1 posted on Mar 22 2008, 19:54 in collection CMU 15-744: Computer Networks -- Spring 08
The core operation in most peer-to-peer (P2P) systems is locating data items. With the distributed and hierarchy-less characteristics of the P2P systems, how to find any given data item in a scalable manner is one challenging problem. This paper proposed a scalable protocol, named "Chord", for lookup in a dynamic peer-to-peer system with frequent node arrivals and departures.

Key features of Chord are 1) simplicity, 2) provable correctness, and 3) provable performance. In the steady state of an N-node system, with Chord, each node needs to maintain only about O(log N) information of other nodes, and is able to resolve all lookups via O(log N) messages to other nodes. In addition, it would require not more than O(log2 N) messages to deal with nodes' joining and leaving the system.

Chord simplifies the design of P2P systems and applications based on it by addressing problems on 1) Load Balancing, 2) Decentralize Operation, 3) Scalability, 4) Availability, and 5) Naming Flexibility.

To achieve load balancing and scalability, Chord provides fast distributed computation of a hash function called "consistent hashing" to map keys to nodes responsible for them. The consistent hash function assigns each node and key an m-bit identifier using hash function such as SHA-1. A node’s identifier is chosen by hashing the node’s IP address, while a key identifier is produced by hashing the key. Then a key k will be assigned to the first node whose identifier is equal to or follows the identifier of k in a circular identifier space. This would let nodes enter and leave the network with minimal disruption. In addition, Chord improves the scalability of the hashing by avoiding the requirement that every node know about every other node.

A very small amount of routing information is needed to implement consistent hashing such that each node only needs to be aware of its successor node, i.e. its following node, in the circle. To accelerate the lookup process, each node has a finger table containing the IP address of a node halfway, a quarter-of-the-way, an eighth-of-the-way, and so forth in powers of two manner.

In order to deal with concurrent entering and leaving of nodes, Chord has stabilization protocol that keeps successor pointers up-to-date. These pointers are then used to verify and correct entries in finger table; resulting in fast and correct lookups. In addition, to deal with node failures Chord maintains a “successor-list” of its r nearest successors on the circular identifier space such that if the successor of node n failed, node n will replace the failed node with its first live entry from the successor list.

Theoretical analysis, simulations, and experimental results in the paper shows that Chord scales well with the number of nodes, recovers from large number of simultaneous node failures and joins, and answers most lookups correctly even during recovery from node failures. However, one caveat is that most of the results show high degree of variation.

Questions:
As it is known that the length of identifier can affect how well keys could be distributed to nodes in the system, instead of using “virtual node” to improve load balancing, is it possible to determine an appropriate identifier length based on some parameters such expected number of nodes and keys in the system?

About lookup latency, in addition to server selection and proximity routing, is there any other way to deal with this problem?
#2 posted on Mar 22 2008, 23:07 in collection CMU 15-744: Computer Networks -- Spring 08
The algorithm is very nice and most intuitive among DHT algorithms. Also, the correctness constraint is very simple.
From the evaluation, I found that the load of each node can differ by orders of magnitude unless quite a few virtual nodes are introduced. Since group of virtual nodes runs in the same node, there will be high probability of joint failures. Then this will increase simulatenous join/leave, but the evaluation only shows results for relatively low join/fail rate.
Also, as pointed out in this paper, partitioned case are not well dealt with. Overall, I'm not convinced that the system is robust enough to be used in production systems.

About latency issue pointed by the public review, one way is to do lots of caching of routes. However, this will make the problem worse if node leaves (with or without notice.) Then for query one can guess from cache, fingers or just make a probabilistic guess to generate queries to multiple nodes. Multiple copies queries can be suppressed by nodes along the path or at destination.
#3 posted on Mar 23 2008, 10:37 in collection CMU 15-744: Computer Networks -- Spring 08
Chord has an overall good performance: it scales very well (the nodes only store information concerning a small set of nodes, and lookup time increases slowly with the number of nodes),and it is simple to mantain and robust to faillures. In fact, in order to keep consistency Chord only needs to know its successor, and in order to keep the sucessor pointer correct a stabilize protocol is run once in a while.

As the paper states (and figure 12 shows), the performance of Chord depends on how frequently faillures occur compared to the number of times stabilize protocol is executed. Therefore a stabilize protocol that can run fast and often is a critical part of the Chord algorithm.

It would be interesting to know if one could had a Proximity Routing (as used in CAN) to decrease the latency.
#4 posted on Mar 23 2008, 10:42 in collection CMU 15-744: Computer Networks -- Spring 08

#5 posted on Mar 23 2008, 13:09 in collection CMU 15-744: Computer Networks -- Spring 08
Chord provides a very intuitive and highly scalable lookup service, and the experimental results also support the authors' arguments. However, I don't understand why they provided the lookup latency analysis only with very small numbers of failure/join rates. (0 ~ 0.1 failure/join per second) It makes sense under the assumption that there is no malicious user who tries to join and leave the network with an extremely high frequency. Also, I'd like to see how many messages should be exchanged to stabilize the routing tables under this malicious attack.
#6 posted on Mar 23 2008, 13:10 in collection CMU 15-744: Computer Networks -- Spring 08
I like the use of a hashed ip address as a node identifier. This makes it cryptographically hard for an attacker to become responsible for some keys they want to censor, then drop them. (This is mentioned in the future work section.) Similarly, it is very hard for an attacker to obtain several successive nodes [in order to simultaneously fail, and disrupt the ring topology].

Using virtual nodes could compromise this guarantee, unless the "randomly chosen" virtual nodes involve simply hashing the IP address with some known salt.
#7 posted on Mar 23 2008, 16:01 in collection CMU 15-744: Computer Networks -- Spring 08
I'd like to comment on Wenjie's insightful comment.

So one thing is that you can expect networks to expand in a way that minimizes distance - it's much more likely that you'll get close-together nodes rather than nodes spread about the world uniformly.

You know that cost/complexity needs to grow to infinity as n grows to infinity, and of the functions that have this property, O(log(n)) is pretty much the best way you can do that.

I don't know about this transitivity thing, but I guess some other comments have mentioned good real-world performance, so maybe it's not that big of a deal.
#8 posted on Mar 23 2008, 16:09 in collection CMU 15-744: Computer Networks -- Spring 08
I agree with Wenjie. Although having the finger table cuts the # of hops to log(N), not all hops are equal. Perhaps the finger table can also include path metrics, and there's a policy that tries to optimize the metric while maximizing the # of identifiers skipped.
#9 posted on Mar 23 2008, 16:39 in collection CMU 15-744: Computer Networks -- Spring 08
Chord provides a nice solution to distributed hashing. I would be interested in reading about other aspects of distributed computing -- what other sorts of data structure functionality can be distributed nicely like this? How about actually computing in a distributed way? You would like to be able to run a computation on the internet using the idle cycles of computers on the network, but in such a way that would be tolerant to frequent node loss (i.e. when users wake up their computers from screen savers, those computers should be able to immediately leave the network computation and devote their full attention to their owner)
#10 posted on Mar 23 2008, 16:43 in collection CMU 15-744: Computer Networks -- Spring 08
Great paper, and I especially liked the detailed experimental results which strongly suggest the efficiency and scalability of Chord in practice. In particular, section 6.4 looks at an interesting result: simultaneous node failures. Unlike most of the previous papers we have read, they actually use rigorous statistical methods and give the 95% confidence interval on the graph. The results suggest that Chord is robust to multiple simultaneous node failures.
#11 posted on Mar 23 2008, 16:54 in collection CMU 15-744: Computer Networks -- Spring 08
Overall a very solid paper, that presents Chord a simple, efficient and scalable distributed lookup protocol. The Chord overlay network resembles an augmented linked-list structure. Firstly, a few extra pointers (successor list) are added for robustness/resilience to failures. Secondly, a few more pointers (finger-table entries) are added for performance. These extra pointers allow Chord to perform fast lookup operations by using a technique that resembles binary search.

An issue with Chord that seems important to me and was given relatively less attention is the mapping between the P2P overlay nodes with the nodes of the underlying network. O(logN) lookup time might sound good if the traversed nodes are connected along a straight path. But imagine the case where each successive overlay hop corresponds to a very large number of hops in the physical underlying network. As Miguel (comment #4) pointed out, it would be interesting to see Chord incorporate some kind of heuristic for proximity routing like other P2P systems do (e.g. CAN, Kademlia, Pastry, etc). According to the other assigned reading, this paper presents a new version of Chord that takes proximity routing into consideration:
Finding Nearest Neighbors in Growth-restricted Metrics
#12 posted on Mar 23 2008, 16:56 in collection CMU 15-744: Computer Networks -- Spring 08
This paper briefly addressed the issue of replicating data so that it is not lost when its host dies. As I understand, you would make n different keys for each datum (e.g., by using n different salts in hashing the original key), so that the datum is mapped to n different hosts. I think this nicely solves in the problem in a good, modular way.

Without virtual nodes, the network can catastrophically fail due to badly unbalanced loads. When a node fails, its predecessor becomes responsible for all of its keys. Thus, if a node is assigned too many keys and its fails due to insufficient bandwidth to handle all the keys, then its predecessor gets all of those keys. This node may then fail also, passing all of these keys to its predecessor, and so on, leading to a chain of failure and eventually brings down the network. The use of virtual nodes avoids this possibility, because when a node leaves, its keys are distributed to multiple other nodes instead of just a single node.

However, it seems that Chord still does not do good load-balancing in the event that one key is much more popular than other keys. If more than one host is needed to handle all the requests for such a popular key, it seems that Chord will fail, because Chord it does not allow a key to be load-balanced among several nodes.
#13 posted on Mar 23 2008, 17:24 in collection CMU 15-744: Computer Networks -- Spring 08
I also like the simplicity of the protocol. In theory, SHA-1 should be able to distribute node identities and key identities uniformly if the number of nodes is quite large. I am not sure how the protocol could deal with non-uniform node identities in the practical situation. For example, many nodes behind firewall might share the same IP.
#14 posted on Mar 23 2008, 18:27 in collection CMU 15-744: Computer Networks -- Spring 08
One additional comment about O(log n) hops. I think Srini can talk more in detail about this, but O(log n) is not a great guarantee when doing time-sensitive operations. Imagine implementing a massively multiplayer online game using Chord, where any given data item could take O(log n) hops before finding who has the data. If you consider that the average application-level hop in the wide-area can be something between 20-100ms, any more than a few hops makes the data you wanted stale before you even got it.
#15 posted on Mar 23 2008, 18:29 in collection CMU 15-744: Computer Networks -- Spring 08
For the issue of load balancing popular items, you may want to take a look at this relatively recent work on a P2P DNS system that exploits the Zipf-distribution of accesses to DNS names:

http://www.cs.cornell.edu/people/egs/beehive/codons.php
#16 posted on Mar 24 2008, 13:04 in collection CMU 15-744: Computer Networks -- Spring 08
Chord provides an algorithm to maintain a topology that in the graph, with O(log N) degree of each node, the diameter of the is O(log N). After Chord, many other algorithms are also presented, some of them having better bounds in either degree or diameter or both. But I agree with the above discussion that, the simplicity of Chord makes it most impactful.

Question: Chord provides exact key location service. But if we only need the correct key location with high probability, can we further simply or reduce the bound by using randomized technique e.g. blooming filter?