On Power-Law Relationships of the Internet Topology
by Michalis Faloutsos, Petros Faloutsos, Christos Faloutsos
url show details
Details
type: | misc | booktitle: | SIGCOMM | year: | 1999 | month: | jun # "~17 | journal: | Computer Communication Review | annote: | Michalis Faloutsos (U.C. Riverside; Dept . of Comp . Science); Petros Faloutsos (U. of Toronto; Dept . of Comp . Science); Christos Faloutsos (Carnegie Mellon Univ.; Dept . of Comp . Science); | volume: | 29 | publisher: | ACM SIGCOMM | pages: | 251--262 | abstract: | Despite the apparent randomness of the Internet, we discover some surprisingly simple power-laws of the Internet topology. These power-laws hold for three snapshots of the Internet, between November 1997 and December 1998, despite a 45\% growth of its size during that period. We show that our power-laws fit the real data very well resulting in correlation coefficients of 96\% or higher. Our observations provide a novel perspective of the structure of the Internet. The power-laws describe concisely skewed distributions of graph properties such as the node outdegree. In addition, these power-laws can be used to estimate important parameters such as the average neighborhood size, and facilitate the design and the performance analysis of protocols. Furthermore, we can use them to generate and select realistic topologies for simulation purposes. | number: | 4 | address: | New York |
|
|
You need to log in to add tags and post comments.
Knowing the properties of Internet topology is very essential while designing some network protocols, such as those employing broadcast and flooding techniques. It can be further helpful to generate network simulations which are closer to real world. In addition, we can use it in estimating future network development.
The authors studied the Internet topology in this paper and had some big findings on fundamental properties of the network. They discovered three power laws, which are fundamental breakthroughs, by studying three snapshots of the Internet between 1997 and 1998 (each pair of adjacent time points are about half year apart):
1. The out-degree, d, of a node is proportional to the rank of the node (in descending order of the out-degree), r, to the power of a constant, R.
2. The frequency, f, of an out-degree, d, is proportional to the out-degree to the power of a constant, O.
3. The eigen-values of a graph (in descending order) are proportional to the order to the power of a constant.
They used correlation coefficiency to verify the findings, and showed that they are statistically significant.
Other contributions of the paper were applications on important graph metrics. They gave an estimation of average neighborhood size which greatly outperformed previous estimation formula. Besides, a formula was given to predict the number of edges in a network.
The overall impact of the paper was huge, because some of the findings broke some simple intuitions that people may have at that time. It was one of the highly cited papers in 1999.
Issues:
1. The close relation between the first and second power law was under-studied. The frequency, f, is essentially (almost) the first derivative of rank, r, with respect to out-degree, d.
d proportional to r^R => r proportional to d^(1/R) => r’ proportional to d^(1/R-1)*d’ => f proportional to d^(1/R-1)*d’ => f proportional to d^(1/R-1) (because d’=1)
Therefore, the exponent in Power-Law 2, O, is about 1/R-1, as long as both of the two laws are fitted well on the dataset. It is verified by the exponents, O’s and R’s, presented in the paper.
In short, the first two power laws are practically the same.
2. I also doubt that big eigen-values could have something to do with large out-degree nodes, though it could be hard to prove that. Besides, there is no information provided that could tell us the potential usage of Law 3.
3. No explanation has been given on the values of exponents in the power laws so far, otherwise it would be perfect.
4. (This is a minor point.) In today’s perspective, the value of average neighborhood size is limited, because broadcast or flooding is very rare and usually limited within a small region due to practical and financial concerns.
Although running an experiment with the real system is crucial, a simulated model is important as well. Before I read this paper, the Internet seems to be impossible to model correctly in my opinion. So, I quite enjoyed this paper. However, I feel like 3 snapshots are a little bit too small. Does anyone try to do a similar study with a larger data set?
Also, I heard people try to do similar studies on properties of social network such as MSN and Facebook.
This is an very intersting and inspiring paper that studies the graph properties of real networking.
Since this paper was written 10 years ago, I am wondering if the result in paper still applies well now?
This paper observes "power laws" in graph degree distributions in data collected on the internet. They empirically fit the parameter of this distribution to the data, and observe a high correlation coefficient. This seems to be the only contribution of this paper -- no serious explanation for this phenomenon is presented.
Of course, this is an interesting property of the internet, and it is important that the observation be made. This paper records the results of looking at the data, and will help others make more realistic modelling assumptions when simulating network protocols.
How do the graph-theoretic quantities observed here characterize the performance (e.g. throughput, robustness, etc) of a network? The authors mention such characterization may be possible in the discussion section, but has anyone actually bothered?
For Wittawat's comment, Jure Leskovec and Eric Horvitz have a recent paper in WWW'08 analyzing network arising from MSN. The link is here: http://www.cs.cmu.edu/~jure/pubs/msn-www08.pdf.
Besides power law distribution, log-normal, double-pareto, etc. are also quite interesting.
Actually, numerous systems in nature and even artificial structures are found to have the network property of power laws. Although this paper focuses on the interconnections of domain level networks, I guess that lots of application level end-to-end overlay networks also should have this power law properties.
What are the other possible application of this property in inter-domain level? And, also, have these findings helped improve some network protocols so far?
I thought this was a great paper. The main contribution is finding three power laws that model the data well with high correlation coefficients. I am interested in how follow-up work was able to take into account these power laws in designing/improving existing protocols. The paper suggests that this would be one of the main contributions: coming up with rules that could shape new protocols. I would be interested to see what sort of efficiency gains could be made by designing new protocols based on the findings of this paper, at both levels (router level and intra-domain level).
I think this paper would be useful to model the Internet topology, especially for inter-domain-level topology. However, as Yi Wu mentioned, the data used in the paper is from 1997-1998, is there any study that studies more updated data? Also, as we see from the other paper that this power law is not quite realistic for highly engineered system like router-level Internet topology, would this power-law concept be realistic for inter-domain-level topology? For me, inter-domain-level topology seems to be less engineered; but anyone know the answer?
This paper studies three snapshots of the internet and identifies relatively simple power-laws that seem to correlate well with some internet-topology metrics that the authors introduce. The presented results can lead to more efficient protocol design, more accurate artificial simulation models of the internet, as well as good first-order estimates about the evolution of the topology of the internet. An additional contribution of this paper is the introduction of new graph metrics that are well-suited for studying not only the internet, but other natural networks as well (e.g. automobile networks). Like others, I also wonder if there has been any follow-up work in this area? I'm also curious if other papers that use simple models to capture the internet topology ever verified their models against the power-laws presented in this paper.
I found this to be a very interesting paper. The good thing about the paper I felt is the clear mention of its limitations. It seems like the scenario might be different in todays internet topology. Also I felt that three snapshots is a small dataset.
Three interesting power laws are observed in this paper. I am thinking that: is there any relationship between "small world" property and the current Internet topology?
Also the data used in this paper is a kind of "old". What does the current Internet topology look like?
Generated this based on a CAIDA AS graph from December 31, 2007 (3 months ago).
http://www.cs.cmu.edu/~vrv/degreeDist.png
So yeah, I think it still holds.
#11: This data was generated from the AS-level graphs (inter-domain graph), so this particularly applies to inter-domain topologies. It might be interesting to see if the lack of peer link inference has an effect on these results, though I suspect not. If I have time I'll try to run the analysis on the NSDI2007 graph on unearthing the peering links.
Update: here are the graphs:
CAIDA Dec 2007:
Riverside (May 2005 data, lots of peering links):
As many people have mentioned, this paper was very influential and spurred a lot of research on such power-laws.
However, there has been some later work which suggest that some of these laws are caused due to sampling bias. For example see, On the bias of Traceroute Sampling, STOC 2005(http://www-rcf.usc.edu/~dkempe/publications/bfs.pdf) and earlier work in Infocomm 2003 by Lakhina et al(Sampling biases in IP topology measurements).