http://www.nature.com/nature/journal/v4 ... 378a0.html
Many complex systems display a surprising degree of tolerance against errors. For example, relatively simple organisms grow, persist and reproduce despite drastic pharmaceutical or environmental interventions, an error tolerance attributed to the robustness of the underlying metabolic network1. Complex communication networks2 display a surprising degree of robustness: although key components regularly malfunction, local failures rarely lead to the loss of the global information-carrying ability of the network. The stability of these and other complex systems is often attributed to the redundant wiring of the functional web defined by the systems' components. Here we demonstrate that error tolerance is not shared by all redundant systems: it is displayed only by a class of inhomogeneously wired networks, called scale-free networks, which include the World-Wide Web3, 4, 5, the Internet6, social networks7 and cells8. We find that such networks display an unexpected degree of robustness, the ability of their nodes to communicate being unaffected even by unrealistically high failure rates. However, error tolerance comes at a high price in that these networks are extremely vulnerable to attacks (that is, to the selection and removal of a few nodes that play a vital role in maintaining the network's connectivity). Such error tolerance and attack vulnerability are generic properties of communication networks.
Here is the critical difference between random failure and a directed attack. Over time the attack becomes more effective as it prunes high degree nodes regardless of the network type, even when very small amounts of nodes are removed.
a, Comparison between the exponential (E) and scale-free (SF) network models, each containing N = 10,000 nodes and 20,000 links (that is, k = 4). The blue symbols correspond to the diameter of the exponential (triangles) and the scale-free (squares) networks when a fraction f of the nodes are removed randomly (error tolerance). Red symbols show the response of the exponential (diamonds) and the scale-free (circles) networks to attacks, when the most connected nodes are removed. We determined the f dependence of the diameter for different system sizes (N = 1,000; 5,000; 20,000) and found that the obtained curves, apart from a logarithmic size correction, overlap with those shown in a, indicating that the results are independent of the size of the system. We note that the diameter of the unperturbed ( f = 0) scale-free network is smaller than that of the exponential network, indicating that scale-free networks use the links available to them more efficiently, generating a more interconnected web. b, The changes in the diameter of the Internet under random failures (squares) or attacks (circles). We used the topological map of the Internet, containing 6,209 nodes and 12,200 links (k = 3.4), collected by the National Laboratory for Applied Network Research http://moat.nlanr.net/Routing/rawdata/
. c, Error (squares) and attack (circles) survivability of the World-Wide Web, measured on a sample containing 325,729 nodes and 1,498,353 links3, such that k = 4.59.
Note that this applies to a generic set of hubs, not the entire internet. The error that the Nature paper above is that it generalizes instead of taking into account the performance optimization qualities of routers. It tries to view the internet as a natural system instead of an engineered one.http://scenic.princeton.edu/network20q/ ... s'_heel%3F
There is no complete, accurate map of the Internet. To estimate the topology of the graph of routers, researchers use algorithms such as trace-route. However, trace-route measurements are prone to biased sampling. Furthermore, Internet exchange points and shortcuts are not detected by trace-route, so its resulting graphs may miss as much as 50% of links. Since the Achilles' Heel argument depends upon a specific topology, the shortcomings of trace-route seriously challenge the argument.
Moreover, as elaborated below, the Preferential Attachment model of the Internet is inaccurate. The Constraint Optimization model is more accurate, and it does not have an Achilles' heel.
The Achilles' heel argument also oversimplifies the issue by focusing solely on topology. In reality, if certain routers were destroyed, other routers can use special protocols to respond. The surviving routers would not continue to helplessly communicate with destroyed routers. The Achilles' heel argument ignores the functionality of routers, and does not address this shortcoming.
However, there are also economic and technological considerations that shape the Internet into a Constrained Optimization Graph.
The setup and running costs of building and connecting a route increase with the distance of the connections. This is especially true of routers in the backbone. The cost of these links can dominate the cost structure, so the Internet must optimize physical distance between routers too. Thus, in the Preferential Attachment graph is an inaccurate model because the high costs prohibit central routers from having several long-distance connections. It is more economical to follow the Constrained Optimization graph.
Moreover, the cost structure encourages networks to minimize the length and number of links. A graph that is highly connected at all levels, not just a few core nodes, more accurately follows these economic considerations.
There are also technological limitations on the Preferential Attachment graph. The number of packets a router can handle is fundamentally limited by technological advances. As such, the highly centralized routers in the Preferential Attachment model do not exist because no router could, given technological limitations, actually handle the bandwidth required to support his model.
A model that more closely follows such technological and economic goals is the HOT (highly optimized tradeoffs) model. HOT models can have maximize multiple objective functions, multiple constraints, and also includes an uncertainty component. The uncertainty component is important because it more accurately reflects the distributed nature of building the Internet. A centralized model for building the Internet (such as the Power Law) is therefore less accurate than the HOT model. In certain special cases of the HOT model, the power law for nodes can actually be generated.
The HOT model can consider technological constraints such as maximum router connectivity and the prohibitive cost of very long router connections. It also considers the economic costs of building routers in general. The HOT model is also robust when highly centralized nodes are removed; it cannot collapse if only a few routers are taken out. Thus, their is no Achilles' Heel if the HOT model is accurate, and theoretical and empirical evidences suggests it is indeed more accurate.
The topology of the CO graph more accurately resembles the Internet. It is scale-free, and the edge nodes have high degrees. The central nodes tend to have small degrees. It does not suffer from the Achilles' Heel problem of the PA graph. The CO graph is the result of deliberate human engineering. A randomly constructed graph is not likely to become a CO graph; a PA graph is much more likely. The Achilles' Heel argument falsely assumes the Internet is a PA graph because it is far more likely to happen in a randomly constructed graph. However, the Internet was not randomly constructed but deliberately designed to maximize total throughput, making it a CO graph.
A popular case study for complex networks has been the Internet, with a central issue being the extent to which its design and evolution have made it “robust yet fragile” (RYF), that is, unaffected by random component failures but vulnerable to targeted attacks on its key components. One line of research portrays the Internet as “scale-free” (SF) with a “hub-like” core structure that makes the network simultaneously robust to random losses of nodes yet fragile to targeted attacks on the highly connected nodes or “hubs” (1–3). The resulting error tolerance with attack vulnerability has been proposed as a previously overlooked “Achilles' heel” of the Internet. The appeal of such a surprising discovery is understandable, because SF methods are quite general and do not depend on any details of Internet technology, economics, or engineering (4, 5).
One purpose of this article is to explore how this SF depiction compares with the real Internet and explain the nature and origin of some important discrepancies. Another purpose is to suggest that a more coherent perspective on the Internet as a complex network, and in particular its RYF nature, is possible in a way that is fully consistent with Internet technology, economics, and engineering. A complete exposition relies on the mathematics of random graphs and statistical physics (6), which underlie the SF theory, as well as on the very details of the Internet ignored in the SF formulation (7). Nevertheless, we aim to show here that the essential issues can be readily understood, if not rigorously proven, by using less technical detail, and the lessons learned are relevant well beyond either the Internet or SF-network models (8–10).
The most significant SF claims for the Internet are that the router graph has power-law degree sequences that give rise to hubs, which by SF definition are highly connected vertices that are crucial to the global connectivity of the network and through which most traffic must pass (3). The SF assertion (later formalized in ref. 12) is that such hubs hold the network together, giving it “error tolerance” to random vertex failures, because most vertices have low connectivity (i.e., are nonhubs) but also have “attack vulnerability” to targeted hub removal, a previously overlooked Achilles' heel. The rationale for this claim can be illustrated by using the toy networks shown in Fig. 1, all of which have the identical scaling-degree sequence D shown in Fig. 1e .Fig. 1a shows a graph (size issues notwithstanding) that is representative of the type of structure typically found in graphs generated by SF models, in this case preferential attachment (PA). This graph is drawn in two ways: the left and right visualizations emphasize the growth process and Internet properties, respectively. Clearly, the highest-degree nodes are essential for graph connectivity, and this feature can be seen even more clearly for the more idealized SF graph shown in Fig. 1b . Thus, the SF claims would certainly hold if the Internet looked at all like Figs. 1 a and b . As we will see, the Internet looks nothing like these graphs and is much closer to Fig. 1d , which has the same degree sequence D but is otherwise completely different, with high-degree vertices at the periphery of the network, where their removal would have only local effects. Thus, although scaling-degree sequences imply the presence of high-degree vertices, they do not imply that such nodes form necessarily “crucial hubs” in the SF sense.
Diversity among graphs having the same degree sequence D. (a) RNDnet: a network consistent with construction by PA. The two networks represent the same graph, but the figure on the right is redrawn to emphasize the role that high-degree hubs play in overall network connectivity. (b) SFnet: a graph having the most preferential connectivity, again drawn both as an incremental growth type of network and in a form that emphasizes the importance of high-degree nodes. (c) BADNet: a poorly designed network with overall connectivity constructed from a chain of vertices. (d) HOTnet: a graph constructed to be a simplified version of the Abilene network shown in Fig. 2. (e) Power-law degree sequence D for networks shown in a–d. Only di > 1 is shown.