Clustering in Networks

Clustering in networks is a phenomenon that we all understand intuitively. Paraphrasing Barabási (Page 46), two of your friends are much more likely to know each other than a “gondolier from Venice and an Eskimo fisherman.” In other words, it is reasonable to expect that your friends would “cluster” into a few tightly connected groups.

This notion of clustering is made precise using the network measure called clustering coefficient. Roughly speaking the clustering coefficient of a node v is the fraction of pairs of friends of v that are friends of each other. The clustering coefficient of a network is simply the average of the clustering coefficients of the nodes in the network. Here are slides in which I define the notion of clustering coefficient of a node as well as clustering coefficient of a network.

Once we start focusing on clustering in social networks, it is easy to see the inadequacy of Erdös-Rényi random graphs as models of social networks. Going back to the earlier example, in an Erdös-Rényi random graph the gondolier from Venice and the Eskimo fisherman have the same probability of being friends as two of my friends. This problem with the Erdös-Rényi model was pointed out in a widely cited paper titled “Collective dynamics of `small-world’ networks” by Watts and Strogatz (Nature, Vol 393, June 1998). The paper considers three empirical examples of small-world networks: (i) Films actors network, (ii) Power grid network, and (iii) C.elegans network and shows that in each of these networks the clustering coefficient is several orders of magnitude larger than the clustering coefficient of the corresponding (i.e., with the same size and expected number of edges) Erdös-Rényi graphs. Also, worth noting is that Erdös-Rényi graphs do get the average path length right, it is just that they significantly underestimate the clustering coefficient.

Homework for next class (9/25): The main contribution of the Watts-Strogatz 1998 Nature paper is a simple random graph model, which we will call the Watts-Strogatz model, that has both the “small world” property and the high clustering property. Your homework for next week is to understand this model by:

  • reading its description in the Watts-Strogatz Nature paper; see Figure 1 and its caption,
  • reading Chapter 4 on “Small Worlds” from the Barabási book. Section 4 in this chapter provides a somewhat informal description of the Watts-Strogatz model.

An additional resource is this article at Scholarpedia.

Leave a comment