karapto

Posted on Jan 21, 2021

Graph Analytics with Python -Graph Partition and Community Extraction-

#python #datascience #graph #networkx

By extracting groups and factions from a graph, it is possible to partition it into multiple subgraphs. There are two types of partitioning, graph partitioning and community extraction, and in this article, we will explain these two types with code. First, load the Networkx library to be used in this article.

import networkx as nx
import matplotlib.pyplot as plt
import numpy as np
from networkx.algorithms.community import kernighan_lin_bisection
from networkx.algorithms.community import greedy_modularity_communities
from networkx.algorithms.community import label_propagation_communities

Graph Partitioning

One of the basic algorithms for graph partitioning is the Kernighan-Lin algorithm, which divides a given graph into two groups and selects two vertices from each group to exchange the groups to which they belong, such that the number of edges between the groups is The algorithm exchanges two vertices such that the number of edges between the groups decreases more. For detailed explanation, please refer to the following papers and articles.

Original Paper: An efficient heuristic procedure for partitioning graphs

Slide about Kernighan-Lin Algorithm: A Fundamental Bi-partition Algorithm of Kernighan-Lin

G = nx.karate_club_graph()
colors = ["#00C98D", "#5030C0", "#50F0F0"]
pos = nx.spring_layout(G)

init_nodes = np.array_split(G.nodes(), 2)
init_partition = [set(init_nodes[0]), set(init_nodes[1])]
print(init_partition)

color_map_i = ["black"] * nx.number_of_nodes(G)
counter = 0
for c in init_partition:
  for n in c:
    color_map_i[n] = colors[counter]
  counter = counter +1
nx.draw_networkx_edges(G, pos)
nx.draw_networkx_nodes(G, pos, node_color=color_map_i)
nx.draw_networkx_labels(G, pos)
plt.axis("off")
plt.show()

lst_b = kernighan_lin_bisection(G, partition=init_partition)
color_map_b = ["black"] * nx.number_of_nodes(G)
counter = 0
for c in lst_b:
  for n in c:
    color_map_b[n] = colors[counter]
  counter = counter + 1
nx.draw_networkx_edges(G, pos)
nx.draw_networkx_nodes(G, pos, node_color=color_map_b)
nx.draw_networkx_labels(G, pos)
plt.axis("off")
plt.show()

Here, we plot both the initial and final states of a graph modeled after Zachary's karate club network, partitioned by the Kernighan-Lin algorithm.

Community Extraction

In contrast to graph partitioning, community extraction is an approach to extract dense parts of a graph without determining the number of partitions or the size of the partitions. In this paper, we discuss the Louvain method and label propagation as representative methods of community extraction.

Louvain Method

The Louvain method is a method for extracting communities by rapidly computing Modularity using the greedy method. As an additional explanation, Modularity is one of the most popular metrics in recent years as a method for fast community extraction from large graph structures. Modularity is a metric that becomes better the more the cut subgraph differs from the random graph. The higher the Modularity of the extracted subgraph, the better the community is extracted. If the extracted community set is C, the number of edges connected from community i to community j is eij, and the total number of edges in the entire graph structure is m, then Modularity is defined as follows.

Modularity = \frac{1}{2m}\sum_{ij}(A_{ij}-\frac{k_ik_j}{2m})\delta(c_i,c_j)

It may be difficult to understand due to the redundant explanations in the formulas, but it is very easy to implement since it only uses the greedy method to extract the community.
Networkx provides a greedy_modularity_community to implement the greedy method, so we just need to use it.

G = nx.karate_club_graph()
colors = ["#00C98D", "#5030C0", "#50F0F0"]
pos = nx.spring_layout(G)

lst_m = greedy_modularity_communities(G)
color_map_b = ["black"] * nx.number_of_nodes(G)
counter = 0
for c in lst_m:
  for n in c:
    color_map_b[n] = colors[counter]
  counter = counter + 1
nx.draw_networkx_edges(G, pos)
nx.draw_networkx_nodes(G, pos, node_color=color_map_b)
nx.draw_networkx_labels(G, pos)
plt.axis("off")
plt.show()

Label Propagation

The next step is community extraction by label propagation. Label propagation assumes that all vertices have different community labels as the initial state, and each vertex repeatedly changes its own community according to the surrounding vertices to determine the community. Label propagation is fast, but the quality of the resulting communities is not high. It is easy to implement by using the label_propagation_communities provided in Networkx.

lst_l = label_propagation_communities(G)
color_map_b = ["black"] * nx.number_of_nodes(G)
counter = 0
for c in lst_l:
  for n in c:
    color_map_b[n] = colors[counter]
  counter = counter + 1
nx.draw_networkx_edges(G, pos)
nx.draw_networkx_nodes(G, pos, node_color=color_map_b)
nx.draw_networkx_labels(G, pos)
plt.axis("off")
plt.show()

Conclusion

This article describes graph partitioning and community extraction, and implements them using Networkx. Graphs can be applied to a variety of fields, and there is a lot of research on such partitioning and extraction techniques.

・Streaming Graph Partitioning: An Experimental Study

DEV Community

Graph Analytics with Python -Graph Partition and Community Extraction-

Graph Partitioning

Community Extraction

Louvain Method

Label Propagation

Conclusion

Top comments (0)

Read next

How to Define AI Agents with Cloudformation and SAM: A Builder's Guide

What I'd do differently in Bootcamp. (spoiler: Everything)

Optimise AWS Costs: Automate Unused EBS Snapshot Cleanup with Lambda

World's Largest Telegram Dataset Reveals How Information Spreads Across 120,000+ Channels