| Title: | "Node+Edge Betweenness Community Detection Algorithm for 'igraph' Objects" |
|---|---|
| Description: | Implements the "Node + Edge Betweenness" community detection algorithm for network analysis using 'igraph' objects. This algorithm combines node degree and betweenness centrality measures to identify communities within networks, with a gradient evident in social partitioning. The package provides functions for implementation of the algorithm, visualization, and analysis of the resulting community structure along with the original dataset used in the publication. Methods are based on results from Smith, Pittman and Xu (2024) <doi:10.48550/arXiv.2411.01394>. |
| Authors: | Benjamin Smith [aut, cre] (ORCID: <https://orcid.org/0009-0007-2206-0177>), Tyler Pittman [aut] (ORCID: <https://orcid.org/0000-0002-5013-6980>), Wei Xu [aut] (ORCID: <https://orcid.org/0000-0002-0257-8856>) |
| Maintainer: | Benjamin Smith <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.2.0 |
| Built: | 2026-05-17 05:17:38 UTC |
| Source: | https://github.com/benyamindsmith/ig.degree.betweenness |
Referred to as the "Smith-Pittman" algorithm in Smith et al (2024). This algorithm detects communities by calculating the degree centrality measures of nodes and edge betweenness.
cluster_degree_betweenness(graph)cluster_degree_betweenness(graph)
graph |
The graph to analyze |
This can be thought of as an alternative version of igraph::cluster_edge_betweeness().
The function iteratively removes edges based on their betweenness centrality and the degree of their adjacent nodes. At each iteration, it identifies the edge with the highest betweenness centrality among those connected to nodes with the highest degree.It then removes that edge and recalculates the modularity of the resulting graph. The process continues until all edges have been assessed or until no further subgraph can be created with the optimal number of communites being chosen based on maximization of modularity.
An igraph "communities" object with detected communities via the Smith-Pittman algorithm.
Smith et al (2024) "Centrality in Collaboration: A Novel Algorithm for Social Partitioning Gradients in Community Detection for Multiple Oncology Clinical Trial Enrollments", https://doi.org/10.48550/arXiv.2411.01394
library(igraphdata) data("karate") ndb <- cluster_degree_betweenness(karate) plot( ndb, karate, main= "Degree-Betweenness Clustering" ) ndb data("UKfaculty") # Making graph undirected so it looks nicer when its plotted uk_faculty <- UKfaculty |> igraph::as.undirected() ndb <- cluster_degree_betweenness(uk_faculty) plot( ndb, uk_faculty, main= "Smith-Pittman Clustering for UK Faculty" )library(igraphdata) data("karate") ndb <- cluster_degree_betweenness(karate) plot( ndb, karate, main= "Degree-Betweenness Clustering" ) ndb data("UKfaculty") # Making graph undirected so it looks nicer when its plotted uk_faculty <- UKfaculty |> igraph::as.undirected() ndb <- cluster_degree_betweenness(uk_faculty) plot( ndb, uk_faculty, main= "Smith-Pittman Clustering for UK Faculty" )
Computes and summarizes vertex degree distributions for an igraph
object. For directed graphs, the function reports in-degree and out-degree
statistics, including their correlation and mean values. For undirected
graphs, only overall degree statistics are reported.
degree_balance(g)degree_balance(g)
g |
An |
If the graph is directed (see igraph::is_directed()), the function:
Computes in-degree and out-degree for each vertex.
Prints summary statistics for both distributions.
Reports the correlation between in-degree and out-degree.
Reports mean in-degree and mean out-degree.
If the graph is undirected, the function:
Computes the degree for each vertex.
Prints summary statistics of the degree distribution.
A list containing:
out_degree Numeric vector of out-degrees (directed graphs only).
in_degree Numeric vector of in-degrees (directed graphs only).
degree Numeric vector of degrees (undirected graphs only).
library(igraph) library(ig.degree.betweenness) # Directed graph example g_directed <- make_ring(10, directed = TRUE) degree_balance(g_directed) # Undirected graph example g_undirected <- make_ring(10) degree_balance(g_undirected)library(igraph) library(ig.degree.betweenness) # Directed graph example g_directed <- make_ring(10, directed = TRUE) degree_balance(g_directed) # Undirected graph example g_undirected <- make_ring(10) degree_balance(g_undirected)
A family of functions to generate networks based on the Linearized Chord Diagram (LCD) model using preferential attachment. The suite includes the standard LCD model as well as variants that introduce bidirectional edges, alternating direction attachment, and mixed degree preferences.
lcd_graph(n, m = 1, directed = TRUE) lcd_graph_bidirectional(n, m = 1, directed = TRUE, bidirectional_prob = 0.5) lcd_graph_alternating(n, m = 1, directed = TRUE) lcd_graph_mixed(n, m = 1, directed = TRUE, out_weight = 0.5)lcd_graph(n, m = 1, directed = TRUE) lcd_graph_bidirectional(n, m = 1, directed = TRUE, bidirectional_prob = 0.5) lcd_graph_alternating(n, m = 1, directed = TRUE) lcd_graph_mixed(n, m = 1, directed = TRUE, out_weight = 0.5)
n |
Integer. The total number of vertices (nodes) in the generated graph. |
m |
Integer. The number of edges each new node adds during its time step. Defaults to 1. |
directed |
Logical. Whether the generated graph should be directed.
Defaults to |
bidirectional_prob |
Numeric. The probability (between 0 and 1) of adding
a reverse edge (target -> source) when generating bidirectional graphs.
Used only in |
out_weight |
Numeric. The probability weight (between 0 and 1) given to
the out-degree pool versus the in-degree pool. Used only in |
The standard lcd_graph function builds a network step-by-step. At each time step
t from 1 to n, a new node is added and generates m edges. The target of each
edge is chosen preferentially based on the degree of the existing nodes.
The package also provides three variations:
lcd_graph_bidirectional: Adds a primary edge and, with a specified
probability, simultaneously adds a reverse edge.
lcd_graph_alternating: Alternates between using out-degree and in-degree
pools for preferential attachment targets.
lcd_graph_mixed: Chooses between out-degree and in-degree pools for
attachment based on a weighted probability.
An igraph object representing the generated LCD graph.
Barabási, A.-L. (2016). Network Science. Cambridge University Press. Chapter 5: The Barabási-Albert Model. https://networksciencebook.com/chapter/5
# Generate a standard directed LCD graph with 100 nodes and 2 edges per step g1 <- lcd_graph(n = 100, m = 2) # Generate a graph where reverse edges appear 30% of the time g2 <- lcd_graph_bidirectional(n = 100, m = 1, bidirectional_prob = 0.3) # Generate a graph alternating between in-degree and out-degree attachment g3 <- lcd_graph_alternating(n = 100, m = 3) # Generate a mixed graph favoring out-degree attachment 70% of the time g4 <- lcd_graph_mixed(n = 100, m = 2, out_weight = 0.7)# Generate a standard directed LCD graph with 100 nodes and 2 edges per step g1 <- lcd_graph(n = 100, m = 2) # Generate a graph where reverse edges appear 30% of the time g2 <- lcd_graph_bidirectional(n = 100, m = 1, bidirectional_prob = 0.3) # Generate a graph alternating between in-degree and out-degree attachment g3 <- lcd_graph_alternating(n = 100, m = 3) # Generate a mixed graph favoring out-degree attachment 70% of the time g4 <- lcd_graph_mixed(n = 100, m = 2, out_weight = 0.7)
A simulated oncology clinical trial referral network from a major research hospital. For the purpose of identifying collaboration networks between oncologists, this dataset only includes referrals of patients who were enrolled in more than one clinical trial. This includes 389 patients enrolled in 288 clinical trials.
oncology_networkoncology_network
A igraph object (e.g. representing the oncology clinical trial referral network. The structure includes:
Oncologists or clinical trials (depending on network structure).
Referral links between nodes, based on shared patient enrollment across trials.
Clinical trials are categorized by intervention type, including targeted therapies (prefixed with "T:") and immunotherapies (prefixed with "I:"). There are 16 distinct intervention types (nodes) and 470 patient referrals (edges) in the network.
Simulated data based on oncology clinical trial enrollment patterns.
Generates a horizontal bar‐style plot of node degrees for an igraph network.
For undirected graphs, it shows each node’s total degree.
For directed graphs, it displays in‐degrees (as negative bars) alongside out‐degrees.
plot_node_degrees(graph)plot_node_degrees(graph)
graph |
An |
This function computes:
Number of edges incident on each node (for undirected graphs).
Number of incoming edges per node (for directed graphs).
Number of outgoing edges per node (for directed graphs).
For directed graphs, in‐degrees are negated so that bars extend leftward, providing an immediate visual comparison to out‐degrees.
Internally, it uses:
igraph::degree() to compute degrees,
dplyr and tidyr for reshaping the data,
ggplot2 for plotting.
A ggplot object:
Undirected graphs: A bar for each node showing its total degree.
Directed graphs: Split bars per node with negative values for in‐degree (pointing left) and positive values for out‐degree (pointing right).
You can modify the returned ggplot with additional layers, themes, or labels.
For example, to add a title or change colors:
plot_node_degrees(g) +
ggtitle("Degree Distribution") +
scale_fill_manual(values = c("in_degree" = "steelblue", "out_degree" = "salmon"))
library(ig.degree.betweenness) library(igraphdata) data("karate") data("oncology_network") plot_node_degrees(oncology_network) plot_node_degrees(karate)library(ig.degree.betweenness) library(igraphdata) data("karate") data("oncology_network") plot_node_degrees(oncology_network) plot_node_degrees(karate)
This function generates a simplified edge plot of an igraph object, optionally highlighting communities if provided.
plot_simplified_edgeplot(graph, communities = NULL, edge.arrow.size = 0.2, ...)plot_simplified_edgeplot(graph, communities = NULL, edge.arrow.size = 0.2, ...)
graph |
igraph object |
communities |
optional; A communities object |
edge.arrow.size |
edge.arrow size arg. See ?igraph::plot.igraph for more details |
... |
other arguments to be passed to the |
This function is ideally for networks with a low number of nodes having varying numbers of connection and self loops. See the example for a better visual understanding.
No return value, called for side effects.
# Load the igraph package library(igraph) library(ig.degree.betweenness) # Set parameters num_nodes <- 15 # Number of nodes (adjust as needed) initial_edges <- 1 # Starting edges for preferential attachment # Create a directed, scale-free network using the Barabási-Albert model g <- sample_pa(n = num_nodes, m = initial_edges, directed = TRUE) # Introduce additional edges to high-degree nodes to accentuate popularity differences num_extra_edges <- 350 # Additional edges to create more popular nodes set.seed(123) # For reproducibility for (i in 1:num_extra_edges) { # Sample nodes with probability proportional to their degree (to reinforce popularity) from <- sample(V(g), 1, prob = degree(g, mode = "in") + 1) # +1 to avoid zero probabilities to <- sample(V(g), 1) # Ensure we don't add the same edge repeatedly unless intended, allowing self-loops g <- add_edges(g, c(from, to)) } # Add self-loops to a subset of nodes num_self_loops <- 5 for (i in 1:num_self_loops) { node <- sample(V(g), 1) g <- add_edges(g, c(node, node)) } ig.degree.betweenness::plot_simplified_edgeplot(g,main="Simulated Data")# Load the igraph package library(igraph) library(ig.degree.betweenness) # Set parameters num_nodes <- 15 # Number of nodes (adjust as needed) initial_edges <- 1 # Starting edges for preferential attachment # Create a directed, scale-free network using the Barabási-Albert model g <- sample_pa(n = num_nodes, m = initial_edges, directed = TRUE) # Introduce additional edges to high-degree nodes to accentuate popularity differences num_extra_edges <- 350 # Additional edges to create more popular nodes set.seed(123) # For reproducibility for (i in 1:num_extra_edges) { # Sample nodes with probability proportional to their degree (to reinforce popularity) from <- sample(V(g), 1, prob = degree(g, mode = "in") + 1) # +1 to avoid zero probabilities to <- sample(V(g), 1) # Ensure we don't add the same edge repeatedly unless intended, allowing self-loops g <- add_edges(g, c(from, to)) } # Add self-loops to a subset of nodes num_self_loops <- 5 for (i in 1:num_self_loops) { node <- sample(V(g), 1) g <- add_edges(g, c(node, node)) } ig.degree.betweenness::plot_simplified_edgeplot(g,main="Simulated Data")
Presently,
cluster_degree_betweenness() function only works with labeled graphs. prep_unlabeled_graph() is a utility function that gives an unlabeled graph labels which are string values of their vertices.
prep_unlabeled_graph(graph)prep_unlabeled_graph(graph)
graph |
an unlabeled graph. |
An "igraph" object with named vertices.
cluster_degree_betweenness() which this function aids.
library(igraph) library(igraphdata) library(ig.degree.betweenness) data("UKfaculty") # Making graph undirected so it looks nicer when its plotted uk_faculty <- prep_unlabeled_graph(UKfaculty) |> as.undirected() ndb <- cluster_degree_betweenness(uk_faculty) plot( ndb, uk_faculty, main= "Node Degree Clustering" ) ndblibrary(igraph) library(igraphdata) library(ig.degree.betweenness) data("UKfaculty") # Making graph undirected so it looks nicer when its plotted uk_faculty <- prep_unlabeled_graph(UKfaculty) |> as.undirected() ndb <- cluster_degree_betweenness(uk_faculty) plot( ndb, uk_faculty, main= "Node Degree Clustering" ) ndb