Networks are a common aspect of your daily life, and since you have been logging your network of friends and colleagues on services like Facebook or LinkedIn, there are tons of data available. There is also a large amount of research on networks with tough questions, such as the analytical difference between homophily (connecting to those like us) and contagion (becoming like our connections). We will stick to description and simple measures of influence.

There are several software options for network analysis, like Gephi, Pajek or VOSON (for hyperlink networks). We will stay in R and use the `sna`

and `igraph`

libraries, which use different but compatible formats to store network data. Some examples will be taken from Baptiste Coulmont's graphs of small cliques.

We'll start with simulating a random network of \(n = 30\) individuals (`ego`

), for which we simulate a bidirectional friendship relationship: if individual 'Ego' is a friend of individual 'Alter', then the reciprocal is true. Each individual has the possibility to associate with any other individual in the network, resulting in a network matrix of \(30^2 = 900\) rows, with one extra row per individual that connects it to itself (\(n-n\)) and that will be ignored when generating relationships. The result is the `rnet`

dataset.

```
# Set network size.
n = 30
# Create n series of n.
ego = rep(1:n, each = n)
# Create n sequences of n.
alter = rep(1:n, times = n)
# Default to no friendship between ego and alter.
friendship = 0
# Assemble dataset.
rnet = data.frame(ego, alter, friendship)
# First rows.
head(rnet)
```

```
ego alter friendship
1 1 1 0
2 1 2 0
3 1 3 0
4 1 4 0
5 1 5 0
6 1 6 0
```

To generate random relationships, we draw from a binomial distribution where the probability of a friendship is artificially set to \(Pr(friendship) = .15\). The result is a network that displays approximately 15% of all possible `friendship`

ties in the `rnet`

dataset.

```
# Probability of friendship tie.
conDen <- 0.15
# Assign ties to random nodes.
for (i in 1:n) for (ii in (i + 1):n) if ((rbinom(1, 1, conDen) == 1) & (i !=
ii)) {
rnet$friendship[(rnet$ego == i & rnet$alter == ii)] = 1
rnet$friendship[(rnet$ego == ii & rnet$alter == i)] = 1
}
# Inspect random network ties.
summary(rnet)
```

```
ego alter friendship
Min. : 1.0 Min. : 1.0 Min. :0.000
1st Qu.: 8.0 1st Qu.: 8.0 1st Qu.:0.000
Median :15.5 Median :15.5 Median :0.000
Mean :15.5 Mean :15.5 Mean :0.124
3rd Qu.:23.0 3rd Qu.:23.0 3rd Qu.:0.000
Max. :30.0 Max. :30.0 Max. :1.000
```

The network is drawn with the `ggnet`

function. The plot function processes the subset of the `rnet`

data frame for which the `friendship`

variable indiciates that there is a relationship to draw. The ties are undirected: there are no arrows between the nodes because the friendship ties are strictly reciprocal.

```
# Form network object.
net = network(rnet[rnet$friendship == 1, ], directed = FALSE)
net
```

```
Network attributes:
vertices = 30
directed = FALSE
hyper = FALSE
loops = FALSE
multiple = FALSE
bipartite = FALSE
total edges= 112
missing edges= 0
non-missing edges= 112
Vertex attribute names:
vertex.names
No edge attributes
```

```
# Plot random network.
ggnet(net,
label = TRUE,
color = "white")
```

This function is used in the next pages to plot a few social networks. You can train yourself by plotting fictional networks, like the one below using the *Grey's Anatomy* network by Gary Weissman, or turn to Solomon Messing's analysis of U.S. student affiliations for a real-world example of network data.

```
# Locate data.
link = "http://www.babelgraph.org/data/ga_edgelist.csv"
file = "data/ga.network.csv"
# Download data.
if(!file.exists(file)) download(link, file, mode = "wb")
# Create network.
net = network(read.csv(file), directed = FALSE)
# Plot network.
ggnet(net,
label = TRUE,
color = "white",
top8 = TRUE,
size = 18,
legend.position = "none")
```

The next pages make more use of the `ggnet`

function with Twitter data and word associations plotted as network ties.

Next: Influence.