Commit a65a3ec3 authored by Mike Lees's avatar Mike Lees
Browse files

Commit before Winter School 2018 (networkx 2.0)

Few updates to notebook 3, to improve fail experiments
Mainly fixes to code to be compatible with NetworkX v2.0. This code is no longer compatible with NetworkX <2.0

Also users must install python-louvain and not instal (or uninstall) the package community
parent d5b856d9
......@@ -207,9 +207,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.10"
"version": "2.7.14"
}
},
"nbformat": 4,
"nbformat_minor": 0
"nbformat_minor": 2
}
%% Cell type:markdown id: tags:
# Start here by creating a Graph
%% Cell type:markdown id: tags:
Create an empty graph with no nodes and no edges.
%% Cell type:code id: tags:
``` python
import networkx as nx
%matplotlib inline
import matplotlib
#import matplotlib.pyplot
import matplotlib.pyplot as plt
```
%% Cell type:markdown id: tags:
By definition, a Graph is a collection of nodes (vertices) along with identified pairs of nodes (called edges, links,
etc). In NetworkX, nodes can be any hashable object e.g. a text string, an image, an XML object, another Graph,
a customized node object, etc. (Note: Python’s None object should not be used as a node as it determines whether
optional function arguments have been assigned in many functions.)
%% Cell type:markdown id: tags:
Graph : Undirected simple (allows self loops)
DiGraph : Directed simple (allows self loops)
MultiGraph : Undirected with parallel edges
MultiDiGraph : Directed with parallel edges
can convert to undirected: g.to undirected()
can convert to directed: g.to directed()
To construct, we will use standard python syntax:
%% Cell type:code id: tags:
``` python
G=nx.Graph()
D=nx.DiGraph()
M=nx.MultiGraph()
H=nx.MultiDiGraph()
```
%% Cell type:markdown id: tags:
# Nodes
%% Cell type:markdown id: tags:
The graph G can be grown in several ways. NetworkX includes many graph generator functions and facilities to read and write graphs in many formats. To get started though we’ll look at simple manipulations. You can add one node at a time,
%% Cell type:code id: tags:
``` python
G.add_node(1)
#add a list of nodes,
G.add_nodes_from([2,3,4,5,6,7,8,9,10])
```
%% Cell type:markdown id: tags:
or add any nbunch of nodes. An nbunch is any iterable container of nodes that is not itself a node in the graph. (e.g. a
list, set, graph, file, etc..)
%% Cell type:code id: tags:
``` python
H=nx.path_graph(10)
G.add_nodes_from(H)
```
%% Cell type:markdown id: tags:
Note that G now contains the nodes of H as nodes of G. In contrast, you could use the graph H as a node in G.
%% Cell type:code id: tags:
``` python
G.add_node(H)
print G.nodes()
```
%% Cell type:markdown id: tags:
The graph G now contains H as a node. This flexibility is very powerful as it allows graphs of graphs, graphs of files,
graphs of functions and much more. It is worth thinking about how to structure your application so that the nodes
are useful entities. Of course you can always use a unique identifier in G and have a separate dictionary keyed by
identifier to the node information if you prefer. (Note: You should not change the node object if the hash depends on
its contents.)
%% Cell type:code id: tags:
``` python
nx.draw(G)
plt.show()
```
%% Cell type:markdown id: tags:
# Edges
%% Cell type:markdown id: tags:
G can also be grown by adding one edge at a time,
%% Cell type:code id: tags:
``` python
G.add_edge(1,2)
```
%% Cell type:markdown id: tags:
by adding a list of edges,
%% Cell type:code id: tags:
``` python
G.add_edges_from([(1,2),(1,3)])
```
%% Cell type:markdown id: tags:
or by adding any ebunch of edges. An ebunch is any iterable container of edge-tuples. An edge-tuple can be a 2-tuple of nodes or a 3-tuple with 2 nodes followed by an edge attribute dictionary, e.g. (2,3,{‘weight’:3.1415}). Edge attributes are discussed further below
%% Cell type:code id: tags:
``` python
G.add_edges_from(H.edges())
```
%% Cell type:markdown id: tags:
Adding an edge between nodes that don’t exist will automatically add those nodes
%% Cell type:code id: tags:
``` python
G.add_edges_from([(11,12),(12,13)])
print G.nodes()
```
%% Cell type:code id: tags:
``` python
print G.edges()
```
%% Cell type:code id: tags:
``` python
nx.draw(G)
plt.show()
```
%% Cell type:markdown id: tags:
One can demolish the graph in a similar fashion; using Graph.remove_node(), Graph.remove_nodes_from(), Graph.remove_edge() and Graph.remove_edges_from(), e.g
%% Cell type:code id: tags:
``` python
G.remove_node(H)
G.nodes()
```
%% Cell type:markdown id: tags:
There are no complaints when adding existing nodes or edges. For example, after removing all nodes and edges,
%% Cell type:code id: tags:
``` python
G.clear()
G.nodes()
```
%% Cell type:markdown id: tags:
we add new nodes/edges and NetworkX quietly ignores any that are already present.
%% Cell type:code id: tags:
``` python
G.add_edges_from([(1,2),(1,3)])
G.add_node(1)
G.add_edge(1,2)
G.add_node("spam") # adds node "spam"
G.add_nodes_from("spam") # adds 4 nodes: 's', 'p', 'a', 'm'
```
%% Cell type:markdown id: tags:
At this stage the graph G consists of 8 nodes and 2 edges, as can be seen by:
%% Cell type:code id: tags:
``` python
G.number_of_nodes()
```
%% Cell type:code id: tags:
``` python
G.number_of_edges()
```
%% Cell type:code id: tags:
``` python
nx.draw(G)
plt.show()
```
%% Cell type:markdown id: tags:
We can examine them with
%% Cell type:code id: tags:
``` python
G.nodes() # show nodes
```
%% Cell type:code id: tags:
``` python
G.edges() # show edges
```
%% Cell type:markdown id: tags:
Neighbors - Iterating over edges,can be useful for efficiency
%% Cell type:code id: tags:
``` python
G.neighbors(1) # show neighbors
```
%% Cell type:markdown id: tags:
Removing nodes or edges has similar syntax to adding:
%% Cell type:code id: tags:
``` python
G.remove_nodes_from("spam")
G.nodes()
```
%% Cell type:code id: tags:
``` python
G.remove_edge(1,3)
G.edges()
```
%% Cell type:code id: tags:
``` python
nx.draw(G)
plt.show()
```
%% Cell type:markdown id: tags:
When creating a graph structure (by instantiating one of the graph classes you can specify data in several formats.
%% Cell type:code id: tags:
``` python
H=nx.DiGraph(G) # create a DiGraph using the connections from G
H.edges()
```
%% Cell type:code id: tags:
``` python
edgelist=[(0,1),(1,2),(2,3),(0,3)]
H=nx.Graph(edgelist)
H.nodes()
```
%% Cell type:code id: tags:
``` python
H.edges()
```
%% Cell type:code id: tags:
``` python
nx.draw(H)
plt.show()
```
%% Cell type:markdown id: tags:
# What to use as nodes and edges
%% Cell type:markdown id: tags:
You might notice that nodes and edges are not specified as NetworkX objects. This leaves you free to use meaningful items as nodes and edges. The most common choices are numbers or strings, but a node can be any hashable object (except None), and an edge can be associated with any object x using G.add_edge(n1,n2,object=x).
As an example, n1 and n2 could be protein objects from the RCSB Protein Data Bank, and x could refer to an XML record of publications detailing experimental observations of their interaction.
We have found this power quite useful, but its abuse can lead to unexpected surprises unless one is familiar with Python. If in doubt, consider using convert_node_labels_to_integers() to obtain a more traditional graph with integer labels.
%% Cell type:markdown id: tags:
# Accessing edges
%% Cell type:markdown id: tags:
In addition to the methods Graph.nodes(), Graph.edges(), and Graph.neighbors(), iterator versions (e.g. Graph.edges_iter()) can save you from creating large lists when you are just going to iterate through them anyway.
Fast direct access to the graph data structure is also possible using subscript notation.
%% Cell type:code id: tags:
``` python
G[1] # Warning: do not change the resulting dict
```
%% Cell type:code id: tags:
``` python
G[1][2]
```
%% Cell type:markdown id: tags:
You can safely set the attributes of an edge using subscript notation if the edge already exists.
%% Cell type:code id: tags:
``` python
G.add_edge(1,3)
G[1][3]['color']='blue'
```
%% Cell type:markdown id: tags:
Fast examination of all edges is achieved using adjacency iterators. Note that for undirected graphs this actually looks at each edge twice.
%% Cell type:code id: tags:
``` python
FG=nx.Graph()
FG.add_weighted_edges_from([(1,2,0.125),(1,3,0.75),(2,4,1.2),(3,4,0.375)])
```
%% Cell type:code id: tags:
``` python
for n,nbrs in FG.adjacency_iter():
for n,nbrs in FG.adjacency():
for nbr,eattr in nbrs.items():
data=eattr['weight']
if data<0.5: print('(%d, %d, %.3f)' % (n,nbr,data))
```
%% Cell type:markdown id: tags:
# Adding attributes to graphs, nodes, and edges
%% Cell type:markdown id: tags:
Attributes such as weights, labels, colors, or whatever Python object you like, can be attached to graphs, nodes, or edges.
Each graph, node, and edge can hold key/value attribute pairs in an associated attribute dictionary (the keys must be hashable). By default these are empty, but attributes can be added or changed using add_edge, add_node or direct manipulation of the attribute dictionaries named G.graph, G.node and G.edge for a graph G
%% Cell type:markdown id: tags:
# Graph attributes
%% Cell type:markdown id: tags:
Assign graph attributes when creating a new graph
%% Cell type:code id: tags:
``` python
G = nx.Graph(day="Friday")
G.graph
```
%% Cell type:markdown id: tags:
Or you can modify attributes later
%% Cell type:code id: tags:
``` python
G.graph['day']='Monday'
G.graph
```
%% Cell type:markdown id: tags:
# Node attributes
%% Cell type:markdown id: tags:
Add node attributes using add_node(), add_nodes_from() or G.node
%% Cell type:code id: tags:
``` python
G.add_node(1, time='5pm')
G.add_nodes_from([3], time='2pm')
G.node[1]
```
%% Cell type:code id: tags:
``` python
G.node[1]['room'] = 714
G.nodes(data=True)
```
%% Cell type:markdown id: tags:
Note that adding a node to G.node does not add it to the graph, use G.add_node() to add new nodes.
%% Cell type:markdown id: tags:
# Edge Attributes
%% Cell type:markdown id: tags:
Add edge attributes using add_edge(), add_edges_from(), subscript notation, or G.edge.
%% Cell type:code id: tags:
``` python
G.add_edge(1, 2, weight=4.7 )
G.add_edges_from([(3,4),(4,5)], color='red')
G.add_edges_from([(1,2,{'color':'blue'}), (2,3,{'weight':8})])
G[1][2]['weight'] = 4.7
G.edge[1][2]['weight'] = 4
G.edges[1,2]['weight'] = 4
```
%% Cell type:code id: tags:
``` python
G.nodes()
```
%% Cell type:code id: tags:
``` python
G.edges()
```
%% Cell type:markdown id: tags:
The special attribute ‘weight’ should be numeric and holds values used by algorithms requiring weighted edges.
%% Cell type:markdown id: tags:
# Simple Properties
%% Cell type:markdown id: tags:
Number of nodes : Return the number of nodes in the graph.
%% Cell type:code id: tags:
``` python
len(G)
G.order()
```
%% Cell type:code id: tags:
``` python
G.number_of_nodes()
```
%% Cell type:markdown id: tags:
Number of Edges : Return the number of edges between two nodes.
%% Cell type:code id: tags:
``` python
G.number_of_edges()
```
%% Cell type:markdown id: tags:
Check node membership : Return True if the graph contains the node n.
%% Cell type:code id: tags:
``` python
G.has_node(1)
```
%% Cell type:markdown id: tags:
Check edge presence : Return True if the edge (u,v) is in the graph.
%% Cell type:code id: tags:
``` python
G.has_edge(0,1)
```
%% Cell type:markdown id: tags:
Degrees: Return the degree of a node or nodes.
The node degree is the number of edges adjacent to that node.
Parameters :
nbunch : iterable container, optional (default=all nodes)
A container of nodes. The container will be iterated through once.
weight : string or None, optional (default=None)
The edge attribute that holds the numerical value used as a weight. If None, then each edge has weight 1. The degree is the sum of the edge weights adjacent to the node.
Returns :
nd : dictionary, or number
A dictionary with nodes as keys and degree as values or a number if a single node is specified.
%% Cell type:code id: tags:
``` python
G.add_path([0,1,2,3])
G.degree(0)
```
%% Cell type:code id: tags:
``` python
G.degree([0 ,1])
```
%% Cell type:code id: tags:
``` python
G.degree()
```
%% Cell type:code id: tags:
``` python
G.degree().values() # useful for degreedist
dict(G.degree()).values() # useful for degreedist
```
%% Cell type:markdown id: tags:
# A Few Useful Functions
%% Cell type:markdown id: tags:
As subgraphs : Generate connected components as subgraphs.
Parameters:
G : NetworkX graph
An undirected graph.
copy: bool (default=True)
If True make a copy of the graph attributes
Returns:
comp : generator
A generator of graphs, one for each connected component of G.
%% Cell type:code id: tags:
``` python
nx.connected_component_subgraphs(G)
```
%% Cell type:markdown id: tags:
k-cores : Return the core number for each vertex.
A k-core is a maximal subgraph that contains nodes of degree k or more.
The core number of a node is the largest value k of a k-core containing
that node.
Parameters
----------
G : NetworkX graph
%% Cell type:code id: tags:
``` python
nx . find_cores ( G )
```
%% Cell type:markdown id: tags:
shortest path : Compute shortest paths in the graph.
Parameters:
G : NetworkX graph
source : node, optional
Starting node for path. If not specified, compute shortest paths using all nodes as source nodes.
target : node, optional
Ending node for path. If not specified, compute shortest paths using all nodes as target nodes.
weight : None or string, optional (default = None)
If None, every edge has weight/distance/cost 1. If a string, use this edge attribute as the edge weight. Any edge attribute not present defaults to 1.
Returns:
path: list or dictionary
All returned paths include both the source and target in the path.
If the source and target are both specified, return a single list of nodes in a shortest path from the source to the target.
If only the source is specified, return a dictionary keyed by targets with a list of nodes in a shortest path from the source to one of the targets.
If only the target is specified, return a dictionary keyed by sources with a list of nodes in a shortest path from one of the sources to the target.
If neither the source nor target are specified return a dictionary of dictionaries with path[source][target]=[list of nodes in path].
%% Cell type:code id: tags:
``` python
nx.shortest_path(G,source=0,target=4)
```
%% Cell type:code id: tags:
``` python
nx . betweenness_centrality ( G )
```