Modeling networks as graphs in C++

After playing around with some graph algorithm code as implemented by Sedgewick and others, I have come to the conclusion that while these are efficient and concise, improvements could be made to their usability.

After studying Sedgewick’s books and online material I have concluded that following his material is no easy task. Many times things seem missing or are left unexplained, leaving it to the user to fill in the gaps. A bit like my college professors of old, in fact. Nothing wrong with this, though there is every possibility that the user (me) will misunderstand and/or misinterpret the purpose of the function. I would like to offer what I consider to be a simpler data structure for modeling graphs and networks, that retains an efficiency similar to Sedgewick’s.

There is also the Boost Graph library, though I consider these to be difficult to understand and difficult to use, often presenting the user with virtually indecipherable compiler errors.

Graph modelling via Sedgewick

This example implementation of the Graph class I have taken from Sedgewick’s C++ code that is available online, specifically, the Graph class seen here implements the DenseGRAPH class found at that link. There is also an alternative SparseMultiGRAPH class dealing with sparse networks which uses an adjacency list as a means of recording the topology.

The code below is more or less the same as the DenseGRAPH implementation at Sedgewick’s site, bar some necessary includes and tweaks for better readability. Sedgewick’s Graph classes typically track the number of edges and vertices, have methods for inserting and removing edges, and use an adjacency matrix or adjacency list to track of the interconnections between vertices:

#include <vector>
#include <iostream>

using namespace std;

struct Edge 
{
	int v;
	int w;
    Edge(int v = -1, int w = -1) : 
		v(v), w(w) { }
};

class Graph
{ 
	int Vcnt;
	int Ecnt; 
	bool digraph; 
	vector <vector <bool> > adj;

public:
  
	Graph(int V, bool digraph = false) :
		adj(V), Vcnt(V), Ecnt(0), digraph(digraph)
		{ 
		  for (int i = 0; i < V; i++) 
			adj[i].assign(V, false);
		}

	int V() const { return Vcnt; }
	int E() const { return Ecnt; }
	bool directed() const { return digraph; }
  
	void insert(Edge e)
    { 
		int v = e.v, w = e.w;
		if (adj[v][w] == false) Ecnt++;
		adj[v][w] = true;
		if (!digraph) adj[w][v] = true; 
    } 

	void remove(Edge e)
    {
		int v = e.v, w = e.w;      
		if (adj[v][w] == true) Ecnt--;
		adj[v][w] = false;
		if (!digraph) adj[w][v] = false; 
    } 
  
	bool edge(int v, int w) const 
    { 
		return adj[v][w]; 
	}	

  
	class adjIterator;
	friend class adjIterator;
};

class Graph::adjIterator
{
	const Graph &G;
	int i, v;

public:
  
	adjIterator(const Graph &G, int v) : 
		G(G), v(v), i(-1) { }
	int beg()
    { 
		i = -1; return nxt();
	}
 
	int nxt()
    {
		for (i++; i < G.V(); i++)
			if (G.adj[v][i] == true) return i;
		 return -1;
    }
	
	bool end()
    { 
		return i >= G.V(); 
	}
};

It is a straightforward matter to create representations of network topologies by creating an instance of Graph and adding sets of edges:

#include "Graph.h"

int main()
{
	Graph* G = new Graph( 5, false );

	G->insert( Edge( 0, 1 ) );
	G->insert( Edge( 1, 2 ) );
	G->insert( Edge( 2, 3 ) );
	G->insert( Edge( 3, 4 ) );
	G->insert( Edge( 0, 4 ) );

	return 0;
}

Problems with Sedgewick’s C++ code

I do however find other aspects of his code tricky if not downright obfuscatory, particularly the graph algorithms. Consider the example for finding minimal spanning trees:

template <class Graph, class Edge> 
class MST
{ 
	const Graph &G;
	vector<double> wt;
	vector<Edge *> fr, mst;

public:
  MST(const Graph &G) : 
	  G(G), mst(G.V()), wt(G.V(), G.V()), fr(G.V())
	{ 
		int min = -1; 

		for (int v = 0; min != 0; v = min) 
		{ 
			min = 0;
			for (int w = 1; w < G.V(); w++) 
				if (mst[w] == 0) 
				{ 
					double P; 
                    Edge* e = G.edge(v, w);
					if (e)
					if ((P = e->wt()) < wt[w])
					{ 
						wt[w] = P; 
						fr[w] = e; 
					}

					if (wt[w] < wt[min]) min = w;
				}
			if (min) mst[min] = fr[min];
		}
	}

	void show() 
	{ 
		for (int v = 1; v < G.V(); v++) 
		if (mst[v])
			mst[v]->show();
	} 
};

Where is the Graph method to return an Edge pointer, for example?

Edge* e = G.edge(v, w);

In this case, the edge method returns a bool, not an Edge pointer, so this is probably a mistake or an omission. The onus would be on the user to come up with a fix, though this seems less than straightfoward to figure this out, given the scant information.

There is another C++ example for finding minimal spanning trees:

template <class Graph, class Edge, class EdgePtr> 
class MST
{ 
	const Graph &G;
	vector<EdgePtr> a, mst;
	UF uf;

public:

	MST(Graph &G) : G(G), uf(G.V()), mst(G.V())
	{ 
		int V = G.V(), E = G.E();
		a = edges<Graph, Edge, EdgePtr>(G);
		sort<EdgePtr>(a, 0, E-1);
		for (int i = 0, k = 1; i < E && k < V; i++)
			if (!uf.find(a[i]->v, a[i]->w)) 
			{ 
				uf.unite(a[i]->v, a[i]->w);
				mst[k++] = a[i]; 
			}
	}
};

But this time no mention of what EdgePtr is supposed to represent. After spending some time wrestling with these and other sources of confusion, I gave up. His C code seems to make far more sense. Sedgewick seems to miss this inescapable point: code is worthless if it can’t be easily used and maintained.

My old college tutor’s sermonizing springs to mind, sage advice garnered no doubt from years of supervising student projects: “If you were to drop dead, would somebody else be able to read this and carry on where you left off?”.

The area where I would like to digress is that of modelling the interconnections between network nodes. For dense, highly-connected networks this has been traditionally achieved by means of an adjacency matrix – a two dimensional array representing the presence or absence of links (edges) between nodes (vertices). Or for more sparsely connected networks the alternative are adjacency lists – arrays of linked lists representing the set of nodes connected to each node.

An alternative Graph class

As an alternative to using adjacency matrices, adjacency lists etc I would like to propose the use of an STL map instead, using it to create a one-to-one mapping between a node pair, and the weight value of the edge connecting that node pair, and can be applied to directed or non-directed graphs:

#include <map>

typedef std::map< std::pair<int,int>, double>::const_iterator clm_it;

class Graph
{

private:
    int nNodes;
    int nEdges;
    bool isDirected;

    std::map< std::pair<int,int>, double > link_map;

public:
    Graph( const int& nodes, const bool& directed );
    ~Graph();

    int V() const;
    int E() const;
    bool directed() const;
    void insert( const int& s, const int& t, const double& weight ); 
    void remove( const int& s, const int& t );
    bool edge( const int& s, const int& t);

    clm_it getEdge( const int& s, const int& t);    
    void show() const;    
};

For ease of presentation I have kept the method names consistent with Sedgewick’s Graph class as far as possible. The main difference is the use of the std::map to provide a reasonably fast means of looking up interconnections between nodes, while encapsulating edge weight information within the same structure:

std::map< std::pair<int,int>, double > link_map;

The method implementations are as follows:

#include <iostream>
#include <algorithm>
#include "Graph.h"

Graph::Graph( const int& nodes, const bool& directed )
{
    nNodes = nodes;
    nEdges = 0;
    isDirected = directed;
}

Graph::~Graph()
{}

clm_it Graph::getEdge( const int& s, const int&t )
{
  std::pair<int,int> p( s, t );
  clm_it mit = link_map.find( p );

  return mit;
}

// Check if link exists between end nodes
bool Graph::edge( const int& s, const int& t)
{
  return getEdge( s, t ) != link_map.end();
}

// Add a link: map node pair with weight
void Graph::insert( const int& s, const int& t, const double& weight )
{
    std::pair<int,int> p1( s, t );
    std::pair<int,int> p2( t, s );
    link_map[ p1 ] = weight;

    if ( !isDirected )
    {        
        link_map[ p2 ] = weight;
    }

    nEdges++;
}

// Remove selected link - remove both directions
// if graph is not directed
void Graph::remove( const int& s, const int& t )
{
    std::pair<int,int> p1( s, t );
    std::map< std::pair<int,int>, double>::iterator mit = link_map.find( p1 );
    link_map.erase( mit );

    if ( !isDirected )
    {
	std::pair<int,int> p2( t, s );
        mit = link_map.find( p2 );
        link_map.erase( mit );
    }

    nEdges--;
}

// Print the network topology
void Graph::show() const
{
    clm_it mit = link_map.begin();

    for ( ; mit != link_map.end(); ++ mit )
    {
      std::pair<int,int> p = (*mit).first;
      std::cout << p.first  << " " <<
	           p.second << " " << 
	           (*mit).second << std::endl;
    }
}

// Return true if graph is directed; false otherwose
bool Graph::directed() const
{
    return isDirected;
}

// Return number of network nodes
int Graph::V() const
{
    return nNodes;
}

// Return number of edges
int Graph::E() const
{
    return nEdges;
}

And example usage below:

#include <iostream>
#include "Graph.h"

int main()
{
    Graph G( 5, true );
    G.insert( 0, 1, 2.5 );
    G.insert( 1, 2, 1.1 );
    G.insert( 2, 3, 1.2 );
    G.insert( 3, 4, 3.4 );
    G.insert( 0, 4, 1.5 );
    G.insert( 4, 3, 1.5 );
    G.show();

    std::cout << "removing 3-4" << std::endl;
    G.remove( 3, 4 );
    G.show();

    std::cout << "No. edges = " << G.E() << std::endl;

    std::cout << "adding 3-4" << std::endl;
    G.insert( 3, 4, 3.4 );
    G.show();

    std::cout << "No. edges = " << G.E() << std::endl;
    
    return 0;  
}