SNAP: Network analysis and graph mining library for massive networks

Stanford Network Analysis Package (SNAP) is a general purpose network analysis and graph mining library that is easily scales to massive networks, is efficient and easily extendible.

It naturally supports rich networks with complex data types associated with nodes and edges of the network. SNAP was developed in course of my PhD studies and was build on top of a general purpose STL (Standard Template Library)-like library GLib that was developed at Jozef Stefan Institute.

SNAP distinguishes between graphs and networks. Graphs describe raw topologies. That is nodes with unique integer ids and directed/undirected/multiple edges between the nodes of the graph. Networks are graphs with data on nodes and/or edges of the network. Data types that reside on nodes and edges are simply passed as template parameters which provides a very fast and convenient way to implement various kinds of networks with rich data on nodes and edges.

SNAP is based on node and edge iterators which allows for efficient implementation of algorithms that work on networks regardless of their type (directed, undirected, graphs, networks) and specific implementation.

For more information on SNAP refer to documentation


476 Million Twitter tweets

The 476 million twitter tweets cover a 7 month period from June 2009 to December 31 2009. The SNAP team estimated that this was about 20-30% of all public tweets published on Twitter during the particular frame.

Take a look at them here.


MemeTracker is an approach for extracting short textual phrases from web documents (news articles and blog posts) and then tracking how such prases spread over the Web and how they change and evolve as they spread.

Microsoft Instant Messenger network dataset

The dataset contains summary properties of 30 billion conversations among 240 million people. From the data, SNAP constructed a communication graph with 180 million nodes and 1.3 billion undirected edges, creating the largest social network constructed and analyzed to date

Read full report here

Some papers published by SNAP:

Supervised Random Walks: Predicting and Recommending Links in Social Networks
L. Backstrom, J. Leskovec.
ACM International Conference on Web Search and Data Minig (WSDM), 2011.

Correcting for Missing Data in Information Cascades
E. Sadikov, M. Medina, J. Leskovec, H. Garcia-Molina.
ACM International Conference on Web Search and Data Minig (WSDM), 2011.

Patterns of Temporal Variation in Online Media
J. Yang, J. Leskovec.
ACM International Conference on Web Search and Data Minig (WSDM), 2011.

The latest news from SNAP:

  • Aug 21 2011: Tutorial on Social Media Analytics. More info.
  • Apr 17 2011: SNAP now computes k-core decomposition and analyzes information cascades.
  • Oct 21 2010: New datasets: 476 million tweets, memetracker data, Amazon product data.
  • Oct 1 2010: SNAP now also computes network motifs and overlapping network communities.
  • Jul 10 2010: SNAP networks are now part of the University of Florida Sparse Matrix collection.
  • April 25 2010: SNAP is now available through the NodeXL which is a graphical front-end that integrates network analysis and SNAP into Microsoft Office and Excel. Using NodeXL, users without programming skills can make use of key elements of the SNAP library.
  • November 25 2009: more datasets added, new version of the library.
  • June 28 2009: website launched with 36 network datasets and SNAP network analysis library.

Go to their website