Visualizing a provenance graph with Gephi

Gephi is an open-source and free tool developed to visualize and explore graphs and networks, and supports the visualization of graphs stored in the GEXF format.

We can follow the instructions of the External tools for provenance visualization section on the Alpaca Installation section, for instructions on how to download and setup Gephi.

Loading the GEXF file

To visualize the graph, use the Open command:

  1. Go to File->Open…;

  2. Select the GEXF file that you saved with the visualize_prov.py script.

You will end up with the graph loaded in the main screen of Gephi:

../_images/screenshot_1.png

Adjusting the graph layout

The graph can be sorted and organized using one of the layout algorithms. We recommend Force Atlas 2:

  1. Select Force Atlas 2 from the drop-down list at the Layout panel on the bottom left;

  2. Mark the options Prevent Overlap and Dissuade Hubs on the Behavior Alternatives section;

  3. Press Run. After a few seconds, the graph will be sorted. Press Stop.

../_images/screenshot_2.png

Adjusting the graph visualization

You can zoom in/out using the mouse wheel. You can drag the graph by pressing and holding the right button while moving the mouse.

../_images/screenshot_3.png

You can tweak the graph appearance by displaying labels and using different colors for the nodes, according to their type:

  1. Enable labels using the button at the bottom toolbar;

  2. Select a good font size using the slider.

../_images/screenshot_4.png

Color the nodes using the partition function:

  1. At the Appearance panel on the top left, select the Partition tab inside the Nodes panel.

  2. From the drop-down list, select type property, and click Apply at the bottom of the panel.

../_images/screenshot_5.png

You will see the sequence of functions called (orange nodes) and the Python objects (purple) or files (green) used/generated by each function.

../_images/screenshot_6.png

Inspecting provenance information

You can use the Edit - Edit Node attributes tool to display the properties of each node in the Edit panel at the top left.

../_images/screenshot_7.png

Understanding provenance

Let’s take one of the function calls in run_basic.py:

isi_counts, isi_edges = isi_histogram(isi_times)

This is represented in the graph by the isi_histogram node:

../_images/screenshot_8.png

Inspecting isi_histogram we can see the parameters used (that were the defaults in the function call):

../_images/screenshot_9.png

We can also inspect the two result objects:

1. The ndarray object contains the counts (i.e., an integer array with shape (249,));

../_images/screenshot_10.png

2. The Quantity object contains the histogram edges (float array with shape (250,)).

../_images/screenshot_11.png

Therefore, the generated histogram has 249 bins.