***************************************** Visualizing a provenance graph with Gephi ***************************************** `Gephi `_ is an open-source and free tool developed to visualize and explore graphs and networks, and supports the visualization of graphs stored in the GEXF format. We can follow the instructions of the :ref:`visualization` section on the Alpaca `Installation` section, for instructions on how to download and setup Gephi. Loading the GEXF file --------------------- To visualize the graph, use the `Open` command: 1. Go to `File->Open...`; 2. Select the GEXF file that you saved with the **visualize_prov.py** script. You will end up with the graph loaded in the main screen of Gephi: .. figure:: ../_static/images/gephi/screenshot_1.png Adjusting the graph layout -------------------------- The graph can be sorted and organized using one of the layout algorithms. We recommend `Force Atlas 2`: 1. Select `Force Atlas 2` from the drop-down list at the Layout panel on the bottom left; 2. Mark the options `Prevent Overlap` and `Dissuade Hubs` on the `Behavior Alternatives` section; 3. Press `Run`. After a few seconds, the graph will be sorted. Press `Stop`. .. figure:: ../_static/images/gephi/screenshot_2.png Adjusting the graph visualization --------------------------------- You can zoom in/out using the mouse wheel. You can drag the graph by pressing and holding the right button while moving the mouse. .. figure:: ../_static/images/gephi/screenshot_3.png You can tweak the graph appearance by displaying labels and using different colors for the nodes, according to their type: 1. Enable labels using the button at the bottom toolbar; 2. Select a good font size using the slider. .. figure:: ../_static/images/gephi/screenshot_4.png Color the nodes using the partition function: 1. At the `Appearance` panel on the top left, select the `Partition` tab inside the `Nodes` panel. 2. From the drop-down list, select `type` property, and click `Apply` at the bottom of the panel. .. figure:: ../_static/images/gephi/screenshot_5.png You will see the sequence of functions called (orange nodes) and the Python objects (purple) or files (green) used/generated by each function. .. figure:: ../_static/images/gephi/screenshot_6.png Inspecting provenance information --------------------------------- You can use the `Edit - Edit Node attributes` tool to display the properties of each node in the `Edit` panel at the top left. .. figure:: ../_static/images/gephi/screenshot_7.png Understanding provenance ------------------------ Let's take one of the function calls in **run_basic.py**: .. code-block:: Python isi_counts, isi_edges = isi_histogram(isi_times) This is represented in the graph by the `isi_histogram` node: .. figure:: ../_static/images/gephi/screenshot_8.png Inspecting `isi_histogram` we can see the parameters used (that were the defaults in the function call): .. figure:: ../_static/images/gephi/screenshot_9.png We can also inspect the two result objects: 1. The `ndarray` object contains the counts (i.e., an integer array with shape `(249,)`); .. figure:: ../_static/images/gephi/screenshot_10.png 2. The `Quantity` object contains the histogram edges (float array with shape `(250,)`). .. figure:: ../_static/images/gephi/screenshot_11.png Therefore, the generated histogram has 249 bins.