Installation

The easiest way to install Alpaca is by creating a conda environment, followed by install using pip. Below is the explanation of how to proceed with these two steps.

Prerequisites

Alpaca requires Python 3.8 or higher.

  1. Create your conda environment (e.g., alpaca):

    conda create --name alpaca python=3.11
    
  2. Activate your environment:

    conda activate alpaca
    

Installation

The easiest way to install Alpaca is via pip:

pip install alpaca-prov

To upgrade to a newer release use the --upgrade flag:

pip install --upgrade alpaca-prov

If you do not have permission to install software systemwide, you can install into your user directory using the --user flag:

pip install --user alpaca-prov

External tools for provenance visualization

In order to visualize the provenance data saved using Alpaca, a graph visualization software is needed. Currently, any application that supports the GEXF or GraphML formats can be used.

It is recommended to use Gephi:

  1. Download the Gephi installation bundle for your system here.

  2. Follow the instructions for your system here.

Dependencies

Alpaca relies on the following packages (automatically installed when you run pip install alpaca-prov):

  • rdflib - working with RDF

  • networkx - representation and manipulation of graphs in Python

  • numpy - fast arrays for scientific computing

  • joblib - tools for pipelining in Python, including hashing

  • dill - extension to Python’s pickle module for serializing and de-serializing objects