Access provenance data in Alpaca PROV files

This class provides functionality to serialize/deserialize the provenance using an ontology based on the W3C Provenance Ontology (PROV-O). The Alpaca ontology is used to serialize the provenance information captured by Alpaca as RDF files.

class alpaca.AlpacaProvDocument[source]

Generates a file using the Alpaca ontology (based on W3C PROV-O) from the history records captured during the execution of a Python script, or reads a serialized file into an RDF graph object.

Notes

For convenience, you can serialize the active history easily by just using the save_provenance() in Interface functions. This class should be used only if you want to access the data as an RDF graph or to manually control the serialization.

Attributes:
graphrdflib.Graph

Provenance data represented as an RDF graph, using the Alpaca ontology based on PROV-O.

add_history(script_info, session_id, history, show_progress=False)[source]

Adds a history of FunctionExecution records captured by Alpaca to an RDF document using the Alpaca PROV ontology. The script is added as a ScriptAgent agent.

Parameters:
script_infoalpaca_types.File

Named tuple with the information on the script being tracked (hash and file path).

session_idstr

Unique identifier for this script execution.

historylist of FunctionExecution

Provenance history to be serialized as RDF using PROV.

show_progressbool, optional

If True, show the progress of the provenance history serialization. Default: False

read_records(file_name, file_format='turtle')[source]

Reads PROV data that was previously serialized as RDF.

Parameters:
file_namestr or Path-like

Location of the file with PROV data to be read.

file_format{‘json-ld’, ‘n3’, ‘nt’, ‘turtle’, ‘xml’, ‘ttl’, ‘rdf’, ‘json’}

Format used to serialize the file that is being read. If None, the format will be inferred from the extension. The formats are the ones accepted by RDFLib. Some shortucts are defined for common file extensions:

  • ‘ttl’: Turtle

  • ‘rdf’: RDF-XML

  • ‘json’: JSON-LD

Raises:
ValueError

If file_format is None and file_name has no extension to infer the format, if it could not be inferred, or if the format is invalid.

serialize(file_name, file_format='turtle')[source]

Writes PROV data to a file or gets an in-memory string.

Parameters:
file_namestr or Path-like

Location of the file with PROV data to be read.

file_format{‘json-ld’, ‘n3’, ‘nt’, ‘hext’, ‘pretty-xml’, ‘trig’, ‘turtle’, ‘longturtle’, ‘xml’}

Format used in the file that is being read. The format strings are the ones supported by RDFLib. If None, the format will be inferred from the extension. Default: ‘turtle’