Access provenance data in Alpaca PROV files¶
This class provides functionality to serialize/deserialize the provenance using an ontology based on the W3C Provenance Ontology (PROV-O). The Alpaca ontology is used to serialize the provenance information captured by Alpaca as RDF files.
- class alpaca.AlpacaProvDocument[source]¶
Generates a file using the Alpaca ontology (based on W3C PROV-O) from the history records captured during the execution of a Python script, or reads a serialized file into an RDF graph object.
Notes
For convenience, you can serialize the active history easily by just using the
save_provenance()in Interface functions. This class should be used only if you want to access the data as an RDF graph or to manually control the serialization.- Attributes:
- graphrdflib.Graph
Provenance data represented as an RDF graph, using the Alpaca ontology based on PROV-O.
- add_history(script_info, session_id, history, show_progress=False)[source]¶
Adds a history of FunctionExecution records captured by Alpaca to an RDF document using the Alpaca PROV ontology. The script is added as a ScriptAgent agent.
- Parameters:
- script_infoalpaca_types.File
Named tuple with the information on the script being tracked (hash and file path).
- session_idstr
Unique identifier for this script execution.
- historylist of FunctionExecution
Provenance history to be serialized as RDF using PROV.
- show_progressbool, optional
If True, show the progress of the provenance history serialization. Default: False
- read_records(file_name, file_format='turtle')[source]¶
Reads PROV data that was previously serialized as RDF.
- Parameters:
- file_namestr or Path-like
Location of the file with PROV data to be read.
- file_format{‘json-ld’, ‘n3’, ‘nt’, ‘turtle’, ‘xml’, ‘ttl’, ‘rdf’, ‘json’}
Format used to serialize the file that is being read. If None, the format will be inferred from the extension. The formats are the ones accepted by RDFLib. Some shortucts are defined for common file extensions:
‘ttl’: Turtle
‘rdf’: RDF-XML
‘json’: JSON-LD
- Raises:
- ValueError
If file_format is None and file_name has no extension to infer the format, if it could not be inferred, or if the format is invalid.
- serialize(file_name, file_format='turtle')[source]¶
Writes PROV data to a file or gets an in-memory string.
- Parameters:
- file_namestr or Path-like
Location of the file with PROV data to be read.
- file_format{‘json-ld’, ‘n3’, ‘nt’, ‘hext’, ‘pretty-xml’, ‘trig’, ‘turtle’, ‘longturtle’, ‘xml’}
Format used in the file that is being read. The format strings are the ones supported by RDFLib. If None, the format will be inferred from the extension. Default: ‘turtle’