Title: | Compare Provenance Collections to Explain Changed Script Outputs |
---|---|
Description: | Inspects provenance collected by the 'rdt' or 'rdtLite' packages, or other tools providing compatible PROV JSON output created by the execution of a script, and find differences between two provenance collections. Factors under examination included the hardware and software used to execute the script, versions of attached libraries, use of global variables, modified inputs and outputs, and changes in main and sourced scripts. Based on detected changes, 'provExplainR' can be used to study how these factors affect the behavior of the script and generate a promising diagnosis of the causes of different script results. More information about 'rdtLite' and associated tools is available at <https://github.com/End-to-end-provenance/> and Barbara Lerner, Emery Boose, and Luis Perez (2018), Using Introspection to Collect Provenance in R, Informatics, <doi:10.3390/informatics5010012>. |
Authors: | Barbara Lerner [cre], Emery Boose [aut], Khanh Ngo [aut] |
Maintainer: | Barbara Lerner <[email protected]> |
License: | GPL-3 | file LICENSE |
Version: | 1.1.1 |
Built: | 2024-11-02 10:15:15 UTC |
Source: | https://github.com/end-to-end-provenance/provexplainr |
prov.explain reads two provenance collections and finds differences between these two versions.
prov.diff.script visualizes the differences between two versions of a script that were previously executed.
prov.explain(dir1, dir2, save = FALSE) prov.diff.script(dir1, dir2, first.script = NULL, second.script = NULL)
prov.explain(dir1, dir2, save = FALSE) prov.diff.script(dir1, dir2, first.script = NULL, second.script = NULL)
dir1 |
path to first provenance directory |
dir2 |
path to second provenance directory |
save |
if true saves the report to the file prov-explain.txt in the first directory |
first.script |
name of first script. If no value is passed in, it will use the main script |
second.script |
name of second script. If both first and second script name are NULL, it will use the main script form the second directory. If second script name is NULL, but first script name is not, it will use first script name. |
prov.explain and prov.diff.script are intended to help a user determine what has changed if multiple executions of a script lead to different results. prov.explain does this by comparing provenance collected using the rdtLite or rdt packages. prov.diff.script compares copies of the R scripts saved in provenance directories at the time that the scripts were executed.
The types of differences that prov.explain can find include:
Environmental information identifying when the scripts were executed, the version of R, the computing systems, the tool and version used to collect the provenance, the location of the provenance file, and the hash algorithm used to hash data files.
Versions of libraries loaded
Versions of provenance tools
Contents and names of main and sourced scripts
The prov.diff.script compares two versions of a script. Users must specify the name of the first script, the provenance directory associated with the first execution of the script, and the provenance directory associated with the second execution of the script. The name of the second script is optional. If it is omitted, the same script name is looked for in the second provenance directory
## Not run: prov.explain("first.test.dir", "second.test.dir")
## Not run: prov.explain("first.test.dir", "second.test.dir")