This wiki service has now been shut down and archived
Provenance in Publication of Legislation
Provenance in publication of legislation
Stephen Cresswell [firstname.lastname@example.org]
Linked Open Data Source
Publication of all UK legislation is a core part of the remit of Her Majesty’s Stationery Office (HMSO), part of The National Archives (TNA). The Stationery Office (TSO) holds the contract to capture, transform and disseminate legislation. The legislation.gov.uk website was created by TSO in partnership with TNA. The website contains nearly 60,000 items of legislation, both enacted and revised, available in various formats including XML, HTML and PDF, with metadata available in RDF/XML. TSO is currently updating its publishing workflows for legislation, and this is to incorporate provenance tracking.
Use case Scenario
Four use cases are envisaged:
- (1) Drafters of legislation (e.g. government departments): Drafters need to track progress of specific items -
- e.g. how is my job progressing through publication workflow?
- (2) Management: TNA need to be able to aggregate information about timings throughout the workflow, in order to provide information about the efficiency of the workflow -
- e.g. where are the bottlenecks in the workflow?
- (3) System maintainers: The provenance information should assist in tracing the source and consequences of errors -
- e.g. which documents were derived from this XSLT?
- (4) Users of the published legislation may need to know what processes an item of legislation has been through since it was drafted.
- e.g. where did this document come from?
In order to demonstrate that the published legislation in correct and complete, TNA require that it should be possible re-execute the publication workflow from the provenance graph.
Problems and Limitations
Requirements for Provenance
- The requirement to be able to re-execute the provenance graph sets a high requirement on the level of detail at which the provenance is recorded.
- Whereas we cannot re-run parts of workflow in which involve human interaction (e.g. manual editing of a document) we aim to model this situation by capturing the change as an artifact in the form of a document “diff”. The interactive change is viewed as the creation of a diff, which is then applied as a patch to the document.
- It must be possible to transform the provenance graph back into executable form.
- We need to track provenance of items even as they are passed between systems and between organizations.
- Use cases (1) and (2) require that the timing of processes should be recorded. It is also necessary to determine who has responsibility for each process. This indicates grouping the processes hierarchically into stages of the workflow, and allowing queries of durations of stages.
- Use case (3) requires transitive reasoning across the provenance graph.
- Use case (4) requires a view of the provenance graph to make it comprehensible to a non-specialist. Authorities approving the item at each stage should be made clear.
- Generally, different use cases require different views of the provenance graph.