This wiki service has now been shut down and archived

Principles of Provenance

From ESIWiki

Jump to: navigation, search


General Information


  • Start Date: 1 April 2008
  • End Date: 15 May 2009

This Theme is being led by:

Applying to be a Visitor to this Theme

Please fill in the application including in the proposal the name of the theme (Provenance).

Theme Topic

Original Proposal

Recent research in a variety of settings (databases and data warehouses, geographic information systems, scientific workflows, grid computing, and the Semantic Web) has addressed the problem of keeping track of metadata about creation and modification history, influences, ownership, and other provenance or lineage information. Such metadata is essential for making informed judgments about data quality, integrity, and authenticity. In addition, ideas about provenance are now being used in several areas of computer science such as probabilistic databases, operating systems, file synchronization, and annotation propagation. Other topics, such as version control and archiving, may also benefit from better understanding of provenance. We believe the time is ripe to develop the foundations of the topic and address questions such as:

  • What does it mean for information to be "provenance"? What is and what isn't provenance?
  • What kinds of problems does provenance address, and how does one characterize correct solutions?
  • How does one compare different models of provenance?
  • Why is provenance so hard to get right, even though it seems rather obvious?
  • Where should research efforts be focused to make the best progress?

Schedule of Events


  • Visiting speaker, May 6, 2008. MIT data quality research activities including Recent Work on Propagation of "Believability" using provenance information. Stuart Madnick, Massachusetts Institute of Technology.
  • Visiting speaker, Friday, May 30, 2008, 11am, JCMB 2511. Self-Adjusting Computation. Umut A. Acar, Toyota Technological Institute, Chicago.
  • Visiting speaker, Tuesday, May 12, 2009. A Theory of Typed Coercions and its Applications, Michael Hicks, University of Maryland


Provenance in Databases, May 19-23 2008

Provenance in Scientific Workflows, October 13-17, 2008 (Salt Lake City)

Provenance in Software Systems, March 30-April 3, 2009

Provenance in Secure and Advanced Computer Systems, May 13-15, 2009


Workshop on Principles of Provenance (PrOPr) Edinburgh, Scotland, November 19-20, 2007.

First Workshop on Theory and Practice of Provenance, San Francisco, CA, February 23, 2009. The online proceedings is available here. Here are some of the presentations from TaPP.

Workshop on Use Cases for Provenance Edinburgh, Scotland, April 20, 2009

Second Workshop on Theory and Practice of Provenance, San Jose, CA, February 22, 2010.

Other recent provenance workshops and events

Early workshops

Data Provenance/Derivation Workshop, Chicago, 2002

Data Provenance and Annotation, eScience Institute, Edinburgh, December 1, 2003

Workshop on Provenance Aware Storage Systems, October 2005

International Provenance and Annotation Workshop

  • IPAW 2008, June 17-18, 2008, Salt Lake City, UT.

Provenance Challenge

Recent workshops

Workshop on Reputation and Provenance, OASIS, IBM. Bethesda, MD, Mar 10-11, 2009. (part of the Open Reputation Management Systems effort)

BELIEF-II/CASPAR Brainstorming Workshop on Provenance, April 6-7, 2009

Workshop on Data and process Provenance (WDPP 2009) - April 20, 2009

Semantic Web Provenance Management (SWPM 2009) - proposed workshop to be collocated with ISWC 2009, October 25-29.

Provenance in Practice Workshop 2009 (PPW '09), colocated with NbiS 2009, Indianapolis, USA, 19-21 August 2009

Provenance Bibliography

This is a (non-exhaustive) list of recent papers on provenance by Theme participants, some of which has been facilitated by the Theme.

  • Manish Kumar Anand, Shawn Bowers, Timothy M. McPhillips, Bertram Ludäscher: Efficient provenance storage over nested data collections. EDBT 2009
  • D. Archer, L. Delcambre, D. Maier, A Framework for Fine-grained Data Integration and Curation, with Provenance, in a Dataspace, TaPP 2009
  • Shawn Bowers, Timothy M. McPhillips, Sean Riddle, Manish Kumar Anand, Bertram Ludäscher: Kepler/pPOD: Scientific Workflow and Provenance Support for Assembling the Tree of Life. IPAW 2008: 70-77
  • Uri Braun, Avraham Shinnar, Margo I. Seltzer: Securing Provenance. HotSec 2008
  • Peter Buneman, James Cheney, Wang Chiew Tan, Stijn Vansummeren: Curated databases. Symposium on Principles of Database Systems 2008:1-12
  • Peter Buneman, James Cheney, Stijn Vansummeren: On the Expressiveness of Implicit Provenance in Query and Update Languages. ACM Transactions on Database Systems Volume 33 , Issue 4 (November 2008)
  • Peter Buneman, Wang Chiew Tan: Provenance in databases. ACM International Conference on Management of Data (SIGMOD) 2007:1171-1173
  • Adriane Chapman and H. V. Jagadish. Why NOT?. In SIGMOD 2009
  • James Cheney, Peter Buneman, Bertram Ludäscher: Report on the Principles of Provenance Workshop. ACM Special Interest Group on Management of Data (SIGMOD) Record 37(1):62-65 (2008)
  • James Cheney, Laura Chiticariu, and Wang-Chiew Tan: Why, how, and where-provenance. Invited to Foundations and Trends in Databases (accepted).
  • S. Chong, Towards Semantics for Provenance Security, TaPP 2009
  • Andrew Cirillo, Radha Jagadeesan, Corin Pitcher, James Riely: Tapido: Trust and Authorization Via Provenance and Integrity in Distributed Objects (Extended Abstract). ESOP 2008:208-223
  • Susan B. Davidson, Juliana Freire: Provenance and scientific workflows: challenges and opportunities. ACM International Conference on Management of Data (SIGMOD) 2008:1345-1350
  • V. Deolalikar, H. Laffitte, Provenance as data mining, TaPP 2009
  • M. Factor, E. Henis, D. Naor, S. Rabinovici-Cohen, P. Reshef, S. Ronen, G. Michetti, M. Guercio,Authenticity and Provenance in Long Term Digital Preservation: Modeling and Implementation in Preservation Aware Storage, TaPP 2009
  • J. Nathan Foster, Todd J. Green, Val Tannen: Annotated XML: queries and provenance. Symposium on Principles of Database Systems 2008: 271-280
  • Juliana Freire, David Koop, Luc Moreau: Provenance and Annotation of Data and Processes, Second International Provenance and Annotation Workshop, IPAW 2008, Salt Lake City, UT, USA, June 17-18, 2008. Revised Selected Papers. IPAW 2008
  • Floris Geerts and Antonella Poggi, On Database Query Languages for K-relations, Logic in Databases workshop (LID), 2008.
  • A. Gehani, M. Kim, J. Zhang, Steps Toward Managing Lineage Metadata in Grid Clusters, TaPP 2009
  • T. Gibson, K. Schuchardt, E. Stephan, Application of Named Graphs Towards Custom Provenance Views, TaPP 2009
  • Todd J. Green. Containment of Conjunctive Queries on Annotated Relations. 2009 International Conference on Database Theory
  • Jan Hidders, Natalia Kwasnikowska, Jacek Sroka, Jerzy Tyszkiewicz, Jan Van den Bussche: DFL: A dataflow language based on Petri nets and nested relational calculus. Inf. Syst. (IS) 33(3):261-284 (2008)
  • Natalia Kwasnikowska, Jan Van den Bussche: Mapping the NRC Dataflow Model to the Open Provenance Model. IPAW 2008: 3-16
  • D. Margo, M. Seltzer, The Case for Browser Provenance, TaPP 2009
  • Paolo Missier, Suzanne M. Embury, Richard Stapenhurst: Exploiting Provenance to Make Sense of Automated Decisions in Scientific Workflows. IPAW 2008:174-185
  • Luc Moreau et al. Special Issue: The First Provenance Challenge. Concurrency and Computation: Practice and Experience 20(5):409-418 (2008)
  • Luc Moreau, Juliana Freire, Joe Futrelle, Robert E. McGrath, Jim Myers, Patrick Paulson: The Open Provenance Model: An Overview. IPAW 2008:323-326
  • Luc Moreau, Paul T. Groth, Simon Miles, Javier Vázquez-Salceda, John Ibbotson, Sheng Jiang, Steve Munroe, Omer F. Rana, Andreas Schreiber, Victor Tan, László Zsolt Varga: The provenance of electronic data. Commun. ACM (CACM) 51(4):52-58 (2008)
  • K. Muniswamy-Reddy, P. Macko, M. Seltzer,Making a Cloud Provenance-Aware, TaPP 2009
  • P. Pediaditis, G. Flouris, I. Fundulaki, V. Christophides, On Explicit Provenance Management in RDF/S Graphs, TaPP 2009
  • C. Reilly, J. Naughton, Transparently Gathering Provenance with Provenance Aware Condor,TaPP 2009
  • A. Rosenthal, L. Seligman, A. Chapman, B. Blaustein, Scalable Access Controls for Lineage, TaPP 2009
  • Yogesh L. Simmhan, Beth Plale, Dennis Gannon: Karma2: Provenance Management for Data-Driven Workflows. Int. J. Web Service Res. (JWSR) 5(2):1-22 (2008)
  • Yogesh L. Simmhan, Beth Plale, Dennis Gannon: Query capabilities of the Karma provenance framework. Concurrency and Computation: Practice and Experience (CONCURRENCY) 20(5):441-451 (2008)
  • R. Spillane, R. Sears, C. Yalamanchili, S. Gaikwad, M. Chinni, E. Zadok, Story Book: An Efficient Extensible Provenance Framework, TaPP 2009
  • I. Souilah, A. Francalanza, V. Sassone, Provenance in Distributed Systems, TaPP 2009
  • Nikhil Swamy, Brian J. Corcoran, Michael Hicks: Fable: A Language for Enforcing User-defined Security Policies. IEEE Symposium on Security and Privacy 2008:369-383
  • Nikhil Swamy, Michael Hicks: Verified enforcement of stateful information release policies. PLAS 2008: 21-32
  • Simone Stumpf, Erin Sullivan, Erin Fitzhenry, Ian Oberst, Weng-Keen Wong, Margaret M. Burnett: Integrating rich user feedback into intelligent user interfaces. IUI 2008:50-59
  • Val Tannen: Provenance for Database Transformations. IPAW 2008:1
  • Curt Tilmes, Albert J. Fleig: Provenance Tracking in an Earth Science Data Processing System. IPAW 2008:221-228
This is an archived website, preserved and hosted by the School of Physics and Astronomy at the University of Edinburgh. The School of Physics and Astronomy takes no responsibility for the content, accuracy or freshness of this website. Please email webmaster [at] ph [dot] ed [dot] ac [dot] uk for enquiries about this archive.