SWPM discussion

From Knoesis wiki
Jump to: navigation, search

Notes from the SWPM09 Workshop Discussions

The workshop began with Carole's keynote, followed by a lively Q&A session, and ended with a discussion on the question:

what is your single most important item in the research agenda for semantics and provenance?

below are three lists of short items of answers, comments, and observations that we generated in response to the question. The first is collected from the audience (with no specific attribution!), the second is Satya's own list, and the third is Paolo's. Please email Paolo if you spot excessive inaccuracies!


Answers from the audience

  • self-disclosure/self-reporting of artifacts, or at least assistive disclosure
    • Eg attribution of derived artifacts, for example of workflows in a repository
    • licensing issues: derived licensing of derived artifacts
  • provenance-enabled truth maintenance system, to make inferences possible on the current state of information Web-wide -- eg on quality and other properties of data
  • are we providing a precise enough notion of provenance, which is not so broad that not all metadata falls in this class?
  • “Provenance by collective witnessing”: inferring (possibly imprecise) provenance from collective observations
  • visualization: mapping provenance to various visualization metaphors
    • (note: VisTrails does well within its domain)
    • support varying levels of precision/granularity in the data
    • avoid the temptation of generic viz tools
    • can we come up with a set of operators that define a common model on which we can build multiple (user) views?
    • think end-user: reuse tools that scientists are used to:
      • avoid displaying graphs!
      • spreadsheet metaphor very well understood
      • perhaps more intriguing: align provenance traces by using a sequence alignment tool?
  • caveat: preserve openness. Let us not create a deep web of provenance. let it live on the web as a first-class citizen
  • how to make provenance explicit:
    • in a format that is useful for reasoning about
    • at a reasonable cost
  • what PML can bring to OPM: tooling for PML can be used to retrieve provenance from Web sources
  • synthesize information from multiple sources
    • reification? clustering related information objects?
    • understanding the sources, best practice solutions for RDF to encode data today?
  • provenance and linked data: don’t damage the linked data!
  • leveraging OPM and PML to express the provenance of classes and properties: “all values from a class C come from source S”
  • role of OWL2:
    • OWL2 allows the same name to be used both as an instance and as a class (punning)
    • annotations on annotations
  • what does it take to publish provenance on the Web according to linked data
    • how to publish something like an OPM graph in RDF format? naming (which URIs), granularity (references to elements within a graph)

Satya's own list

  • Provenance Modeling: Is there a need for a single model for provenance representation for relational database, workflows, and domain semantics?
    • Provenance representation in the Semantic Web without using RDF reification (or named graphs)
    • Representing both provenance and data as first class entities in provenance model
  • Query and Analysis: How to query, analyze provenance
    • Is there a need for an uniform query mechanism for provenance?
    • Should the query mechanism be coupled with the provenance representation model?
    • What are the applications of provenance querying beyond manual verification and rerun of experiments?
    • Should provenance analysis be treated as separate from provenance querying?
  • Role of provenance in data inter-operability and integration

Paolo's own list

  • Use cases and applications:
    • What use cases are driving the injection of semantics in provenance?
    • …a killer app?
  • When, in the capture/storage/query process, does provenance become “semantic”?
    • Semantic observable events vs post-hoc annotations
    • where do the annotations to a provenance graph come from?
  • How does one choose an appropriate semantic formalism to annotate provenance?
    • OWL vs SKOS, for instance
  • How is the provenance query language affected? and the efficiency of query answering?