This wiki service has now been shut down and archived
Closed Versus Open World Questions
From ESIWiki
Contents |
List of questions for discussion during the The Closed World of Databases Meets the Open World of the Semantic Web workshop
Semantic Web
- What is it?
- There is no a single one definition.
- It is an ideology.
- It is the use of any technology based on relational triples annotated in either RDF, OWL, etc.. (?) -> No, these technologies can also be used to implement not semantic web applications.
- It is not definable (?).
Data Webs
- For David Shotton:
- What about quality control?- in the absence of central control/submission standards
- "There's a lot of junk out there". It depends on the parties and the data may not be of the best quality.
- What about copyright issues? Payments?
- The idea is about facilitating finding images. Nevertheless, copyrighted images will be available only if the user has the required authorization.
- What is the relationship between Data webs and Google/Yahoo?
- Use of semantics vs Keywords. Nonetheless there is some semantics consideration made by Google.
- How to link an actual resource after initial query of the web repository? Bringing the user to DB's website vs. pulling data directly via repository? What's the right approach?
- There will be enough centralised metadata to ask simple queries. Given a query, the system will retrieve a list of thumbnails and links to the original sites. Then you could go to a specific site and ask more complex queries and/or retrieve the original image there.
- Do we annotate with metadata actual rows in the DB or do we annotate the resource (website od DB as a whole)? Authoring useful metadata to create the "data Web" - should people use wikis, RSS, RDF, html mircoformats, Flickr/de.lici.ous style tagging, XML, OWL or what? GoogleBase? Blogs?
- In fact the idea is to start with existing data/metadata. The question is what level of granularity regarding the metadata is going to be sufficient. A big mess is anticipated.
- What about quality control?- in the absence of central control/submission standards
Data Integration
- For David Shotton:
- It seems that what you want to do is putting together a directory of sources rather than really integrating data at those sources. What kind of integration would you like to achieve?
- The idea is to integrate the search capabilities of different sources, not the information itself.
- Does RDF really give you integration for free?
- With RDF you get syntactic and semantic integration.
- It is true that you get a single RDF graph, but this is not (very) meaningful (at all) without unifying semantics.
- It seems that what you want to do is putting together a directory of sources rather than really integrating data at those sources. What kind of integration would you like to achieve?
- For Matt Pocock:
- What kind of mappings could I have (LaV, GaV, GLaV/sound, complete, exact)?
- LaV/Sound.
- Is your query answering algorithm sound/complete/exact?
- Complete.
- What is the expressivity of the query language for the user and for the mappings?
- OWL concept definitions.
- What do I have to do to add/remove a new/obsolete source?
- Adding source implies providing the mappings.
- Who is going to come up with the global view?
- The community who is interested in building such a system.
- What kind of mappings could I have (LaV, GaV, GLaV/sound, complete, exact)?
Ontologies
- From Chris Date
- ONTOLOGIES: Why are we saddled with this wildly inappropriate term? How is an ontology different from metadata? Perhaps I should ask: what exactly is an ontology?
- It is a knowledge base. A knowledge base KB is composed by a TBox and a ABox. The TBox is a set of restrictions on concepts while the Abox is a set of assertions on individuals and relationships between individuals.
- The main difference between knowledge bases and databases is the type of constraints you could impose on the concepts being considered. Also typically knowledge bases work under OWA while databases work under CWA.
- It is important to differentiate ontologies from reasoners and databases from DBMSs.
- We've also heard about RDF triples. Does an ontology consist of RDF triples or are they something different? RDF triples sound a little like 6NF relvars (or tuples on such relvars) but I suspect "tuples" is inadequate - need degree 0 1 2 (especially) 3 +..... Note: Weve heard about triple stores before too. LEAP? 1957?
- An ontology can be regarded as a set of RDF triples.
- RDF is more general than 6NF.
- ONTOLOGIES: Why are we saddled with this wildly inappropriate term? How is an ontology different from metadata? Perhaps I should ask: what exactly is an ontology?
Open vs Closed World
- Is the difference between RDBMS worlds and interpretations and open world reasoning simply the level and detail of explicit definition of axioms/constrainst etc.
- Different people are using terms in different things.
- CWA means "everything a database tells you is true and everything that it does not is false" while OWA means "everything a database tells you is true and everything that it does not is unknown".
- OWA leads you into 3VL.
- CWA systems can answer "don't know".
- In CWA every proposition is either false or true, in OWA this is not necessarily the case.
- In CWA there is only ONE model (consistent interpretation) and a proposition is either true or false. In OWA there are many models. The value "unknown" comes from the fact that a certain proposition is only true in some of them (false in some others).
- Is there a problem with systems using both the closed world assumption and the open world assumption?
- See previous discussion.
- If the OWA is at odds with the relational model does that mean we can't combine relational databases with ontologies?
- No. There are efforts trying to bring together these two worlds (cf. Boris Motik, University of Manchester).
- Reporting standards are seen as the starting point in many fields. These are often more nearly CW than OW in their style. They need only operate at submission, though users who see CW input may expect CW query behaviour. How far should their influence extend? This may relate to David Shotton's QC comments. How should OW systems address these issues. (Nigel Hardy)
- WTH?
Terminology
- Is the terminology in relational databases and ontologies the same? e.g predicates, axioms etc
- Both worlds talk about the same things. The terminologies are not precisely inconsistent but can lead to misunderstandings (axioms as restrictions or as tuples).
- From Chris Date
- OPEN VS CLOSED: I'm still not convinced we're all using these terms in the same way. Certainly (a) a requirement for "don't know" answers does NOT imply a requirement for the OWA; (b) several speakers seemed to use the term "open world" in imprecise ways; (c) the OWA is related to 3VL, including giving wrong answers. Related questions: I thought "negation as failure" and the CWA were essentially the same thing, but I'm pretty sure at least one speaker used it to mean something else - Please clarify...
- Negation as failure is used to implement CWA.
- OPEN VS CLOSED: I'm still not convinced we're all using these terms in the same way. Certainly (a) a requirement for "don't know" answers does NOT imply a requirement for the OWA; (b) several speakers seemed to use the term "open world" in imprecise ways; (c) the OWA is related to 3VL, including giving wrong answers. Related questions: I thought "negation as failure" and the CWA were essentially the same thing, but I'm pretty sure at least one speaker used it to mean something else - Please clarify...
2VL or 3VL
- Do we really need more than 2VL?
- Does the OWA imply 3VL?
- Relational Databases: Representation of missingness might be solved by one or other of the methods mentioned by Chris Date. But this does not solve the manipulation problem. You need "unk" as a truth valued variable even if it isn't itself a truth value. Does this imply OWA and 3VL?
Missing Information
- How do we really deal with missing or incomplete data?
- It is an open problem.
- The use of NULL is to be taken into consideration. It can be proven that a inadequate use of NULL could lead to wrong answers.
- It is useful removing NULLs using 6NF.
- Even when we have removed NULLs from the database, unfortunately, in some implementations, NULLs can arise in the middle of query processing (or other situations). In other implementations, there are some techniques that try to prevent the arising of NULLs.
- Real problems arise when NULLs provoke wrong answers that we cannot recognize as being wrong.
- If you use NULLs, maybe the user should be warned about what kinds of operations could lead to problems.
- What differences does CWA/OWA make in SPARQL (which is often translated to SQL for execution against triple store)? If not (a valid interpretation), why not?
- <Rob will edit this>
Data Consistency
- How much semantics do we really need to capture in meta-data/ontologies?
- It depends on what you are trying to do. If you are trying to build an ontology that will be reused by many people, it is a good a idea to be general.
- Identify the scope!
- Do the bit that is easy to do in order to cover your domain. You could always add things later, nevertheless it is important to identify overlapping parts.
- Identify the core concepts and the secondary ones.
- Ontologies are typically built to be shared.
- Contradictory data is common in science, how do we represent this and query/reason over it?
- Contradictions depend on the interpretation of the data.
- There was a presentation about this. <please add more details>
- Should data stored according to one semantics be re-used according to another? - How do we know it's accurate?
- If the semantics are compatible, this is possible. Communication is crucial.
- One speaker said "of course, our data is inconsistent...", and this is my experience too. Is consistency checking, eg in OWL (a) practicable - given real-world data volumes and difficulty of reconciling terminology between KBs, (b) sensible - given existence of genuine inconsistencies in real-world data?
- Consistency checking is very important to discover mistakes, specially when complex constructors are used.
- Reasoners are typically poor handling individuals. Nevertheless, consistency checking algorithms are getting better and better.
Mappings between DB and Ontologies
- From Chris Date
- OWL CLASSES: We heard about mapping SQL tables to OWL classes. Convince me that this isn't the First Great Blunder rearing its ugly head again!(Type!= table ...regardless of whether "table" really means value or variable)
- A class is a unary predicate, a set of individuals.
- A class can be regarded as a variable whose value is a set.
- OWL CLASSES: We heard about mapping SQL tables to OWL classes. Convince me that this isn't the First Great Blunder rearing its ugly head again!(Type!= table ...regardless of whether "table" really means value or variable)
- If the semantics of the database are tied into the predicates (in the designer's mind), how do we represent and share these semantics to ensure that the data is used "correctly"
- Database design is predicate design.
- Semantics of a database is only in the user's heads.
- From Chris Date
- SEMANTIC MISMATCH: We hear about defining (possibly automatically?) OWL classes or ontologies - not sure I'm using the term correctly - over existing SQL databases. How are the semantics captured? and disambiguated
- There are tools that can help defining the mappings, but they are not completely reliable.
- SEMANTIC MISMATCH: We hear about defining (possibly automatically?) OWL classes or ontologies - not sure I'm using the term correctly - over existing SQL databases. How are the semantics captured? and disambiguated
- Ontologies that describe the contents of databases: what is (usually) implicit in database terminology e.g. disjoint concepts, disjoint instances, functional propertis etc? should we write the ontology to describe what the database designer/data collector meant?
- Disjointness at least.
User issues
- What world assumptions do our end users really use?
- Both OWA and CWA.
- Does it matter if users understand the logic of the system?
- No, as long as the system answers correctly.
- What problems can arise if the user interprets the results presented with the wrong world assumption?
- The user gets wrong answers.
URIs
- for Henry Thompson
- URIs are virtuallly never static - so is it sensible to use URIs for identifiers for resources?
- What's the best way of generating URIs when translating a relational database to RDF?