14
5

When developing ontologies it is important to use a version control system. But ordinary version control systems, like Subversion, Git, Bazaar and Mercurial do not take into account the graph nature of ontologies. They track every file change, even if you just swap two lines, which does not actually change an ontology.

Developing a separate version control system for ontologies is an interesting task, but it would be too much work (new GUI, new protocols, integration with bugtrackers). What is the easiest way to extend an existing version control system to make it take ontology semantics into account?

We know about OWLDiff. It could be used to extend an existing VCS. Detecting changes in OWL is easier than in RDF(S), considering the blank nodes problem, see the article.

Using N3 looks like a good temporary solution, but what about automatic commit messages? Like

2 new classes: Class1, Class2 (subclass of Class1)
1 class removed: Class3
1 class changed:
    Class4: new superclass: "Class1 and hasProperty value Class2"
2 new individuals: Individual1 (a Class4), Individual2 (a Class2, a Class1)
1 new fact: Invididual1 hasProperty Individual2

Such commit messages could be even structured (using OMV, for example) to allow advanced search over them. The question is about existing VCS which can be extended in such a way.


Here is my attempt to implement the features I was telling about in this question: http://code.google.com/p/ontovcs. It can be used together with existing version control systems, such as Git and Mercurial, and is able to provide summary like the one I mentioned above. It is faster than OWLDiff (probably because it does not use reasoner). It also contains a simple three-way merge tool for OWL ontologies. It does not provide any search capabilites, though.

asked 05 Mar '11, 06:37

utapyngo's gravatar image

utapyngo
1.9k312
accept rate: 19%

edited 02 Nov '11, 09:29


We use N3 notation which we post-process the ordering of and then check it into subversion.

permanent link

answered 05 Mar '11, 09:50

Carl's gravatar image

Carl
50228
accept rate: 25%

Can't answer your question directly but this paper from ISWC 2009 might be of interest to you

On detecting high-level changes in RDF/S KBs by Papavassiliou, V. and Flouris, G. and Fundulaki, I. and Kotzinos, D. and Christophides, V.

Edit

If you have Blank Nodes then you may want to look at the approach taken by RDFSync: efficient remote synchronization of RDF models by Tummarello, G. and Morbidoni, C. and Bachmann-Gmur, R. and Erling, O.

RDF diffs are doable when the graph contains blank nodes it just makes things more complex to compute. Essentially the approach is to decompose the graph into MSGs (Minimal Spanning Graph) and you can then compare the list of MSGs to discover differences.

And speaking from experience implementing an algorithm for doing RDF diffs based around the approach outlined in their paper is relatively easy though depending on the implementation may need a decent graph isomorphism algorithm as well which is a whole other issue.

permanent link

answered 05 Mar '11, 09:20

Rob%20Vesse's gravatar image

Rob Vesse ♦
14.1k1715
accept rate: 29%

edited 06 Mar '11, 17:33

Thank you, Rob, interesting article. But there are no words about blank nodes.

(06 Mar '11, 08:18) utapyngo utapyngo's gravatar image

If you have blank nodes then it's going to be difficult to do I'm afraid

(06 Mar '11, 17:27) Rob Vesse ♦ Rob%20Vesse's gravatar image

@eye If you have blank nodes then you probably need a more general RDF diff approach as described in the RDFSync paper I've linked

(06 Mar '11, 17:34) Rob Vesse ♦ Rob%20Vesse's gravatar image

To identify changes in the semantics of an ontology, you could try OWL Diff. As far as I know, it does not interact with SVN or other version control systems (yet).

There is also a paper (which won the KR best paper award) about the problem of finding differences between DL-Lite ontologies.

permanent link

answered 05 Mar '11, 10:04

Robert%20Hoehndorf's gravatar image

Robert Hoehn...
1213
accept rate: 28%

Jeremy Carroll has documented a canonical way of assigning labels to blank nodes http://www.hpl.hp.com/techreports/2003/HPL-2003-142.pdf. This works with most graphs. Combined with canonical whitespace for N-Triples, it provides a canonical string form for most graphs.

Unfortunately, adding a single blank-node-containing triple to the graph has the potential to relabel every blank node. :-(

Really for RDF diffs the only solution is to skolemize each blank node on input (i.e. assign it a URI). I believe this is what Talis does. (Their RDF store has pretty good support for diffs and versioning.)

permanent link

answered 06 Mar '11, 11:01

tobyink's gravatar image

tobyink ♦
5.3k312
accept rate: 26%

TopBraid Composer (that is built into Eclipse and therefore has convenient access to Git, SVN or CVS) has a little-known option in the I/O Preferences that will use a sorted Turtle writer. This algorithm should work well with line-based versioning systems, as it will sort the resources and then their properties and objects alphabetically.

permanent link

answered 15 Oct '11, 04:49

Holger%20Knublauch's gravatar image

Holger Knubl...
1.7k137
accept rate: 15%

I "discovered" that little option this weekend and it made me jump for joy!

(17 Oct '11, 10:05) Jerven ♦ Jerven's gravatar image

How do you deal with blank nodes? They are so evil that can change their names every time you save.

(02 Nov '11, 09:27) utapyngo utapyngo's gravatar image

In 99% of use cases that I typically see, there is only one reference to a given blank node. For example, OWL restrictions only have a single link to a class via rdfs:subClassOf. This means that bnode names usually never show up, but instead you get the [ ... ] syntax. For the ordering of those, e.g. if you have multiple owl:Restrictions on the same subject/predicate, then we rely on the string rendering - in this case Manchester Syntax, in other cases SPIN queries - which will also be consistent. Needless to say there are corner cases where this approach doesn't work.

(02 Nov '11, 16:29) Holger Knubl... Holger%20Knublauch's gravatar image
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×17
×10

question asked: 05 Mar '11, 06:37

question was seen: 3,088 times

last updated: 02 Nov '11, 16:29