It's triples all the way down
… in which I backpedal quite a bit, after helpful comments from Laurens Holst on my previous post.
Clarification 1: I don’t think that namespaces are bad. They are good and important. I just think that sometimes you want to ignore them in queries.
Clarification 2: My proposal doesn’t stop anyone from using namespaces and doesn’t change the semantics of any valid SPARQL query. It only makes some currently invalid SPARQL queries valid: those which use the default prefix without declaring it.
Now for the backpedaling: I guess that most people around here see SPARQL as “SQL for triple stores”, something you embed into application code. My perspective on SPARQL has become a bit different recently. Most of my recent SPARQLing was to interrogate SPARQL endpoints with unknown contents, or to explore the open Semantic Web using the SemWebClient library.
I use SPARQL interactively. I write quick one-off queries, see the result right away, and make corrections as needed. Bogus results are noticed and fixed instantly. I rarely re-run a query later on. Number of characters typed matters a lot in this scenario.
Maybe this explains why I advocate such a nutty idea, and my bad for not realizing this earlier.
So, time to shelf this idea until maybe more people will look beyond their own data and play with the open Semantic Web and feel the pain of having to type out the FOAF namespace for the n-th time.
Thanks for all the comments.
Posted at 16:58
Tim presented the tabulator to the W3C team today; see slides: Tabulator: AJAX Generic RDF browser.
The tabulator was sorta all over the floor when I tried to presented it in Austin in September, but David Sheets put it back together in the last couple weeks. Yay David!
In particular, the support for viewing the HTTP data that you pick up by tabulating is working better than ever before. The HTTP vocabulary has URIs like http://dig.csail.mit.edu/2005/ajar/ajaw/httph#content-type. That seems like an interesting contribution to the WAI ER work on HTTP Vocabulary in RDF.
Note comments are disabled here in breadcrumbs until we figure out OpenID comment policies and drupal etc.. The tabulator issue tracker is probably a better place to report problems anyway. We don't have OpenID working there yet either, unfortunately, but we do support email callback based account setup.
Posted at 16:40
Those of you who read my SCAI 2006 slides may have noticed that I announced a new version of Wilbur and OINK, written in Python! We are not quite ready for prime time, but eventually the source code will make its way to our repository at SourceForge.
No, this does not mean I am giving up Common Lisp. Unfortunately, Common Lisp just does not run on our phones...
Posted at 14:24
Posted at 07:30
That was my reaction when I read a recent blog entry by Richard Cyganiak where he argues - essentially - that namespaces are bad (my interpretation). He claims that using namespace prefixes in SPARQL queries is superfluous and just makes things harder. He says:
The query processor should match the QNames regardless of namespace. Thus,
:name
would match bothfoaf:name
anddoap:name
.
Ouch! Please tell me I am missing something. We decided to use
URIs to name nodes in an RDF graph for a reason, so that
one wouldn't think that foaf:name
and
doap:name
are the same, or interchangeable, or in any
way related. They are, of course, both sub-properties of
rdfs:label
, but you have a reasoner to tell you that.
Don't you?
URIs are opaque, meaning that even if two URIs end in
"name", you should think of those URIs as mere identifiers with no
semantics associated to their names. Now, it would be a
different thing if you queried for things whose human-readable
names (i.e., values of rdfs:label
or some
sub-property thereof) would be the same. Maybe. I wouldn't, but
someone else might.
Please, please tell me that I am getting old and I missed something crucial...
Posted at 20:37
Posted at 16:11
Managing multiple sets of sign-on information (username and password) is a task that many people hate. I often forget either the username or the password for a less frequently visited web site, and sometimes I forget both. OpenID is an emerging framework that attempts to solve this problem. This framework built based on a simple idea that users on the Web can be uniquely identified by URI.
As popular web sites such as Zooomr, Livejournal and Technorati begin to support OpenID, I wonder if we can use OpenID for identifying people and discovering the semantic description about them on the Semantic Web?
A user picks an OpenID Identity Provider (e.g., Verisign’s PIP and MyOpenID.com). Register a unique URI with the Identity Provider. Typically this URI is in the form of HTTP URL — e.g. http://harry.chen.myopenid.com. Once this URI is registered and authenticated, it can be used for signing on to any web sites that supports the OpenID protocol. Should the user wants to change the password used for authenticating his/her OpenID, the user changes it with the Identity Provider and not with the individual OpenID-enabled web sites.
An obvious application of the OpenID URI is to use which to identify instances of FOAF Person in a FOAF document. In the past, there were discussions about what URI format is best for representing a FOAF Person. Some said it should be the home page URL of a person, and some others said it should be a standard URI pattern convention (e.g., http://harry.hchen1.com/me). I think an OpenID URI is a better approach. Not only it’s a unique URI, but also it has a functional purpose — authenticating users on the Web. Why reinventing new URI for myself if I already have a URI that uniquely identifies me in a dozen different web sites?
Moreover, I believe that OpenID can be exploited for publishing semantic description about people. OpenID supports a special kind of authentication method called “delegating authentication“. We can exploit this method to fetch RDF descriptions (e.g., FOAF documetns) about the person that an OpenID URI identifies.
This is how delegating authentication works. Sometimes users want to use customized URI for their OpenID identification. For example, I may want to use http://harry.chen.ebiquity.umbc.edu as oppose to http://harry.chen.myopenid.com. However, users may not be capable of running or don’t wish to run an Identity Provider. For example, I’m not authorized to host an OpenID Identity Provider on ebiquity.umbc.edu. In this case, a user can define configurations for delegating authentication in an HTML document that is linked from the customized URI (see an example and howto).
Since the configuration block for delegation authentication
builds on HTML
link, the rel and href attribute, we can
add to this block with links to a FOAF document about a person.
The benefit of this approach is that any OpenID-enabled sites can automatically fetch personal profile information about the users (a photo image that the depicts the person, their home base location, nearby airport, a group of people that the user knows etc.). I’ve added the above definition in my OpenID HTML page.
OpenID is an emerging web framework that attempts to simplify the management of web authentication profiles (usernames and passwords). Since URI is used to identify users in this framework, I believe we should borrow those URI to represent people in FOAF documents. In addition, I believe we can extend OpenID’s “delegating authentication” method to link FOAF documents from people’s OpenID URI. Not only this will bring forth the benefit of having a single sign-on, but also enable web sites to automatically discover people’s profiles and their social network information.
Posted at 15:14
AgentOWL library was developed to support RDF/OWL ontology models in JADE [Java Agent Development Framework] agent system. The library use Jena for ontology model manipulation. Functionality is shown on simple example of two communicating agents. It covers functionalities such as:
----
Not sure I've blogged about this before (I did add a page to the
ESW Wiki : SemanticWebAgentFramework),
but I was looking at JADE not long ago for dayjob work. Basically
the purpose of the system being built is to query various disparate
data sources as if they were a single database. One feature that
was required was to be able to get some results immediately, but
for it to be possible to repeat the query (refresh the browser)
soon after and see any additional results that had been collected.
Very async, so the architecture is comprised of several independent
chunks of functionality, communicating with each other as (fairly)
autonomous agents.
Looking into suitable toolkits to support the agent approach, on
Java the clear best choice was JADE. I had quite a bit of fun
setting up the basic comms, and found the whole agent approach
really quite inspirational. JADE is mature kit, and mostly Just
Worked. However, all the particular application (being
RDF/web-based), asked of the agents really was pass to around URIs.
Although time constraints mean we're using JADE for now (and
someone else has been handed that part of the dev), we generally
agreed it was major overkill.
Without realising it at the time, much of the last big bunch of code I was playing with for myself was quite agent-like. I'd got various smallish Python scripts each with some particular behaviour, operating against a shared triplestore (see RdfStoreWhiteboard), their behaviour mostly being triggered by cron or HTTP calls. Anyhow, like I said, I really liked the idioms of JADE, but there were some aspects that just didn't seem right:
Yep, all of these are seen through Web spectacles. My conclusion was that it would be really good to make a framework that had a lot in common with JADE (agents, with pluggable behaviours, supporting the core FIPA ACL acts etc) but based directly on Semantic Web tech. No more layers than necessary. A typical agent would comprise a HTTP client and server together with an RDF model, exposing ASK/TELL interfaces based on RDF/SPARQL over HTTP. Some agents might just act as language converters (e.g. exposing an Atom Protocol interface or support GRDDL), others might look after aggregation, or query federation, or inference, or user interface, or...whatever. The framework would internally simply provide a common pluggable-behaviour programming interface to the communication and data subcomponents. Externally, it'd be another wee bit of the (Semantic) Web - the RDF/SPARQL endpoints would just be what you get with those mime types.
Anyhow, although I started hacking on it a while ago, I made a false start by beginning with a HTTP server from scratch (thinking of making the wiring to the triplestore fairly direct), been distracted since. I plan to restart soon, probably using Java/Jena/ARQ/servlets API/Jakarta Commons client. I've got some GRDDL code that should make a reasonable first demo, then I can get on with reworking the pragmatron ideas for it.
Dunno, when I saw what Frederick Giasson was talking about doing with ZitGist (and Kingsley Idehen), I couldn't help thinking that might benefit from the agent paradigm too. But that might just be "when all you have is a hammer..."
Posted at 15:11
OK, this isn’t really a Semantic Web post at all (sorry PlanetRDFers) but I thought I’d share the following odd fact…
I’m on my second week of travel going to various and sundry meetings. Monday night, I was in a Hilton Garden Inn in Albany New York. Hilton Garden Inn brags that every room has an electronic clock radio with an MP3 player and there was one in the room. Very nice, except for one odd thing, the clock was five minutes fast and there was no obvious way to change the time setting (I tried all combinations of button press, looked for instructions, and even called the desk - since I was getting obsessive).
Here’s the thing - Tuesday night I stayed at a different Hilton Garden Inn in Norfolk, Virginia - about 500 miles away. You guessed it, same kind of clock, and it was also 5 minutes fast.
So this seemed odd - I Googled various strings to see if I could find someone else reporting this phenomenom to see if this was just a weird coincidence or some kind of corporate conspiracy (or a broken central clock) - but I have yet to find anyone else complaining - so I figured I should blog this, and we’ll see if we can find out…
Posted at 13:30
Posted at 10:54
With today's upcoming release of Firefox 2.0 you may want to run multiple versions of Firefox in parallel. Or just run multiple profiles of Firefox 2.0 at once.
Here's how: Running more than one profile in parallel
SeeAlso:
Posted at 17:56
My colleague Kingsley introduced the concepts of a multi-dimensional Web (compared to the multi-dimensional universe). He described the first four dimensions as:
Dimension 1 = Interactive Web (Visual Web of HTML based Sites aka Web 1.0)
Dimension 2 = Services Web (Presence based Web of Services; a usage pattern commonly referred to as Web 2.0)
Dimension 3 = Data Web (Presence and Open Data Access based Web of Databases aka Semantic Web layer 1)
Dimension 4 = Ontology Web (Intelligent Agent palatable Web aka Semantic Web layer 2)
...
So, the Web as we know it today would have three dimensions:
Personally I would define them as (without talking about Web 1.0 or Web 2.0 or Web X.0):
The Interactive-Web dimension is the Web of humans: document formatted for humans understanding (HTML, DOC, PDF etc.).
The Services-Web dimension is the Web of functionalities: how humans and machines can play with functionalities of a system.
The Data-Web dimension is the Web of data presence: availability of open and meaningful data. How machines can play with the data of a system.
The Interactive-Web
The Interactive-Web is the Web of humans: a Web where all documents (HTML, PDF, DOC, etc.) are formatted at the intention of the humans with visual markers (headers, footers, bold characters, bigger fonts etc.) to help them scanning and quickly finding the right information.
But the problem with the Interactive-Web is that it is only intended to humans, so machines (software agents for example) have real difficulty to analyze and interpret this type of documents.
The Services-Web
The Services-Web also exists in the current landscape of the Web: a Web where protocols exist to let people and machines (web services, software, etc) playing with the functionalities of a system.
With this Web, one can manipulate the information within a system (web service) without using the primary user interface developed for this purpose. That way, the power is gave back to the users letting them manipulating (in most cases) their data using the user interface they like.
The Services-Web dimension already exists and is extensively used to publish information on the Web. Fewer web services will use the Services-Web to let people adding, modifying and deleting data (their own) in the system.
The Data-Web
The Data-Web dimension also exist in the current Web, but it is much more marginal than the two firsts dimensions. This dimension belongs to the idea of the Semantic Web: developing standards to let machines (software) communicating together in a meaningful way. The idea here is to publish structured data at the intention of machines (and not human) to help them communicate (and the communication is assured by the use of standards).
A switch from Services-Web to the Data-Web
What I think that will happen is that the Services-Web dimension will not be used to publish information from a system to another as it is today. In fact, the Services-Web will only let users trigger functionalities of a system to add, modify and delete data in the system, and the Data-Web will publish (the communication of the data will be assured by the use of standards such as the one of the Semantic Web) data in a meaningful way from a system to another system.
So the way we use the Services-Web today is not the way we will use it tomorrow.
Final word
Yesterday I started to write a series of articles to explain the creation of ZitGist and to explain how Talk Digger and Ping the Semantic Web will evolve in the next months and years.
This article is the foundation of my explanation. This is the basic framework I’ll use to explain how Talk Digger and Ping the Semantic Web work and how they interact together and with the Web.
In the next few articles, I’ll explain how these two systems fit in this framework.
Technorati: Zitgist | talkdigger | pingthesemanticweb | semantic | web | dimensions | services | data | interaction | human | machine |
Posted at 17:33
The GRDDL Working Group has released the First Public Working Draft of GRDDL. With important applications such as connecting microformats to the Semantic Web, GRDDL is a mechanism to extract RDF statements from suitable XHTML and XML content using programs such XSLT transformations. GRDDL is ready to deploy, allowing powerful mash-ups at very low cost. You can also read the related W3C press release.
Posted at 14:36
Google launched Google Co-op late last night, a service to create custom search engines. The central feature is that it prioritizes or restricts search results based on websites and pages that you specify. It also allows one to tag sources and to provide a way to focus on hits from sources with a given tag from the results page. You can open up the development and maintenance of your custom search engine to others, allowing people (everyone or just those you invite) to add or exclude sites and to tag sources.
Although Elias Torres beat us to it, we’re experimenting with the service with a Co-op search engine (here) that draws on a number of sites related to the semantic web. Feel free to add to it or tag some resources.
Not surprisingly, Google’s getting lots of press for this (e.g., FT, SF Chronicle).
This is a good idea, though not novel — remember the concept of a focused search engine? The idea of letting users create their own focused search engines through a web interface is also not new. Rollyo is a Yahoo-powered service that offers the same basic capability and Swicki is another that offers some interesting wiki-inspired features. Google’s collaboration model is different though. I wish there were something between “anyone” and “invited individuals” for collaboration. The former opens the door to web spam, which will soon come in. The latter is too simple a model for building a large community. Update: After trying the collaboration a bit, I see that the “anyone” model requires the approval of the owner, so the collaboration model is reasonable, though simple.
Posted at 11:42
The (first) face to face meeting of the Semantic Web Education and Outreach Interest Group takes place between November 14 and 15, 2006, in Burlington, MA, USA, hosted by Oracle. Members of the interest group may register for the meeting using the registration form; the agenda of the meeting is also publicly available.
Posted at 08:03
Anyways, one of the good things of Google Rollyo is that planets can have their own search engine too and Venus doesn’t have to solve this problem for them, except for maybe a “hacky” Google Custom Search Engine template to help ease the pain from the planet lords (too bad you can just point your custom search engine to an OPML on the web, instead you must upload one everytime, oh well).
BTW, dajobe, shouldn’t we have Wing on Planet RDF?
Posted at 06:04
Dan Brickley's off to Buenos Aires in a couple of days, and to orient himself he's been doing some mashup code: Flickr groups on Google Earth, good write-up, plenty of screenshots.
What's novel is treating the Flickr API as if it were a SPARQL endpoint, by transforming the results into SPARQL results syntax (XML or JSON). Ok, it's like the results from one preset query, but using the results as an integration point is rather nifty, an easy reusable interface to code against in situations when turning the data into RDF and then querying might be overkill.
danbri asked me to whip up a bit of XSLT for it, although he's moving on to doing everything in Ruby. I've been stuck mostly in Java the past few months, I like XSLT but it does seem to turn painful very quickly, so the mention of a scripting language got me eager. Not yet started looking at Ruby (still got Lisp on my to-do list), but there was an obvious SPARQL results generating I couldn't resist trying: jaspwr.py * reads feeds using Mark Pilgrim's Universal Feed Parser and outputs JSON SPARQL results based on (much of) Atom's vocabulary. Untested, I only had time to get as far as making it look right.
* will probably keep a more up-to-date version in danbri's svn repository
See also: danbri's note to Iranian Flickr Group.
Posted at 15:39
Posted at 13:46
(Copy’n'pasted from a poster I’ve done recently – I thought it’s worth re-posting here)
Public and private organizations can benefit from making data available in machine-readable formats. Such data is most valuable when it is easily accessible and can be re-used by integrating it with other data. Semantic Web technologies support this through
Data published on the Semantic Web can be queried using the SPARQL query language, can be navigated with RDF browsers like Tabulator and Piggy Bank, and is accessible to RDF-consuming Web crawlers.
Posted at 13:37
Posted at 13:20
This week I am off to give the keynote at a joint session of the 2006 Scandinavian Conference on AI and the Finnish AI Symposium. This marks the 20th anniversary of the founding of the Finnish AI Society.
I am happy to do this; last time I attended SCAI was in 1989! I will post my slides after the talk.
Posted at 13:16
Posted at 13:15
Geonames announced the release of its Geonames ontology v1.2. The new ontology has few enhancements. It introduced the notion of linked data and made clear distinction between URI that intended for linking documents and for linking ontology concepts.
Different types of geospatial data are of different spatial granularity. Data of different spatial granularity may relate to each other by the containment relation. For example, countries contain states, states contains cities and so on. Some geospatial data are of the similar spatial granularity (e.g., two cities that are nearby each other, or two countries that are neighboring each other). To support the knowledge representation of these relationships, the ontology introduced three new properties: childreanFeatures, nearbyFeatures and neighbouringFeatures.
In the Semantic Web, both ontology concepts and physical web documents are linked by URI. Sometimes in applications, it’s useful to make clear whether the use of a URI is intended for linking documents or for linking ontology concepts. The new Geonames ontology introduced a URI convention for identifying the intended usage of a URI. This convention also simplifies the discovering of geospatial data using Geonames web services.
Here is an example:
Other interesting ontology properties include wikipediaArticle and locationMap. The former links a Feature instance to a Web article on Wikipedia, and the latter links a Feature instance to a digital map Web page.
For additional information about Geonames ontology v1.2, see Marc’s post at the Geonames blog.
Posted at 01:50
Protege remains one of the most popular editors for ontologies and including Semantic Web ontologies and RDF and OWL data. The Protege Community of Practice) community has a wiki that includes an interesting section on modeling tips and tricks. Many of the entries are not specific to Protge or even to OWL and RDF. It looks like a good resource to watch and also to contribute to. (Spotted in a SWIG IRC Scratchpad comment by Dan Brickley.)
Posted at 19:47
Posted at 19:16
Posted at 10:53
Posted at 16:02
Posted at 13:59
Posted at 12:29