DERI Galway
National University of Ireland, Galway   Science Foundation Ireland

Technical Reports

Integrating FLOSS repositories on the Web

Aftab Iqbal, Richard Cyganiak, Michael Hausenblas
Linked Data Research Center, 2012.

Get the document.

D03.03 Data Mining and Semantic Matching Engine

Simon Scerri, Keith Cortis, Ismael Rivera, Cristina Frà Project Consortium, 2012.

Get the document.

Theoretical And Technological Building Blocks For An Innovation Accelerator

Frank van Harmelen, George Kampis, Katy Borner, Peter van den Besselaar, Erik Schultes, Carole Goble, Paul Groth, Barend Mons, Stuart Anderson, Stefan Decker, Conor Hayes, Thierry Buecheler, Dirk Helbing
VU University Amsterdam, 2012.

Modern science is a main driver of technological innovation. The efficiency of the scientific system is of key importance to ensure the competitiveness of a na- tion or region. However, the scientific system that we use today was devised centuries ago and is inadequate for our current ICT-based society: the peer review system en- courages conservatism, journal publications are monolithic and slow, data is often not available to other scientists, and the independent validation of results is limited. The resulting scientific process is hence slow and sloppy. Building on the Innovation Ac- celerator paper by Helbing and Balietti [1], this paper takes the initial global vision and reviews the theoretical and technological building blocks that can be used for implementing an innovation (in first place: science) accelerator platform driven by re-imagining the science system. The envisioned platform would rest on four pillars: (i) Redesign the incentive scheme to reduce behavior such as conservatism, herding and hyping; (ii) Advance scientific publications by breaking up the monolithic paper unit and introducing other building blocks such as data, tools, experiment workflows, resources; (iii) Use machine readable semantics for publications, debate structures, provenance etc. in order to include the computer as a partner in the scientific process, and (iv) Build an online platform for collaboration, including a network of trust and reputation among the different types of stakeholders in the scientific system: scientists, educators, funding agencies, policy makers, students and industrial innovators among others. Any such improvements to the scientific system must support the entire scien- tific process (unlike current tools that chop up the scientific process into disconnected pieces), must facilitate and encourage collaboration and interdisciplinarity (again un- like current tools), must facilitate the inclusion of intelligent computing in the scientific process, must facilitate not only the core scientific process, but also accommodate other stakeholders such science policy makers, industrial innovators, and the general pub- lic. We first describe the current state of the scientific system together with up to a dozen new key initiatives, including an analysis of the role of science as an innovation accelerator. Our brief survey will show that there exist many separate ideas and con- cepts and diverse stand-alone demonstrator systems for different components of the ecosystem with many parts are still unexplored, and overall integration lacking. By analyzing a matrix of stakeholders vs. functionalities, we identify the required innova- tions. We (non-exhaustively) discuss a few of them: Publications that are meaningful to machines, innovative reviewing processes, data publication, workflow archiving and reuse, alternative impact metrics, tools for the detection of trends, community for- mation and emergence, as well as modular publications, citation objects and debate graphs. To summarize, the core idea behind the Innovation Accelerator is to develop new incentive models, rules, and interaction mechanisms to stimulate true innovation, revolutionizing the way in which we create knowledge and disseminate information.

Get the document.

SPARQL 1.1 and RDF Faceted Browsing

Fadi Maali, Nikolaos Loutas
DERI, 2012.

On the Semantic Web, faceted browsing is a popular choice to support user navigation over RDF data. While existing tools provide powerful navigation and rich user experience, they require transforming the RDF data into a specific format or pre-processing the data to build particular indices and models. We show how the faceted navigation of RDF data can be directly supported by utilising the upcoming version of the standard RDF query language, SPARQL 1.1. By directly mapping faceted navigation to SPARQL queries, we enable navigating the large number of datasets currently available behind SPARQL endpoints. Secondly, we present algorithms necessary to support browsing data distributed across several SPARQL endpoints over the web. An empirical study that provides insights into the queries involved in faceted browsing and compares the different algorithms is also presented.

Get the document.

Dx - Initial Mappings for the Semantic Presence Based Ontology Definition

Maciej Dabrowski, Simon Scerri, Ismael Rivera, Myriam Leggieri
Digital Enterprise Research Institute, 2012.

This deliverable describes the initial set of mappings for a set of the XMPP Extension Protocols (XEPs) relevant for Semantic Presence Ontology definition. In particular, this document identifies the list of XEPs that can contribute to the Semantic Presence. Further, we outline the ontologies that constitute the Semantic Presence model and describe the concepts they model, which facilitate the project use cases. Finally, this document provides a set of mappings that enable the inclusion of the data described with XMPP in the Semantic Presence model.

Developing in the Cloud

Aftab Iqbal, Michael Hausenblas, Stefan Decker
Linked Data Research Centre, 2012.

With cloud computing as an infrastructure model maturing, first efforts around cloud-based software development are emerging. In this model, a cloud-based environment enables developers to edit, test and deploy Web applications through a Web browser. We consider this tool stack to be disruptive in its nature, changing the way software development is being carried out and opening up new opportunities for service providers as well as (teams of) developers. In this report we review the state of the art of cloud-based software development environments-in particular browser-based development environments-and discuss their features as well as related challenges and opportunities

Get the document.

Library Linked Data Incubator Group Final Report

Thomas Baker, Emmanuelle Bermès, Karen Coyle, Gordon Dunsire, Antoine Isaac, Peter Murray, Michael Panzer, Jodi Schneider, Ross Singer, Ed Summers, William Waites, Jeff Young, Marcia Zeng
W3C, 2011.

Get the document.

Improving the recall of decentralised linked data querying through implicit knowledge

Jürgen Umbrich, Aidan Hogan, Axel Polleres
Digital Enterprise Research Institute, 2011.

Aside from crawling, indexing, and querying RDF data centrally, Linked Data principles allow for processing SPARQL queries on-the-fly by dereferencing URIs. Proposed link-traversal query approaches for Linked Data have the benefits of up-to-date results and decentralised (i.e., client-side) execution, but operate on incomplete knowledge available in dereferenced documents, thus affecting recall. In this paper, we investigate how implicit knowledge – specifically that found through owl:sameAs and RDFS reasoning – can improve the recall in this setting. We start with an empirical analysis of a large crawl featuring 4 m Linked Data sources and 1.1 g quadruples: we (1) measure expected recall by only considering dereferenceable information, (2) measure the improvement in recall given by considering rdfs:seeAlso links as previous proposals did. We further propose and measure the impact of additionally considering (3) owl:sameAs links, and (4) applying lightweight RDFS reasoning (specifically ρDF) for finding more results, relying on static schema information. We evaluate our methods for live queries over our crawl.

Get the document.


Danh Le Phuoc, Josiane Xavier Parreira, Manfred Hauswirth
DERI, 2011.

In this report we address the problem of scalable, native and adaptive query processing over

Linked Stream Data integrated with Linked Data. Linked Stream Data consists of data generated by

stream sources, e.g., sensors, enriched with semantic descriptions, following the standards proposed

for Linked Data. This enables the integration of stream data with Linked Data collections and

facilitates a wide range of novel applications. Currently available systems use a “black box” approach

which delegates the processing to other engines, e.g, stream/event processing engines, SPARQL query

processors by translating to their provided languages. As the experimental results described in this

paper show, the need for query translation and data transformation, as well as the lack of full control

over the query execution pose major drawbacks in terms of efficiency. To remedy these drawbacks,

we present CQELS (Continuous Query Evaluation over Linked Streams), a native and adaptive query

processor for unified query processing over Linked Stream Data and Linked Data. In contrast to the

existing systems, CQELS uses a “white box” approach and implements the required query operators

natively to avoid the overhead and limitation of closed system regime. CQELS provides a flexible

query execution framework with the query processor dynamically adapting to the changes in the input

data. During query execution, it continuously reorders the operators according to some heuristics to

achieve improved query execution in terms of delay and complexity. Moreover, external disk access

on large Linked Data collections is reduced with the use of data encoding and caching of intermediate

query results. To demonstrate the efficiency of our approach, we present extensive experimental

performance evaluations in terms of query execution time, under varied query types, dataset sizes,

and number of parallel queries. These results show that CQELS outperforms related approaches by

orders of magnitude.

Get the document. (668.58 kB)

Mapping between RDF and XML with XSPARQL

Stefan Bischof, Stefan Decker, Thomas Krennwallner, Nuno Lopes, Axel Polleres
Digital Enterprise Research Institute, 2011.

One of the promises of Semantic Web applications is to seamlessly deal with heterogeneous data. While the Extensible Markup Language (XML) has become widely adopted as an almost ubiquitous interchange format for data, along with transformation languages like XSLT and XQuery to translate from one XML format into another, the more recent Resource Description Framework (RDF) has become another popular standard for data representation and exchange, supported by its own powerful query language SPARQL, that enables extraction and transformation of RDF data. Being able to work with these two languages using a common framework eliminates several unnecessary steps that are currently necessary when handling both formats side by side. In this report we present the XSPARQL language that, by combining XQuery and SPARQL, allows to query XML and RDF data using the same framework and, respectively transform one format into the other. We focus on the semantics of this combined language and present an implementation, including discussion of query optimisations along with benchmark evaluation.

Get the document.

A General Framework for Representing, Reasoning and Querying with Annotated Semantic Web Data

Nuno Lopes, Axel Polleres, Umberto Straccia, Antoine Zimmermann
Digital Enterprise Research Institute, 2011.

We describe a generic framework for representing and reasoning with annotated Semantic Web data, a task becoming more important with the recent increased amount of inconsistent and non-reliable meta-data on the web. We formalise the annotated language, the corresponding deductive system and address the query answering problem. Previous contributions on specific RDF annotation domains are encompassed by our unified reasoning formalism as we show by instantiating it on (i) temporal, (ii) fuzzy, and (iii) provenance annotations. Moreover, we provide a generic method for combining multiple annotation domains allowing to represent, e.g. temporally-annotated fuzzy RDF. Furthermore, we address the development of a query language -- AnQL -- that is inspired by SPARQL, including several features of SPARQL 1.1 (subqueries, aggregates, assignment, solution modifiers) along with the formal definitions of their semantics.

Get the document.

An Introduction to Cloud Computing

Gerry Conway
Innovation Value Institute (IVI), 2011.

This paper describes cloud computing, its main characteristics and the models that are currently used for both deployment and delivery. It examines the benefits and business issues with using the cloud, and how they can be addressed. It describes some of the early adapters of cloud computing, together with their experiences.

Get the document.


Renaud Delbru, Stephane Campinas, Krystian Samp, Giovanni Tummarello
DERI, 2010.

The performance of Information Retrieval systems is a key issue in large web search engines. The use of inverted indexes and compression techniques is partially ac- countable for the current performance achievement of web search engines. In this paper, we introduce a new class of compression techniques for inverted indexes, the Adaptive Frame of Reference, that provides fast query response time, good compression ratio and also fast indexing time. We compare our approach against a number of state-of-the-art compres- sion techniques for inverted index based on three factors: compression ratio, indexing and query processing performance. We show that significant performance improvements can be achieved.

Get the document.

A Conceptual Architecture for Mashup Makers

Ronan Fox, Manfred Hauswirth
Digital Enterprise Research Institute, National University of Ireland, Galway, 2010.

In this report the conceptual architecture of mashup makers is described. The conceptual architecture is mapped to existing mashup makers by showing examples of how those dimensions are implemented by mashup makers in the provision of services to their user groups

Towards Technology Structure Mining from Text by Linguistics Analysis

Behrang Qasemizadeh
DERI, 2010.

This report introduces the task of Technology-Structure Mining to support Management of Technology. We propose a linguistic based approach for identification of Technology Interdependence through extraction of technology concepts and relations between them. In addition, we introduce Technology Structure Graph for the task formalization. While the major challenge in technology structure mining is the lack of a benchmark dataset for evaluation and development purposes, we describe steps that we have taken towards providing such a benchmark. The proposed approach is initially evaluated and applied in the domain of Human Language Technology and primarily results are demonstrated. We further explain research challenges and our research plan.

Get the document.

Continuous Query Optimization and Evaluation over Unified Linked Stream Data and Linked Open Data

Danh Le Phuoc, Josiane Xavier Parreira, Michael Hausenblas, Manfred Hauswirth
DERI, 2010.

In this report we address the problem of scalable query processing over Linked Stream

Data integrated with Linked Open Data. Linked Stream Data consists of data generated by stream

sources, e.g., sensors, enriched with semantic descriptions, following the standards proposed for

Linked Data. This will enable the easy integration of sensor data with the quickly growing amount

of Linked Open Data and facilitate the use of the large body of existing software along with a wide

range of novel applications. However, the highly dynamic nature of sensor data requires new approaches

for data management and processing which are not supported by existing systems. To

remedy this, we present our Continuous Query Evaluation over Linked Streams (CQELS) approach

which provides a scalable query processing model for unified Linked Stream Data and Linked Open

Data. Scalability in CQELS is achieved by applying state-of-the-art techniques for efficient data

storage and query pre-processing, combined with a new adaptive cost-based query optimization algorithm

for dynamic data sources, such as sensor streams. In traditional Database Management


(DBMS), query optimizers use pre-computed selectivity values for the data to decide on the best

execution plan, whereas with continuous query over stream data the data – and consequently its

selectivity values – varies over time. This means that the optimal execution plan itself can vary

throughout the execution of the query. To overcome this problem, the CQELS query optimizer retains

a subset of the possible execution plans and, at query time, updates their respective costs and

chooses the least expensive one for executing the query at this given point in time. We have implemented

CQELS and our experimental results show that CQELS can greatly reduce query response

times while scaling to a realistically high number of parallel queries.

Get the document.

Service Protocol Replaceability Assesment

ZhangBing Zhou, Feng Gao
DERI, 2010.

Given the inherent autonomy, heterogeneity, and continuous evolution of Web services, mismatches usually exist between service protocols and mediated interactions are a common style of service interactions. Given a requestor service and an interaction to be conducted, if the provider service is found unavailable, we need to identify the most suitable provider service from a set of functionally equivalent candidates to replace the original one. Cur- rent techniques analyzing protocol replaceability can compute a replacement degree that specifies how replaceable two protocols are, but they cannot de- termine whether or not, and under which conditions, the effects prescribed by the requestor can be achieved. To address this challenge we propose a technique called replaceability assessment in this paper where, according to the adaptation mechanisms of a certain adapter, this technique (i) provides a set of condition pairs that determine when one protocol can be replaced by another, and (ii) computes a replacement degree. The set of condition pairs and the replacement degree are two complementary criteria to be used by the requestor for identifying the most suitable provider service.

Get the document.

Unifying Stream Data and Linked Open Data

Danh Le Phuoc, Josiane Xavier Parreira, Michael Hausenblas, Manfred Hauswirth
DERI, 2010.

With the increasing popularity of sensor networks there has been a lot of effort in lifting the contents of such networks to a semantic level. This opens doors to the integration of sensor data with the more established semantic data sets, such as the Linked Open Data initiative, facilitating a wide range of new applications. However, the distributed, heterogeneous, real-time, and large scale nature of the data collections poses a big challenge. This project aims to build a platform for integrating sensor data with web data using linked data model. We first address the challenge of heterogenous data integration of dynamic data sources. Then we focus on how build efficient query processor over linked sensor stream data. Scalability is also an issue to be considered. This report describes the current status report of the work and the next steps of the project.

Get the document.

Searching and Browsing Linked Data with SWSE: the Semantic Web Search Engine

Aidan Hogan, Andreas Harth, Jürgen Umbrich, Sheila Kinsella, Axel Polleres, Stefan Decker
DERI, 2010.

In this report, we discuss the architecture and implementation of the Semantic Web Search Engine (SWSE). Following traditional search engine architecture, SWSE consists of crawling, data enhancing, indexing and a user interface for search, browsing and retrieval of information; unlike traditional search engines, SWSE operates over RDF Web data (loosely also known as Linked Data) which implies unique challenges for the system design, architecture, algorithms, implementation and user interface. In particular, many challenges exist in adopting Semantic Web technologies for Web data: the unique challenges of the Web (in terms of scale, unreliability, inconsistency and noise) are largely overlooked by the current Semantic Web standards. In this report, we detail the current SWSE system, initially detailing the architecture and later elaborating upon the function, design, implementation and performance of each individual component. In so doing, we also give an insight into how current Semantic Web standards can be tailored, in a best-effort manner, for use on Web data. Throughout, we offer evaluation and complementary argumentation to support our design choices, and also offer discussion on future directions and open research questions. Later, we also provide candid discussion relating to the diffculties currently faced in bringing such a search engine into the mainstream, and lessons learnt from roughly five years working on the Semantic Web Search Engine project.

Get the document. (1.26 MB)

Business Process Model Discovery using Semantics

Gabriela Vulcu, Wassim Derguech, Sami Bhiri
DERI - Digital Enterprise Research Institute, National University of Ireland, Galway, 2010.

Business process model discovery represents a pillar technique that enables business process model reuse. In this paper we describe a method for business process model discovery, which uses semantically annotated business processes. We created an RDF vocabulary for business processes that captures functional, non functional and structural properties that is used in the annotations of basic activities. We developed a set of algorithms to automatically generate different representations of the same business process at different granularity levels. We defined a set of rules to extract the RDF meta data in the annotated business process models and to build an RDF knowledge base which then can be interrogated using SPARQL.

Technical Report: Visual Abstraction and Ordering in Faceted Browsing of Text Collections

VinhTuan Thai, Pierre-Yves Rouille, Siegfried Handschuh
Digital Enterprise Research Institute, 2010.

Faceted navigation is a proven technique for exploration and discovery of a resource collection. In this paper, we report on a visual support toward the exploration of a collection of documents based on a set of entities of interest to users, in which faceted navigation is employed for the filtering process. Our approach can be used when metadata is not available and unlike other faceted browsing work, it treats documents as content-bearing items. We propose using a multi-dimensional visualization as an alternative to the linear listing of focus items. We describe how visual abstractions based on a combination of structural equivalence and conceptual structure can be used simultaneously to deal with a large number of items, as well as visual ordering based on the importance of facet values to support prioritized, cross-facet comparison of focus items. A user study was conducted and it showed that interfaces using the proposed approach can better support users in exploratory tasks and were also well-liked by the participants of the study, with the hybrid interface combining the multi-dimensional visualization with the linear listing receiving the most favorable ratings.

Get the document.

Mapping Mashup Makers to a Design Space

Ronan Fox, Manfred Hauswirth
Digital Enterprise Research Institute, National University of Ireland, Galway, 2010.

In this report the important dimensions that bound the design space of mashup

makers are described. These identified dimensions are mapped to existing mashup makers, showing examples of how those dimensions are implemented by mashup makers in the provision of services to their user groups. 13 technical dimensions were identified with 53 classes, which were subsequently mapped to 45 mashup makers.

Get the document.

Minimising RDF Graphs under Rules and Constraints Revisited

Reinhard Pichler, Axel Polleres, Sebastian Skritek, Stefan Woltran
Digital Enterprise Research Institute, National University of Ireland, Galway, 2010.

Based on practical observations on rule-based inference on RDF data, we study the problem of redundancy elimination in RDF in the presence of rules (in the form of Datalog rules) and constraints (in the form of so-called tuple-generating dependencies). To this end, we investigate the influence of several problem parameters (like restrictions on the size of the rules and/or the constraints) on the complexity of detecting redundancy. The main result of this paper is a fine-grained complexity analysis of both graph and rule minimisation in various settings.

Get the document. (654.14 kB)

A General Framework for Representing, Reasoning and Querying with Annotated Semantic Web Data

Nuno Lopes, Gergely Lukácsy, Axel Polleres, Umberto Straccia, Antoine Zimmermann
Digital Enterprise Research Institute, 2010.

We describe a generic framework for representing and reasoning with annotated Semantic Web data, a task becoming more important with the recent increased amount of inconsistent and non-reliable meta-data on the web. We formalise the annotated language, the corresponding deductive system and address the query answering problem. Our work extends previous contributions on RDF annotations by providing a unified reasoning formalism and allowing the seamless combination of different annotation domains. We demonstrate the feasibility of our method by giving a formal description on how easily we can represent and reason with instantiating it on (i) temporal RDF; (ii) fuzzy RDF; (iii) and their combination. A prototype shows that implementing and combining new domains is easy. Furthermore, we address the development of a query language -- AnQL -- that is inspired by SPARQL, including several features of SPARQL 1.1. As a side effect we propose formal definitions of the semantics of these features (subqueries, aggregates, assignment, solution modifiers) which could serve as a basis for the ongoing work in SPARQL 1.1. We demonstrate the value of such a framework by comparing our approach to previously proposed extensions of SPARQL and show that AnQL generalises and extends them.

Get the document. (792.17 kB)

Architecture and Methodologies for Adaptive Personalisation on the Web of Data

Benjamin Heitmann
Digital Enterprise Research Institute, 2010.

Personalisation through recommendation plays an important role for the user experience of many e-commernce and social media sites. Existing services have accumulated the data which is required for good recommendations over time. In order to compete with existing services, new service providers need acquire data and knowledge in order to provide relevant recommendations. We propose to use the Web of Data for the purpose of recommendation and personalisation. In this report we present a survey of the state of the art in the area of adaptive personalisation. The four main research challenges in relation to the design of adaptive services are: the acquisition of (1) data and (2) knowledge, the selection of relevant features from an open corpus (3), and the scalability (4) of the architecture, algorithms and data structures used.

This report describes our contributions towards addressing these research challenges: (i) The implementation and evaluation of a collaborative filtering algorithm in the music domain, which uses the Web of Data to address the data acquisition problem. (ii) The description of a use case, of a detailed example and of available Linked Data data sources to address the knowledge acquisition problem. (iii) A detailed example and a discussion of possible approaches to address the open corpus problem on the Web of Data by using feature selection and soft case-based reasoning. (iv) An architecture for open and scalable recommender systems on the Web of Data which is derived from our reference architecture for Semantic Web applications and based on an empirical analysis of Semantic Web applications.

We also show that the Web of Data is not only well suited as data source for current recommendation algorithms, it also allows us to go beyond the capabilities of current approaches for adaptive personalisation and recommendation.

Get the document. (1.93 MB)

Modeling Hierarchical Menu Selections: Effects of Additive Factors

Krystian Samp, Stefan Decker
DERI, 2010.

In this paper we demonstrate that models of hierarchical menu selections should include a component that accounts for additive factors (AF) associated with sub-selections (selections on each menu level). Examples of such AF are dwell time, mouse clicks, or reaction time of pointing movements.

We collect empirical data from 28 participants using two considerably different hierarchical menu designs. The results show that the effects of AF associated with sub-selections are substantial and models not accounting for it lose their value because they cannot explain the sources of the differences between tested designs.

We also use the collected data to compare performance of the designs and propose a simple measure of performance stability assessing the degree of navigation problems.

Get the document.

On Lightweight Data Summaries for Optimised Query Processing over Linked Data

Andreas Harth, Katja Hose, Marcel Karnstedt, Axel Polleres, Kai-Uwe Sattler, Jürgen Umbrich
DERI Galway, 2009.

Get the document. (714.31 kB)

Discovering Resources on the Web - A Comparison of Discovery Mechanism for the Web of Data and the Web of Documents

Jürgen Umbrich, Michael Hausenblas, Phil Archer, Eran Hammer-Lahav, Erik Wilde
Linked Data Research Centre, 2009.

Discovering information on the Web in a scalable and reliable way is an important but often underestimated task. Research on discovery itself is quite a young field. Hence, to date not many Web-compliant discovery mechanism exist. Firstly, we introduce a layered Abstract Discovery Model and discuss its features. Then, driven by use cases and requirements, we review three promising discovery proposals in the context of the Web of Data and the Web of Documents: XRD, POWDER, and voiD.

Get the document.

Weaving the Pedantic Web

Aidan Hogan, Andreas Harth, Alexandre Passant, Stefan Decker, Axel Polleres
DERI, 2009.

In this paper we provide detailed discussion regarding common mistakes and issues relating to publishing RDF data on the Web. In particular, we discuss the cause, prevalence and possible solutions for the highlighted issues relating to accessibility, dereferenceability, protocol, usage of core vocabularies, datatypes, inconsistencies and ontology hijacking. Throughout, we provide statistics and examples from the Web to highlight the prevalence and severity of various observed anomalies in RDF Web data. Continuing, we introduce an online system which can be used by publishers to validate their RDF data with respect to our commonly observed mistakes.

Get the document. (545.73 kB)

Linked Data Applications - The Genesis and the Challenges of Using Linked Data on the Web

Michael Hausenblas
Linked Data Research Centre, 2009.

We are writing the year 2009. Three years after the linked data principles have been formulated by Tim Berners-Lee and two years after the grass-root community project 'Linking Open Data' has started to apply them to publicly available datasets such as Wikipedia, DBLP, and GeoNames, we are still at the very inception to understand how to use linked data in order to build Web applications and Web services. This memo outlines the current state-of-the-art, highlights (research) issues and tries to anticipate some of the future developments of linked data.

Get the document.

A document engineering approach to automatic extraction of shallow metadata from scientific publications

Tudor Groza, Siegfried Handschuh, Ioana Hulpus
DERI Galway, National University of Ireland, Galway, 2009.

Semantic metadata can be considered one of the foundational blocks of the Semantic Web and Desktop. This report describes a solution for automatic metadata extraction from scientific publications, published as PDF documents. The proposed algorithms follow a low-level document engineering approach, by combining mining and analysis of the publications' text based on its formatting style and font information. We evaluate them and compare their performance to other similar approaches. In addition, we present a sample application that represent the use-case for the metadata extraction algorithms.

Get the document.

Towards Lightweight and Robust Large Scale Emergent Knowledge Processing

Vit Novacek
DERI, NUIG, 2009.

We present a lightweight framework for processing of uncertain emergent knowledge coming from multiple resources with varying relevance. The framework is essentially RDF-compatible, but it also allows for direct representation of arbitrary number of contextual features like provenance of emergent statements. We support soft integration and robust querying of the represented content based on well-founded notions of aggregation, similarity and ranking. A proof-of-concept implementation is presented and evaluated within large scale knowledge-based search in life science articles.

Get the document.

TOPDIS: Tensor-based Ranking for Data Search and Navigation

Andreas Harth, Sheila Kinsella
DERI, 2009.

Web sources increasingly use the Resource Description Format (RDF) as a means for general-purpose knowledge representation. Many database-backed sites are now being complemented with a structured representation of their content; for example, the content of blogs, social networking sites, and Wikipedia is being made available in RDF and can be crawled and aggregated. Searching over large corpora of structured information collected from the Web brings about new challenges for ranking. One particular problem is that searches can result in graphs with thousands of edges, too large for a user to easily absorb. To this end, we introduce TOPDIS, a set of algorithmic tools to determine prominent elements in large semantic graphs. We provide a formalisation of the method using the concepts of multilinear algebra, evaluate scalability of the algorithm and quality of the results, and show how TOPDIS can improve search over structured data in general.

Get the document. (297.48 kB)

Scalable Authoritative OWL Reasoning for the Web

Aidan Hogan, Andreas Harth, Axel Polleres
DERI, 2009.

In this paper we discuss the challenges of performing reasoning on large scale RDF datasets from the Web.

Using ter-Horst's pD* fragment of OWL as a base, we compose a rule-based framework for application to web data: we argue our decisions using observations of undesirable examples taken directly from the Web. We further temper our OWL fragment through consideration of "authoritative sources" which counter-acts an observed behaviour which we term "ontology hijacking": new ontologies published on the Web re-defining the semantics of existing entities resident in other ontologies. We then present our system for performing rule-based forward-chaining reasoning which we call SAOR: Scalable Authoritative OWL Reasoner. Based upon observed characteristics of web data and reasoning in general, we design our system to scale: our system is based upon a separation of terminological data from assertional data and comprises of a lightweight in-memory index, on-disk sorts and file-scans. We evaluate our methods on a dataset in the order of a hundred million statements collected from real-world web sources and present scale-up experiments on a dataset in the order of a billion statements collected from the Web.

Get the document. (1.56 MB)

Semantic Web Publishing with Drupal

Stéphane Corlosquet, Richard Cyganiak, Stefan Decker, Axel Polleres
DERI, 2009.

Getting Semantic Web data and annotations into and out of end-user applications is one of the many challenges of making the Semantic Web fly. While linked data principles are slowly taking off and being adopted by a small number of sites and gradually more exporters to link existing Web content to the Web of data, we still find ourselves in the cold start

phase. While producing Web content has become easy for end users by content managment systems (CMS), blogging tools, etc. the problem of enabling end users to produce semantic Web content persists. In this short paper, we propose a framework for a one-click solution to lift the huge amounts of Web content residing in CMS systems to the Web of Data. We tackle one of the most popular CMS systems nowadays, Drupal, where we enable site administrators to export their site content model and data to the Web of Data without requiring extensive knowledge on Semantic Web technologies. We have developed a Drupal module

that maps the inherent site structure of a typical Drupal site to a lightweight ontology that we call the Site Vocabulary, and that exposes site data directly in RDFa. As such, this simple solution would not link to existing Semantic Web data, since site vocabularies exist decoupled from the widely used

vocabularies in the Linked data cloud. To close this gap, we have incorporated an easy-to-use, fail-safe ontology import and reuse mechanism in our module, that allows site administrators - with a few clicks - to link their site vocabulary, and thus their site data, to existing, widely used vocabularies on the Web. In whole, our approach shall help to bootstrap the Semantic Web by leveraging the huge amounts of data in CMS. In approaching CMS site

administrators rather than end users, we tackle the problem of adding semantics where we consider it easiest: Semantics are fixed at design time based on the site structure, whereafter end users entering data produce Linked data automatically. We have evaluated our approach in user experiments and report on deployment of our module in the Drupal community.

Get the document.

Assessment of Service Protocols Adaptability

ZhangBing Zhou, Sami Bhiri
DERI, 2009.

Protocol adaptation between services is a key functionality for ensuring successful interactions. Previous work has mainly interested in either compatibility analysis which targets at the direct service interaction, or constructing adapters for service protocols. In this paper, we are rather interested in characterizing whether two service protocols are adaptable without constructing an adapter, quantifying their adaptability degree, and identifying conditions providing which they can be properly adapted. We believe such an assessment is a key criteria for selecting the appropriate service among functionally-equivalent candidates.

We firstly introduce a generic method that adapts service protocols without requiring to construct an adapter at design-time. Then we present a technique that enables to explore our interests mentioned above.

Get the document. (8.18 MB)

Towards an Efficient Knowledge-Based Publication Data Exploitation: An Oncological Literature Search Scenario

Vit Novacek
DERI, NUIG, 2009.

In this report, we present a solution for robust scalable extraction and exploitation of knowledge from unstructured text. The robustness is achieved by an application of our novel light-weight, similarity-based knowledge representation framework and respective inference services. The scalability is ensured by the framework’s straightforward anytime implementation on the top of a relational database back-end. The potential of our work is exemplified within an oncological literature search scenario, which was motivated by and evaluated with domain experts.

Get the document.

Analyzing mediated service interactions

ZhangBing Zhou, Sami Bhiri
DERI, 2009.

To study if service protocols are adaptable is essential for ensuring successful service interactions. Previous work has mainly interested in either analyzing service compo

sitions that target at direct service interactions, or building adapters that resolve mismatches between service protocols. In this paper we are rather interested in the global behavior that when and under which conditions service protocols can be adapted. This knowledge is much valuable to the user for identifying and selecting the suitable service among functionally equivalent candidates.

We firstly introduce a generic space-based process mediator that adapts service protocols without requiring to build an adapter at design time, and present an observation that mediated service interactions are synchronizable. Then we formally model mediated service interactions and their conversation which enables to evaluate when and under which conditions service protocols are adaptable.

Get the document. (363.98 kB)

Empirical KR&R in Action: A New Framework for Emergent Knowledge

Vit Novacek
DERI, NUIG, 2009.

We introduce a framework for practical, essentially empirical exploitation of emergent, primarily automatically extracted knowledge. Efficient and meaningful machine processing of knowledge extracted, e.g., by ontology learning from natural language texts, has been largely an open problem to date. To address this gap, we propose a light-weight, similarity-based knowledge representation framework and respective simple, yet quite practical inference services. Our approach has been motivated by and applied to a life science use case.

Get the document.

ENVOY: A Platform for cooperating widgets on the Web

Ronan Fox, Manfred Hauswirth
DERI, 2009.

Widgets have the potential to improve the way in which applications are developed on the web. With HTML 5 widgets can now communicate with one another and their hosting platforms. However, there exists the problem of how to enable this communication in a scalable, flexible, and robust manner. There is no existing framework which enables heterogeneous widgets to be used together, without a-priori knowledge of each widget’s interface protocol. In this paper we propose Envoy - a semantic widget engine - which supports the reuse of functionality encapsulated by intercommunicating web widgets in web applications through the use of a new design methodology which incorporates the semantic description of interfaces, publishing those interfaces, searching for them, and combining them into fully functional, context-aware web applications. As the communication paradigm enables interface reuse, this is particularly well suited to the asynchronous communication style of typical widget-based applications.

Get the document. (289.99 kB)

On Integration Issues of Site-Specific APIs into the Web of Data

Tim Berners-Lee, Richard Cyganiak, Michael Hausenblas, Joe Presbrey, Oshani Seneviratne, Oana-Elena Ureche
Linked Data Research Centre, 2009.

The current Web of Data, including linked datasets, RDFa content, and GRDDL-enabled microformats is a read-only Web. Although this read-only Web of Data enables data integration, faceted browsing and structured queries over large datasets, we lack a general concept for a read-write Web of Data. That is, we need to understand how to create, update and delete RDF data. Starting from the experience we have gathered with Tabulator Redux - a single-triple update system based on a data Wiki - we review necessary components to realize a read-write Web of Data. We propose a form-based editing approach for RDF graphs along with the integration of site-specific APIs. Further, we present a concept of a uniform architecture for a read-write Web of Data, including a demonstration. Eventually, our work reveals issues and challenges of the proposed architecture and discusses future steps.

Get the document.

D6.2b Final NEPOMUK Architecture

Gerald Reif, Tudor Groza, Simon Scerri, Siegfried Handschuh
NEPOMUK Project Consortium, 2008.

Get the document.

NetTopo: A Framework of Simulation and Visualization for Wireless Sensor Networks

Lei Shu, Chun Wu, Manfred Hauswirth
Digital Enterprise Research Institute, 2008.

Network simulators are necessary for testing algorithms of large scale wireless sensor networks (WSNs), but lack the accuracy of real-world deployments. Deploying real WSN testbed provides a more realistic test environment, and

allows users to get more accurate test results. However, deploying real testbed is highly constrained by the available budget when the test needs a large scale WSN environment. By leveraging the advantages of both network simulator and real testbed, an approach that integrates simulation environment and testbed can effectively solve both scalability and accuracy issues. Hence, the simulation of virtual WSN, the visualization of real testbed, and the interaction between simulated WSN and testbed emerge as three key challenges. In this paper, we present an integrated framework called NetTopo for providing both simulation and

visualization functions to assist the investigation of algorithms in WSNs. NetTopo is provides a common virtual WSN for the purpose of interaction between sensor devices and simulated virtual nodes. Two case studies are described to prove the effectiveness of NetTopo.

Get the document.

NetTopo: Beyond Simulator and Visualizor for Wireless Sensor Networks

Lei Shu, Chun Wu, Manfred Hauswirth
Digital Enterprise Research Institute, DERI, 2008.

Simulation tools are essentially needed for testing and validating algorithms and protocols of wireless sensor networks (WSNs) towards the large scale scenarios. Comparing with simulators, deploying real WSNs testbeds provides a more rigorous and realistic testing environment, which allows researchers to get the more accurate test results. However, deploying real testbed is highly constrained by the available budget especially when the test needs a large scale WSNs environment. Taking the advantages of both simulation tools and real testbed into consideration, an integrated approach of a simulation environment and a testbed can effectively solve the both scalability and accuracy issues. Hence, the simulation of virtual WSNs and the visualization of real testbeds as well as the interaction between simulated WSNs and testbeds emerge as the three key challenging issues. In this paper, as the earliest stage of the huge vision, we present NetTopo as the integrated framework for providing both simulation and visualization functions of WSNs to assist the investigation of routing algorithms. Two primary case studies are described, which essentially prove the effectiveness of the design concept.

Get the document.

Space Based Process Mediator

ZhangBing Zhou, Sami Bhiri
DERI, 2008.

Web service interactions lie in the core of Service Oriented Architecture. Due to the inherent autonomy, heterogeneity and continuous evolution of Web services, mediators are often needed to support service interactions to overcome possible mismatches existing among Web service based business processes. This paper introduces a space based process mediator which considers both control-flow and data-flow, presents possible mismatch patterns, and discusses how they can be automatically mediated. Our process mediator can address not only all mismatch patterns prescribed by existing process mediators, but also new mismatch patterns related to data-flow. In addition, our process mediator provides a uniform mechanism to perform runtime mediation without the need of a design-time work, and greatly facilitates Web service interactions.

Get the document. (365.64 kB)

Enabling Networked Knowledge

Stefan Decker, Manfred Hauswirth
DERI, 2008.

Despite the enormous amounts of information the Web has made accessible, we still lack means to interconnect and link this information in a meaningful way to lift it from the level of information to the level of knowledge. Additionally, new sources of information about the physical world become available through emerging sensor technologies. This information needs to be integrated with the existing information on the Web and in information systems which requires (light-weight) semantics as a core building block. in this postion paper we discuss the potential of global knowledge space and which research and technologies are required to enable our vision of networked knowledge.

Get the document. (614.54 kB)

Real-life rating algorithm

Mateusz Marmolowski
DERI, 2008.

Get the document. (362.84 kB)

A Process Ontology for Business Intelligence

Armin Haller, Mateusz Marmolowski, Eyal Oren, Walid Gaaloul
DERI, 2008.

This paper presents oXPDL, a process interchange ontology based on the standardised XML Process Definition Language (XPDL). XPDL was introduced to allow process model exchange between information systems, most of which are based on proprietary workflow models. In its current form, XPDL allows only syntactic vendor-specific extensions without clear semantics, has only limited support for informational and organisational modelling aspects and cannot be interlinked to existing standardised knowledge bases. Our process interchange ontology oXPDL explicitly models the complete semantics of XPDL process models in a standard Web ontology language. The oXPDL ontology has a strong focus on the reuse and integration of existing standard ontologies such as PSL, RosettaNet, SUMO and eClassOWL. We present the ontology and the accompanying tool to automatically translate an XPDL process model to its corresponding oXPDL. The oXPDL process models may be used for integrated process analysis, by querying and reasoning over multiple models, each of which may originate from different information systems, in combination with business rules described in background ontologies.

Get the document. (365.65 kB)

Analysis and mining of ontological process model instances

Armin Haller, Walid Gaaloul, Mateusz Marmolowski
DERI, 2008.

The application of ontologies to Business Process Management has been introduced to improve the level of automation in the execution of processes. Ontologies aid in the description of the process model as well as in their exchanged data. This paper presents an ontological model for workflow logs, namely oXPDL+. The model builds upon a process interchange ontology based on the standardised XML Process Definition Language (XPDL). oXPDL+ can be used to not only exchange process models, but also workflow logs representing instances of such process models. We present a mapping architecture and implementation to populate a knowledge base with ontologically described process models and workflow logs. By defining a set of ordering relations in the ontological model, it is possible to analyse the model based on a combination of its static and behavioural properties. The use of semantics to model processes allows to interlink local information with knowledge defined in background ontologies, leading to significant enhancements in analysis and mining techniques.

Get the document. (345.95 kB)

Automated Empirical Reasoning -- General Motivations and Theory Essentials

Vit Novacek
DERI, NUIG, 2008.

Abstract. We introduce a novel reasoning framework based on a respective knowledge representation theory. It aims at efficient and non-trivial inference of valuable conclusions even from relatively loosely specified knowledge in automatically extracted resources (e.g. learned ontologies or RDF extracted from databases). The noisy and flat nature of such inputs hampers appropriate processing by traditional logics-based inference engines to large extent, which is broadly described within our motivations. Therefore we turn from logical knowledge representation paradigms, propose a theory of empirical computational semantics and advocate its better suitability in certain application scenarios.

Get the document.

Didaskon Algorithm Specification

Jacek Jankowski, Cosmin Basca
DERI Galway, 2008.

The purpose of this document is to describe the Didaskon Algorithm. It delivers detailed specification of HTN Planning and its use in course generation.

Get the document.

XSPARQL: Traveling between the XML and RDF worlds - and avoiding the XSLT pilgrimage

Waseem Akhtar, Jacek Kopecký, Thomas Krennwallner, Axel Polleres
DERI, 2007.

With currently available tools and languages, translating between an existing XML format and RDF is a tedious and error-prone task. The importance of this problem is acknowledged by the W3C GRDDL working group who faces the issue of extracting RDF data out of existing HTML or XML files, as well as by the Web service community around SAWSDL, who need to perform lowering and lifting between RDF data from a semantic client and XML messages for a Web service. However, at the moment, both these groups rely solely on XSLT transformations between RDF/XML and the respective other XML format at hand. In this report we propose a more natural approach for such transformations based on merging XQuery and SPARQL into the novel language XSPARQL. We demonstrate that XSPARQL provides concise and intuitive solutions for mapping between XML and RDF in either direction, addressing both the use cases of GRDDL and SAWSDL. We also provide and describe an initial implementation of an XSPARQL engine, available for user evaluation.

Get the document.

D2.3.8v2 Report and Prototype of Dynamics in the Ontology Lifecycle

Vit Novacek, Loredana Laera, Siegfried Handschuh, Jan Zemanek, Max Volkel, Rokia Bendaoud, Mohamed Rouane Hacene, Yannick Toussaint, Bertrand Delecroix, Amedeo Napoli
DERI, NUIG, 2007.

Deliverable D2.3.8v2 (WP2.3) presents a novel ontology integration technique that explicitly takes the dynamics and data-intensiveness of many practical application domains into account.

This technique fully implements a crucial part of the dynamic ontology lifecycle scenario defined in D2.3.8v1. In particular, we tackle semi-automatic integration of ontology learning results into a manually developed ontology. This integration is based on automatic negotiation of agreed alignments, inconsistency resolution, ontology diff computation and natural language generation methods. Their combination alleviates the end-user effort in the dynamic incorporation of new knowledge to large extent, thus conforming to the principles specified in D2.3.8v1. As such, it allows for a basic application of all the dynamic ontology lifecycle features we have proposed.

Get the document.

Semantic Web Pipes

Christian Morbidoni, Axel Polleres, Giovanni Tummarello, Danh Le Phuoc
DERI Galway, 2007.

This report presents Semantic Web pipes, a powerful paradigm to build RDF-based mashups. Semantic Web pipes work by fetching RDF models on the Web, operating on them, and producing an output which is itself accessible via a stable URL. We illustrate how Semantic Web pipes can solve use cases ranging from simple aggregation to complex collaborative editing and filtering of distributed RDF graphs. To this end, we introduce the concept of RDF revocations and describe a pipe operator that can apply such revocations. This operator enables Semantic Web pipes where agents cooperatively develop semantically structured knowledge, while still retaining and publishing their individual contributions and beliefs. We conclude with a description of two available implementations.

Get the document. (767.33 kB)

D2.3.9 Theoretical Aspects for Ontology Lifecycle

Vit Novacek, Zhisheng Huang, Alessandro Artale, Norman Foo, Enrico Franconi, Tommie Meyer, Mathieu d'Aquin, Jean Lieber, Amedeo Napoli, Giorgos Flouris, Jeff Z. Pan, Dimitris Plexousakis, Siegfried Handschuh, et al.
DERI, NUIG, 2007.

Deliverable D2.3.9 (WP2.3) presents a study on theoretical aspects of ontology lifecycle and dynamic maintenance. Several crucial topics are analysed, ranging from logical groundwork of ontology dynamics based on belief-change theory, through semantics of ontology diffs, to multi-version reasoning. Building on the theoretical studies, basic practical guidelines and combined approach to multi-version ontology reasoning are discussed in the report, too. This is meant to provide a tangible binding between the theory and interests of practitioners.

Get the document.

MarcOnt Portal 1.0 Research Stable

Maciej Dabrowski, Szymon Pajak, Michal Nowacki, Mateusz Marmolowski, Sebastian Ryszard Kruk
DERI, 2007.

One of the key conceptions of the Semantic Web is the ontology. Ontologies define concepts and relations from the given domain, thus allowing for other tools like reasoners to infer new knowledge based on the predefined knowledge. In order to ensure their role in the vision of the Semantic Web interoperability, ontologies should be delivered based on agreement within the community. MarcOnt is an approach for collaborative ontology development and management. It introduces the notion of suggestions, negotiation and versioning to the ontology life-cycle. We present the idea that the metadata mediation information for the ontology should follow the same life-cycle pattern as the ontology itself. We also show how a collaborative ontology management life-cycle can be achieved with the MarcOnt Portal.

Get the document.

MarcOnt Portal User Guide

Mateusz Marmolowski, Maciej Dabrowski, Michal Nowacki, Szymon Pajak, Sebastian Ryszard Kruk
DERI Galway, 2007.

One of the key conceptions of the Semantic Web is the ontology. Ontologies define concepts and relations from the given domain, thus allowing for other tools like reasoners to infer new knowledge based on the predefined knowledge. In order to ensure their role in the vision of the Semantic Web interoperability, ontologies should be delivered based on agreement within the community. MarcOnt is an approach for collaborative ontology development and management. It introduces the notion of suggestions, negotiation and versioning to the ontology life-cycle. We present the idea that the metadata mediation information for the ontology should follow the same life-cycle pattern as the ontology itself. We also show how a collaborative ontology management life-cycle can be achieved with the MarcOnt Portal.

Get the document.

MarcOnt Portal Service Oriented Architecture

Maciej Dabrowski, Michal Wozniak, Mateusz Marmolowski, Michal Nowacki, Szymon Pajak
DERI, 2007.

MarcOnt is an approach for collaborative ontology development and management. It introduces the notion of suggestions, negotiation and versioning to the ontology life-cycle. We present the idea that the metadata mediation information for the ontology should follow the same life-cycle pattern as the ontology itself. This document describes the architecture of MarcOnt Portal. It introduces the REST-based Service Oriented Architecture framework that utilizes semantic descriptions of Web Services. The MarcOnt REST Framework allows building new services easily using REST idea and RDF format as the information interchange format. This document discusses technical details on the implementation and the architecture of collaborative ontology development tool - MarcOnt Portal.

Get the document.

YARS2: A Federated Repository for Searching and Querying Graph Structured Data

Andreas Harth, Jürgen Umbrich, Aidan Hogan, Stefan Decker
Digital Enterprise Research Institute, Galway, 2007.

We present the architecture of an end-to-end semantic search engine that uses a graph data model to enable interactive query answering over structured and interlinked data collected from many disparate sources on the Web.

In particular, we study distributed indexing methods for graph-structured data and parallel query evaluation methods on a cluster of computers.

We evaluate the system on a dataset with 430 million statements collected from the Web, and provide scale-up experiments on 7 billion synthetically generated statements.

Get the document. (227.86 kB)

JeromeDL 2.0.1 User Guide

Sebastian Ryszard Kruk, Mariusz Cygan, Ewelina Kruk, Slawomir Grzonkowski, Tomasz Woroniecki
Digital Enterprise Research Institute, Galway , 2007.

JeromeDL is a Social Semantic Digital Library. As a digital library, it allows institu- tions to easily publish documents on the Web. It supports a variety of document formats; it allows to store and query a rich bibliographic description of each document. To find relevant documents in JeromeDL users can use searching and browsing features. Whole documents content can be searched through, as well as single fields of the documents description, like author or publish year. Users can also find documents by browsing content of subject categories and keywords With JeromeDL's social and semantic services every library user can bookmark interesting books, articles or other materials in semantically annotated directories. Users can allow others to see their bookmarks and annotations and share their knowledge within a social network. JeromeDL can also treat a single library resource as a blog post. Users can comment the content of the resource and reply to others' comments and this way create new knowledge. This document will guide the user through the installation process and all administration tasks for librarians. It will also present the most important features for the end users.

Get the document.

Software Specification of the InContext PCSA (Pervasive Collaboration Services Architecture)


Get the document.

D-FOAF: Role-based Access Control Standard

Slawomir Grzonkowski, Sebastian Ryszard Kruk
DERI, 2007.

This document is also available in non-normative

Get the document.

D-FOAF: Functional Description

Slawomir Grzonkowski
DERI, 2007.

This deliverable will describe the functional description of the FOAFRealm/D-FOAF system.

Get the document.

Distributed-FOAFRealm (D-FOAF) APIs

Slawomir Grzonkowski, Adam Gzella
DERI, 2007.

This deliverable will describe the APIs of the D-FOAF system.

Get the document.

Distributed-FOAFRealm (D-FOAF) Architecture Descriptio

Slawomir Grzonkowski
DERI, 2007.

This deliverable will describe the architecture of the D-FOAF system. The description concerns modules and their dependencies. Moreover, a deployment diagram will be described because of the distributed project nature.

Get the document.

Controlled Language IE Components version 2.

Adam Funk, Brian Davis, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham.
University of Sheffield, UK, DERI, NUIG, 2007.

Get the document.

LOstRepository (Learning Object Repository) - Specification

Jacek Jankowski
DERI Galway, 2007.

Get the document.

SemPerKit - Semantic Personal Web Site Starter Kit

Jacek Jankowski
DERI Galway, 2007.

Get the document.

D2.3.8v1 Report and Prototype of Dynamics in the Ontology Lifecycle

Vit Novacek, Siegfried Handschuh, Diana Maynard, Loredana Laera, Sebastian Ryszard Kruk, Max Voelkel, Tudor Groza, Valentina Tamma
DERI, NUIG, 2006.

Deliverable D2.3.8v1 (WP2.3) proposes the dynamic ontology lifecycle schema and discusses its implementation. It provides concrete solutions to ontology development and evolution issues in highly dynamic and data-intensive environments. Particularly, it deals with proper placement of ontology learning, evaluation and negotiation methods and with integration of learned and collaborative ontologies in a novel way. The transfer possibilities of the framework are justified by elaborated application scenarios from the medicine domain.

Get the document.

Revisiting and Simplifying RDF

Benjamin Heitmann, Eyal Oren, Max Völkel
Digital Enterprise Research Institute, 2006.

RDF, a simple yet expressive data model, is widely recognised

as the foundation for the Semantic Web. We revisit the RDF specification and analyse five problems: literals are not addressable, blank nodes

are not addressable, reification is not semantically recognised, language tags are not addressable, and language tags can not be combined. We

introduce SiRDF, a simplified data model with fewer modelling elements that overcomes the mentioned problems, and formally show its semantic

equivalence to RDF. Based on the lessons of SiRDF we recommend a restricted usage of RDF that solves the mentioned problems while fully

adhering to the original RDF specification.

Get the document.

FOAFRealm Ontology Specification

Slawomir Grzonkowski, Sebastian Ryszard Kruk, Adam Gzella, Tomasz Woroniecki
DERI, 2006.

Proposed FOAFRealm (Friend-of-a-Friend Realm) system allows to take advantage of social networks and FOAF profiles in user profile management systems. However, the FOAF standard must be enriched with new concepts and properties that are described in this document. The enriched version is called FOAFRealm.

Get the document.

Web Service Discovery - A Reality Check

Daniel Bachlechner, Katharina Siorpaes, Dieter Fensel, Ioan Toma
DERI, 2006.

Web services are about the integration of applications via the web. Hereby, the programming effort should be minimized through the reuse of standardized components and interfaces. Semantic web services try to provide the next step through mechanizing important sub tasks within a service-oriented architecture. Otherwise, significant manual programming effort would remain as a bottleneck for this approach. One of the sub tasks in a service-oriented architecture is service discovery. While a significant number of papers have already been published in this area, most of them are more concerned in providing yet another illustration for an arbitrary logical framework rather than providing a contribution that meets current constraints in given practical settings. In this paper, we first provide an empirical enumeration of existing approaches towards web service discovery. This sets the basis for analyzing the strengths and weaknesses of the existing approaches as well as the prediction of future potential improvements in this area. We also identify a useful role for semantic techniques as long as it is in a proper setting.

Get the document. (247.71 kB)

The Fundamental Premises of the Digital Enterprise Research Institute (DERI) International

Michael L. Brodie, Jim Browne, Dieter Fensel, Sung-Kook Han
DERI, 2005.

DERI International is a collaborative organization of research institutes worldwide that are committed to the realization of the semantic web and semantic web services through collaborative and open methods and in accordance with the five fundamental DERI Premises: first, a core technology in support of problem solving environments; second, a conceptual model for service-oriented computing (Web Service Modeling Ontology (WSMO)); third, Web Service Modeling Language (WSML) a family of languages that provides formal semantics for WSMO models; fourth, a Web Service Execution Environment (WSMX) that supports reasoning and mediation in WSMO compliant solutions; and fifth, Triple-Space Computing, a new communication paradigm for services based on the Semantic Web paradigm. This document provides DERI’s motivating vision, its five fundamental premises, and the first steps in defining DERI International.

Get the document. (162.91 kB)

Semantically Enabled Service-Oriented Architectures: A Manifesto and a Paradigm Shift in Computer Science

Michael L. Brodie, Christoph Bussler, Jos de Bruijn, Thomas Fahringer, Dieter Fensel, Martin Hepp, Holger Lausen, Dumitru Roman, Thomas Strang, Hannes Werthner, Michal Zaremba
DERI, 2005.

After four decades of rapid advances in computing, we are embarking on the greatest leap forward in computing that includes revolutionary changes at all levels of computing from the hardware through the middleware and infrastructure to applications and more importantly in intelligence. This paper outlines a comprehensive framework that integrates two complimentary and revolutionary technical advances, ServiceOriented Architectures (SOA) and Semantic Web, into a single computing architecture, that we call Semantically Enabled Service Oriented Architecture (SESA). While SOA is widely acknowledged for its potential to revolutionize the world of computing, that success depends on resolving two fundamental challenges that SOA does not address, integration, and search or mediation. In a servicesoriented world, billions of services must be discovered and selected based on requirements, then orchestrated and adapted or integrated. SOA depends on but does not address either search or integration. The contribution of this paper is to provide the semanticsbased solution to search and integration that will enable the SOA revolution. The paper provides a vision of the future enabled by our framework that places computing and programming at the services layer and places the real goal of computing, problem solving, in the hands of end users.

Get the document. (1.44 MB)

The Scientific Role of Computer Science in the 21st Century

Dieter Fensel, Dieter Wolf
DERI, 2005.

This paper defines the future role of computer science in its scientific context. We will show that the future prospects of mathematics, physics, biology, and human brain research will greatly depend on understanding them as branches of applied computer science. Therefore, we will establish applied computer science as a unifying foundation of natural sciences.

Get the document. (246.84 kB)

The Web Service Modeling Language WSML: An Overview

Jos de Bruijn, Holger Lausen, Axel Polleres, Dieter Fensel
DERI, 2005.

The Web Service Modeling Language (WSML) is a language for the specification of different aspects of Semantic Web Services. It provides a formal language for the Web Service Modeling Ontology WSMO which is based on well-known logical formalisms, specifying one coherent language framework for the description of Semantic Web Services, starting from the intersection of Datalog and the Description Logic SHIQ. This core language is extended in the directions of Description Logics and Logic Programming in a principled manner with strict layering. WSML distinguishes between conceptual and logical modeling in order to facilitate users who are not familiar with formal logic, while not restricting the expressive power of the language for the expert user. IRIs play a central role in WSML as identifiers. Furthermore, WSML defines XML and RDF serializations for inter-operation over the Semantic Web.

Get the document. (353.52 kB)

A Minimal Triple Space Computing Architecture

Christoph Bussler
DERI, 2005.

The visionary approach of Triple Space Computing was recently introduced based on the insight that Web Services do not follow the Web paradigm of 'persistently publish and read'. Instead, Web Services currently require a synchronous connection to transmit data transparently bypassing and ignoring the power of the Web paradigm. Triple Space Computing proposes to publish communication data analogous to the publication of Web pages: persistently for anybody to read who has access to it at any point in time. This has several benefits. The provider of data can publish it at any point in time (time autonomy), independent of its internal storage (location autonomy), independent of the knowledge about potential readers (reference autonomy) and independent of its internal data schema (schema autonomy). This article introduces a minimal Internet-scalable Triple Space Computing architecture based on Semantic Web technology that implements these four types of autonomy in the simplest way possible with as minimal functionality as feasible to be useful with no or almost no impact to publishers and reader of communication data.

Get the document. (120.37 kB)

Aspects in Workflow Management

Simeon Petkov, Eyal Oren, Armin Haller
DERI, 2005.

Many different workflow systems have been developed that focus on different application domains and provide different functionality. Workflow management lacks a standardised theory that provides a theoretical background for workflows like the relational algebra provides for databases; despite efforts of standardisation bodies there is no consensus on the representation or conceptual model of workflow processes. A number of approaches attempt to address this situation: Jablonski and Bussler describe a number of essential perspectives and aspects of a comprehensive workflow management functionality; they provide a structure in the complex environment of workflows. van der Aalst et al., Russell et al. systematically analyse available functionality in existing workflow management systems, and categorise these in a number of workflow patterns; these patterns are devoid of implementational issues and form a qualitative standard against which existing workflow management systems can be benchmarked.

Our work integrates and extends these approaches; we describe various aspects of workflow management; we indicate how these can be modelled and what quality indicators can be used to assess their support in workflow management systems.

Get the document. (358.69 kB)

Formal Frameworks for Workflow Modelling

Eyal Oren, Armin Haller
DERI, 2005.

We survey formal frameworks for workflow modelling. We summarise important aspects of workflow management and approaches to evaluate current workflow management systems. We discuss a number of formalisms for workflow modelling, namely Petri nets, Temporal Logic, and Transaction Logic. We decribe how these formalisms are used specifically for workflow modelling, their possibilities and their disadvantages.

Get the document. (361.33 kB)

OWL DL vs. OWL Flight: Conceptual Modeling and Reasoning for the Semantic Web

Jos de Bruijn, Dieter Fensel, Rubén Lara, Axel Polleres
DERI, 2004.

The Semantic Web languages RDFS and OWL have been around for some time now. However, the presence of these languages has not brought the breakthrough of the Semantic Web the creators of the languages had hoped for. OWL has a number of problems in the area of interoperability and usability in the context of many practical application scenarios which impede the connection to the Software Engineering and Database communities. In this paper we present OWL Flight, which is loosely based on OWL, but the semantics is grounded in Logic Programming rather than Description Logics, and it borrows the constraint-based modeling style common in databases. This results in different types of modeling primitives and enforces a different style of ontology modeling. In this paper we analyze the modeling paradigms of OWL DL and OWL Flight, as well as reasoning tasks supported by both languages. We argue that different applications on the Semantic Web

require different styles of modeling and thus both types of languages are required for the Semantic Web.

Get the document. (409.68 kB)

Online Social and Business Networking Communities

Ina O'Murchu, John Breslin, Stefan Decker
DERI, 2004.

The ability to send and retrieve information over the Web using traditional and ubiquitous computing methods has changed the way we work and live. Web portals, as content aggregators, act as gateways to pertinent and up-to-date information. Social networking portals are a recent development, allowing a user to create and maintain a network of close friends or business associates for social and/or professional reasons. The main types of social network sites will be classified, and an evaluation will be performed in terms of features and functionality.

Get the document. (252.09 kB)

Linking Semantically Enabled Online Community Sites

Andreas Harth, John Breslin, Ina O'Murchu, Stefan Decker
DERI, 2004.

Online community sites have replaced the traditional means of keeping a community informed via libraries and publishing. At present, online communities are islands that are not interlinked. We describe different types of online communities and tools that are currently used to build and support such communities. Ontologies and semantic web technologies offer an upgrade path to providing more complex services. Fusing information and inferring links between the various applications and types of information provides relevant insights that make the available information on the Internet more valuable. We present the SIOC ontology which combines terms from vocabularies that already exist with new terms needed to describe the relationships between concepts in the realm of online community sites.

Get the document. (149.26 kB)

Towards an Ontology Mapping Specification Language for the Semantic Web

Jos de Bruijn, Axel Polleres
DERI, 2004.

This paper addresses the requirements for an Ontology Mapping Specification Language for the Semantic Web. We present a set of generic technical use cases and a number of application scenarios, which are used as a basis for a list of requirements for such a mapping language. Furthermore, we discuss further steps to be taken towards a useful language fulfilling the requirements we identify in terms of °exibility and expressiveness.

Get the document. (333.38 kB)

Triple-based Computing

Dieter Fensel
DERI, 2004.

This white paper discusses possible paths in moving the web from a collection of human readable information connecting humans into the direction of a web that connects computing devices based on machine-processable semantics of data and distributed computing. We analyze current shortcomings of web service technology and propose a new paradigm for fully enabled semantic web services which we call triple-based computing.

Get the document. (193.52 kB)

A Semantic Matchmaker Service on the Grid

Andreas Harth, Yu He, Hongsuda Tangmunarunkit, Stefan Decker, Carl Kesselman
DERI, 2004.

A fundamental task on the Grid is to decide what jobs to run on what computing resources based on job or application requirements. Our previous work on ontology-based matchmaking discusses a resource matchmaking mechanism using the Semantic Web technologies. We extend our previous work to provide dynamic access to such matchmaking capability by building a persistent online matchmaking service. Our implementation uses the Globus Toolkit for the Grid service development. It also exploits the monitoring and discovery service in the Grid infrastructure to dynamically discover and update resource information. A schema translator is developed to translate the various formats used in a heterogeneous Grid environment into our resource ontology. We describe the architecture of our semantic matchmaker service.

Get the document. (187.27 kB)

Semantic Information Integration in the COG Project

Jos de Bruijn, Ying Ding, Sinuhé Arroyo, Dieter Fensel
DERI, 2004.

Information integration in enterprises is hindered by differences in software and hardware platforms and by syntactic and semantic differences in the schemas of the data sources. This is a well-known problem in the area of Enterprise Application Integration (EAI), where many applications have been developed for the purpose of information integration. Most current tools, however, only address the problems of (soft- and hardware) platform and syntactic heterogeneity; they fail to address semantic differences and only support one-to-one (syntactical) mappings between individual schemas. In this White Paper we present the approach to semantic information integration that was applies in the COG project. We describe the Semantic Information Management along with the Unicorn Workbench tool, part of the Unicorn System, and how they were applied in the project to solve the information integration problem. We used the Semantic Information Management Methodology and the Unicorn Workbench tool to create an Information Model (an ontology) based on data schemas taken from the automotive industry. We map these data schemas to the Information Model in order to make the meaning of the concepts in the data schemas explicit and relate them to each

other, thereby creating an information architecture that provides a unified view of the data sources in the organization. We furthermore provide an extensive survey of other efforts in semantic information integration and a comparison with our approach in the COG project.

Get the document. (1.06 MB)

The Social Semantic Desktop

Stefan Decker, Martin Frank
DERI, 2004.

This whitepaper we vision of a new group collaboration infrastructure, the Social Semantic Desktop, drawing from co-evolving research in the Semantic Web, Peer-to-Peer (P2P) Networks, and Online Social Networking. The Social Semantic Desktop is a novel collaboration environment, enabling the creation,

sharing and deployment of data and metadata.

Get the document. (323.79 kB)

SECO: Mediation Services for Semantic Web Data

Andreas Harth
DERI, 2004.

The Semantic Web has motivated grassroots efforts to develop and publish ontology specifications in RDF. So far, the large amount of RDF data available online has not been utilized to the extent possible. SECO, the application presented in this paper, aggregates, integrates, and displays RDF data obtained from the Semantic Web. SECO collects the RDF data available in files using a crawler, and also utilizes RDF repositories as sources for data. Integration tasks over the various data sources, such as object consolidation and schema mapping, are carried out using a reasoning engine and are encapsulated in mediators to which software agents can pose queries using a remote query interface. SECO includes a user interface component that emits HTML, which allows for human users to browse the integrated data set.

Get the document. (321.19 kB)

Semantic Web Portals - State of the Art Survey

Holger Lausen, Michael Stollberg, Rubén Lara, Ying Ding, Sung-Kook Han, Dieter Fensel
DERI, 2004.

Web portals are entry points for information presentation and exchange over the internet used by a community of interest. Therefore they require efficient supportbfor communication and information sharing. Current Web technologies employed to build up these portals present serious limitations regarding facilities for searching, accessing, extracting, interpreting and processing of information. The application of Semantic Web technologies has the potential of overcoming these limitations and will lead to semantically enhanced Web portals. This paper presents the state of the art application of Semantic Web technologies in web portals and the improvements achieved by the use of such technologies.

Get the document. (387.91 kB)

Community Portal Survey

Ina O'Murchu, John Breslin, Stefan Decker
DERI, 2004.

Web-service semantic enabled implementation of Machine vs. Machine business negotiation

Laurentiu Vasiliu, Michal Zaremba, Matthew Moran, Christoph Bussler
DERI, 2004.

The ultimate business to business (B2B) integration and deployment target is complete business process automation within and between enterprises with no human intervention. From a business point of view, negotiation is the main mechanism that modern enterprises use for achieving their profit-maximising targets. This paper introduces an automated B2B negotiation solution: the implementation of a semantic-enabled machine versus machine business negotiation as a web service. It is argued in this article that the shift from human vs. human business negotiation to machine vs. machine business negotiation is facilitated by using semantic web technology and implemented as a web service. The negotiation process in the present work is designed for multiple machines (minimum of 3 computers), using a negotiation algorithm that emulates human business negotiation behaviour. Conclusions and possible development directions are brought forward.

Get the document. (174.55 kB)

FRED Whitepaper - An Agent Platform for the Semantic Web

Michael Stollberg, Holger Lausen, Sinuhé Arroyo, Reinhold Herzog, Peter Smolle, Dieter Fensel
DERI, 2004.

This paper presents the FRED system, a development environment for agent-based applications that utilize Semantic Web resources. The FRED system consists of an agent runtime environment based on ontologies as the underlying data model and offers a tool suite for application development. It uses progressive technologies for task-service-resolution that detect suitable problem solving implementations to solve tasks assigned to agents. Furthermore, FRED allows integration of external Semantic Web Services into the system, thus admitting the assimilation of key technologies which is compulsory for creating nSemantic Web applications. We give an overall description of the FRED system by describing its technological solutions, explain the system functionalities and relate this to ncontemporary approaches. The aim of this paper is to expose requirements on agent platforms for the Semantic Web and to show how these are attained in the FRED system.

Get the document. (488.61 kB)

Jump to top.
Valid XHTML 1.0 TransitionalRDF Resource Description Framework Icon
(C) Copyright 2004-2010 by the Digital Enterprise Research Institute (DERI). All rights reserved.
DERI Locations
image: European Data Forum 2013 1
EDF 2013
April 9-10, 2013
Dublin, Ireland