Sunday 22 January 2012

SASSY - Increment #4

This increment has not gone well. I started off with trialling SPARQL-DL
as a query language for the OWL database. At first it seemed OK - it could run
some of the simple queries that paralleled the examples in its source code.


Getting the results out involved adding in a JSON parser. The only C++ one I
could find wanted to do everything using wide characters which added to the
complexity somewhat.


When I tried some of the more complex queries that will be needed for this
project things started to go awry. I contacted the author and he confirmed
that it was not capable of doing the queries I wanted. It could handle
relationships between individuals, but not those that involve
querying the relationships between classes.


Further research has not turned up any likely candidates for this problem,
so it appears that I will have to work up my own solution.


At about this time there have been some significant changes to my personal
life, employment and so on, which have been a bit distracting. I also invested
in a new computer with a bit more more performance, and it was a significant
distraction setting up all my favourite software on this new machine. Then
Christmas came along to further delay things. Anyway, I hope to be able to put a
lot more time and effort into this project for the next few months.


The break gave me time to reflect on the way forward. My decision was to build
a small interpreted language that could be used to query the ontology database
and construct the document. I chose a threaded interpreted language, similar to
Forth, as the basis since they a very easy to implement.


The interpreter deals with a range of objects, such as integers, strings,
and several complex structures used to return results from the database, plus
arrays of these objects. I wrapped them up and accessed them using a smart
pointer object so that they would be automatically managed and could be
placed onto a data stack.


The interpreter uses three stacks, a return address stack for subroutine calls,
a data stack for manipulating the results of the database queries, and an
object stack on which the document is constructed.


The runtime interprets a byte code array which consists of either indexes into
an array of function objects, or to the location of a subroutine within the byte
code array. The function objects are responsible for manipulating the stacks and
for making calls to the ontology database. The only other entries in the byte
code array are integer parameters used for jumps or indexing into the data stack
to load string constants.


A very simple, single pass parser is used to convert a textual version of the
program into the contents of the code array (and the string constants in the
data stack). This was mainly introduced as it was too tedious to hand calculate
the targets for jump statements.


Using this language I have been able to reproduce the architecture document, so
I am confident in using this as the basis for the remainder of the project.


The next phase will involve a significant overhaul of the ontology database for
the architecture. During my break I have been reading the "Description Logics
Handbook" and have been slowly gaining a better understanding of this subject.
The net result is that a lot of classes in the ontology need to be converted to
being individuals - so a lot of rewriting will be necessary.



Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

No comments:

Post a Comment