Thursday 28 July 2011

SASSY - Proof of Concept

The first development cycle for the project is designed to determine if the
concept is even possible. To demonstrate this we will build a small application
that takes an ontology prepared by Protege and attempt to generate LaTeX and
then PDF.

For my test data I built an ontology of quality attributes. Every term like
maintainability or performance that was mentioned in the literature was dumped
into this ontology, along with a short description of what it means. This will
later be enhanced to incorporate the various quality frameworks that
researchers have developed to try and bring some structure to this collection.

I then started by using GCJ to compile the Java OWLAPI library so that it could
be linked to my C++ code. After a bit of hacking the Ant build script a
successful build was created. I was able to call this from C++ code and
generate some fragments of LaTeX. Adding in some calls to the dvips and ps2pdf
programs completed the process.

After a bit of work it became evident that the version of OWLAPI that I was
using had a problem with one of the functions that I wanted to use. The release
notes for the subsequent version indicated that this had been resolved, so I
downloaded the latest version and recompiled it using GCJ.


At this point things started to go wrong. The program would build OK, but when
run it would disappear into a loop that steadily used up all available memory.
I tried to run it under Valgrind to determine where the memory leak was, but it
promptly reported that I was attempting to execute an illegal instruction.
Trying it under GDB was no better; it reported a segment violation and showed
a stack trace that made no sense whatsoever. As far as I can tell the code
generated by GCJ was not being loaded correctly by the linker. After a week or
so of trying to get to the root cause I decided that perhaps an alternative
approach was warranted.

The only other solution I can think of for combining Java and C++ is to use CORBA.
The OWLAPI library would be made into a server process and the application would
be a client. This would give the application some additional freedom since we
could now distribute it across multiple processes or machines if necessary.
However, I wanted to use open source and it became apparent that there was no
viable open source CORBA product that had bindings for both C++ and Java. There
were plenty that supported Java and a few for C++, but nothing workable for both.
I was starting to look into passing data between CORBA implementations, which seemed
to be a possibility thanks to standards employed by these products. The idea
of having to support two CORBA implementations, though, was a bit of an issue.


During my CORBA research I came across the ICE product by ZeroC. This product is
open source, has bindings for a wide range of languages and platforms and seems
to be actively supported. It works much like CORBA but without a lot of the
complexity, which I think is nearly always a good thing. After a few experiments
to become familiar with building ICE applications I redesigned the interface to
use ICE generated code and successfully completed the proof of concept.

My work with Protege to develop the quality attribute ontology highlighted a
problem with the visualisation of the ontology. I decided that it would be good
practice to develop an application that would enable its user to graphically
navigate the ontology and to display all the relationships between the entities
of the ontology. I used the GraphViz dot program to generate a diagram based on
the data extracted from the OWLAPI using the ICE generated interface. The
output of dot was set to SVG and I used Qt's SVG and SceneGraph components to
create the visual representation. The result was a program that can quickly
navigate an ontology.

The first increment of the project was thus completed successfully, albeit with
a radically different architecture that was first envisaged. We now have a basis
for designing and developing the software.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Saturday 16 July 2011

SASSY - Project Establishment

The preliminary analysis did not turn up any insurmountable issues, so the
next step is to establish a project for the development. While I do intend for this
project to eventually be a collaborative development, I wanted to get the core
of the project written in a more private setting. Once I have something usable I
will transfer it to a public development area, such as Source Forge.

Fedora Linux

Since I am most comfortable in a Linux environment, and it has all the development
tools I could possibly want readily available, the project will be established
on my personal computer which runs Fedora Linux. I chose Fedora as it is based
on Red Hat which I often use for work, but tracks the latest developments. I
avoid the worst of the "bleeding edge" by remaining one release behind. This
means that all the issues will have been worked through when I upgrade.

Project Structure

The first step is to create the directory structure for the project. I created
a typical Subversion structure containing a trunk, tags and branches directories.
I will be using Subversion as my version control system. Once it goes public I
may consider using Git, but that is a decision I can make later. Within the trunk
I then created a bin, lib and include directories to install into, plus a docs
directory for documentation, an ontologies directory for data and a workspace (ws)
directory for the source code. The use of ws rather than src comes about since I
will be using Eclipse for the project's IDE, and it creates workspaces into which
the source projects are placed.


Documentation

I find it useful to document stuff (my memory isn't what it used to be), and for
this project there will be several documents, so the next step was to create a
document template for the project. The vision statement and the preliminary analysis
were then brought into the project. I will initially use Open Office, but expect to
move to Libre Office in line with Fedora.

There will be various diagrams for class diagrams and such. I use Dia as my diagramming
tool since it does a reasonable job of UML, and includes a variety of other symbol sets.
The aim of SASSY is to automate the creation of the diagrams, but until its up and
running I will need to do them by hand.


Tool Set

I like to document the various products that I use on a project. It makes it much
easier on the next project to be able to refer back to how it was done last time.
For each product I record how it was installed, any configuration that was done,
and any tips for using it.

As mentioned I will be using Eclipse for the IDE. Normally I would just run up a
few terminal sessions and use vi to do the editing. However, when working with
large, complex libraries things like auto-completion are very handy. I expect to
be doing C++ development as well as Java, so I included the CDT package from the
Eclipse repository. The CDT package also supports autotools which can simplify
development if we ever need to support a range of platforms.

For designing the user interface the Qt Designer is a suitable product, especially
since I will be using Qt for the user interface programs.

For defect tracking I am a fan of Mantis. I wanted to use PostgreSQL for the
underlying database, and after a bit of hacking I have this working satisfactorily.
For the very large projects that are the very reason for SASSY we will eventually
need to store the data in a database, as opposed to a simple XML file, so getting
PostgreSQL into the mix early on has some plusses.

Component Products

For the user interface programs I have chosen the Qt libraries. This package has
a complete user interface library in C++, is available for a variety of platforms,
and is open source. It also has an abstraction layer for the operating system which
may eventually be useful if we support multiple platforms. The only down-side I
see is that it has its own implementation of templates for containers etc rather
than use the STL. This can result in a lot of code moving data between containers.

The Protege product provides a complete front end for entering and editing an
ontology. It is under active development but is satisfactory for the initial
development. There are currently some issues with large databases and multiple
users for the OWL version 2 environment, but I am hopeful that these will be
resolved by the time we need them.

Program access to the ontology database is via a Java library called OWLAPI,
available from Source Forge. I am also looking at the owldb library which
aims to provide the database storage for large projects.

The mixture of C++ and Java presents a small problem since you cannot easily
call the Java libraries from a C++ program. My approach, initially, was to use
GCJ and compile the OWLAPI library into a form that can be linked into a C++
program. This eventually came unstuck, but that is a story for another time.

The documents will have diagrams so the GraphViz package gets a run for this
project. It allows you to input a graph in a simple format and does all the
layout for you, producing an output in a variety of formats.

The document formatting will be done with the LaTeX package. The output can
be converted to PDF using various command line tools.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Saturday 2 July 2011

SASSY- Analysis, Part 9

Interfaces


In this last blog on the preliminary analysis for the SASSY project we look at the interfaces between our software architecture system and the surrounding environment.

Input Interfaces


The inputs to the software architecture process are the vision statement, the preliminary analysis document, the data dictionary and the functional and quality requirements documents.

The software architecture process also contains a body of its own knowledge. Currently this is mostly held in the expertise of our software architects, but for this project we aim to capture as much as we can into a core software architecture ontology.

For a general project there may not be much control over the format of the inputs since they may come from a client who will simply provide the requirements etc. as a set of word processing documents. We can either accept these documents as the input, or transcribe them into the ontologies described in the previous blog. It may be that the act of transcribing them will uncover gaps which the preliminary analysis can attempt to resolve.

Output Interfaces


The outputs from the software architecture are a set of documents and diagrams that describe the system from a range of viewpoints.

Glossary
From our vision statement we intend to use knowledge engineering, specifically ontologies, to store the software architecture, and we also understand that we intend for this to be used when creating very large systems. One of the tasks in building a system, and an output of the preliminary analysis, is the glossary or data dictionary for the project's terminology. For a very large system it would seem logical to use an ontology to store this data dictionary.

Given that the aim of this project is to produce documentation from the contents of an ontology, it would seem reasonable to extend the project to the generation of the glossary ontology.

Requirements
It is often useful to know how all the parts of a software system are related. In particular it is often useful to know what functional or quality requirements drove various aspects of the design, code, testing, documentation, etc. If we were to relate all the components of a system together into an ontology then it might be possible to readily deduce such information and thus be in a better position to maintain the system.

Hence the requirements for the system should be drawn into the software architecture ontology, or perhaps exist as an ontology of their own which can then be referenced from the SASSY ontology.

Quality Attributes
Also from the vision statement we can expect our system to have a core Software Architecture ontology, and part of this will be a collection of quality attributes. These would be of interest when creating the Quality Requirements document, so it would appear to be a useful extension to the system to be able to print out such a list from the Software Architecture ontology.

Document Format
There are a variety of possible formats for the documents, ranging from plain text, to HTML, Word Processing and to PDF.

While plain text might be the easiest to generate, and it has advantages if you want to do further processing with it, it does not present very well to those that expect to see a high quality product.

HTML is fine for on-line viewing, but generally does not print well. If XHTML is used it can still be further processed if necessary.

The internal format of most word processing documents is quite complex, and sometimes not well documented. This makes generating the documents quite difficult. The other problem with word process documents is that they can be subsequently edited. There is, therefore, a danger that the output documents will be maintained, rather than the underlying ontology. This could lead to confusion as documents get out of step.

One option that produces high quality output, in PDF, is to generate LaTex. This is a well documented format that is easy enough to generate.

Diagram Format
The natural choice for diagrams is SVG. This is easy to generate, both manually and with various tools.

It is also easy enough to further process if necessary, since it is just XML.

While simple diagrams can be, and probably should be, embedded into the documents to which they refer, larger, more complex diagrams might be better left as separate documents.

If we allow separate documents for the diagrams it opens up some additional possibilities. We might introduce a third dimension and create a 3D model that can show more complex relationships than would be practical for a 2D diagram. Alternatively we might use animation to show how things change over time.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Friday 1 July 2011

SASSY – Analysis, Part 8

Ontologies


In their papers [BER06] and [HAP06] four different purposes are proposed for using ontologies:
  • Ontology–driven development (ODD): ontologies are used at development time to describe the domain.
  • Ontology–enabled development (OED): ontologies are used at development time to support developers with their activities.
  • Ontology–based architectures (OBA): ontologies are used at run time as part of the system architecture.
  • Ontology–enabled architectures (OEA): ontologies are used at run time to provide support to the users.

The SASSY project aims to demonstrate how ODD and OED can be used during the architecture phase of development. Of course this means that the SASSY project itself will also use OBA and OEA.

Data Dictionary
An ontology (ODD) can be used to describe the problem domain. [HAP06] For projects beyond a certain size a simple glossary of terms ceases to be of much value. Once the project has got to the size where it becomes difficult to remember all the names of things you need something more than the ability to look up by name. An ontology with its more structured and linked view can make finding things easier.

The ontology also allows you to capture the relationships between objects, and, using data properties, it allows you to begin the object modelling.

Requirements
The requirements for a very large system can become a huge document in its own right. It is also common for there to be relationships between the requirements. Thus the requirements are another candidate for an ontology (ODD). [HAP06]

The ability of the Web Ontology Language (OWL) to import one ontology into another means that a requirements ontology can be imported into other ontologies, such as the software architecture ontology and the traceability ontology.

There are quite a large number of possible Quality Requirements that might be considered when creating the requirements document for a system. An ontology (OED) of documented candidates can be referenced during the development of the system requirements.


Software Architecture
The Software Architecture discipline has a large body of knowledge that might be more useful as an ontology (OED). The SASSY project aims to demonstrate the utility of such a collection.

There should be a static ontology that encapsulates the discipline, and a second that captures the specifics of the project.

Since OWL allows ontologies to be imported into other ontologies it might be sensible to partition the SA discipline into several smaller ones.

Traceability
For many systems it is important to know how each component depends on the system requirements. There is rarely a one-to-one mapping from requirements to components, or lines of code, so some way of capturing these relationships seems to be called for. An ontology (OED) seems like a candidate, and the SASSY project should support tracing requirements through the architectural design phase.

Configuration Management
When a large system is being developed by multiple teams, working at differing rates, perhaps even on completely different increment cycles it can become “interesting” trying to keep track of which combinations of components are known to work together (or not).

An ontology, with its ability to handle a large variety of relationships, seems like a good fit to the configuration management problem.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
References

BER06: Julita Bermejo Alonso, Ontology-based Software Enginnering, 2006
HAP06: Hans-Jörg Happel, Stefan Seedorf, Applications of Ontologies in Software Engineering, 2006