During this week I realized a restlet server which provides useful information in RDF when someone looks up a URI, using HTTP URIs as names for thing, as stated in the first three Linked Data principles:
- Use URIs as names for things
- Use HTTP URIs
- Provide useful information in RDF when someone looks up a URI
For the basic resource types which are Hackystat user, project and sensor data, I used the same URI patterns as the ones used in the Sensorbase.
The so called “useful information in RDF” followed the schema published online. The raw sensor data are dynamically declared as sub-class of one specific sensor data type (highlighted in green within that schema). Considered the Sensorbase DB structure where the SDT to which a SD is associated could be of any kind, it could happen to deal with an unforeseen SDT. In this situation I create dynamically a class for the SDT that has an URI following the same URI pattern of every SDT resource type.
The requested RDF data are created dynamically at run-time if not stored in the cache (managed by Apache JCS as done by the DPD service), and then stored in the cache (a ‘.hackystat/cache/linkedservicedata’ folder is created). I don’t know if finally the system will be too slow because of the lack of a triple storage system, but such a storage system could be easily added as needed at anytime in the future, then I’m postponing the question.
During the RDF model construction I used the following external vocabularies:
@prefix owl: .
@prefix xsd: .
@prefix rdfs: .
@prefix rdf: .
@prefix foaf: .
@prefix iswc: .
@prefix doap: .
@prefix evoont: .
@prefix owls_process: .
@prefix sec: .
@prefix sioc: .
The major features that lack to the current implementation are
- the fourth Linked Data principle, that is ‘links to external datasets’. Of course it lacks extensions to data coming from Hackystat services other than Sensorbase, but I’m planning to realize all the Linked Data principles firstly over data coming from the Sensorbase because all the other data are abstractions based upon them and because I want to be sure to have something complete and working within the end of the GSoC. However after that I’m going to extend the implementation and schema to involve data coming from as many Hackystat services as possible, starting from the DPD.
- the server doesn’t yet provide RDF data for “all” the existing URI (that is: not all the URIs are dereferenceable), but I’m working on this. However I already include all the existing URI in the unofficial published REST API specification, between which only the URI referred to resource types such as user, project sensor data type and sensor data, are dereferenceable.
That’s about the major features that are going to be added soon.
Additionally it still lacks of support to every kind of RDF serialization (it supports only N3), it lacks of content negotiation to also handle requests for html or xml, and of course it lacks of an interesting interface.
I supposed to optionally (it’s not compelled) receive from sensors the following information added to the basic ones within the ‘Properties’ field of sensor data belonging to the following specified SDT:
- Commit
key=branch ; value=repositoryUrl branchUrl
key=log ; value=logMessage
key=version ; value=versionNumber
key=fromFile ; value=fileFullPath
key=linesAdded ; value=fileFullPath lineNumber lineNumber lineNumber …
key=linesDeleted ; value=fileFullPath lineNumber lineNumber lineNumber …
key=totalLines ; value=numTotLines
key=author ; value=userEMail
- File Metric
key=fromFile ; value=fileFullPath
key=totalLines ; value=numTotLines
key=commentLines ; value=numTotCommentLines
key=codeLines ; value=numTotCodeLines
key=classCount ; value=numTotClass
key=functionCount ; value=numTotFunction
key=functionSizeList ; value=fileFullPath functionName-LOC functionName-LOC functionName-LOC …
key=classSizeList ; value=fullPathClassName-LOC fullPathClassName-LOC fullPathClassName-LOC …
key=majorClassName ; value=className
key=fileType ; value=fileTypeName
- DevEvent
key=fromFile ; value=fileFullPath
key=type ; value=devEventTypeName (e.g Refactor or Edit or Compile or Execute or Test or Build or Debug)
- ReviewIssue
key=phase ; value=phaseId
key=module ; value=moduleName
key=line ; value=lineNumber
key=author ; value=userEMail
key=reviewerId ; value=id
key=runtimeId ; value=id
key=type ; value=issueType (e.g. Bug or Defect or Enhancement)
key=priority ; value=priorityValue
key=status ; value=statusValue
key=summary ; value=summaryMessage
- ReviewActivity
key=author ; value=userEmail
key=phase ; value=phaseId
key=phaseItems ; value=phaseId phaseId phaseId …
key=issueItems ; value=issueId issueId issueId …
key=module ; value=moduleName
- CodeIssue
key=message ; value=msg
key=priority ; value=priorityValue
key=runtimeId ; value=id
key=fromFile ; value=fileFullPath
key=line ; value=lineNumber
key=type ; value=issueType
key=project ; value=userEmail projectName
key=operatingSystem ; value=os
key=status ; value=statusValue
- Activity
key=fromFile ; value=fileFullPath
key=editedLines ; value=fileFullPath lineNumber lineNumber lineNumber …
- BufferTransition
key=fromFile ; value=fileFullPath
key=toFile ; value=fileFullPath
key=modified ; value=TrueFalse
- Dependency
key=path ; value=sourceFullPath
key=granularity ; value=package|class| swComponentFullPath | method fileFullPath methodName
key=inbound ; value=swComponentFullPath swComponentFullPath swComponentFullPath …
key=outbound ; value=swComponentFullPath swComponentFullPath swComponentFullPath …
- Command
key=path ; value=sourceFullPath
key=machine ; value=machineName
key=arguments ; value=ParameterType ParameterName=ParameterValue, ParameterType ParameterName=ParameterValue , …
key=operatingSystem ; value=os
- Build
key=path ; value=sourceFullPath
key=target ; value=sourceFullPath
key=result ; value=resultValue
key=arguments ; value=ParameterType ParameterName=ParameterValue, ParameterType ParameterName=ParameterValue , …
- UnitTest
key=fromFile ; value=fileFullPath
key=result ; value=resultValue
key=testName ; value=name
- Coverage
key=fromFile ; value=fileFullPath
key=granularity ; value=package|class| swComponentFullPath | method fileFullPath methodName
key=covered ; value=numCovered
key=uncovered ; value=numUncovered
- Perf
key=testName ; value=name
key=result ; value=resultValue
key=outputType ; value=outputTypeName
key=performanceMeasure ; value=measureName-value-unit measureName-value-unit measureName-value-unit …
- Issue
key=summary ; value=summaryMessage
key=priority ; value=priorityValue
key=runtimeId ; value=id
key=fromFile ; value=fileFullPath
key=line ; value=lineNumber
key=type ; value=issueType
key=project ; value=userEmail projectName
key=operatingSystem ; value=os
key=status ; value=statusValue
- ReviewActivity
key=fromFile ; value=fileFullPath
key=editedLines ; value=fileFullPath lineNumber lineNumber lineNumber …
You can find the code hosted on my Google code project.
Finally I decided a name for my project: Linked Service Data whose acronym would be “LiSeD” (not LSD ehehe XD)
During the next week:
I’m going to make all the existing URIs dereferenceable and to start linking external datasets (eventually searching for an algorithm to automate everything).
P.S.
I’m really sorry for the delay in posting this weekly report :p
——————UPDATE————————-
THE FIRST MILESTONE
The First milestone release will be published within the 6th of July (the mid-term evaluation date) and will consist in a restlet server able to provide information in RDF for all the URIs identifying resources described in the REST API specification, which are all the ones coming from Sensorbase plus some resource types created by me (as I suppose to receive the above described additional data from Sensorbase). Every URIs will be dereferenceable (currently some of them can already be requested). The RDF model provided won’t yet contain links to external datasets and will be created (if not already cached and if the logged user is authorized) dynamically at run-time and then cached locally on the server host. The server will be able to handle only requests for the N3 RDF serialization format.
It’s granted that these features will be achieved within the mid-term evaluation date. However it doesn’t lack so much time until I’ll start linking external datasets (as required by the fourth and last linked data principle), so it could happen that I’ll add this feature, too, within that date, but this is not granted.