Posted by: iammyr on: August 24, 2009
These last days have been even harder than ever. I don’t remember having never worked so hard in my life, but finally I’m satisfied and this is a priceless feeling surely!! ^__^ Of course I’m not saying that my application is perfect because it can be improved a lot especially to enhance its low performance that may impact its usability. I hope that this can be fixed by substituting Derby with the Couch DB as suggested by Philip (considered that my system work especially on sensor data properties). Moreover a GUI more web-oriented is needed, better if integrated with the Hackystat ProjectBrowser. But I really love my application goals and the way the Linked Data world is going to increase! While working I had the chance to read really recent paper about this topic and this is stimulating! I met great really great people in the Hackystat team. Talking at the telephone (!!!) with Philip Johnson was such a wonderful experience…he’s a really humble person despite of his wiseness
and he’s always encouraged me throughout my GSoC experience, as well as my university professor Filippo Lanubile ^__^ Unfortunately I’m a person always in special needing for encouragement eheheh XD
But in addition to this I met great developers also on IRC channels and talking to them on such a currently growing topic as Linked Data, could be so stimulating too! For example tuukkah added a relationship to insert a particular info I was in needing within his SIOCLog’s profiles to let me use it in my application. And in the same channel there were also Fred Giasson of the Umbel project and Sergio Fernandèz who created RDFOhloh and lots of other useful applications! Definitely I found that the Opens Source community in general is a stimulating, interesting, exciting and full of CV enhancement chances for everyone
.
My GSoC experience has been as taking a world tour and visiting places I didn’t see before, and learning things I’ve never knew. Firstly I was asked to make a great effort to think about the usefulness of my application for final users. This is an argument that anyone talks about while giving you projects for your University exams and it’s indeed, really difficult but really important to make a great work out of your project and stimulating your love for it (that is the essential thing: designing application and coding is pure creativity!).
Secondly I’ve done lots of exams completely alone but there was never been put such importance on documentation, and projects were not so great to compel me entering a deep designing phase. This time indeed, a deep and good design was a critical phase.And I made mistakes and I corrected them and tried to do a realistic and clear job. I’ve learned so much from this.
Thirdly I’ll remember the GSoC as “the first time I…” experience. It represents to me the first time I’ve used Subversion, the first time I’ve reported bugs, the first time I fixed them (locally yes
), the first time I’ve used Ant and discovered how easy can be generating jar files and distribution packages even for great projects including lots of external libraries in the CLASSPATH, the first time I’ve met Apache Ivy (another great tool ehehe
) and the first time I’ve shared publicly online with others, in a perfect Open Source style, my own project. It would be wonderful even if someone out there become interested in my project and would ask for contributing ^__^ ooo I was forgetting: it’s been also the first time collaborating with such a geographically far placed team! I enjoyed it, really.
Well, I want to be completely sincerely and then I must talk about the stress and tiredness caused by the GSoC timeline. Here in the south of Italy we’ve had the hottest days during July and other really hot days during August. Moreover everyone takes vacation on August, (just during the last weeks of the competition and then the heaviest ones) and it’s not easy to concentrate on your work. BUT now, at the end of the GSoC I can say that it’s such a valuable and especially “Rewarding” experience, that renouncing to “yet another summer of your life” is not a so high price
))) IT’S DEFINITELY WORTH IT!
I don’t know if it’s already clear enough, but I’m not going to abandon my project, absolutely! I’ll take care of it and keep spreading the word
My only regret regards not having the chance to know better the other Hackystat GSoCers. Personally it’s just because of so few spare time to dedicate to pure conversations
I’ve really appreciated the kind messages sent by Rachel!
Sincerely thanks to the Hackystat team that has offered to me such a great experience,
to the Collaborative Development Group at my own University, that has stimulated me in applying for the GSoC this year, and has supported me throughout my work (even moral support
),
definitely thanks to Philip Johnson whose meeting I consider a honor ^__^
I hope to keep in contact with everyone however, so this is not a “leave-taking” but simply a “see you soon”
Posted by: iammyr on: August 21, 2009
Hi,
sorry for the delayed blog post but I wanted to post only when I would have finished all the interfaces, code cleaning and then implementation in general. Well, I finished yesterday (or maybe I should say today, “early” in the morning…).
To realize the user interfaces I used javax.swing and java.awt even if maybe not so proper for the task, just because I know them more and I realized I didn’t have enough time to learn using other libraries or languages. The interfaces implemented include:
- a panel to login,
- list/edit users + search query wizard for users
- list/edit projects + search query wizard for projects
- list/edit issues + search query wizard for issues
- manage the external Hackystat LiSeD servers list which constitutes the network in which all the Sparql searches are propagated and searches for linkable users, projects and issues are performed.
- in every moment you’re able to change the preferred RDF serialization language and view (triplify) the RDF description of a single selected item or of a list of them.
At first I thought to let users send a ping to PingTheSemanticWeb for a selected resource, but as all the resources are accessible only to users registered with the SensorBase, PTSW will always reject such pings. Then there’s a class to interact with the API (it’s very simple) but not yet used anywhere.
Always foreseeing that in the next future part of the stored data will become available also to non Hackystat registered users, I’m completing also a sitemap.xml with the semantic web crawling extension (about which I talked in a previous post) to facilitate the retrieval of the dataset by search engines. Now I’m going to document everything…
P.S.
I’ve built successfully everything and all the JUnit test cases work well but only if run in Eclipse and not in Ant (through junit, emma etc.)…I don’t know why, I have no time to investigate further and I hope this doesn’t matter so much.
However also jar files and a .zip distribution and javadoc have been generated through the Ant scripts (they’re very useful!).
Bye
P.P.S
The Hackystat formatter file, when applied, fills a single line of code with more than 100 chars (even if I split that line with a \n before formatting) while on the other hand Checkstyle requires each line having less than 100 chars. This should be a conflict, unless I’m doing something wrong…
Posted by: iammyr on: August 10, 2009
During this passed week I’ve implemented and successfully run JUnit tests for all the main resources provided by my system, which are User, Project and Issue. Within those tests I’ve checked that the RDF model representing either a single instance requested or a list of instances, is constructed correctly, that the Sparql endpoints return correct search results and that either the server, local and alternative caching mechanisms work correctly (in fact I empty all the caches before running my test cases).
Then I’ve imported the hackystat’s eclipse formatting file, formatting each one of my classes according to this new code style, and I’ve successfully run ant and all the *.build.xml files with the except of:
LinkedServiceDataTestHelper.sensorbaseServer = org.hackystat.sensorbase.server.Server.newTestInstance();
LinkedServiceDataTestHelper.telemetryServer = org.hackystat.telemetry.service.server.Server.newTestInstance();org.hackystat.sensorbase.client.SensorBaseClientException: 1000: Connection refusedI’d like to report that there should be a mistake in the build.xml and ivy.xml stored in the SensorBase, DPD and Telemetry google code repositories, because statements such as
ivy:retrieve organisation="org.hackystat" module="hackystat-sensorbase-uh" revision="latest.integration" pattern="${lib.dir}/hackystat-sensorbase-uh/[artifact].[ext]" sync="true" inline="true" conf="default" log="download-only" transitive="false" type="jar, javadoc, source"
ivy:retrieve organisation="org.hackystat" module="hackystat-analysis-dailyprojectdata" revision="latest.integration" pattern="${lib.dir}/hackystat-analysis-dailyprojectdata/[artifact].[ext]" sync="true" inline="true" conf="default" log="download-only" transitive="false" type="jar, javadoc, source"
dependency org="org.hackystat" name="hackystat-utilities" rev="latest.integration"
should be changed in:
ivy:retrieve organisation="org.hackystat" module="hackystat" revision="latest.integration" pattern="${lib.dir}/hackystat-sensorbase-uh/[artifact].[ext]" sync="true" inline="true" conf="sensorbase" log="download-only" transitive="false" type="jar, javadoc, source"
ivy:retrieve organisation="org.hackystat" module="hackystat" revision="latest.integration" pattern="${lib.dir}/hackystat-analysis-dailyprojectdata/[artifact].[ext]" sync="true" inline="true" conf="dailyprojectdata" log="download-only" transitive="false" type="jar, javadoc, source"
dependency org="org.hackystat" name="hackystat" rev="latest.integration"
because module names such as “hackystat-analysis-{name}” don’t exist.
btw I had to explicit the rev=”latest.integration” attribute value because ‘latest.integration’ is not found in my own project and I don’t know where, indeed, it’s stored within the other Hackystat projects.
However the only library that I need and but was not included in the Ivy module repository is Jena 2.5.7 (and Arq, Iri and IBM-Icu libraries usually shipped with it). Then I used the environment variable ‘JENA_HOME’ and it has been sufficient. Other libraries retrieved by Ivy are JUnit, Restlet, Xerces, Hackystat-sensorbase-uh and Hackystat-analysis-telemetry.
Also the LinkedServiceDataClient class, useful to interact with my service through any external java application, has been ultimated. Moreover now my system is able to return a response serializing RDF through any of the existing RDF serialization languages, which are: N-triple, N3, RDF/XML, RDF/XML-Abbrev and Turtle, according to which media type has been requested by clients.
With regards to Issues, as I don’t have yet the Hackystat SensorBaseClient version which allows to get “only” all the issues instances, I retrieve all the sensor data and browse them to find which have a sensorDataType==Issue.
During the next week:
Interface, interface and interface ![]()
As Philip told me, I’ll apply my interface changes over a copy of the Hackystat Project Browser.
Posted by: iammyr on: August 4, 2009
This week I’ve done what I promised, more or less.
I’ve linked users in this way:
User data such as name, surname, nickname, homepage, weblog, are not mandatory: they’re just optionally filled in by users. For those users whose foaf profile is successfully retrieved, I’ll provide a ‘merge into your profile’ facility to merge any Hackystat-created triple into their profile’s content.
Moreover I’ve linked issue data but only btw issues stored in different Hackystat servers, and I’m going to explain you why.
I opened a discussion on Baetle’s ml in which I asked news about online bug datasets on which it’s allowed to perform searches, maybe datasets which uses the Baetle or EvoOnt BOM ontologies. Then from that and other discussions, I knew that there are currently two projects that have begun to publish RDF data about bugs: Helios and Pear, but neither of them provide yet search facilities. I’ve unsuccessfully searched and asked on IRC for bug datasets available online which provides those search facilities. The only one is Launchpad whose Rest Api is still in beta version and doesn’t yet provide a way to search for similar bugs through tags or other features. Rather it just provides the custom GET findSimilarBugs which is capable only to search for Launchpad bugs similar to another given Launchpad bug (not bugs coming from anywhere). I’ve also asked for such bug datasets on IRC but unsuccessfully.
Then, until Pear or Helios or any other project doesn’t provide search facilities on bug datasets, Hackystat issues will be linked only between each other stored in the same or different Hackystat servers. To foreseen the future availability of new bug datasets, I perform a search using Sindice for datasets having a particular meta-description.
Finally I downloaded the new hackystat-services binaries, I compiled all the *.build.xml files contained in the new hackystat-developer-example and saw its corresponding videos. I had no apparent problem, with the except of this:
I get some ClassCastExceptions while compiling hudson.build.xml and verify.build.xml, although all the other builds previously have ended successfully. This problem has been reported by others, too:
I get a
Can’t find/access AST Node typecom.puppycrawl.tools.checkstyle.api.DetailAST
with a matching
0: Got an exception – java.lang.ClassCastException
for every file in the source.
Using the supplied ant task.
Other where I found that this could may be related to Debian ant packages:
It seems there’s a fundamental conflict between the ANTLR Ant Task (or more
generally, having antlr.jar in the global CLASSPATH or Ant’s lib directory -
even if you aren’t actively using the ANTLR Task) and the Checkstyle Ant Task.
There’s a passing comment about it in an Ant development proposal here:http://fisheye.cenqua.com/viewrep/~raw,r=1.3/ant/proposal/mutant/docs/desc.html
Basically if I have Ant with antlr.jar in ant/lib (as is the state when the
antlr and ant debian packages are installed) and I use the checkstyle task,
including in its classpath all the checkstyle jars (including or excluding
antlr.jar, it doesn’t matter) then checkstyle will fail to run properly,
producing errors such as this:[checkstyle] Can’t find/access AST Node
typecom.puppycrawl.tools.checkstyle.api.DetailAST(repeated several times)
[checkstyle]
/home/dblaikie/work/honours/ssaburg/src/au/edu/usyd/it/ssaburg/Expression.java:0:
Got an exception – java.lang.ClassCastException: antlr.CommonAST(things like this for every source file)
due to the way Ant sets up the ClassLoader hierarchy, the Antlr classes are
loaded by the base loader which is above the loader used for the checkstyle
classes, therefore the Antlr classes cannot access the checkstyle classes (but
checkstyle can access antlr classes). Even including the antlr.jar in the
local classpath used to load the checkstyle task is insufficient as Java’s
ClassLoader delegation model still causes the base antlr.jar to be used.
This conflict with the Ant Debian package can be explained. The Ant version supplied by the apt package manager is an old one: version 1.6 and then I uninstalled it completely before installing the Hackystat (maybe because of Ivy) required version 1.7.1, downloaded from the Apache repository. But deleting all the references and folders related with ant 1.6 is not been sufficient. I don’t know..maybe I’ll ask in some Debian forum.
During the next week
I’ll finish write all my *.build.xml files and I’ll write Test* classes for all the main classes. Then if there will be time left, I’ll begin also implementing interfaces with Apache wicket. I was thinking about modifying the projectBrowser directly but I’m not sure about this, I’ll ask Philip.
Posted by: iammyr on: July 28, 2009
As expected during this passed week I’ve published projects as Linked Data, realized some mock-ups and a rough/testing-purpose/very simple interface to allow just the Sparql query insertion and the projects’ URIs retrieval. In Fig.1 and Fig2 there are examples (click to enlarge).
This simple interface has been realized using javax.swing and java.awt because I’ve lots of experience using it and then I’m faster, but I’m going to learn using Apache wicket to conform to the rest of the Hackystat services
.
Searches are performed over the local LiSeD service and over the hackystat servers included in a dedicated list (together with admin username and password).
Project data are linked to data coming from Ohloh and from the Hackystat servers included in that dedicated list mentioned above*. Oholoh allows searches only over project’s title, tags and description and then links of type ’seeAlso’ to its projects are created only on the base of project’s tags (Hackystat stores also descriptions but it would be too expensive to extract key-words from descriptions); while links of type ’sameAs’ to its projects are created only on the base of project’s title. To get the search results from Ohloh I invoke the Ohloh’s REST Api which returns a XML file containing the project’s IDs of interest. Then I insert them in a URI referring to the Linked Data representation of this project as published by RDFOhloh (I can’t invoke directly RDFOhloh because it doesn’t provide a Sparql endpoint to support my searches).
Links with projects from other Hackystat servers instead, are created using both tags and tools lists; and search results are obtained by invoking the Hackystat Sparql endpoint for projects (e.g. http://localhost:9875/linkedservicedata/projects/sparql/?query={query} ). In the mean time I’ve contacted the doap-interest mailing list asking for clarification about the not availability of the dataset doapspace.org and I’ve received answers but just saying “I’m forwarding this request to the admin” and nothing else, so I think I should give up…
Then to summarize:
Link type: seeAlso – created after searching for projects having: 1) same tags in Ohloh projects’ tags or description or title; 2) same tags and tools in external Hackystat LiSeD servers projects’ tags and tools
Link type: sameAs – created after searching for projects having: 1)same title in Ohloh projects’ title or tags or description (there’s no way to specify over which project feature perform searches); 2) same title in external Hackystat LiSeD servers projects’ title.
I’ve realized also mock-ups for all the user interfaces I’m going to add in Projectbrowser in order to get feedback from you (I really appreciate any comment/suggestion).
Links to these mock-ups are listed below:
I’ve preferred hiding the triple-structure (subject-relationship-object) to the user, and conform data presentation to the rest of the Hackystat GUI.
Mock-ups regarding issues have been realized after taking a look to Shaoxuan’s documentation and code on the new Issue SDT that he’s implementing, and thanks to his fast and useful e-mail answers I’ve clarified some aspects. He’s suggested that an issue can be marked automatically as duplicate of another one by checking if they’re associated with the same /’system-name’/'project-name’/'issue id’ resource. However my task consists in finding issues which are similars (and then I’ll need to compare tags) or referred to the same problem, and two issues can be stored in different tracking system, can have different IDs and can be associated with different projects, but can be referred to the same problem. Then I think that I could unify Shaouxuan’s suggestion with my thoughts and implement a semi-automated way to create ’sameAs’ relationships, identifying duplicates.
Please, give me feedbacks about these mock-ups.
During the next week
I’ll do for users and (if possible) issues the same job done for project (linking them with external datasets, realizing Sparql endpoints for them and testing searches through a simple GUI). If I’ll have time left (sic!
) I’ll also begin following Philip’s suggestions to update my Hackystat version to the one using Ivy and to conform my code to be easier integrable with the other services.
*To debug purpose I include in this list my localhost if the list is empty, using it only once and then clean the list (to avoid loop).
P.S.
I was forgetting to add that I’ve make the Sparql-results conformed with the W3C suggestion. Initially I followed the endpoint published by Dbpedia wich returns not only XML type of Sparql results but also JSon, RDF and N3. But I’ve found that all the kind of results different from XML have just been proposed to W3C but not yet approved. Then currently the Sparql-result returned by Hackystat is a XML document. This is an example (it’s an image because I can’t manage to paste xml file in wordpress due to syntax conflicts):
for the Uri http://localhost:9875/linkedservicedata/projects/sparql/?query=PREFIX+doap%3A+%3Chttp%3A%2F%2Fusefulinc.com%2Fns%2Fdoap%23%3E+SELECT+%3Furi+WHERE+%7B+%3Furi+doap%3Amaintainer+%22myrpandemon%40yahoo.it%22+%7D
conformed with the W3C recommendation.