Category Archives: artificial intelligence

A very interesting talk about Semantic Web and Machine Learning research applied to human computer interaction and desktop applications design was delivered by Prof. Andreas Dengel during Google Tech Talks, September 28, 2007.

A Semantic Desktop is a mean to manage all personal information across application based on Semantic Web standards. It acts as an extended personal memory assisting users to file, relate, share, and access all digital information like documents, multimedia, and messages through a Personal Information Model (PIMO). This PIMO is built on ontological knowledge generated through user observations and interactions and may be seen as a formal and semi-formal complement of the user’s mental models. [...]

In this talk I will show how machine learning techniques may be used to support the generation of a PIMO. I will further introduce the main concepts, components, and functionalities of the Semantic Desktop, and give examples which show how the Semantic Desktop may become reality.

The talk is ~1h-long and absolutely interesting: adapting machines’ behaviour to their users’ one would be a great result, after over a decade of users adapting to machines’ limitations. :)

Read more:

I want to recap my summer of code so far, before the final evaluation starts. Here’s what I accomplished so far. It was a lot of work, and a huge fun as well, hope to catch your interest and get some feedback on future developments! :)

GUI CONTRIBUTIONS

global_view

  • new component submission view (drag and drop support from the repository explorer to add dependencies easily :) )
  • repository explorer view
  • preference page (set repository URL)

IMPLEMENTED FEATURES

  • submit a new component
  • usual search features (name, version, tags)
  • search components providing a set of tasks
  • search components providing all the tasks of the selected ones
  • “smart” search of components functionally equivalent to the selected one (reasoning here)
  • search components depending on the selected one
  • assert functional equivalence between components

HANDS-ON

Let’s take a test drive. I submit a new component, in this case (just as an example) the “last-gsoc-demo” one. I fill in some data, and press submit. I can just drag-and-drop dependencies from the repository explorer to the dependencies viewer.

submit drag drop

I previously submitted some sample components. Now since all jdbc drivers implement the same specification, to some extent it is correct to consider them “functionally equivalent”, and I push this statement in the knowledge base.

find-eq

For sake of brevity I ask you to trust me without further shots, what I did is just assert all jdbc drivers in the repository (besides the “dummy-jdbc” one) “functionally equivalent” to the postgresql one, and then assert the “dummy- jdbc” only equivalent to “mysql-jdbc”. I can ask now the repository to give me components “functionally equivalent” to the selected one (“dummy-jdbc”), just clicking on the context menu item:
assert-eq

Here’s what I obtain:

inference-rulez

You might notice that the selected item is still there, which makes sense since everything is of course functionally equivalent to itself. ;) Furthermore, it is worthwhile to note I only said the “dummy-jdbc” was equivalent to “mysql-jdbc”, full stop! The rest is just the result of the reasoning process.

Now, I can also describe a component in terms of the “tasks” it carries out. Let’s suppose – just as an example – I have two components, one for “dom-parsing” and the other for “sax-parsing”.

tasks

Suppose now I was not on earth in the last years and I want to know if there exists a single component doing the two things.

union-tasks task-union-found

I can select both of them, click on the shown item and I’ll get xerces-j actually does both things. I might decide to use it if it fits my needs, since a single dependency is better than two, in most cases.

I also could want to know if other components rely on mine, or for instance how many components actually use one, which usually means it has great reputation. Remember the “last-gsoc-demo” component? I put “mysql-jdbc” as a dependency there. I just right click on the component, and find the components depending on the selected one. :)

search-clients client

CLIENT-SERVER ARCHITECTURAL VIEW (after latest modifications)

architecture

KNOWN BUGS

  • troubles with SPARQL queries involving literals: searching against id and tasks is ok, versions and tags are not (yet the http://repo.url/tag/{tag} resource works fine… i had no time to investigate further befor pencils were off)
  • dangling dependencies (i.e. after a delete operation) are not handled yet.

COMING SOON (random order)

  • rest (not in the soa-ish meaning)
  • enable license and license-style search criteria on the plugin
  • associate a new perspective with the provided views
  • improve repository explorer (i don’t like that tree very much)
  • bundled repository exploiting the eclipse embedded jetty server
  • import existing metadata from maven POM or OSGi manifest (URL drag and drop from web browser?)
  • address repository data access layer performance issues
  • setup an update site somewhere on the globe

CHEERS
That’s all for now, I really enjoyed the work, and I am confident this both- side fruitful collaboration will go on. A lot of things remain to be done on this project, and I won’t let it down after Google Summer of Code stops.

I want to thank Philippe Ombredanne for mentoring me, and all the guys @ #eclipse-soc for supporting me and other students day after day. It was an invaluable experience to work with you guys.

See you online,

cheers,

Savino Sguera.

Recap: previous status reportall about my GSoC project @ Eclipse

Most significant updates for this week:

  • Model refactoring
  • Ontology design improvement
  • Jastor classes update
  • Implemented marshalling subsystem (XStream)
  • Added persistency to Jena model (Apache Derby embedded DB)
  • Tested full stack data flow:
    Restlet’s DomRepresentation <-> Document <-> XML <-> Javabean <-> Jastor class <-> Jena statements <-> RDBMS
  • Discussed dependencies licensing issues (no problems ahead)
  • Started client’s architectural design (and assessed code reuse scenarios)
  • Added javadoc
  • Committed new code to eclipse-incub

Very next steps:

  • Discuss some modeling issues with Philippe
  • Implement REST layer
  • Get a live demo of the repository up and running
  • Start client design
  • Add a “dependencies” page to project’s wiki

Mid-term evaluation incoming, blogged about my Google Summer of Code project @ Eclipse Summer of Code Blog.

Some significant progress this week, I’m absolutely satisfied but a lot of work is still to be done.

Read the entry and feel free to drop a line of comment!

Second update about my Google Summer of Code @ Eclipse.

I blogged about my project’s status here @ Eclipse Summer of Code blog. ;)

Please, feel free to comment the original post, or this one as well. :)

A few hours ago Google published accepted students for the Google Summer of Code 2007.

I’m in! :P

Just received the official “Congratulations!” e-mail from Google: I have been accepted by The Eclipse Software Foundation, I’ll work on a project I proposed and I’ll be mentored by Philippe Ombredanne.

A brief high-level description of what the project will consist in is here @ my Google Summer of Code application information page. More details are coming right away on this page :) .

I really want to thank all the people who made this possible, especially guys @ Google and The Eclipse Software Foundation.

Btw, looking forward to get the t-shirt :P !

I’m so happy to read that my second work has been accepted for publication by the Applied Ontology Journal of Ontological Analysis and Conceptual Modeling. I want to thank a lot M. T. Pazienza and A. Stellato for what I learnt working with them, it was an invaluable experience. :) Great job!

You are here!

A very good post @ Good Morning Silicon Valley about privacy and security policies.

The second point is absolutely interesting: it deals with a recently disclosed project about automatically assigning every person entering the US a score, rating the person as a terrorist threat.

The score is based on a list of factors as the “analysis of their travel records and other data, including items such as where they are from, how they paid for tickets, their motor vehicle records, past one-way travel, seating preference and what kind of meal they ordered”.

I won’t go into the privacy issues, which have been discussed in the original post. Anyway, this looks to me pretty much like the “ideal” machine learning scenario. :)

I believe they trained the system with a set of already categorized examples (yes, people… either bad and good), learning the categorization function returning your very own score as a threath.

This is very interesting from an engineering perspective, although we all know in such problems the error rate is usually non-zero, so the question arises quite easily: what if…?

Hopefully they allowed a very narrow error margin ;) .

In a moment where it seems impossibile to get online without actually reading tons of blog posts or articles about Web 2.0, it’s nice to read some original point of view about the actual AJAX impact over industry and innovation (if any), considering the architectural implications and debating about which road should we go.

“Web 2.0 marks the dictatorship of the presentation layer, a triumph of appearance over architecture that any good computer scientist should immediately dismiss as unsustainable.”

In this post @ deal architect Vinnie Mirchandani deals with Bill Thompson’s article @ regdeveloper.co.uk, where the Web 2.0 “madness” is harshly criticized (with a bunch of – IMHO – nice politically-motivated metaphores :P ).

To be honest, I subscribe to this point of view: first of all I can’t quite understand where this enthusiasm about a late-1990 web development technique comes from; XmlHTTPRequest has been there since much earlier, it’s not rocket science, nor cuttin’ edge technology.

Furthermore, I definitely agree with Thompson’s invitation not to focus ourselves that much about the presentation layer: the way we present data is absolutely important, but the actual innovation is elsewhere to be chased. Engineering should face (and its actually facing) much more complicated challenges than asynchronous <div> refreshing.

Whatever, I believe there’s no chance to get a better conclusion for this post than Thompson’s one, so I’ll steal his one:

“The time has come to stand up and be counted, and we need people who can count in hex and see beyond the Web 2.0 hype. “

Yesterday two seminars took place at University of Rome, Tor Vergata, Computer System Engineering department. The talks were held by Yorick Wilks (head of the NLP research group at University of Sheffield) and Sergei Nirenburg (University of Maryland Baltimore County). Main topics were: information retrieval and extraction, ontological semantics, Semantic Web and Natural Language Processing (NLP).

Some excellent points were made out, and a lively discussion about whether the Semantic Web will ever go mainstream, and about where actually semantics are in the Semantic Web derived from the talks. What emerged soon, anyway, was the crucial role NLP is playing – and will play – in the development of a machine-understandable web.

Whatever… I’d like to see such events take place more often in my faculty. Very interesting. :)