Category Archives: information retrieval

A very interesting talk about Semantic Web and Machine Learning research applied to human computer interaction and desktop applications design was delivered by Prof. Andreas Dengel during Google Tech Talks, September 28, 2007.

A Semantic Desktop is a mean to manage all personal information across application based on Semantic Web standards. It acts as an extended personal memory assisting users to file, relate, share, and access all digital information like documents, multimedia, and messages through a Personal Information Model (PIMO). This PIMO is built on ontological knowledge generated through user observations and interactions and may be seen as a formal and semi-formal complement of the user’s mental models. [...]

In this talk I will show how machine learning techniques may be used to support the generation of a PIMO. I will further introduce the main concepts, components, and functionalities of the Semantic Desktop, and give examples which show how the Semantic Desktop may become reality.

The talk is ~1h-long and absolutely interesting: adapting machines’ behaviour to their users’ one would be a great result, after over a decade of users adapting to machines’ limitations. :)

Read more:

Tagging provides a more flexible and smarter approach to information search, there’s no doubt about it. Yet, so far, it is almost a prerogative of web-based services.

An Austria-based team developed Tag2Find, a nice standalone app enabling Windows XP systems with the well known tagging feature we all use in web 2.0 apps. I just started using it and, although it’s still in beta testing, it seems quite cool.

Windows Vista provides such a feature, even though it is quite difficult to use, techcrunch reports: if you right click on a file and click properties, and then details, you can enter in a set of tags for that file. Using the search feature of Vista you can then find those files by searching for a tag.

Windows XP doesn’t support tagging though, so that’s where this tiny piece of software comes into hand.

Tag2find can be integrated everywhere where you need it – in the taskbar, in Explorer, in Internet Explorer or even in a floating position.

At the moment only NTFS filesystems are supported. As I said above, it’s still a beta, so if you want to have a look you should request an invitation.

I like this app very much, it’s useful, smart, and the browser interface is very friendly. I definitely plan to test it thoroughly. ;)

Gmail just launched an absolutely interesting feature that allows users to fetch mail from other accounts. I spotted the news via Techcrunch.

Yes folks, the mail fetcher allows users to access non-Gmail email accounts from within the Gmail interface, using POP parameters.

This actually turns Gmail into a web-based email client, which is a very smart move from a strategy-biased perspective too.

A big step towards the day the Web will be the platform apps will run onto.

[UPDATE] Privacy implications – on the long run – are not to underrate here, standing to Donna Bogatin @ ZDNet Blogs.

You are here!

A very good post @ Good Morning Silicon Valley about privacy and security policies.

The second point is absolutely interesting: it deals with a recently disclosed project about automatically assigning every person entering the US a score, rating the person as a terrorist threat.

The score is based on a list of factors as the “analysis of their travel records and other data, including items such as where they are from, how they paid for tickets, their motor vehicle records, past one-way travel, seating preference and what kind of meal they ordered”.

I won’t go into the privacy issues, which have been discussed in the original post. Anyway, this looks to me pretty much like the “ideal” machine learning scenario. :)

I believe they trained the system with a set of already categorized examples (yes, people… either bad and good), learning the categorization function returning your very own score as a threath.

This is very interesting from an engineering perspective, although we all know in such problems the error rate is usually non-zero, so the question arises quite easily: what if…?

Hopefully they allowed a very narrow error margin ;) .

Yesterday two seminars took place at University of Rome, Tor Vergata, Computer System Engineering department. The talks were held by Yorick Wilks (head of the NLP research group at University of Sheffield) and Sergei Nirenburg (University of Maryland Baltimore County). Main topics were: information retrieval and extraction, ontological semantics, Semantic Web and Natural Language Processing (NLP).

Some excellent points were made out, and a lively discussion about whether the Semantic Web will ever go mainstream, and about where actually semantics are in the Semantic Web derived from the talks. What emerged soon, anyway, was the crucial role NLP is playing – and will play – in the development of a machine-understandable web.

Whatever… I’d like to see such events take place more often in my faculty. Very interesting. :)

vlogoDuring one of the latest lessons of the Information Retrieval and Machine Learning course I’m attending, I had the chance to get to know some clustering techniques and algorithms, and an example of “real world” application a clustered (meta) search engine was presented too: Vivìsimo.

“They used a mathematical algorithm and deep linguistic knowledge to find relationships between search terms and bring them to light.”

I’m always amazed by natural language based techniques; results are quite accurate if you think about the huge amount of unstructured data the search process insists on.

Vivìsimo Inc. provides enterprises and government with tailored clustered search solutions and also provides a consumer-oriented web search service: Clusty.

One thing I like a lot is the Clusty cloud creator: a totally unsupervised – real time – tag cloud generator, based upon – I suppose – latent semantic analysys techniques. Is it a first step towards an automated semantic tagging? Hope so…I’m quite fed up with thinking about tags myself, and I keep forgetting quite relevant tags too :P .

  • Vivìsimo search results for query “U2″: vivisimo1vivisimo2
  • Clusty cloud for query “U2″: clustycloud