Computers Dig Deeper for Meaning
Search engine technology is in a state of flux as it digs ever deeper for new meaning. ‘Search’ is the gateway to the Web, it keeps internet traffic moving, it provides the maps and the shortcuts through the enormous tangle of the World Wide Web.
But, while there is a phenomenal amount of content, most of it is not that easy to find. Sure, text content can be skimmed or glanced, but audiovisual content has to be viewed in linear time. We cannot easily search inside a film or audio recording for relevant information. This is changing, and one European project has created the first integrated platform for semantic search that can return results based on the content and context of film and audio files, as well as text.
Not the end for keywords
This is not the end of keyword search — the standard technology that we use every day — but it could well be the beginning of the end. For instance, try to compose a meaningful query, such as “effects of military action in civil population.” Traditional search engines will give results for the individual keywords introduced. A semantic engine, like MESH, will analyze the query first and then give relevant results for the actual meaning of the query.
The EU-funded MESH project sought to create a platform that integrated the state of the art in semantic search technologies and all the necessary tools to develop a working platform. But, while the team’s achievements are impressive, there is a length of road to travel before they are ready for universal search by everyday surfers. Still, the platform proves the technology in two restricted news domains — natural disasters and civil unrest and street violence — and it has led to many useful, working applications and potential commercialization opportunities.
“We developed a manual annotation tool to create manageable annotations for all types of media, and it is a very strong program that is easy to use,” explains Pedro Concejero, coordinator of the MESH project. This tool could become a commercial product, he predicts.
The search for relevance
One partner of the project, Deutsche Welle, a German TV station, created a dossier-developing tool called Full Story. This remarkable program can help a video editor link to video, audio and text relevant to a particular topic.
The editor can then assemble these diverse elements into a dossier. For example, a dossier about flooding might assemble media outlining the mechanics of flooding, the impact of changing weather patterns, and the effect on lowland and populous areas.
TV stations do this type of feature all the time and, typically, it can take days sorting through media archives for useful material to assemble a compelling dossier.
But, with the Full Story program, an editor can perform the same task in hours, and the editor is much more likely to find compelling and visually interesting material, because most of the time is spent sorting through relevant results rather than searching for relevant material in a vast warehouse.
“Deutsche Welle is currently evaluating the future prospects of Full Story with further extensive user testing, a comprehensive technology implementation plan and an outline concerning potential commercialisation,” notes Concejero.
Annotating user-generated content
User-generated content is another area that could benefit from the work of the MESH consortium in the short to medium term. User-generated content is a huge element of Web 2.0 applications — it is the material that makes sites like YouTube, flickr, Facebook and Twitter so popular. The MESH project’s automated annotation tool was central to the platform’s success, and it could be developed to work with user-generated content.
“Here at my company, Telefónica, we are very interested in developing semantic search and annotation for user-generated content on mobile phones, but more work would need to be done on the technology developed in MESH to make it ready for that sort of application,” reveals Concejero.
That may be the work of another project. The consortium has just put the final touches on MESH, but Concejero says that some of the partners may go forward with another project in the future. In the meantime, response from peers and industry to the work of the MESH project has been encouraging.
Above all, the MESH project demonstrates that semantic search for all media types is possible and automation is improving rapidly. It’s not quite there yet, but the search continues.
For further information about the MESH prototype: http://mesh.tid.es:8080/MeshGUI
The MESH integrated project received funding from the ICT strand of the EU’s Sixth Framework Programme for research.
Courtesy of ICT Results