The Web has proved to be an unprecedented success for facilitating the publication, use and exchange of information, at planetary scale, on virtually every topic, and representing an amazing diversity of opinions, viewpoints, mind sets and backgrounds. Its design principles and core technological components have lead to an unprecedented growth and mass collaboration. This trend is also finding increasing adoption in business environments. Nevertheless, the Web is also confronted with fundamental challenges with respect to the purposeful access, processing and management of these sheer amounts of information, whilst remaining true to its principles, and leveraging the diversity inherently unfolding through world wide scale collaboration.
RENDER will engage with these challenges by developing methods, techniques, software and data sets that will leverage diversity as a crucial source of innovation and creativity, whilst providing enhanced support for feasibly managing data at very large scale, and for designing novel algorithms that reflect diversity in the ways information is selected, ranked, aggregated, presented and used. RENDER’s information management solution will scale to very large amounts of data and hundreds of thousands of users, but also to a plurality of points of views and opinions. This will be demonstrated through the usage of realistic data sources, including news streams covering over 5000 sources worldwide with 100,000 items per day, (micro) blog streams adding up to more than a million posts per day, a full data stream from Wikipedia, and the Linked Open Data Cloud; through open source extensions to popular collaboration and communication platforms such as MediaWiki and Drupal; and through three high-profile case studies.
RENDER will help to realize a world where information is acquired and shared in a fundamentally different manner than the consensual approach promoted by movements such as Web 2.0, and where communication and collaboration across the borders of social, cultural or professional communities are truly enabled via advanced Web technology, supporting one of the credos of European society: “United in diversity”.
RENDER will provide a comprehensive conceptual framework and technological infrastructure for enabling, supporting, managing and exploiting information diversity in Web-based environments.
Diversity is a crucial source of innovation and adaptability. It ensures the availability of alternative approaches towards solving hard problems, and provides new perspectives and insights on known situations.
Equally important, embracing diversity in information management is essential for enhancing state-of-the-art technology in this field with novel paradigms, models, and methods and techniques for searching, selecting, ranking, aggregating, clustering and presenting information purposefully to users, thus alleviating critical aspects of information overload.
RENDER will develop concepts, methods, techniques and technology to
- Collect and manage information sources which are rich in diversity so that this information is available in an effective form and can be processed efficiently in further steps. We will crawl, gather, structure and enrich various information sources with a great diversity basis, including sources relevant for the RENDER case studies. RENDER will leverage very large amounts of content and metadata: news, blog and microblog streams, content and logs from Wikipedia, news archives, multimedia content and reader comments, discussion forums and customer feedback databases from Telefónica, all together adding up to hundreds of millions of items, some even on a daily basis. This data will be managed by a highly scalable data management infrastructure, and enriched with machine-understandable descriptions and links referring to the Linked Open Data Cloud. The results will be published available online as high-quality, self-descriptive data sets that will be available to the large-scale information management community worldwide for widespread use.
- Identify and extract the diversity embodied within the various information sources collected, and make the connections and references between different items and sources explicit. RENDER will: spot and assess biases, factual coverage, and the intensity of opinions expressed; identify complex events; and track topics along multiple sources, across data modalities and languages. The results will be stored and managed through the data management infrastructure mentioned above, and will serve as input for higher-level processing and usage.
- Represent and process diversely expressed information so as to explicate and conceptualize the results of the mining task, to enable the development of diversified information management algorithms and services. RENDER will develop novel, scalable techniques to reason upon opinions and viewpoints, and for diversity-aware information selection and ranking. RENDER will also look into proper means to make diversity information accessible to the end-user by providing sophisticated metaphors, interfaces and software tools to organize, display and visualize it. Raking algorithms will take into account the viewpoints underlying different information items (within the top hits, provided a concept such as “top hit” still deems appropriate). Information will be summarized in fundamentally different ways, not only making the biases that have been introduced through such processes explicit, but also trying to minimize them. The results will be displayed to the end-user, who will then be able to navigate and discover content and topics within the diversity space.
- Use diversity as integral concept of popular communication, collaboration and information sharing platforms, in the form of extensions to MediaWiki, Drupal, or Twitter. RENDER technology will allow to explicitly link to items with a dissenting view, and thus to increase the diversity exposure of the wider Web audience.