The English version of the DBpedia knowledge base currently describes 4.58 million things, out of which 4.22 million are classified in a consistent ontology (see here), including 1,445,000 persons, 735,000 places (including 478,000 populated places), 411,000 creative works (including 123,000 music albums, 87,000 films and 19,000 video games), 241,000 organizations (including 58,000 companies and 49,000 educational institutions), 251,000 species and 6,000 diseases.
We want to develop a generic search engine for encyclopedic knowledge that exceeds the capabilities of existing, text-based information retrieval approaches and truly understands the user intention while querying. Furthermore, we want to achieve a natural interaction of the user and the engine by allowing the user to make use of keywords, phrases as well as natural language questions as well as provide him with state-of-the-art input comfort like auto-completion. The underlying data will have the World’s largest knowledge collection – Wikipedia – at his heart. Furthermore, we want to extend this knowledge by using unstructured streams (e.g. Twitter), tabular data (e.g. WHO) and other data sources which are not yet structured. Finally, we want to ensure that users can use any available knowledge base no matter of its location by implementing a federation layer based on W3C standards like RDF and SPARQL 1.1.
ULEI is the scientific partner of the DIESEL project and delivers the following frameworks:
- Named Entity Recognition: Federated knOwledge eXtraction Framework FOX
- Named Entity Disambiguation: Agnostic Named Entity Disambiguation using Linked Data AGDISTIS
- Open Table Extraction: Automatic Property Mapping for Tabular Data TAIPAN
- SPARQL Query Federation Suite QUETSAL for distributed search QUETSAL
- Semantic Search SESSA