Text
Semantic search for novel information
Companies frequently face the challenge of screening the continuously increasingnumber of (Web) documents and assessing the contained information with respectto its relevance and novelty. For instance, technology scouts need to discover andmonitor new technologies, while investors and stock brokers would like to beinformed about recent acquisitions. The systems that have been developed so farfor detecting novel information(semi-)automatically in text documents are oftenvery inefficient. This is due to the fact that most approaches only consider therelevance, but not the novelty, of text documents. The few existing approaches fornovel information detection do not use any semantically-structured representationof the already known and of the extracted information.In this thesis, new approaches for detecting and extracting novel, relevant in-formation from unstructured text documents are presented that exploit the explicitmodeling of the semantics of the given and extracted information. Using seman-tics has the benefit of resolving ambiguities in the language and specifying theexact information need regarding relevance and novelty. The explicit modeling isperformed by using Semantic Web technologies such as the Resource DescriptionFramework (RDF). In the presented work, we assume that all knowledge that isknown to the system is available in the form of an RDF knowledge graph. Hence,novelty and relevance are considered with regard to a knowledge graph.The contributions of this thesis can be summarized as follows:1.We assess the suitability of existing large knowledge graphs for the task ofdetecting novel information in text documents.2.We present an approach by which emerging entities are predicted andrecommended, respectively, for a knowledge graph.3.We present an approach for extracting novel, relevant, semantically-structuredstatements from text documents.The contributions are presented, applied, and evaluated with the help of severalscenarios. The developed approaches are suitable for the recommendation ofemerging entities and novel statements, respectively, for the purpose of knowl-edge graph population as well as for use by users who are dependent on novelinformation (such as journalists and technology scouts).
No copy data
No other version available