Archaeoinformatics - Data Science

BA/MA: Text Mining and Knowledge Extraction in Marine Sciences

There are multiple open topics available that target text mining and knowledge extraction from text with applications in marine science. If you are interested in one of the topics, or have an idea about a related topic, please contact Asif Suryani, M.Sc.
You can find more suggested topics below.

BA/MA: Study and Evaluation: From NER to Network Representation of Scientific Text

named-entity recognition example

Named entity recognition (NER) is the task of finding and extracting relevant entities from text. Scientific measurements and their associated values are of particular interest in this scenario, but also automatic recognition of locations, institutions, persons etc. Once these entities are extracted from a text, the task is to construct a network representation (e.g. a heterogeneous information network) of the text document at hand, where the challenge lies in predicting the appropriate relationships between the extracted entities (e.g. linking an extracted quantity 'mass' to its respective measurement '42', and unit 'kilograms'). The Target of this thesis is to develop and study novel techniques for NER and to link the extracted entities for a network representation of the document.

BA: Scientific Text Parser: An Interactive and Intelligent Approach

In this bachelor's thesis the task is to develop a framework for a parsing toolkit, that reads and summarizes scientific text documents - taylored to the needs of the user. The target domain for these studies will be scientific texts from marine science.

MA: Pre-trained Language Models for Domain-driven Q/A

In this master's thesis the objective is to leverage novel, and state-of-the-art pre-trained language models (such as BERT) to facilitate automated question answering (Q/A). The target domain for these studies will be scientific texts from marine science.