Archaeoinformatics - Data Science

Ongoing Topics

This is a list of ongoing thesis topics. For further information, or if you wish to suggest your own topic, contact the responsible supervisor(s).

MA: Extraction of Scientific Measurements from Text

named-entity recognition example with scientic measurements

The goal of this work is to automatically extract scientific measurements form papers, such as 4 kg/m2. One of the many challenges is that the same quantities are often given in different ways (metre, meter, m etc.). A far sight goal is to also link the extracted quantities to their geo-spatial location (if given somewhere in the text).

For more information please contact Asif Suryani

BA/MA: Investigating Fitness Effects for Beta-Lactamase with Recurrent Neural Networks

Contact: Steffen Strohm, M.Sc., Christian Beth, M.Sc.

In this work the goal is to predict the fitness of Bacteria against beta-Lactam-based antibiotics in silico (in the computer). This is done with protein sequence data from the beta-Lactamase enzyme, which is crucial for the survival of beta-Lactam resistant bacteria. The performance of the predictor in then evaluated on real-world data obtained from in vitro test (in the lab).

MA: Extending Expressiveness of Meta-Structures in Heterogeneous Information Network

Contact: Christian Beth, M.Sc.

Heterogeneous information networks (HINs) are graphs, where nodes have different types, and edges form different relations between the nodes, and thus allow semantically rich modelling of virtually any kind of data and information, ranging from protein-protein interaction networks to bibliogrpahical networks. The meta-path is a composite relationship between nodes in an HIN that is an integral part of state-of-the-art similarity/relevance measures in HINs, which are an integral part for downstream data mining tasks such as clustering, classification, or link prediction in HINs. To allow for more powerful, complex, and expressive relationships, the meta-structure was developed. It felxibly combines meta-path relations with the 'and'-linkage, which allows the user to specify more precisely what she is looking for. But why stop at the 'and'-linkage? Why not consider 'or', 'not', or other logical constraints? In this thesis you will design and study effective (meaningful) and efficient (fast, scalable) relevance measures based on more expressive meta-structures.