Archaeoinformatics - Data Science

Ongoing Topics

This is a list of ongoing thesis topics. For further information, or if you wish to suggest your own topic, contact the responsible supervisor(s).

MA: Extraction of Scientific Measurements from Text

named-entity recognition example with scientic measurements

The goal of this work is to automatically extract scientific measurements form papers, such as 4 kg/m2. One of the many challenges is that the same quantities are often given in different ways (metre, meter, m etc.). A far sight goal is to also link the extracted quantities to their geo-spatial location (if given somewhere in the text).

For more information please contact Asif Suryani

MA: Extending Expressiveness of Meta-Structures in Heterogeneous Information Networks

Contact: Christian Beth, M.Sc.

Heterogeneous information networks (HINs) are graphs, where nodes have different types, and edges form different relations between the nodes, and thus allow semantically rich modelling of virtually any kind of data and information, ranging from protein-protein interaction networks to bibliogrpahical networks. The meta-path is a composite relationship between nodes in an HIN that is an integral part of state-of-the-art similarity/relevance measures in HINs, which are an integral part for downstream data mining tasks such as clustering, classification, or link prediction in HINs. To allow for more powerful, complex, and expressive relationships, the meta-structure was developed. It felxibly combines meta-path relations with the 'and'-linkage, which allows the user to specify more precisely what she is looking for. But why stop at the 'and'-linkage? Why not consider 'or', 'not', or other logical constraints? In this thesis you will design and study effective (meaningful) and efficient (fast, scalable) relevance measures based on more expressive meta-structures.