Archaeoinformatics - Data Science


CIDS - Center for Interdisciplinary Data Science


  • Services
    • Project management and counselling
    • Explorative pilot projects (3-6 months)
    • Support for data management and analysis
  • Research
    • Data-driven solutions for research questions (Data Science, AI)
    • Development of novel Data Science and AI solutions (DS/AI-Toolbox)
    • Interdisciplinary  data management (a science on its own)
  • Training
    • Research data management in practice (CAU-wide courses)
    • Data Science Skills (hands-on, tools & methodologies, DB management, statistics, reporting etc.)
    • Compliance (adherence to guidelines for data privacy, data security, CAU-regulations etc.)

DAAD PPP with HKU: Motif Discovery in Heterogeneous Information Networks

HIN motifs

In collaboration with Prof. Dr. Reynold Cheng, HKU  (Hong Kong University).

The main goal of this project is to discover interesting relationships (or “motif”) between nodes in a large heterogeneous information network (HIN).  HINs, which
represent complex relationships and interactions among real-world entities, are ubiquitous in bibliographical networks, communication networks, the World Wide Web, and social networks. Recent methods for discovering the vast amount of knowledge contained in HINs has gained attention in computer science, social science, physics, and biology. An HIN provides not only a general, natural, and rich representation of relationships between objects, but also follows a schema that describes important information about it. Specifically, an HIN is a typed graph, whose nodes and edges are tagged with “type labels” to indicate their meanings. The motif, essentially a subgraph of an HIN that connects two entities defined on the HIN schema, reveals interesting relationship information about entities, and is important to many applications.

Research Objectives:

  • Develop effective motif discovery algorithms designed for heterogeneous information networks (HINs).
  • Design online and real-time solutions for motif discovery in HINs

For more details, please contact Prof. Dr. Matthias Renz, Christian Beth, M.Sc.

DAAD PPP with UIC: Heterogeneous Information Network Management and Analysis


In collaboration with Philip S. Yu, UIC (University of Illinois at Chicago).

Most real system consists of multi-typed entities/objects with a variety of relationships between interacting and/or associated objects. These interactions and relationships between entities are naturally represented as information network graphs. Information networks are ubiquitous and well-established in the real world with  diverse applications fields such as publication networks, communication networks, the World Wide Web, or social networks. Nowadays, we have to handle a size of these information networks with ranges from hundreds up to millions and billions of nodes. By the rise of data integration, an increasing attention to information networks can be observed in academia and industry. Related to this development, new challenges have been introduced as we are not only concentrating on homogeneous data, i.e. we only have one type of objects or relationships between the objects in our network, but are rather faced to heterogeneous data derived from a variety of sources. Information networks are often organized in form of Resource Description Framework (RDF) data. RDF provides a simple way for expressing facts across linked data and is appropriate for the representation of information networks and/or knowledge graphs. Several distributed and federated RDF systems have emerged to handle the massive amounts of available RDF data nowadays. However, the weak (or missing) structure of information networks or knowledge graphs makes the development of methods for the efficient organization of this data difficult. In this project, we consider methods for the efficient organization of Heterogeneous Information Networks (HINs). HINs provide not only a more general, natural, and rich representation of relationships between objects and semantic information than traditional (homogeneous) networks but also follow a certain graph schema that provides important information about the structure of the graph. Consequently, the problem of understanding the vast amount of information modeled in heterogeneous information networks has received a lot of interest. Though many approaches have been developed to efficiently handle RDF data in the context of general Knowledge Graphs, standard methods for the efficient organization of knowledge structured in form of Heterogeneous Information Networks and corresponding scalable methods for performing relationship queries are still subject to future work.

Research Objectives:

  • Investigation of methods for the efficient management of Heterogeneous Information Networks.
  • Investigation of scalable methods for meta-structure-based relationship queries and relationship pattern analysis.

For more details, please contact Prof. Dr. Matthias Renz, Christian Beth, M.Sc.

Cross-Domain Marine Data Fusion: Discover Knowledge on Open Ocean Oyxgen Minimum Zones

Interdisciplinary project with Prof. Dr. Martin Visbeck, GEOMAR Helmholtz Centre for Ocean Research Kiel and University of Kiel, Ocean Circulation and Climate Dynamics

for questions please contact Carola Trahms

From Geological Text to Knowledge (GeoTeK): Investigation of scalable text mining methods for discovering and managing marine geological data

Interdisciplinary project with Prof. Dr. Klaus Wallmann, GEOMAR, FB Marine Biogeochemistry

For more information, please contact Asif Suryani, M.Sc.

Big Exchange: Searching for Knowledge, Social Inequality and Politics behind exceptional Large-Scale Networks in European Prehistory

Interdisciplinary project with Dr. Johanna Hilpert and Dr. Tim Kerig, ROOTS Subcluster Social Inequalities.

This project aims to collect finds of selected materials (e.g. flint, copper, jade), present and link these with respect to dating and geographic location. This data provides the basis for discussions about reconstructing exchange routes for certain materials, tools and items or collections of materials where applicable.

For more details, please contact Steffen Strohm, M.Sc.

CRC 1266 Z2: Data Integration and Development of the CRC's Landscape Archaeology Geoportal (Landman)

Find details about CRC 1266 subproject Z2 here.

For more details, please contact Steffen Strohm, M.Sc.

NeoBzHouses: Merging Databases of Archaeological House Data

Over the course of several years a variety of research projects have been conducted answering questions related to housing in the past. Naturally, these projects focussed on different aspects and created data about houses from varying perspectives.

This project aims to support ongoing and future research in this area to create an overview of data collected so far (e.g. geographical distribution, dating, house features on several levels of granularity and so on), which then provides a basis for further research and reduces the risk of redundancy when starting a new research project and the associated data acquisition / collection.

The data integration and processing effort in this project, which involves data from about 9 former and ongoing projects, is part of the larger complex around the Landman Portal project (Z2) in CRC 1266.


For more details, please contact Steffen Strohm, M.Sc.

RadoN+B+SP: Merging Databases of Archaeological Radiocarbon Data

The data integration and processing effort in this project brings together two larger databases storing radionuclid datings of the Neolithic (RADON) and the Bronze Age (RADON-B) respectively.

This project is part of the larger complex around the Landman Portal project (Z2) in CRC 1266 and also includes the integration of smaller sets of similar data.


For more details, please contact Steffen Strohm, M.Sc.

Cost-efficient Ship Routing utilizing Ocean Currents

details will follow


For more details, please contact Niko Amann, M.Sc.