Archaeoinformatics - Data Science

BA/MA: Scalable Co-Location Mining in Large Protein Databases

contact: Prof. Matthias Renz

Given a large set of genomes covering a set of genes where genes can come from specific family (according to their function).

The question at issue is which genes significantly co-occur in genomes. These questions relate to comparing genes among species with similar/different ecological or physiological properties.

In the context of this question, the aim of this thesis is to develop algorithms and methods that efficiently support the identification of co-occurance patterns in gene/genome-datasets.