Data management platform for biomarker discovery and research

Industry

Biotechnology & Healthcare

Technologies

reactpythonpostgresqldockerredis

Country

Germany

Client Overview

Centogene is a biotechnology company specializing in genetic diagnostics for rare diseases, biomarker discovery, and clinical trial support. They provide genetic testing services, identify and validate biomarkers, and assist in recruiting patients with specific rare genetic conditions for clinical trials. The platform helps researchers to manage mass spectrometry-based metabolomics experiments. The system organizes experimental data hierarchically, supports quality control measures, and incorporates data processing techniques like drift correction, peak mapping and statistical analysis, for biomarker discovery.

Client Needs

Clear Data Visualization

Clear Data Visualization

Quality Control Integration

Quality Control Integration

Custom Machine Learning Algorithms

Custom Machine Learning Algorithms

Statistical Analysis Tools

Statistical Analysis Tools

Centogene needed a system to visualise, manage and analyse mass spectrometry-based metabolomics data. It had to handle complex experimental data structures, support quality control, and run the data processing techniques that biomarker discovery needs.

Services Provided

Data Processing Pipeline: Implemented algorithms for drift correction, peak mapping, and data normalization.

Quality Control Integration: Enabled tracking and management of QC samples and flagged data anomalies.

Custom Machine Learning Algorithms: Developed specialized ML algorithms for clustering and feature selection in metabolomics data.

Statistical Analysis Tools: Integrated tools for calculating RSD, detection rates, and other key metrics.

Scope of Work

  1. Designed a hierarchical data model to represent experimental structures including batches, measurements, and samples.

  2. Implemented quality control mechanisms to track and manage QC samples and data anomalies.

  3. Developed data processing algorithms for drift correction, peak mapping, and normalization.

  4. Developed custom machine learning algorithms for clustering and feature selection on metabolomics data.

  5. Implemented statistical analysis tools to calculate key metrics such as RSD and detection rates.

Technologies Used

React: Used for building the frontend user interface, providing an easy to navigate and interact experience for researchers.

Python: Utilized for developing data processing algorithms and system backend.

PostgreSQL: Used for storing and managing structured experimental data.

Docker: Containerized applications for consistent deployment across environments.

Redis: Utilized as a message broker for handling background tasks and caching.

Development Process

The project commenced with an in-depth analysis of Centogene's requirements for managing mass spectrometry data. We designed a hierarchical data model to represent batches, measurements, and samples. The development focused on integrating quality control processes, implementing data processing algorithms, and ensuring secure user access through API tokens. Custom machine learning algorithms were developed to enhance biomarker discovery capabilities.

common.checkClutchWork