Georgi Tancev, PhD

Clinical Data Science

Searching for risk factors of mortality.

Summary

Motivation

In this project, conducted in the research group of Julia Vogt at the University Hospital Basel, I worked on analysing clinical data from children with end-stage renal disease receiving chronic haemodialysis. The goal was to use data-science methods to identify predictors of mortality and assess their relevance using statistical and machine-learning approaches.

The work is directly linked to the publication Identifying key predictors of mortality in young patients on chronic haemodialysis – a machine learning approach (NDT, 2021).1

My contribution

I was responsible for the full data-science pipeline:

Data preparation and feature engineering

  • Cleaning, structuring, and enhancing the clinical database (demographics, labs, treatment parameters), including derivation of additional features. Exploratory data analysis
  • Examination of potential risk factors, initial hypothesis generation, and visual pattern analysis. Machine-learning modelling
  • Application of modern ML methods (e.g., Random Forests, Gradient Boosting) to identify key predictors of mortality and quantify their importance robustly. Interpretation in clinical context
  • Extraction of the most relevant factors (e.g., albumin, inflammatory markers, dialysis parameters) and contextualisation of results together with clinical collaborators.

Outcome

The analysis confirmed established risk markers and highlighted additional variables that had been underappreciated. Overall, the ML workflow helped refine clinical risk assessment and identify vulnerable patient subgroups more objectively.

This project illustrates how modern data-science methods can be applied in sensitive clinical environments — combining rigorous modelling, interpretability, and interdisciplinary collaboration between clinical medicine and machine learning.

  1. These results were also presented at a conference