Tese Mestrado

Temporal Phenotyping of ALS Patients using Machine Learning

André Filipe Fernandes Esteves

Sexta-feira, 4 de Julho 2025 das 12:00 às 14:00
Online

Password: 561232

Amyotrophic Lateral Sclerosis (ALS) is a fatal neurodegenerative disease characterized by highly heterogeneous and rapidly progressing motor decline. Despite extensive research, predicting ALS progression remains a major clinical challenge, limiting timely interventions such as Non-Invasive Ventilation (NIV) and Percutaneous Endoscopic Gastrostomy (PEG). This thesis addresses this challenge by applying machine learning (ML)-based temporal phenotyping to characterize evolving patient profiles from static and longitudinal clinical data, with the objective of uncovering distinct progression trajectories and improving prognosis prediction.

Specifically, the T-Phenotype algorithm is adapted to the Lisbon ALS Clinic dataset, leveraging phenotypic predictive clustering to group patients by both outcome and disease trajectory similarities. Comprehensive preprocessing strategies were developed to handle irregular clinical event timing, missing data, variable-length time series, and class imbalance. The algorithm identified clinically meaningful phenotypes associated with the need for NIV (endpoint C1) and PEG (endpoint C3), with optimal performance in shorter observation windows (6–12 months), revealing distinct risk profiles aligned with respiratory and bulbar function decline.

Combining endpoints into a multiclass task (C1+C3) highlighted challenges with class imbalance and reduced prediction performance, although still produced interpretable phenotypes. Temporal analysis of cluster transitions illustrated the model’s ability to capture diverse ALS progression patterns dynamically, emphasizing the disease's heterogeneity and the limitations of static classifications. Although the T-Phenotype model had satisfactory results in predictive accuracy and clustering interpretability for the binary endpoints (Hprc metric around 80%), its performance declined for the combined endpoint (Hprc metric around 65%), its performance declined for the combined endpoint. Additionally, it lacked sensitivity to clinical improvements post-intervention, underscoring the need for richer datasets and methodological rigor.

The potential of ML-based temporal phenotyping was demonstrated as a valuable tool for understanding ALS progression and supporting personalized prognosis and intervention planning. By integrating temporal dynamics into patient stratification, this approach advances data-driven ALS monitoring and highlights critical considerations for future research in this domain.