Pushing the boundaries of AI-based
techniques for clinical data anonymisation
and synthetic data generation
in rare hematological diseases
Under a global lens, the impact of hematological diseases is staggering.
Unfortunately, as is common for rare diseases, both the scarcity and fragmentation of available data prevent researchers from reaching the critical mass required for pushing forward basic and clinical research, and heavily impacts health authorities’ capabilities for effective health planning.
Moreover, their underrepresentation in coding systems highlights the complexity in tracing patient pathways within healthcare systems, thus limiting the long-term sustainability of existing and new patient registries established at national and European level.
This is exactly why precision medicine is key: when we shift our focus to the individual, all diseases become unique.
The European Health Data Space (EHDS) is one of the central building blocks of a strong European Health Union. It aims to create a strong legal framework for the use of health data for research, innovation, public health, policy-making and regulatory purposes. Under strict conditions, researchers, innovators, public institutions or industry will have access to large amounts of high-quality health data, crucial to develop life-saving treatments, vaccines or medical devices and ensuring better access to healthcare and more resilient health systems.
SYNTHEMA shares the vision of the EHDS to create a consistent, trustworthy and efficient framework to use health data to push the boundaries of clinical research, while ensuring full compliance with the EU’s high data protection standards.
SYNTHEMA aims to establish a cross-border health data hub for rare hematological diseases: a space to develop and validate innovative AI-based techniques for clinical data anonymisation and synthetic data generation (SDG). The ultimate ambition is to address the issues around data scarcity and fragmentation to effectively widen the basis for meaningful, GDPR-compliant research in this disease space.
The SYNTHEMA platform will be based on a privacy-preserving federated learning (FL) network, equipped with secure multi-party computation (SMPC) protocols and differential privacy (DP), connecting health data and academic research centres, industries and SMEs to advance translational and clinical research and care working in rare hematological diseases.
To generate virtual patients that keep patterns
and features of real-world data
To facilitate the collaborative training of
AI models with no sharing of raw data
To generate virtual patients that keep patterns| and features of real-world data
To facilitate the collaborative training of AI models with no sharing of raw data
For lower risk, privacy-preserving
model aggregation in FL schemes
To set strict boundaries on the disclosure
of private data each clinical site is allowed
For lower risk, privacy-preserving model aggregation in FL schemes
To set strict boundaries on the disclosure of private data each clinical site is allowed