Synthema

Barcelona, Vall d’eHebron Institut de Recerca, 16-17 January 2024

In a significant step forward for medical research in rare hematological diseases, the SYNTHEMA consortium convened for its 14th meeting at the prestigious Vall d’eHebron Institut de Recerca in Barcelona last week. This two-day event, held on the 16th and 17th of January 2024, marked a pivotal moment for the consortium, bringing together leading experts and researchers to discuss the progress in establishing a cross-border hub for the development and validation of Artificial Intelligence (AI) techniques focused on anonymization and synthetic data generation.

The consortium, a collaboration of various European entities, aims to leverage AI in combating rare hematological diseases. Its primary objectives are to anonymize patient data securely and generate synthetic datasets, addressing the challenge of data scarcity in rare disease research.

Overview of Technical and Clinical Work Packages Progress

Each Work Package (WP) group presented their latest achievements and developments, effectively illustrating the consortium’s comprehensive and multidisciplinary approach. These presentations highlighted the diverse range of expertise and collaborative efforts within the consortium, from advancements in AI technology and data management strategies to pioneering methods for ensuring privacy and ethical compliance.

Work Package 1 (WP1) – Data collection, harmonisation and interoperability

  • Data Strategy and Harmonization: Established a comprehensive data strategy for Sickle Cell Disease (SCD) and Acute Myeloid Leukemia (AML), ensuring GDPR compliance and ethical standards.
  • Clinical Use Case Development: Developed AI tools for MRI analysis in SCD and strategies for patient categorization using Synthetic Data Generation (SDG).
  • Advancements in AML Research: Enhanced dataset quality, especially for elderly patients, and developed AI solutions for treatment outcome forecasting.

Work Package 2 (WP2) – Federated learning platform development

  • Federated Learning Infrastructure: Key efforts in building the FL infrastructure with clinical centers as distributed nodes, enabling learning across various locations.
  • Privacy-Preserving Protocols: Enhanced SMPC and DP protocols for model aggregation and computational output protection.
  • Research and Development: Conducted research on SMPC and DP, investigating platforms like PySyft and SyMPC of Openmined.

Work Package 3 (WP3) – Data anonymisation and synthetic data generation pipelines

  • Pipeline Development: Created a pipeline for data asset generation, focusing on data anonymization and SDG approaches.
  • Anonymization and SDG Proof of Concepts: Developed workflows for anonymization and tested concepts for SDG engines.
  • In-silico Modelling: Early-stage implementations of centralized and distributed causal inference models.

Work Package 4 (WP4) – Clinical validation and statistical utility assessment 

  • Validation Objective: Focused on validating the clinical and statistical utility of synthetic data.
  • Validation Pipelines: Developed validation pipelines for imaging and genomics based on feature extraction and clinical endpoint prediction.

Work Package 5 (WP5) – Data Protection and Privacy Assessment

  • Legal and Regulatory Compliance: Addressed key compliance aspects, including data flow diagrams and data management plans.
  • Privacy Risk Assessment: Implemented a qualitative risk classification framework and identified privacy risk metrics for synthetic data.

The SYNTHEMA consortium’s M14 meeting concluded with a significant and forward-looking session that overlapped with representatives from the sister project Genomed4All. This collaboration was particularly strategic, as Genomed4All, nearing its completion in a year, shares several objectives and aims with SYNTHEMA, as well as many consortium members. This joint session fostered a vital discussion centered on the seamless transition of operations, networks, and knowledge from Genomed4All to SYNTHEMA. A key focus of this transition is the educational program, which has been a cornerstone of Genomed4All’s success. The consortiums deliberated on effective strategies to transfer and adapt this educational framework to SYNTHEMA’s context, ensuring the continuity and further development of this valuable resource. This collaborative effort highlights the consortium’s commitment to sustainability and the ongoing advancement of research and education in the field of rare hematological diseases.