The Synthema glossary, a document that collects all clinical and technical terms frequently used in the project.
As a European funded project, part of the #HorizonEurope program, Synthema is backed by 16 partners from 10 countries who have joined forces to fight rare hematological diseases.
Therefore, this glossary aims to provide a comprehensive resource that aids in understanding the clinical and technical language used internally amongst our partners, as well as to help the general audience understand our work.
Definition
Synthetic data is artificially generated information that mimics real health data, used for research, testing, and privacy protection purposes in healthcare.
Explanation
Synthetic health data refers to artificially generated data that mimics real health data. It is generated through a combination of algorithms, statistical models, and other mathematical techniques, and is used for a variety of purposes, including research and development, testing and validation of health information systems, and as a means of preserving the privacy and confidentiality of actual patient data. Synthetic health data enables the simulation of real-world scenarios and conditions in a controlled, ethical, and safe environment, providing valuable insights and enabling innovation in the field of healthcare and medical research.
Reference
https://edps.europa.eu/press-publications/publications/techsonar/synthetic-data_en
Definition
Lymphoid organs are a group of organs that make up the human body’s immune system.
Explanation
Synthetic health data refers to artificially generated data that mimics real health data. It is generated through a combination of algorithms, statistical models, and other mathematical techniques, and is used for a variety of purposes, including research and development, testing and validation of health information systems, and as a means of preserving the privacy and confidentiality of actual patient data. Synthetic health data enables the simulation of real-world scenarios and conditions in a controlled, ethical, and safe environment, providing valuable insights and enabling innovation in the field of healthcare and medical research.
Reference
https://edps.europa.eu/press-publications/publications/techsonar/synthetic-data_en
Definition
Coagulation factors are proteins in the blood that are critical for the processs of blood clotting.
Explanation
Coagulation factors are proteins in the blood that are critical for the process of blood clotting, which is essential for preventing excessive bleeding after injury or surgery. Understanding coagulation factors is crucial for managing bleeding disorders such as hemophilia, and for developing treatments for conditions such as deep vein thrombosis.
Reference
https://edps.europa.eu/press-publications/publications/techsonar/synthetic-data_en
Definition
Hematological malignancies are a group of blood cancers.
Explanation
Hematological malignancies are known as blood cancers, these are a group of diseases that affect the blood, bone marrow, and lymphatic system. Types of hematological malignancies include leukemia, lymphoma, and multiple myeloma, among others. While they can be challenging to diagnose and treat, advancements in medicine are helping more people with these conditions live longer, healthier lives.
Reference
https://edps.europa.eu/press-publications/publications/techsonar/synthetic-data_en
Definition
Cytogenetics is the branch of genetics that studies the structure and function of chromosomes, which are the structures that carry our genetic information.
Explanation
By analyzing chromosomal abnormalities and variations, cytogenetics plays a critical role in diagnosing and treating a wide range of genetic disorders and diseases, including cancer. With advances in technology and techniques, cytogenetics continues to be a rapidly evolving and exciting field at the forefront of genetic research and personalized medicine.
Reference
https://edps.europa.eu/press-publications/publications/techsonar/synthetic-data_en
Definition
Genomic data refers to the complete set of genetic material, or DNA, that makes up an organism.
Explanation
Synthetic health data refers to artificially generated data that mimics real health data. It is generated through a combination of algorithms, statistical models, and other mathematical techniques, and is used for a variety of purposes, including research and development, testing and validation of health information systems, and as a means of preserving the privacy and confidentiality of actual patient data. Synthetic health data enables the simulation of real-world scenarios and conditions in a controlled, ethical, and safe environment, providing valuable insights and enabling innovation in the field of healthcare and medical research.
Reference
https://edps.europa.eu/press-publications/publications/techsonar/synthetic-data_en
Definition
Hormonal postulating is the process of proposing a theory or hypothesis about the role of hormones in physiological processes.
Explanation
This field of research is critical for understanding how hormones impact human health and behavior, and can lead to the development of new treatments for hormonal imbalances and related disorders.
Definition
Sickle-cell disease (SCD) is an inherited chronic life-threatening disorder caused by the presence of abnormal adult haemoglobin S.
Explanation
In patients with SCD, the red blood cells become rigid and sickle shaped, breaking down or blocking normal blood circulation. This results in haemolytic anaemia and recurrent occlusion of the small vessels and ischemia, known as vaso-occlusive event (VOE), with consequent acute and chronic pain, and a wide-range of progressive organ-specific clinical complications.
Reference
Definition
Acute myeloid leukemia (AML) is a cancer of the blood and bone marrow.
Explanation
Acute Myeloid leukaemia (AML) is a type of blood cancer that starts from young white blood cells in the bone marrow. The bone marrow produces white blood cells called granulocytes or monocytes too quickly because they grow and divide too fast.
Reference
Definition
Multiple myeloma, also known as myeloma, is a type of bone marrow cancer.
Explanation
Multiple myeloma (MM) Multiple myeloma is a type of cancer that affects the plasma cells, which are a type of white blood cell that produces antibodies to help fight infections. In multiple myeloma, the plasma cells become abnormal and grow uncontrollably, leading to the formation of tumors in the bone marrow and other parts of the body. This can cause a number of symptoms, such as bone pain, fatigue, and an increased risk of infections. Multiple myeloma is a relatively rare form of cancer, but it is considered to be incurable and can be life-threatening if not treated promptly and effectively.
Reference
Definition
Federated learning is a way to train AI models without anyone seeing or touching your data, offering a way to unlock information to feed new AI applications.
Explanation
Federated Learning (FL) is a machine learning technique in which the model training is performed on decentralized data sources, such as multiple edge devices or cloud-based systems, instead of a centralized data center. In this method, each device trains a local model on its own data, and then sends updates to a central server to be combined into a global model. This process helps maintain data privacy and security as the raw data never leaves the device and only model parameters are exchanged. Federated Learning can be useful in scenarios where data is distributed across multiple sources, as it enables the creation of a single, accurate model without the need for centralizing sensitive or large amounts of data.
Reference
https://edps.europa.eu/press-publications/publications/techsonar/synthetic-data_en
Definition
Probabilistic machine learning (PML) are computational methods that use randomness or probability to find solutions to complex problems.
Explanation
Probabilistic machine learning (PML) are computational methods that use randomness or probability to find solutions to complex problems. These algorithms are used in many fields, including computer science, finance, and engineering, where they help us optimize processes and make predictions in the face of uncertainty. However, probabilistic machine learning can also pose challenges, as they require careful calibration and validation to ensure that the results are accurate and reliable.
Reference
https://edps.europa.eu/press-publications/publications/techsonar/synthetic-data_en
Definition
Spatial statistics is the study of data that has both a geographical location and numerical values attached to it.
Explanation
It’s all about analyzing and interpreting data in a spatial context, using techniques like spatial autocorrelation, kriging, and spatial regression. With the rise of GIS technology and big data, spatial statistics is becoming increasingly important in fields like urban planning, environmental management, and public health.
Reference
https://edps.europa.eu/press-publications/publications/techsonar/synthetic-data_en
Definition
Generative models are a class of machine learning models that are designed to generate new data that is similar to the training data.
Explanation
These models are used in a variety of applications, such as image and speech recognition, language translation, and music composition. The goal of a generative model is to learn the underlying probability distribution of the training data and then use this knowledge to generate new data that is statistically similar to the original data. Generative models can be trained using a variety of techniques, including neural networks, Gaussian mixture models, and variational autoencoders. One of the key advantages of generative models is that they can be used to create synthetic data, which can be useful in situations where real data is scarce or difficult to obtain.
Reference
https://edps.europa.eu/press-publications/publications/techsonar/synthetic-data_en
Definition
Differential privacy (DP) is a concept in data privacy that aims to protect sensitive information in a dataset while still allowing useful analysis to be performed on the data.
Explanation
It is achieved by adding random noise to the data in a way that preserves the overall statistical properties of the dataset, but prevents individual data points from being linked to specific individuals. The idea is that if a user can’t tell whether a particular individual’s data is included in the dataset or not, then the privacy of that individual is protected. Differential privacy has become increasingly important with the rise of big data and machine learning, where the risk of sensitive information being leaked is high. It is often used in settings such as health care, finance, and government, where privacy is critical.
Reference
https://edps.europa.eu/press-publications/publications/techsonar/synthetic-data_en
Definition
Secure multi-party computation (SMPC) is a cryptographic technique that enables multiple parties to jointly compute a function on their private inputs without revealing their inputs to each other.
Explanation
Secure multi-party computation (SMPC) is a cryptographic technique that enables multiple parties to jointly compute a function on their private inputs without revealing their inputs to each other. In other words, each party can input their private data into a computation, and the computation can be executed in a way that ensures privacy for each party. The computation results are then made available to all parties, while keeping each party’s input private. This technique is often used in scenarios where data privacy is crucial, such as in financial or health care settings, where multiple parties need to collaborate on a computation while keeping their inputs confidential. Secure MPC is achieved through the use of cryptographic protocols, such as secret sharing and homomorphic encryption, which enable the parties to securely compute the function without compromising data privacy.
Reference
https://edps.europa.eu/press-publications/publications/techsonar/synthetic-data_en
Definition
Privacy-preserving federated learning is a machine learning technique that allows multiple parties to collaborate and train a machine learning model without sharing their raw data with each other.
Explanation
Privacy-preserving federated learning is a machine learning technique that allows multiple parties to collaborate and train a machine learning model without sharing their raw data with each other. The process involves training a model on a subset of each party’s data and aggregating the model updates while keeping the raw data private. This approach provides a privacy-preserving way to improve the accuracy of machine learning models without compromising the privacy of the individual data sources.
Reference
Definition
Data standardization is the critical process of bringing data into a common format that allows for collaborative research, large-scale analytics, and sharing of sophisticated tools and methodologies.
Explanation
Data standardization is the critical process of bringing data into a common format that allows for collaborative research, large-scale analytics, and sharing of sophisticated tools and methodologies.
Reference
https://www.sciencedirect.com/topics/computer-science/data-standardization
Definition
FHIR (Fast Healthcare Interoperability Resources) Specification is a standard for exchanging healthcare information electronically.
Explanation
Healthcare records are increasingly becoming digitized. As patients move around the healthcare ecosystem, their electronic health records must be available, discoverable, and understandable. Further, to support automated clinical decision support and other machine-based processing, the data must also be structured and standardized. FHIR aims to simplify implementation without sacrificing information integrity. It leverages existing logical and theoretical models to provide a consistent, easy to implement, and rigorous mechanism for exchanging data between healthcare applications. FHIR has built-in mechanisms for traceability to the HL7 RIM and other important content models.
Reference
Definition
The Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) is an open community data standard, designed to standardize the structure and content of observational data and to enable efficient analyses that can produce reliable evidence.
Explanation
The concept behind this approach is to transform data contained within those databases into a common format (data model) as well as a common representation (terminologies, vocabularies, coding schemes), and then perform systematic analyses using a library of standard analytic routines that have been written based on the common format.
Reference
Definition
Morphological analysis is the study of the form and structure of objects, organisms, and systems.
Explanation
Morphological analysis is the study of the form and structure of objects, organisms, and systems. It involves analyzing their shape, size, and other physical characteristics, as well as their function and behavior. Morphological analysis is used in a variety of fields, including biology, linguistics, and engineering. By understanding the morphology of a system or object, we can better understand how it works and how it can be improved.
Reference
https://www.techtarget.com/whatis/definition/morphological-analysis
Definition
Bioinformatician is a scientist who applies computer science, statistics, and other quantitative methods to analyze and interpret biological data.
Explanation
Bioinformatician is a scientist who applies computer science, statistics, and other quantitative methods to analyze and interpret biological data. They work with large sets of genetic and molecular data to identify patterns and extract meaningful insights. Bioinformaticians play a crucial role in fields like genomics, drug discovery, and personalized medicine. With the increasing importance of data-driven research, the demand for bioinformaticians continues to grow.
Reference
Definition
Tabular Data is a type of data that is structured in a tabular format, with rows and columns.
Explanation
Each row represents an observation or data point, while each column represents a variable or feature. Tabular data is commonly found in spreadsheets, databases, and CSV files, and is used in a variety of applications, such as data analysis, machine learning, and business intelligence. Examples of tabular data include sales data, customer information, financial records, and scientific measurements.abular
Reference
What is Tabular Data? (Definition & Example)
Definition
DICOM (Digital Imaging and Communications in Medicine) image is a standard format used for storing, exchanging, and transmitting medical images and related information.
Explanation
DICOM (Digital Imaging and Communications in Medicine) image is a standard format used for storing, exchanging, and transmitting medical images and related information. DICOM images are typically generated by medical imaging equipment, such as X-ray machines, MRI scanners, and CT scanners. These images can include various types of medical imaging data, such as 2D or 3D images, ultrasound scans, and other types of medical images. DICOM images are used in a variety of medical applications, such as diagnosis, treatment planning, and medical research.
Reference
Definition
MRI (Magnetic Resonance Imaging) is a non-invasive medical imaging technique that uses a strong magnetic field and radio waves to create detailed images of the internal organs, tissues, and bones of the body.
Explanation
MRI (Magnetic Resonance Imaging) is a non-invasive medical imaging technique that uses a strong magnetic field and radio waves to create detailed images of the internal organs, tissues, and bones of the body. MRI machines generate high-resolution images that can be used to diagnose a wide range of medical conditions, including injuries, infections, and diseases such as cancer. MRI scans are commonly used in neurology, orthopedics, cardiology, and oncology, among other medical fields.
Reference
https://www.nibib.nih.gov/science-education/science-topics/magnetic-resonance-imaging-mri
Definition
Histopathology imaging is a medical imaging technique that involves the examination of biological tissues and cells at a microscopic level.
Explanation
Histopathology imaging is a medical imaging technique that involves the examination of biological tissues and cells at a microscopic level. It is typically used in the diagnosis and treatment of various diseases, such as cancer. The technique involves staining thin slices of tissue or cells with various dyes to enhance their visibility under a microscope. The resulting images can then be examined by a pathologist or other medical professional to identify abnormalities and diagnose diseases. Histopathology imaging is widely used in research, diagnosis, and treatment planning in various medical fields, including oncology, dermatology, and gastroenterology, among others.
Reference
Definition
CI/CD refers to Continuous Integration and Continuous Deployment, which are practices for frequently integrating code changes and automatically deploying them to production environments.
Explanation
CI/CD stands for Continuous Integration and Continuous Deployment. Continuous Integration (CI) is the practice of frequently merging code changes from multiple contributors into a central repository, usually multiple times a day. This process often involves automated testing to ensure that new code integrations don’t introduce defects. Continuous Deployment (CD), on the other hand, is the practice of automatically deploying integrated code changes directly to the production environment, ensuring that software releases happen rapidly and reliably. Together, CI/CD practices aim to streamline the software development process, reduce manual interventions, minimize errors, and accelerate the delivery of new software features and fixes to end-users.
Reference
Definition
An API, or Application Programming Interface, is a set of rules and protocols that allows different software entities to communicate with each other.
Explanation
An Application Programming Interface (API) serves as an intermediary that enables two distinct software applications to communicate and interact. It provides a predefined set of functions, methods, and protocols that developers can use to request and exchange information between systems without needing to understand the intricate details of the other system’s internal workings. APIs are fundamental in enabling the integration of systems, allowing different software solutions to work together cohesively. They can be found in various forms, such as web APIs that enable web-based applications to interact, or library-based APIs that help in using features of a software library.
Reference
https://medium.com/@perrysetgo/what-exactly-is-an-api-69f36968a41f