The objective of this project is to create an annotated database for AI training on CTG pattern recognition. We aim to build a large, thoroughly reviewed database with a high level of annotation.
The aim of such AI could be:
This project does not aim to evaluate inter-rater variability due to the potential selection bias and the unusual annotation method.
Furthermore, it does not intend to create a dataset for predicting birth outcomes (like pH, or clinical neonatal criteria). This would require a larger database, with all clinical information but not necessarily annotations.
The AI methods will accept as input tht two FHR channels, the MHR, the TOCO, the delivery stage (antepartum, first or second stage) as well as few clinical information (Gestational age, subsaharan origin). It should predict a set of classes for each time sample. The classes are:
| TOCO | MHR | FHR1 | FHR2 | Morphological (only if FHR1 or FHR2 is true Sig) |
Variability (only if true Sig and baseline) |
|---|---|---|---|---|---|
|
|
|
|
|
|
If MHR, Doppler and Scalp ECG are available, 6 classes (with multiples modalities) are expected as output per sample. There should be n1 x n2 x n3 x n5 x n6 but a lot of them are impossible. The AI problem will then appear as a classification problem with 208 modalities (17 possibilities for the last two classes, 17*3+1=52 for the last four classes, and 2*2*(52)=208 total possibilities). The optimized criteria will be the cross-entropy. Annotations can be partial and considered as union of possible modilities. For instance, an analizer may have annotated FHR but not the TOCO and thus the cross-entropy for TOCO will not be counted for the concerned samples.
Recordings are mainly included for their originality to cover all possible FHR patterns. The initial database comprises both FHRMA public datasets for Morphological Analysis and for False Signals analyses. These datasets are extracted from the Bien Naître database (RNIPH…).
A dedicated page indicates which patterns are missing from the database. Participants are encouraged to find and add recordings with these patterns. A recording could be analyzed by two different individuals, and this will be processed as if there were two different recordings. They will, however, be placed in the same train, validation or test set. Some clinical information is also asked to help the annotator make a more accurate ground truth. For example, if there is low variability for 40 minutes and we see that the fetus was in acidosis, this helps identify that this was pathological reduced variability and not deep sleep.
If you have experience in FHR analysis, we invite you to participate in annotating the database. You must first complete a 20-minute tutorial to familiarize yourself with the interface and ensure agreement on pattern definitions. The annotation is based on the FIGO 2015 guidelines with some slight nuances explained in the tutorial. Then, you can start annotating recordings which will be presented in random order. You can correct them afterward.
You can either:
To thank those who have participated actively (>100H of annotated periods), we would add your name to the publication of this dataset if you wish (after reviewing the paper).
You can post a recording on the dedicated page or using our API. The page statistics indicate the deficiencies in our database and we hope posters prioritize these recordings. Poster have the possibility to analyze their own recordings if they want.
Please do not post any information which will permit to identify a patient. Please provide the study number of the ethical comity.
To thank those who have posted a significant amount of recordings, we would add your name to the publication of this dataset if you wish (after reviewing the paper).
A paper presenting this dataset will be submitted for publication. Regardless, the train/val data will be made public. We aim to organize a competition (on Kaggle or Physionet) during the year following the publication and will contact journals to publish the winning method(s).
Those who have participated in the dataset can have access to the train/validation dataset as soon as it is stopped. The test dataset will be made public after the competition. During the year of competition, we will try to make the FHRMA-DB 2025 dataset, much larger and correcting our errors from this dataset.