CLinkaRT @ Evalita 2023

Linking a Lab Result to its Test Event in the Clinical Domain

Main

For participants

Task Description

Use of data for model training

Evaluation

Evaluation procedure

CLinkaRT at EVALITA 2023 is a relation extraction task based on clinical cases taken from the E3C corpus, i.e. on Italian written documents reporting statements of a clinical practice (thus including, for example, the reasons for a clinical visit, the physical exams undertaken, the assessment of the patient’s diagnosis and subsequent treatments).

The task consists in identifying test results and measurements and linking them to the textual mentions of the laboratory tests and measurements from which they were obtained.

Main

Organization

Bernardo Magnini - FBK
Begoña Altuna - University of the Basque Country
Alberto Lavelli - FBK
Manuela Speranza - FBK
Roberto Zanoli - FBK
Goutham Venkatesh Karunakaran - University of the Basque Country

Contact and Submission

clinkart@fbk.eu

Important dates

Available here

For participants

Test dataset with annotations now available - December 28th
Submit your intention to participate
Download training dataset
Download the evaluation kit

Task Description

Laboratory tests are a common step in the disease and disorder diagnosis processes and, as a part of them, they are documented in clinical narratives. The main goal of the CLinkaRT task is finding test results and measurements and linking them to the textual mentions of laboratory tests and measurements from which they were obtained. For this, both elements need to be identified in text and then the relation between them needs to be created.

Laboratory tests and measurements and their results provide interesting information on the patients’ status in a certain time of the development of the disorder but have been given little attention in the last years. The task also brings up a new data treatment perspective since the extraction of lab values and vital signs cannot be handled as a named entity recognition task, as it requires interpreting numeric values and ranges.

In the following example, the test event is marked in blue and the result is marked in red:

Both the test events and the results are defined as text strings. All test events are single-token strings (i.e. only the syntactic head is considered, following the THYME framework), while results may be represented by multitoken strings (i.e. a whole syntactic chunk). There are not discontinuous spans nor nested entities.

All relations are of the type PERTAINS (see the red arrows in the examples) and can be one-to-one, one-to-many and many-to-one.

The annotation of pertains-to relations is strictly based on tokens (the target, in particular, must necessarily consist of one token, while the source consists of one or more token) and on sentences (source and target of a relation must belong to the same sentence).

In order to reduce the noise that might be introduced by mismatching pre-processing, participants to the Clinkart task are strongly encouraged to use the same (automatic) tokenization and sentence splitting that has been used when the data have been (manually) annotated (which is distributed together with the annotated data and the test data).

Data

Training dataset

Language: Italian

Documents: 83

Tokens: 28,856

Annotated Relations: 658

Source: E3C corpus

Download

Test dataset

Language: Italian

Documents: 80

Source: E3C corpus

Download raw texts - May 12th

Download annotated texts - New

Data format

All annotation data are made available to participants in a format that owes a lot to the PubTator format. The annotated data consists of a straightforward tab-delimited text file:

Where:

Every document in the dataset is in a new line and a space line is used as a document separator.
- DOCID: document id
- t: marker to identify the lines that contain the text of the documents
- TEXT: text of the document
Every annotated relation is in a separate line and is represented as an ordered pair of entity mentions (i.e. RML,event). Each entity mention in the relationship is expressed by its start and end character offsets. The mention text span can be set but is not mandatory.
- DOCID: document id
- REL: marker to identify the lines that contain the relations of the given document
- RML_START: start character offset of the RML entity mention in the document
- RML_END: end character offset of the RML entity mention in the document
- EVENT_START: start character offset of the EVENT entity mention in the document
- EVENT_END: end character offset of the EVENT entity mention in the document
- RML_TEXT [optional]: text span of the RML entity mention
- EVENT_TEXT [optional]: text span of the EVENT entity mention

For example:

100509|t|Donna, 87 anni, ipertiroidismo subclinico, artrosi, osteoporosi (fratture T10-T11), ipovisus, AH in terapia steroidea cronica; ipertensione; scompenso cardiaco diastolico; un ricovero per EPA. Recente embolia polmonare, da allora in TAO. Recentemente agitazione e dolore resistente a paracetamolo. All’ECG RS 66 bpm, deviazione assiale sinistra, BBD incompleto. Chest Pain Score e Wells Score bassi. All’ecocardiogramma FE 55%. PA 160/90 mmHg. Giordano positivo, dolore paravertebrale bilateralmente. Dopo caduta accidentale vivo dolore a livello dorsale. Al quadro rx crolli vertebrali da T6 a T8 con pregresso crollo di T12. Procrastinata la chifoplastica e prescritto un busto, iniziava cauta fisioterapia. Videat oculistico e continuazione della terapia steroidea. Prescrizione di teriparatide e vitamina D.

100509 REL 309-315 306-308 66 bpm RS

100509 REL 393-398 387-392 bassi Score

100509 REL 393-398 373-378 bassi Score

100509 REL 423-426 420-422 55% FE

100509 REL 431-442 428-430 160/90 mmHg PA

100509 REL 453-461 444-452 positivo Giordano

Use of data for model training

The CLinkaRT training data are part of the E3C corpus, which is released under CC-BY-NC-4.0 licence.

There are no restriction in the use of additional data for model training. The training data released for the TESTLINK twin task can be used, as well as any other datasets participants might consider useful. Nonetheless, all data used for training purposes must to be specified in the system report.

Evaluation

Evaluation procedure

The task has been defined as a relation extraction (RE) task in which the elements taking part in the relation as well as the directionality of the relation are considered. Participating systems are provided with raw text from clinical cases as input and asked to return a list of entity mention pairs for which a relationship exists in the text. For example, given the document reported below:

100854|t|Un maschio di 25 anni dopo una partita di calcetto si è presentato in PS per paralisi degli AAII. Gli ematochimici evidenziavano marcata ipokaliemia: <1.5 Mmol/L; dopo esecuzione di supplementazione di potassio il pz veniva dimesso. Pochi giorni più tardi il pz si ripresentava in PS per una recidiva e veniva ricoverato. Veniva eseguito TSH con riscontro di tireotossicosi senza sintomi di ipertiroidismo: TSH <0.005 mUI/L, FT4 44.9 ng/L. Veniva impostata terapia tireostatica e somministrato K+ con risposta clinica. Sono stati effettuati RMN, rachicentesi e dosaggio enzimi muscolari per ricerca di diagnosi differenziali. È stato proposto anche test genetico.

The systems output includes both the document in input and the annotated relations as follows:

100854|t|Un maschio di 25 anni dopo una partita di calcetto si è presentato in PS per paralisi degli AAII. Gli ematochimici evidenziavano marcata ipokaliemia: <1.5 Mmol/L; dopo esecuzione di supplementazione di potassio il pz veniva dimesso. Pochi giorni più tardi il pz si ripresentava in PS per una recidiva e veniva ricoverato. Veniva eseguito TSH con riscontro di tireotossicosi senza sintomi di ipertiroidismo: TSH <0.005 mUI/L, FT4 44.9 ng/L. Veniva impostata terapia tireostatica e somministrato K+ con risposta clinica. Sono stati effettuati RMN, rachicentesi e dosaggio enzimi muscolari per ricerca di diagnosi differenziali. È stato proposto anche test genetico.

100854 REL 150-161 137-148 <1.5 Mmol/L ipokaliemia

100854 REL 411-423 407-410 <0.005 mUI/L TSH

100854 REL 429-438 425-428 44.9 ng/L FT4

In the annotated relations, the mention text span (e.g., <1.5 Mmol/L, ipokaliemia) can be set but it is not used for evaluation.

Participants are allowed to submit up to two different runs.

We measure RE performance by standard Precision, Recall and F1 measure, in which a relation prediction is considered correct if the start and end character offsets of the two related entity mentions and their order of both in the relation are correct. To perform this evaluation the scorer of BioCreative V CDR task is used, i.e.

> eval_relation.sh PubTator gold_file prediction_file

Where:

PubTator: input files format

gold_file: contains the gold standard annotations

prediction_file: contains the predicted annotations