Concept Sets
New users of this software can create concept sets for testing in any way that is appropriate to their use case. Here, we describe how we generated concepts sets in case it is useful for others.
Medical Subject Headings (MeSH)
We manually reviewed MeSH entries using the MeSH Browser and chose identifiers of entries that describe a group of related medical concepts. We selected 105 concepts in this way, and the following table shows the first four of them (the entire list can be found in the file mesh_target_ids.tsv).
mesh.id | label |
---|---|
D001145 | Arrhythmias, Cardiac |
D007674 | Kidney Diseases |
D009202 | Cardiomyopathies |
D001523 | Mental Disorders |
The Python script meshImporter.py
was used to retrieve all of the descendant terms of each of these identifiers.
For instance, Arrhythmias, Cardiac has 40 descendant terms including Adams-Stokes Syndrome (D000219) and
Arrhythmia, Sinus (D001146). The meshImporter.py
script retrieves these terms and writes them in a file with the following structure
Arrhythmias, Cardiac | D001145 | meshd016170;meshd000219; (...) |
---|---|---|
Kidney Diseases | D007674 | meshd016263;meshd000141;meshd058186; (...) |
Cardiomyopathies | D009202 | meshd000092183;meshd019571; (...) |
Mental Disorders | D001523 | meshd015526;meshd000275; (...) |
For convenience, the output of this script is stored in the file mesh_sets.tsv and does not need to be recreated to run the other scripts.