Map Gene IDs¶
This tutorial will show you how to directly call the GeneWeaver API to map Gene IDs within a species. For this example, we will map Gene Symbols to Ensemble Gene IDs.
import requests
Initialize Gene IDs¶
First, you will need to initalize a list of the Gene IDs you want to map. Here, we load identifiers from a file.
After the end of this step you will need to have a list of gene IDs as shown below.
with open('gene_names.txt', 'r') as file:
file_content = file.read().strip()
ids = [id.strip('"') for id in file_content.split(',')]
len(ids)
32285
ids[:5]
['Xkr4', 'Gm1992', 'Gm19938', 'Gm37381', 'Rp1']
Call The GeneWeaver ReST API¶
First we'll construct our ReST call.
payload = {
"source_ids": ids,
"target_gene_id_type": "Ensemble Gene",
"species": "Mus Musculus"
}
The GeneWeaver API uses JSON, so let's specufy that in our request headers.
headers = {
'accept': 'application/json',
'Content-Type': 'application/json'
}
response = requests.post('https://geneweaver.jax.org/api/genes/mapping', json=payload, headers=headers)
Process Results¶
The mapping endpoint will return a dictionary with a list of results available on the gene_ids_map
key. Let's process that into a dictionary with the original IDs as keys and te new IDs as values.
mapping = {
r["original_ref_id"]: r["mapped_ref_id"]
for r in response.json()['gene_ids_map']
}
We can now easily access a list of our mapped IDs.
mapped_ids = list(mapping.values())
len(mapped_ids)
32050
mapped_ids[:5]
['ENSMUSG00000066586', 'ENSMUSG00000027596', 'ENSMUSG00000030359', 'ENSMUSG00000027597', 'ENSMUSG00000019986']
And we can see which IDs couldn't be mapped.
unmapped_ids = [
_id for _id in ids if _id not in mapping
]
len(unmapped_ids)
235
unmapped_ids[:5]
['Gm28653', 'Ptp4a1.1', 'Arhgef4.1', 'Asdurf', 'AC169382.1']