Molecule Tutorials - Herong's Tutorial Examples - v1.26, by Herong Yang
Synchronization with HGNC Database
Provides suggestions on how to maintain a local table of Gene Symbols and Names and keep it synchronized with HGNC database.
If you want to maintain a local copy of Gene Symbols and Names and keep it synchronized with HGNC database, you may consider the following suggestions:
1. Get Gene Counts - Run the /info request to get the total number of genes:
http://rest.genenames.org/info <response> <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">1</int> </lst> <result name="response" numFound="1" start="0"> <doc> <str name="lastModified">2021-05-09T02:43:51.575Z</str> <int name="numDoc">44334</int> <arr name="searchableFields"> ... </arr> <arr name="storedFields"> ... </arr> </doc> </result> </response>
2. Detect Changed Genes - Unfortunately, HGNC does not provide modified date as comparable conditions in the search request. So we have to use "Search" or "Fetch" requests to get all genes to figure out their changes.
3. Use "Search" Requests to Figure Out Changes - If you are interested in new gene symbols only, you can use "Search" requests with symbol patterns to get all genes with their symbols only in chunks. Below is the suggested algorithm:
- Run one "Search" request for each gene symbol pattern in a loop: http://rest.genenames.org/search/symbol/A* http://rest.genenames.org/search/symbol/B* ... http://rest.genenames.org/search/symbol/Z* - Scan each response with about 1700 genes for any new gene symbol and store it locally: <response> <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">4</int> </lst> <result name="response" numFound="1127" start="0" maxScore="1.0"> <doc> <str name="hgnc_id">HGNC:29504</str> <str name="symbol">ZACN</str> <float name="score">1.0</float> </doc> <doc> <str name="hgnc_id">HGNC:51915</str> <str name="symbol">ZACNP1</str> <float name="score">1.0</float> </doc> <doc> <str name="hgnc_id">HGNC:28697</str> <str name="symbol">ZADH2</str> <float name="score">1.0</float> </doc> ... </result> </response>
4. Use "Fetch" Requests to Figure Out Changes - If you are interested in changes on gene symbol status, alias, and name, you have to use "Fetch" requests with HGNC IDs to get gene details one by one. Below is the suggested algorithm:
- Run one "Fetch" request for each HGNC ID: http://rest.genenames.org/fetch/hgnc_id/1 http://rest.genenames.org/fetch/hgnc_id/2 ... http://rest.genenames.org/fetch/hgnc_id/n - Analyze each response to insert or update gene details locally: <response> <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">1</int> </lst> <result name="response" numFound="1" start="0"> <doc> <str name="hgnc_id">HGNC:13199</str> <str name="symbol">ZXDB</str> <str name="name">zinc finger X-linked duplicated B</str> <str name="status">Approved</str> <str name="locus_type">gene with protein product</str> <arr name="alias_symbol"> <str>ZNF905</str> </arr> ... </doc> </result> </response>
Table of Contents
Molecule Names and Identifications
Nucleobase, Nucleoside, Nucleotide, DNA and RNA
ChEMBL Database - European Molecular Biology Laboratory
PubChem Database - National Library of Medicine
INSDC (International Nucleotide Sequence Database Collaboration)
►HGNC (HUGO Gene Nomenclature Committee)
Human Gene Symbol Report by HGNC
►Synchronization with HGNC Database