SBN/etc. Poster

Indirect Gene-Disease Association via Medical Subject Term Annotation of Literature Evidence

Warren A Cheung, BF Francis Ouellette, Wyeth W Wasserman
Bioinformatics Program, University of British Columbia

Presented CSCBC 2009 Poster


Computational analysis of the interconnections between annotated biomedical literature data and gene databases allows the prediction of novel linkages. Our project focuses on human genes playing a previously unknown functional role in the pathology of one or more diseases. We investigate the 38 000 human genes in Entrez Gene to connect to PubMed articles annotated with disease-related Medical Subject Heading (MeSH) terms. To connect these data sources, we use both manually and automatically annotated linkages, such as the reviewed user-submitted Gene Reference into Function (GeneRIF) and the Gene2PubMed annotations in Entrez Gene.

By distilling this interconnected network of relationships into an integrated database, we provide a framework to identify known, direct relationships between genes and medical subjects terms. We then compare the direct association profiles of genes to the profiles for diseases to uncover novel relationships by assessing the similarity of the subject terms profiles. We evaluate a variety of scoring methodologies, such as over-representation analysis, to assess putative links between gene and disease as well as to validate the effectiveness of our scoring functions. Predictions have also been verified against curated sources on gene-related diseases, and the gene-disease association predictions have successfully predicted newly discovered associations.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-Share Alike 2.5 License.