• Presented at the Genetics and Bioinformatics Retreat December 5th, 2007.
  • Presented at Advancing Interdisciplinarity (CFIS Symposium) November 30th, 2007.
  • Presented at the CFRI Student Research Forum, June 21st, 2007.
  • Presented at the first Michael Smith Laboratories Poster Event (May 30-June 1st 2007), presented on May 31st, 2007.
  • Originally designed to fulfill the Call for Poster Abstracts for the Canadian Genetic Disease Network, Annual Scientific Meeting (Due date extended to March 26, 2007)

Mining Brain-Related Transcription Factor-Disease Relationships for Novel Linkages

Latest Version of the Poster (with corrections): CGDN-Poster-final-postrevised.ppt
LaTeX'ed PDF version of the abstract: cgdn.pdf

Warren Cheung1,2,3, BF Francis Ouellette2, Wyeth W Wasserman3
Email: wac AT dnahelix DOT org, francis AT bioinformatics DOT ca, wyeth AT cmmt DOT ubc DOT ca

Integrated approaches to the computational analysis of diverse data collections offer the possibility to predict links between genes and diseases. We focus on the analysis of biomedical literature for the identification of genes encoding DNA binding transcription factors which play a previously unknown functional role in the pathology of one or more neurological diseases. Existing databases enumerating human transcription factors, online repositories of abstracts from the biomedical literature and organized ontologies and vocabularies for both gene and disease annotation will be integrated. For example, over one thousand human genes in Entrez Gene are labeled as transcription factors via Gene Ontology (GO) terms. Using Medical Subject Heading (MeSH) terms, over half a million articles are identified as relevant to brain diseases in PubMed. To connect these data sources, we use both manually and automatically annotated linkages, such as the reviewed user-submitted Gene Reference into Function (GeneRIF) annotations in Entrez Gene and the computationally generated Related Articles from PubMed.

By distilling this interconnected network of relationships into an integrated database, we will provide a framework to identify known, direct relationships between transcription factors and brain diseases. This will also allow us to experiment with predicting novel relationships by the study of indirect linkages (e.g. transcription factor-characteristic and disease-characteristic intersections). Statistical scoring methodologies, such as over-representation analysis, will be developed to assess putative links between transcription factors and disease. Predictions will be verified using curated sources on gene-related diseases such as the Online Mendelian Inheritance in Man.

(244 words)

1 Bioinformatics Program, University of British Columbia, Vancouver, BC, Canada.
2 UBC Bioinformatics Centre, Department of Medical Genetics, University of British Columbia.
3 Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, Department of Medical Genetics, University of British Columbia.

Post-Conference Revisions:

  • Embarrassingly enough, there is a numerical in the printed poster - the number of human TFs being considered is currently about 1200, NOT 8000. This makes all the numbers much more reasonable looking. The 8000ish number is actually the number of TFs.
  • The right side of the UMLS box was cut off by Kinko's - the rightmost line of the box didn't get printed. I've tightened that up a bit.
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-Share Alike 2.5 License.