Exploratory Research Notes


  • genes
  • disease
    • genetic basis for disease
  • transcription factor
    • gene regulation
  • prediction of transcriptional gene regulation in disease

Data Integration

Text Data Extraction

Related Difficulties

  • Gene name extraction
    • obsolete identifiers
    • differing databases (ID mapping)
  • ontology term extraction
    • GO
    • Medline terms
    • mapping between ontologies
  • text matching vs. context vs. understanding/semantic
  • co-occurrence of terms vs. text parsing


  • Raf is also working on text analysis


Transcription/Transcription Factor


Other Ideas

  • Database Search and Evaluation System
  • simple relations
    • gene families/similarity
    • structural similarity
  • GOToolbox GO term overrepresentation
    • overrepresentation from ontology term associations
  • example test system to demo
    • some small subset of data to try out

Overrepresentation Analysis

  • Current research from Fall 2006 Term
    • Purkinje Cell Specific Promoter Elements
    • Literature Search for Purkinje Cell specific references of Gene Expression
    • Extract Promoter Regions of these Genes
    • Look for Overrepresented Common Elements
  • Overrepresentation in Literature
    • Look for Transcription Factors, Gene names
    • References to Cell Type (Purkinje Cells), Region (Cerebellum), disease
    • PubMed
      • look for occurrence vs. background occurrence in abstracts
    • OMIM
      • links to PubMed
    • co-occurrence (à la oPossum2)
      • which things tend to occur together
      • clusters of related occurrences
  • Motif Overrepresentation
    • but worked on by Shannon and others already
  • Quantitative Information
    • SAGE data
    • weight by SAGE tag
    • or use SAGE Categories
  • Relationships
    • Interaction Databases
    • ChIP-chip data
    • phosphorylation, other related biological changes
    • related species - human, mouse


  • Requires one FoGS member to chair (Supervisor)
    • Wyeth and Francis
  • two other members
    • desirable that one member be outside faculty of program
    • recommended to be at least Assistant Profs
  • Statistics
  • Medical Genetics
  • Steve Jones - will likely chair qualifying exam committee
  • CS
    • Kevin Murphy (shown interest in bioinformatics - check with other machine learning contacts)
    • Holger Hoos (sabbattical - out of town until Feb)
    • Anne Condon (bioinformatics, but not really related)

Check Faculty of Graduate Studies Relevant Pages?

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-Share Alike 2.5 License.