IBM®
Skip to main content
    Country/region [change]    Terms of use
 
 
 
    Home    Products    Services & solutions    Support & downloads    My account    

IBM Journal of Research and Development

Systems Biology   Volume 50, Number 6, 2006
Table of contents: HTMLPDF This article: HTMLPDF   Copyright info

Machine learning methods for transcription data integration - References

by D. T. Holloway,
M. A. Kon,
and C. DeLisi
References

  1. E. M. Conlon, X. S. Liu, J. D. Lieb, and J. S. Liu, “Integrating Regulatory Motif Discovery and Genome-Wide Expression Analysis,” Proc. Natl. Acad. Sci. 100, No. 6, 3339–3344 (2003).
  2. S. Keles, M. J. van der Laan, and C. Vulpe, “Regulatory Motif Finding by Logic Regression,” Bioinformatics 20, No. 16, 2799–2811 (2004).
  3. W. Wang, J. M. Cherry, D. Botstein, and H. Li, “A Systematic Approach to Reconstructing Transcription Networks in Saccharomyces scerevisiae,” Proc. Natl. Acad. Sci. 99, No. 26, 16893–16898 (2002).
  4. H. Bussemaker, H. Li, and E. Siggia, “Regulatory Element Detection Using Correlation with Expression,” Nature Genetics 27, No. 2, 167–171 (2001).
  5. K. Birnbaum, P. N. Benfey, and D. E. Shasha, “cis Element/Transcription Factor Analysis (cis/TF): A Method for Discovering Transcription Factor/cis Element Relationships,” Genome Res. 11, No. 9, 1567–1573 (2001).
  6. Z. Zhu, Y. Pilpel, and G. Church, “Computational Identification of Transcription Factor Binding Sites via a Transcription-Factor-Centric-Clustering (TFCC) Algorithm,” J. Molec. Biol. 318, No. 2, 71–81 (2002).
  7. M. Pritsker, Y.-C. Liu, M. A. Beer, and S. Tavazoie, “Whole-Genome Discovery of Transcription Factor Binding Sites by Network-Level Conservation,” Genome Res. 14, No. 1, 99–108 (2004).
  8. S. Elemento and S. Tavazoie, “Fast and Systematic Genome-Wide Discovery of Conserved Regulatory Elements Using a Non-Alignment Based Approach,” Genome Biol. 6, No. 2, R18 (2005).
  9. M. Tompa, N. Li, T. L. Bailey, G. M. Church, B. De Moor, E. Eskin, A. V. Favorov, M. C. Frith, Y. Fu, W. J. Kent, V. J. Makeev, A. A. Mironov, W. S. Noble, G. Pavesi, G. Pesole, M. Regnier, N. Simonis, S. Sinha, G. Thijs, J. van Helden, M. Vandenbogaert, Z. Weng, C. Workman, C. Ye, and Z. Zhu, “Assessing Computational Tools for the Discovery of Transcription Factor Binding Sites,” Nature Biotechnol. 23, No. 1, 137–144 (2005).
  10. G. D. Stormo, “DNA Binding Sites: Representation and Discovery,” Bioinformatics 16, No. 1, 16–23 (2000).
  11. C. T. Workman and G. D. Stormo, “ANN-Spec: A Method for Discovering Transcription Factor Binding Sites with Improved Specificity,” Proceedings of the Pacific Symposium on Biocomputing, 2000, pp. 467–478.
  12. T. D. Schneider, G. D. Stormo, L. Gold, and A. Ehrenfeucht, “Information Content of Binding Sites on Nucleotide Sequences,” J. Molec. Biol. 188, No. 3, 415–431 (1986).
  13. T. Schneider and R. Stephens, “Sequence Logos: A New Way to Display Consensus Sequences,” Nucl. Acids Res. 18, No. 20, 6097–6100 (1990).
  14. M. C. Frith, M. C. Li, and Z. Weng, “Cluster-Buster: Finding Dense Clusters of Motifs in DNA Sequences,” Nucl. Acids Res. 31, No. 13, 3666–3668 (2003).
  15. B. P. Berman, Y. Nibu, B. D. Pfeiffer, P. Tomancak, S. E. Celniker, M. Levine, G. M. Rubin, and M. B. Eisen, “Exploiting Transcription Factor Binding Site Clustering to Identify Cis-Regulatory Modules Involved in Pattern Formation in the Drosophila Genome,” Proc. Natl. Acad. Sci. 99, No. 2, 757–762 (2002).
  16. D. Dinakarpandian, V. Raheja, S. Mehta, E. Schuetz, and P. Rogan, “Tandem Machine Learning for the Identification of Genes Regulated by Transcription Factors,” BMC Bioinformatics 6, No. 1, 204 (2005).
  17. M. Rebeiz, N. L. Reeves, and J. W. Posakony, “SCORE: A Computational Approach to the Identification of Cis-Regulatory Modules and Target Genes in Whole-Genome Sequence Data,” Proc. Natl. Acad. Sci. 99, No. 15, 9888–9893 (2002).
  18. V. Matys, O. V. Kel-Margoulis, E. Fricke, I. Liebich, S. Land, A. Barre-Dirrie, I. Reuter, D. Chekmenev, M. Krull, K. Hornischer, N. Voss, P. Stegmaier, B. Lewicki-Potapov, H. Saxel, A. E. Kel, and E. Wingender, “TRANSFAC® and Its Module TRANSCompel® Transcriptional Gene Regulation in Eukaryotes,” Nucl. Acids Res. 34, No. 1, D108–D110 (2006).
  19. C. Lemer, E. Antezana, F. Couche, F. Fays, X. Santolaria, R. S. Janky, Y. Deville, J. Richelle, and S. J. Wodak, “The aMAZE LightBench: A Web Interface to a Relational Database of Cellular Processes,” Nucl. Acids Res. 32, D443–D448 (2004).
  20. C. T. Harbison, D. B. Gordon, T. I. Lee, N. J. Rinaldi, K. D. Macisaac, T. W. Danford, N. M. Hannett, J.-B. Tagne, D. B. Reynolds, J. Yoo, E. G. Jennings, J. Zeitlinger, D. K. Pokholok, M. Kellis, P. A. Rolfe, K. T. Takusagawa, E. S. Lander, D. K. Gifford, E. Fraenkel, and R. A. Young, “Transcriptional Regulatory Code of a Eukaryotic Genome,” Nature 431, No. 7004, 99–104 (2004).
  21. B. Ren, F. Robert, J. J. Wyrick, O. Aparicio, E. G. Jennings, I. Simon, J. Zeitlinger, J. Schreiber, N. Hannett, E. Kanin, T. L. Volkert, C. J. Wilson, S. P. Bell, and R. A. Young, “Genome-Wide Location and Function of DNA Binding Proteins,” Science 290, No. 5500, 2306–2309 (2000).
  22. J. Qian, J. Lin, N. M. Luscombe, H. Yu, and M. Gerstein, “Prediction of Regulatory Networks: Genome-Wide Identification of Transcription Factor Targets from Gene Expression Data,” Bioinformatics 19, No. 15, 1917–1926 (2003).
  23. M. A. Beer and S. Tavazoie, “Predicting Gene Expression from Sequence,” Cell 117, No. 2, 185–198 (2004).
  24. D. Holloway, M. Kon, and C. DeLisi, “Integrating Genomic Data to Predict Transcription Factor Binding,” Proc. Workshop Genome Informatics 16, No. 1, 83–94 (2005).
  25. T. Jaakola, M. Diekhans, and D. Haussler, “Using the Fisher Kernel Method to Detect Remote Protein Homologies,” Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology, August 6–10, 1999, pp. 149–158.
  26. S. Hua and Z. Sun, “A Novel Method of Protein Secondary Structure Prediction with High Segment Overlap Measure: Support Vector Machine Approach,” J. Molec. Biol. 308, No. 2, 397–407 (2001).
  27. S. Hua and Z. Sun, “Support Vector Machine Approach for Protein Subcellular Localization Prediction,” Bioinformatics 18, No. 8, 721–728 (2001).
  28. M. Wang, J. Yang, and K.-C. Chou, “Using String Kernel to Predict Signal Peptide Cleavage Site Based on Subsite Coupling Model,” Amino Acids 28, No. 4, 395–402 (2005).
  29. T. S. Furey, N. Cristianini, N. Duffy, D. W. Bednarski, M. Schummer, and D. Haussler, “Support Vector Machine Classification and Validation of Cancer Tissue Samples Using Microarray Expression Data,” Bioinformatics 16, No. 10, 906–914 (2000).
  30. P. Pavlidis and W. S. Noble, “Gene Functional Classification from Heterogeneous Data,” RECOMB Conference Proceedings, 2001, pp. 249–255.
  31. A. Zien, G. Ratsch, S. Mika, B. Scholkopf, T. Lengauer, and K.-R. Muller, “Engineering Support Vector Machine Kernels That Recognize Translation Initiation Sites,” Bioinformatics 16, No. 9, 799–807 (2000).
  32. G. Lanckriet, N. Cristianini, M. Jordan, and W. S. Noble, “A Statistical Framework for Genomic Data Fusion,” Bioinformatics 20, No. 16, 2626–2635 (2004).
  33. B. Scholkopf and A. J. Smola, Learning with Kernels, MIT Press, Cambridge, MA, 2002.
  34. P.-N. Tan, M. Steinbach, and V. Kumar, Introduction to Data Mining, Addison-Wesley Publishing Co., Boston, MA, 2005.
  35. D. Holloway, M. Kon, and C. DeLisi, “Machine Learning and Data Combination for Regulatory Pathway Prediction,” Synthetic & Syst. Biol. (2006), submitted.
  36. T. I. Lee, N. J. Rinaldi, F. Robert, D. T. Odom, Z. Bar-Joseph, G. K. Gerber, N. M. Hannett, C. T. Harbison, C. M. Thompson, I. Simon, J. Zeitlinger, E. G. Jennings, H. L. Murray, D. B. Gordon, B. Ren, J. J. Wyrick, J.-B. Tagne, T. L. Volkert, E. Fraenkel, D. K. Gifford, and R. A. Young, “Transcriptional Regulatory Networks in Saccharomyces cerevisiae,” Science 298, No. 5594, 799–804 (2002).
  37. P. Hodges, A. McKee, B. Davis, W. Payne, and J. Garrels, “The Yeast Proteome Database (YPD): A Model for the Organization and Presentation of Genome-Wide Functional Data,” Nucl. Acids Res. 27, No. 1, 69–73 (1999).
  38. R. Young, “Transcriptional Regulatory Network”; see http://staffa.wi.mit.edu/cgi-bin/young_public/navframe.cgi?s=17&f=evidence.
  39. M. Kellis et al., “Yeast Comparative Genomics”; see http://www.broad.mit.edu/annotation/fungi/comp_yeasts/ (2003).
  40. J. van Helden, “Regulatory Sequence Analysis Tools,” Nucl. Acids Res. 31, No. 13, 3593–3596 (2003).
  41. E. Birney, T. D. Andrews, P. Bevan, M. Caccamo, Y. Chen, L. Clarke, G. Coates, J. Cuff, V. Curwen, T. Cutts, T. Down, E. Eyras, X. M. Fernandez-Suarez, P. Gane, B. Gibbins, J. Gilbert, M. Hammond, H.-R. Hotz, V. Iyer, K. Jekosch, A. Kahari, A. Kasprzyk, D. Keefe, S. Keenan, H. Lehvaslaiho, G. McVicker, C. Melsopp, P. Meidl, E. Mongin, R. Pettett, S. Potter, G. Proctor, M. Rae, S. Searle, G. Slater, D. Smedley, J. Smith, W. Spooner, A. Stabenau, J. Stalker, R. Storey, A. Ureta-Vidal, K. C. Woodwark, G. Cameron, R. Durbin, A. Cox, T. Hubbard, and M. Clamp, “An Overview of Ensembl,” Genome Res. 14, No. 5, 925–928 (2004).
  42. R. L. Tatusov and D. J. Lipman, “National Center for Biotechnology Information, NCBI Toolkit”; see http://www.ncbi.nlm.nih.gov/.
  43. A. Smit, R. Hubley, and P. Green, “Repeatmasker Open 3.0”; see http://repeatmasker.org.
  44. S. Aerts, G. Thijs, B. Coessens, M. Staes, Y. Moreau, and B. De Moor, “Toucan: Deciphering the Cis-Regulatory Logic of Coregulated Genes,” Nucl. Acids Res. 31, No. 6, 1753–1764 (2003).
  45. C. Harbison, E. Fraenkel, and R. Young, “Matrices for Motifs”; see http://jura.wi.mit.edu/fraenkel/download/release_v24/final_set/Final_InTableS2_v24.motifs.
  46. E. Birney, D. Andrews, M. Caccamo, Y. Chen, L. Clarke, G. Coates, T. Cox, F. Cunningham, V. Curwen, T. Cutts, T. Down, R. Durbin, X. M. Fernandez-Suarez, P. Flicek, S. Graf, M. Hammond, J. Herrero, K. Howe, V. Iyer, K. Jekosch, A. Kahari, A. Kasprzyk, D. Keefe, F. Kokocinski, E. Kulesha, D. London, I. Longden, C. Melsopp, P. Meidl, B. Overduin, A. Parker, G. Proctor, A. Prlic, M. Rae, D. Rios, S. Redmond, M. Schuster, I. Sealy, S. Searle, J. Severin, G. Slater, D. Smedley, J. Smith, A. Stabenau, J. Stalker, S. Trevanion, A. Ureta-Vidal, J. Vogel, S. White, C. Woodwark, and T. J. P. Hubbard, “Ensembl 2006,” Nucl. Acids Res. 34, No. 1, D556–D561 (2006).
  47. J. E. Galagan, S. E. Calvo, K. A. Borkovich, E. U. Selker, N. D. Read, D. Jaffe, W. FitzHugh, L.-J. Ma, S. Smirnov, S. Purcell, B. Rehman, T. Elkins, R. Engels, S. Wang, C. B. Nielsen, J. Butler, M. Endrizzi, D. Qui, P. Ianakiev, D. Bell-Pedersen, M. A. Nelson, M. Werner-Washburne, C. P. Selitrennikoff, J. A. Kinsey, E. L. Braun, A. Zelter, U. Schulte, G. O. Kothe, G. Jedd, W. Mewes, C. Staben, E. Marcotte, D. Greenberg, A. Roy, K. Foley, J. Naylor, N. Stange-Thomann, R. Barrett, S. Gnerre, M. Kamal, M. Kamvysselis, E. Mauceli, C. Bielke, S. Rudd, D. Frishman, S. Krystofova, C. Rasmussen, R. L. Metzenberg, D. D. Perkins, S. Kroken, C. Cogoni, G. Macino, D. Catcheside, W. Li, R. J. Pratt, S. A. Osmani, C. P. C. DeSouza, L. Glass, M. J. Orbach, J. A. Berglund, R. Voelker, O. Yarden, M. Plamann, S. Seiler, J. Dunlap, A. Radford, R. Aramayo, D. O. Natvig, L. A. Alex, G. Mannhaupt, D. J. Ebbole, M. Freitag, I. Paulsen, M. S. Sachs, E. S. Lander, C. Nusbaum, and B. Birren, “The Genome Sequence of the Filamentous Fungus Neurospora crassa,” Nature 422, No. 6934, 859–868 (2003).
  48. R. Dean, “Fungal Genomics Laboratory at North Carolina State University, Broad Institute of MIT and Harvard”; see http://www.fungalgenomics.ncsu.edu and http://www.broad.mit.edu.
  49. P. Cliften, P. Sudarsanam, A. Desikan, L. Fulton, B. Fulton, J. Majors, R. Waterston, B. A. Cohen, and M. Johnston, “Finding Functional Features in Saccharomyces Genomes by Phylogenetic Footprinting,” Science 301, No. 5629, 71–76 (2003).
  50. M. Kellis, N. Patterson, M. Endrizzi, B. Birren, and E. S. Lander, “Sequencing and Comparison of Yeast Species to Identify Genes and Regulatory Elements,” Nature 423, No. 6037, 241–254 (2003).
  51. A. Halees, D. Leyfer, and Z. Weng, “Promoser: A Larger-Scale Mammalian Promoter and Transcription Start Site Identification Service,” Nucl. Acids Res. 31, No. 13, 3554–3559 (2003).
  52. P. Pavlidis, I. Wapinski, and W. S. Noble, “Support Vector Machine Classification on the Web,” Bioinformatics 20, No. 4, 586–587 (2004).
  53. J. Ihmels, S. Bergman, and N. Barkai, “Naama Barkai Group”; see http://barkai-serv.weizmann.ac.il/GroupPage/.
  54. The Mathworks, “MATLAB: MATrix LABoratory”; see http://www.mathworks.com/.
  55. J. Weston, A. Elisseeff, G. Bakir, and F. Sinz, “SPIDER: Object Oriented Machine Learning Library”; see http://www.kyb.tuebingen.mpg.de/bs/people/spider/.
  56. J. C. Platt, “Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods,” in Advances in Large Margin Classifiers, P. Bartlett, B. Schölkopf, D. Schuurmans, and A. Smola, Eds., MIT Press, Cambridge, MA, 2000.
  57. N. Simonis, S. J. Wodak, G. N. Cohen, and J. van Helden, “Combining Pattern Discovery and Discriminant Analysis to Predict Gene Co-Regulation,” Bioinformatics 20, No. 15, 2370–2379 (2004).
  58. F. Gao, B. Foat, and H. Bussemaker, “Defining Transcriptional Networks Through Integrative Modeling of mRNA Expression and Transcription Factor Binding Data,” BMC Bioinformatics 5, No.1, 31 (2004).
  59. D. Goodsell and R. Dickerson, “Bending and Curvature Calculations in B-DNA,” Nucl. Acids Res. 22, No. 24, 5497–5503 (1994).
  60. S. Parker, J. Greenbaum, G. Benson, and T. D. Tullius, “Structure-Based DNA Sequence Alignment,” poster presented at the 5th International Workshop in Bioinformatics and Systems Biology, Berlin, Germany, August 2005.
  61. B. Balasubramanian, W. K. Pogozelski, and T. D. Tullius, “DNA Strand Breaking by the Hydroxyl Radical Is Governed by the Accessible Surface Areas of the Hydrogen Atoms of the DNA Backbone,” Proc. Natl. Acad. Sci. 95, No. 17, 9738–9743 (1998).


    About IBMPrivacyContact