|
B. Robson B. and R. Mushlin
(2004), "Genomic Messaging System for Information-Based Personalized
Medicine with Clinical and Proteome Research Applications",
J. Proteome Res. In Press.
Abstract
The convergence of clinical medicine and the Life Sciences,
commencing with opportunities in clinical trials and clinically
linked medical research, presents many novel challenges. The Genomic
Messaging System (GMS) described here was originally developed as
a tool for assembling clinical genomic records of individual and
collective patients, and was then generalized to become a flexible
workflow component that will link clinical records to a variety
of computational biology research tools, for research and ultimately
for a more personalized, focused, and preventative healthcare system.
Prominent among the applications linked are protein science applications,
including the rapid automated modeling of patient proteins with
their individual structural polymorphisms. In an initial study,
GMS formed the basis of a fully automated system for modeling patient
proteins with structural polymorphisms as a basis for drug selection
and ultimately design on an individual patient basis.
B. Robson and R. Mushlin (2004), "The
Dragon on the Gold: Myths and Realities for Data Mining in Biotechnology
using Digital and Molecular Libraries", J. Proteome Res. In Press.
Abstract
To develop bioscience and personalized medicine in the
post-genomic era, the biggest problem may be how to extract knowledge
from the rich libraries of biomedical data. A particular dragon
protects the gold therein: the dragon is the “curse of dimensionality”
and its formidable fire weapon, which is burning researchers, is
the “combinatorial explosion”. This arises because many genomic,
proteomic, clinical, and lifestyle factors may interact that cannot
necessarily be considered on a simple pairwise or additive basis.
A suggested theoretical solutionsor at least “road map” that ameliorates
management of these problems borrows from several disciplines. It
is undertaken also in the hope might also lead to research with
broader impact on several unresolved issues in biotechnology: conversely,
mathematical understanding of processes involving molecular libraries,
such as cDNA libraries and DNA in the living cell itself, may open
the opportunities to use biotechnology to construct nanotechnological
storage and query systems.
S. Weiss, and N. Indurkhya (2000), "Lightweight
rule induction", Proceedings of the Seventeenth International
Conference on Machine Learning, pp. 1135-1142.
Abstract
A lightweight rule induction method is described that generates
compact Disjunctive Normal Form (DNF) rules. Each class has an equal
number of unweighted rules. A new example is classified by applying
all rules and assigning the example to the class with the most satisfied
rules. The induction method attempts to minimize the training error
with no pruning. An overall design is specified by setting limits
on the size and number of rules. During training, cases are adaptively
weighted using a simple cumulative error method. The induction method
is nearly linear in time relative to an increase in the number of
induced rules or the number of cases. Experimental results on large
benchmark data sets demonstrate that predictive performance can
rival the best reported results in the literature.
|