![]() |
The above title is from a recent report from the National Academies that attempts to identify opportunities and challenges presented by the rapid convergence of knowledge in mathematics and the life sciences.[1] If the interested reader can get by the rather stiff and formal language of the Executive Summary (and someone badly needs to inform the authors that this is usually done in less than a single page!), then the rest of the document is rather good reading, a nice review of biological progress with suggestions as to making maximally efficient use of mathematics while not turning biologists into mathematicians. The document is far more than a 150-page review of modern biology however, and buried within its pages are a number of highly important points.
Among the most important of the conclusions/recommendations are elaborated in the last chapter of the report: Crosscutting Themes. The two ideas considered of prime consideration by the authors are that:
1) the coming interaction between mathematics and biology must be driven by biological considerations, and
2) we must not neglect the areas of mathematical research that “cut across levels of biological organization, emerging and re-emerging in diverse biological contexts.”
As much space is given to the small n/large P problem, especially in the context of genetics in general and microarray studies in particular, I naturally segue to this month’s rant.
During the past 15 or so years, high throughput methodologies have been developed to address problems such as finding patterns in gene expression data. The current technique involves microarray technology whereby genetic materials may be interrogated by biochemical probes to detect changes in expression levels of genes. These changes will hopefully mirror overt changes seen in cells due to actions of the proteins coded for by the genetic material.
What the researcher is ultimately presented with is a mountain of data derived from few samples (the ‘n’ above) being described by many parameters (the ‘P’ above, in this case the genes). To handle this data, researchers originally wrote bits of computer code to grossly digest the data and do a few routine analyses.
Enter the statisticians!
The need for a careful examination of the data, standardization techniques, and modeling of error every step of the way lead to numerous interesting ‘interactions’ between the biologists and the statisticians. True, geneticists and statisticians have had a long and profitable history of collaboration, but that was with large n – small P datasets.
Enter bioinformatics.
Looks like we know more than we really do! Recent papers are too complex, and it is too hard to model the data. Math must model highly complex biology!
Lest the tone of the present dissertation be taken as too negative, we will end with the upbeat and enlightening conclusion from the National Research Council report.
“As this and many other stories emphasize, applications of the mathematical sciences to the biosciences span an immense conceptual range, even when one considers only one facet of the biological enterprise. No one scientist, mathematical or biological specialty, research program, or funding agency can span the entire range. Instead, the integration of diverse skills and perspectives must be the overriding goal. In this report, the committee seeks to encourage such integration by putting forward a set of broad principles that it regards as essential to the health of one of the most exciting and promising interdisciplinary frontiers in 21st century science.”
Reference
1. Mathematics and 21st Century Biology. The National Academies Press. 500 Fifth Street, N.W., Lockbox 285, Washington, DC 20055 (2005). Internet, www.nap.edu.
John Wass is a statistician with GPRD Pharmacogenetics, Abbott Laboratories. He can be reached at [email protected].