The University of Pennsylvania’s Abramson Family Cancer Research Institute has adopted SPSS’s integrated data mining and text mining solution. The Abramson Family Cancer Research Institute will be using Clementine and LexiQuest Mine solutions to analyze disease processes, with the objective of improving patient outcomes.
“The goal of our research is to understand how the interaction of genetic and environmental factors turns a genetic predisposition into active disease,” said Dr. Michael N. Liebman, the institute’s director of Computational Biology and BioMedical Informatics. “Based on our findings, we want to develop a methodology to improve patient care and gain a better understanding of the process of developing new diagnostics and therapies.”
Advances in analytical software have redefined research methodologies for the life sciences. As scientists have leveraged new technologies for capturing and evaluating large volumes of data, the speed and accuracy of their research has increased. Dr. Liebman selected SPSS’s data mining and text mining technology because the ability to combine both types of data provides the best chance for advances in research.
Dr. Liebman and his colleagues are using the Clementine data mining workbench to help identify early risk and clinical factors that could lead to more effectively diagnosing and treating cancer. The research is focused on discovering the relationships between factors that determine the onset of a disease.
By using these computational models, the institute can assess the impact of different therapeutic interventions. As Dr. Liebman’s team discovers successful processes for treating cancer, they will convert these methods into applications that can be generalized so that other researchers may use them. “We want to build an infrastructure that will enable us to automate the process and apply it to other diseases,” said Dr. Liebman.
“The combination of Clementine’s modeling techniques and its outstanding flexibility for integrating different types of data — biochemical, genetic, clinical, and family history data, among others — makes it the perfect solution for scientific research, where predictive analysis can be utilized for pattern discovery and recognition,” said Dr. Petra Scheffer, senior marketing manager for SPSS Inc.’s science practice.
SPSS Inc.’s text mining solution, LexiQuest Mine, is being used to quickly extract and analyze important concepts contained in thousands of scientific articles and patents in order to better understand the relationships between diseases. Since the intelligence is presented graphically, LexiQuest Mine facilitates easy, efficient discovery of obscure or emerging patterns hidden within massive amounts of information. Dr. Liebman’s long-term goal is to use LexiQuest Mine to build an ontology of disease progression.
“Previously, we could not process all existing disease research literature in a uniform manner,” explained Dr. Liebman. “Through its visual-map interface, LexiQuest Mine provides us with insight that goes beyond standard search tools. By leveraging text mining, we’re confident that we will discover previously overlooked relationships and make observations about issues that haven’t been fully explored in the past.”
With the integration of Clementine and LexiQuest Mine, Dr. Liebman and his researchers can access text mining from the Clementine user interface and incorporate text data into the regular flow of the data mining process. The Clementine-LexiQuest Mine combination will assist his researchers with uncovering unknown patterns and associations, and enable them to make better predictions based on all of the available information. According to Dr. Liebman, “Major research advances won’t be coming from just genomics, they’ll arise from the clinical side, as well.”