Geographic Pattern Visualization may Predict Disease Spread
Disease statistics buried within patient records or detailed in newspaper clippings can be sorted with a new toolkit and organized to depict geographic patterns, allowing the discovery of trends that were previously overlooked. The GeoViz Toolkit combines text mining with geographical mapping, allowing users to search publically available data to identify and visualize data patterns for their own interests or concerns. The software allows someone with no programming experience to navigate the application, while also providing different components and analytical tools for experienced analysts.
“The use of interactive maps and graphs, combined with word search interfaces, can lead to greater insight into complex events such as the spread of Swine flu,” said Frank Hardisty, research associate, Penn State GeoVISTA Center. “Potential applications range from research in public health — infectious disease dynamics, cancer etiology, surveillance and control — through analysis of socioeconomic and demographic data, to exploration of patterns of incidents related to terrorism or crime.”
Many sources for disease and crime statistics — newspaper articles for example — are in a semi-structured format that does not clearly present the data in a table or graph, but rather buries it within the text of the document. To obtain high-quality, relevant information from these documents, researchers use “text analytics” or ‘”text mining,” allowing them to retrieve only applicable information, like the date and description of a disease-related death, from the flood of information typically included in a newspaper clipping.
“An example would be searching a database of H1N1 flu reports for ‘child’ or ‘children’ and seeing if there is spatial clustering in the relative frequency of those reports,” Hardisty told attendees at the 2010 Association of American Geographers Annual Meeting in Washington, D.C.
H1N1 data, provided by RhizaLabs, was used in a GeoViz query. Reports containing “child” or similar terms were mapped, with areas containing a high frequency of children cases highlighted. In general, areas with low population density exhibited a higher proportion of cases containing the search term.
“The hypothesis that this evokes is that rural states have proportionally more transmissions via children, while more densely populated places are more likely to experience other vectors of transmission,” said Hardisty.
GeoViz allows users to easily manipulate the software to change time and location, as well as how the data is viewed. The user can, thus, visualize the pattern of how the disease spreads and determine how quickly it progresses from one area to the next.
Visual geographic analysis can identify locations that are more or less susceptible to certain disease, crime, or weather patterns and researchers might link these occurrences with a cause or trigger. Using the GeoViz Toolkit could contribute to how people respond to or prevent these incidents.
“First, GeoViz methods can help first responders gain better situational awareness. Second, a better retroactive understanding of clustered patterns like disease incidence and public security incidents will lead to the development of effective control measures,” concluded Hardisty.
The Department of Homeland Security’s VACCINE initiative and the Gates Foundation Vaccine Modeling Initiative supported this work. The GeoViz Toolkit was developed under the leadership of Alan MacEachren, Director, GeoVISTA Center, Penn State.