Story posted August 10, 2007
Imagine plotting the surface of Earth to within 100 feet of detail. That's exactly what NASA did in 2000 when the agency recorded digital images of 80 percent of the planet's surface with a resolution of 98.4 feet. With the aid of recent technologies, we now have the potential to obtain real-time images of the entire Earth down to a resolution of 3.3 feet.
Once you get beyond the "Gee whiz!" factor of using Google Earth to find a satellite image of your old pick-up truck parked in your driveway, it's almost a burden to have this much information at your fingertips. It requires the power of a super-computer to meaningfully store, manipulate or analyze that much data. Most of the Geographic Information Systems (GIS) used to plot maps are not up to the task.
With help from some Bowdoin student researchers, Assistant Professor of Computer Science Laura Toma and her colleagues from other universities have been creating algorithms that manage geographic data to reduce computation time, essentially giving a desktop the power of a supercomputer.
"Our work deals with real problems on real data sets that practitioners face," she says. "It tries to solve them in the most efficient ways, which will have impact on other disciplines. This is not something you publish that only theoreticians care about. The practical work is really important to justify the theory."
A computer stores data in two ways: on its hard drive and in its RAM. The hard drive is much larger — easily 100 gigabytes — and much cheaper, so that's the best place to dump huge amounts of data. The RAM is critical, however, because it's where you take the data to work with it.
A computer's "virtual memory manager" chooses blocks of data to move back and forth between the hard drive and the RAM — even if you need just one piece of data, a whole block goes with it. Transferring these data blocks takes time, and creates a bottleneck in the program's calculating speed. The best way to speed things up is to store and use data in an efficient order to minimize the number of blocks you need to transfer, and the number of times you need to transfer each one.
After early development as mapping technology, GIS has since driven evermore complex systems, such as Global Positioning System (GPS), a worldwide satellite-based navigation system. The layers of information created by these systems can quickly create dizzying computing challenges for those who wish to store, analyze or manipulate massive terrain data.
To explain the scope of the computing challenge, Toma refers to an image that was taken of parts of Washington State. At a resolution of 33 feet, the image generated one billion points of data, each requiring at least 16 bytes of RAM (random access memory) to process. That's a total of 16 gigabytes of RAM. Even a state-of-the-art desktop computer provides only two gigabytes of RAM.
Toma is developing approaches that optimize a desktop's input/output — in addition to its central processing unit (CPU) — to give it the performance improvements needed for applications that handle such large data sets. (See box, right.)
The first program she developed, called TerraFlow, tracks water flow and flood patterns. It reduced the computation time from two weeks to three hours, looking at an image containing merely 200 million points of data.
"Every GIS has its own software for handling water flow data," Toma said. "Ours is the only system that can handle very large data sets."
Toma then developed a program called IOviewshed — working with Bowdoin student researcher Yi Zhuang '08 — which uses elevation data to calculate what part of a terrain is visible from any random point. His work was included in Toma's presentation at a professional conference and is part of a professional paper.
This summer, computer-science major Zhuang is working with Toma on her current project, undertaken with colleagues at the University of Eindhoven in the Netherlands. They are developing a theoretical algorithm to analyze map overlays — calculating each point where map layers, such as streams and streets, intersect.
It is a time-intensive process of programming, says Zhuang, who is testing the theory. "The algorithm is only a skeleton," he says. "I'm putting flesh on it so the body can move."
If he is successful, he says, the work will double as his honors thesis and he hopes to publish it as a professional paper.
"On the last paper we published [IOviewshed], I wrote the experimental results section," he said. "This time, my professor wants me to write the whole paper."
"Yi is definitely very smart, very motivated and driven," says Toma. "And it's rare that a student works on two projects during undergraduate school. I'm very happy that he wanted to start working with me after his sophomore year."