The study of Geographic Information Systems (GIS) has been gaining popularity and attention in the computer science community. This is due in part to the increase in availability of data and the need for more efficient computations on this digitized data. Massive amounts of data are available from many sources like the USGS web-site and the NASA EOS project. The availability of this data is largely due to remote sensing projects like NASA's shuttle radar topography mission (SRTM) which mapped 80% of the Earth's land mass at 30 meter resolution amounting to roughly 10 terabytes of data. SRTM, USGS and other sites provide free access to terabytes worth of data.
Much work has been done in GIS with fast computations on moderately sized datasets, however, as terrain becomes more available at higher resolutions we find that many of these algorithms do not scale. This is because while they make efficient use of the CPU, they typically make the assumption that all data being processed can fit into memory; this assumption is obviously not true for large data sets. Research in the area of I/O-efficient algorithms has shown that designing algorithms that make efficient use of the memory/disk transfers can significantly reduce the running time on massive data sets.
Jon's project investigates the use of triangulations (TINs) to represent and compute on very large terrains in GIS. While grid-based terrains are highly researched, both in theory and in practice, the exploration of triangulated terrains has been mostly theoretical. We study the problem of simplifying a terrain into a TIN that approximates it within a specified error threshold. Our main contribution is an I/O-efficient terrain simplification algorithm that converts grids into TINs. We present an implementation of our algorithm and analyze its performance and scalability to large datasets. We port the algorithm into the open source GIS package GRASS for use and experimentation by the GIS community. This is the first grid simplification system that incorporates theoretical and practical applications on triangulated terrains while maintaining scalability. Finally, we show how to extend our algorithm to simplify arbitrary point-data, and explain how this will be useful for processing LIDAR data.