- Tuesday 11/15: decide on a project
- Thursday 11/17: 5-minute presentation
- Thursday 12/8 (last class): preliminary project demo
- December 18th (final exam date in polaris), 2-5pm: final demo and paper.

You can work alone, or with a partner. Naturally the expectation is that the work of a team scales linearly with the number of people in the team (this means: 2 people in a team leads to double the number of hours spent on the project).

I am willing to allow teams of 3 people, though based on my experience it is harder to coordinate --- you will have to convince me that you have a plan.

There are many LaTex guides online, if you should need anything beyond the template. Here is a suggested outline for the paper:

- [Section 1] Introduction and background: A short description of the problem you solve, how it is defined, why it is important and how it is used. General level and relatively brief, at least for this report.
- [Related work]: In a research paper you would have a section in which you would discuss the related work, and how your proposed work addresses some of the questions left open in the previous work. This probably will not be the case for this project report. UPDATE: SINCE WE DID NOT DISCUSS PAPERS IN CLASS, YOUR PAPER NOW HAS TO INCLUDE A SECTION ON PREVIOUS WORK.
- [Section 2] The algorithm: Here you describe, at a general level, the approach.
- [Section 3] Implementation: Describe in detail your implementation: its main steps, and how they are implemented.
- [Section 4] Experiments: what sort of experiments you did, what data, what platform.
- [Section 6] Conclusion and future work: Describe what you learnt in this project, what went well and what went wrong, what you wish you had done differently, any related stories that you want to share, any questions you ran into, any insights (and of course why you liked this class).

Below is a list of project options:

The basic problem is the following: Given a grid terrain (part of which is the sea) and a sea-level rise (e.g. 3ft), model flooding of the terrain as the sea rises to the given level.

Unless you watch Fox News, you probably believe that climate change is happening. As temperatures continue to rise, ice will melt and the sea level will rise. Scientists predict significant sea-level rises (between 3 and 5 feet) in the next 100 years. Below is one of the recent pictures in the news (NYC flooded):

In this project you will produce a similar flooding simulation and you can show it to your family and friends to raise awareness. Even more, if your code is efficient on large data and if you implement the extension that I suggest below, your code will be useful to local agencies that are looking at impacts of flooding.

**Performance:** An important goal for this project is
performance. The DEM folder on `
dover:/mnt/research/gis/DATA/DEM` contains a grid for Lincoln
county at 2m resolution, about 900 million points. We also have a
grid for Knox county at 2m resolution (CHECK SIZE). Both these
counties have been generated from high-resolution Lidar data using
ArcGIS.
Generally speaking we have .5 TB of 1m resolution Lidar data for Maine
and we could generate other counties; also we could generate grids
covering more than one county. Ideally we would have a grid covering
the whole Maine coast, but noone dares work with such large datasets
simply because there is no software that can handle it (the exception
is LAStools which can handle large data, and can generate very large
grids from Lidar data, but the modules that do this are not open
sources).
All in all, this is an opportunity to develop an efficient algorithm,
customize it and parallelize it. The problem of flooding is not
immediately parellizable (like the viewshed problem) however some
simple data partition strategies can be explored and all in all coming
up with an approach will be a great problem.

**SLR+BFE: ** An additional issue is to consider the
impact of storm waves in addition to SLR flooding. The flooding due
to waves is given as a BFE grid: this is a grid of the same size as
the elevation grid, its value at point (i,j) is the height of waves at
point (i,j). The BFE grid is zero at sea and inland, and has non-zero
values only along the coast (see picture below). The BFE grid gives
the extent of the current flooding (note: do we know anything about
how it is calculated? is it based on historical flood data?), without
taking SLR into account. A picture of the BFE grid for Southport is
below (you can find it in the folder with SLR papers and data).

The goal is to "add" BFE to SLR and model flooding with both---this will give the flood zones in the future, when in addition to storm waves there will be sea-level rise.

If you want to work on this problem, I see several possibilities:

- Compute SLR flooding.
- Parallelize SLR flooding
- Compute SLR + BFE flooding.
- Parallelize SLR+BFE flooding.

**Test data:** the island Southport in Maine, provided
by Eileen Johnson. Data is here.

**Relevant links:** SLR papers

The problem is the following: Given a flow direction (FD) grid, compute a (recursive) watershed partition of the terrain.

It is important that you work with a complete FD grid, that is, one which routes flow on flat areas and routes flow out of sinks. The process of finding the sinks in the terrains and simulating flooding is tedious to implement, so you can skip this and use FD grids generated by GRASS GIS. If you want to run GRASS GIS, it is available on dover; I generated a couple of test FD grids and uploaded them here.

Given the FD grid, the process of computing a watershed partition is the following:

- Find the mouths of the rivers and walk upstream, finding the 4 biggest tributaries. Call these 2,4,6 and 8.
- Compute the watersheds of these tributaries and call them W2, W4, W6 and W8.
- Compute the watersheds in between these, and call them W1, W3, W5, W7 and W9.

Doing this process once will give a partition into 9 watersheds. The output shoudl be a grid the same size as the terrain, where each point in the grid is labeled with a number 1 through 9, corresponding to the watershed it is in.

The basic function is this project will be a function to determine the watershed of an arbitrary point p. As a reminder, the watershed of a point p is all the grid points that flow to p.

Computing these 9 watersheds will be a good project.

As a refinement, you can repeat the process inside each watershe. This way you can find 9 sub-watersheds inside each of the 9 watersheds. For example the sub-watersheds of watershed 1 will be numbered 11,12,...19. The sub-watersheds of watershed 2 will be numbered 21, 22, 23,...,29; and so on.

**Test data:** You will find some test FD grids here. The DEMs are stored in teh standard location on dover `dover:/mnt/research/gis/DATA/DEM/`

**Relevant links:** watershed papers

The problem is the following: Given an elevation grid, compute the total viewshed grid. This is a grid where the value of point (i,j) is the size (that is, number of visible points) of the viewshed of (i,j).

Thus computing the total viewshed entails computing a viewshed for each single point in the terrain as a viewpoint. This process is computationally intensive and for example on Kaweah dataset (1 million points) it took 42 hours with one core. Here is a picture of the total viewshed for Kaweah:

The goal in this project is to parallelize this computation and perform a detailed and careful inverstigation of its performance. You will run experiments with various number of threads and measure the speedup; and you will attempt to explain why the speedup flattens out by finding a way to measure the running times of each thread, and thus the overall load balancing. You will experiment with ways to improve the load balancing among threads. All projects need to be accompanied by reports, but for this project in particular, since the bulk of the work is in the experimental evaluation, I expect a report that could be submitted to a conference.

**Test data:** See DEMs in `dover:/mnt/research/gis/DATA/DEM/`.

Parallel total viewshed:

- Calculating the inherent visual structure of a landscape (total viewshed) using high-throughput computing (Llobera et al, 2003)
- Simultaneous computation of total viewshed on large high resolution grids (Tabik, Zapata, Romero; International Journal of Geographical Information Science; 2012)
- Efficient data structure and highly scalable algorithm for total viewshed computation (Tabik, Cervilla, Zapata, Romero; IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing; 2014)

lasgrid : is a tool that reads LIDAR from LAS/LAZ/ASCII and grids them onto a raster. The most important parameter ‘-step n’ specifies the n x n area that of LiDAR points that are gridded on one raster cell (or pixel). The output is either in BIL, ASC, IMG, TIF, PNG, JPG, XYZ, FLT, or DTM format. The tool can raster the ‘-elevation’ or the ‘-intensity’ of each point and stores the ‘-lowest’ or the ‘-highest’, the ‘-average’, or the standard deviation ‘-stddev’. Other gridding options are ‘-scan_angle_abs’, ‘-counter’, ‘-counter_16bit’, ‘-counter_32bit’, ‘-user_data’, ‘-point_source’, and others. For more details and other options not mentioned here see the README file.Here's the more detailed README for

`lasgrid` has many options for the output; your output should
be and ASC grid, same format as the one you've used so far.
Assume you start with a Lidar dataset in ASC form (as generated by
`las2txt`).
Allow the user to raster the lowest, or highest, or average point
(and perhaps other).
By default the grid shoudl cover the entire extent of the bounding box
of the lidar dataset. Experiment with various step sizes and
especially when the resolution of the grid gets close to the
resolution of the Lidar data. If there are grid cells that dont
contain any lidar points, then the resolution requested for the output
grid is too high. You will investigate what's the highest resolution
grid for a particular dataset.

A nice challenge here is to make your algorithm work on very large Lidar data where neither the Lidar data nor the grid actually fit in memory.

**Test data:** Check out Lidar data in
`dover:/mnt/research/gis/DATA/LIDAR/`. There's also Lidar data
that comes with LAStools.

**Papers/links:**

- GRASS GIS r.in.lidar

For example, your code could take on the command line the name of the lidar dataset to be simplified, the desired error threshold, and the name of the output file where you;ll write the esulting TIN. Your code should simplify the lidar data, time it, print a summary, and then render the TIN. The function to simplify should be separated from the render and timed. The summary should include how many points are left in the TIN (both in absolute value and percentage of the number of points in the input grid), and total time for simplification. The time for simplification should not include the time to read the lidar data into memory, or to write the TIN to a file. In summary, one should expect to see the following on the screen when running your program:

./simplify lidar.txt 10 lidar.10.tin reading lidar.txt in memory...done. total xxx seconds. --------- starting simplification n=184552. ... done. n'=2019 (1.09% of 184552) total time xx seconds --------- writing TIN to file lidar.10.tin

The error epsilon on the command line should be interpreted as an absolute value. Example: the command above produces a TIN that is within a distance of 10 (units) from the lidar data. The units here are the same as the units used for height in lidar.txt.

When run with epsilon=0, your program should eliminate all the flat areas. If there is no flat area then running with epsilon=0 will not eliminate any points.

**Test data:** Check out Lidaar data in `dover:/mnt/research/gis/DATA/LIDAR/`. There's also Lidar data
that comes with LAStools.

Relevant links:

- The classic: Fast polygonal approximations of terrain and height fields (Garland and Heckbert 1995)
- Garland's Terra, Scape, QSlim
- Jonathan Shewchuck's website
- Streaming computation of Delaunay triangulations (SIGGRAPH 2006)
- Generating Raster DEM from Mass Points via TIN Streaming (Isenburg, Liu, Shewchuck , Snoeyink, Thirion; GIScience 2006)

Oviously doing all three will be too ambitious for a term
project. Just pick a part that looks interesting to you.
For example, you could work on finding the ground. Or, you could run
`lasground` to find the ground, and focus on finding the
vegetation, or the buildings.
This is worth exploring if you are interested in vision, because
finding roofs entails computing some sort of planarity estimate for a
point and its neighborhood. If you had an image instead of a grid,
the starting point would be to compute the Sobel operator that
estimates first derivatives an thus identifies sharp edges. Here you
have a Lidar dataset that stores actual heights not colors, so the
approach is different, but the spirit the same: identify how the
terrain looks around a point in order to see if it's part of a planar
roof. This assumes that roofs are planar, but hey you have to start
somewhere and make some assumptions. Roofs that are not planar will
need to be identified in a different way.
If you want to chose this project, we have some datasets from Eileen
where you can test it, check `dover:/mnt/research/gis/DATA/LIDAR/Boothbay_Harbor`.

**Test data:** Check out Lidar data in
`dover:/mnt/research/gis/DATA/LIDAR/`. You will see a folder
`Lidar_for_Northeast` that contains Lidar data for Maine.

**Links:**

- LAStools
`lasground`README - Lidar classification papers
- More paper links:
- Automatic detection of residential buildings using LIDAR data and multispectral imagery ISPRS 2010
- Extracting general-purpose features from LIDAR data, 2010
- Tree Detection in Aerial LiDAR and Image Data, 2006.
- Morphology-based Building Detection from Airborne Lidar Data
- Building detection and line extraction from Lidar data, ISPRS 2008

- Report: ISPRS comparison of filters (2003)
- GRASS GIS GRASS GIS for the distinction of vegetation from buildings using Lidar data (Sanchez, Bovelli, 2008) | ) v.lidar.edgedetection | v.lidar.growing
- Automatic detection of residential buildings using LIDAR data and multispectral imagery (Awrangjeb, Revanbaksh, Fraser, ISPRS, 2010)