## Project 1: Implementation and experimental evaluation for a suite of matrix multiplication algorithms

Implement the three algorithms for matrix multiplication
• standard
• blocked
• divide-and-conquer
For the the standard algorithm and the divide-and-conquer algorithm: Run a suite of experiments with various values of n, and plot the running time of the matrix multiplication function of n.

For the blocked algorithm: First run a suite of experiments for a large value of n, with different values of the block size r. Plot the running time for that value of n, as function of r. Pick a value of n that's large enough so that a row of the matrix does not fit in cache (if possible). The optimal value of r will be the one that optimizes the running time. Experiment with a few different values of n, to see if the optimal block size changes. Second, once you found the optimal block size for that platform, run a suite of experiments keeping r fixed, and with various values of n. Plot the running time function of n.

Structure/tips:

• Write each matrix multiplication algorithm in its own separate file (like mmult.c, blocked_mmult.c, rec_mmult.c). Create one Makefile that compiles all three modules.
• All modules should read the value of n on the command line.
• Should work for any value of n, not only powers of 2 <----- added 9/17
• All modules should start by allocating and randomly initializing the matrix. This should not be timed.
• The matrices should be stored as arrays of doubles.
• Run all experiments on the same platform ---- your laptop or a desktop in on e of the labs (all machines in the lab are the same platform).
• Makefile should compile with -O3 flag.
• Include unit testing <--- added 9/17 (see below)

### Testing

Testing is a fundamental part of good programming, so for all your projects you will need to write test functions. Having unit tests will speed up the debugging, because they will help localize the bug. For matrix multiplication in particular, testing is fairly straightforward: you'll need to check that the output of your blocked and recursive algorithms are correct, by comparing them with the output of the straightforward MM. So this could look like this:
```//file that implements the blocked_mmult()

main(..) {

//a,b are the matrices, c is the result
a = calloc(...)
b = calloc(...)
c = calloc(...)

//initialize a,b with random values
...

//call the blocked mmult
//start timer
blocked_mmult(a,b,c,n)
//end timer

//test it; note, this is not timed
test(a,b,c, n)
}

//input: a,b are matrices and c=a*b computed by  blocked_mmult
//this function computed a*b by straightforward matrix multiplication and tests whether c is the same
//if it finds an error, it prints something useful and exits
//if all good, it prints that the test was passed
void test(a,b,c,n) {

//compute d = a*b with straightforward mmult
d = calloc(...)
for i
for j
for k

//now compare c with d element  by element; if find an element where they're different,
//print an error message and exit()

//if you got to the end of the loop, all elements match
printf("congratulations: test past\n"):
}

```

The programming part for this project is pretty straightforward. The core of the project is running experiments, so plan to spend most time on that. You will need to record running times, and plot. You may want to look into writing scripts to run the experiments. If you are interesting ask me, and I'll post an example of script (I'll probably post it anyways).

Finally, doing this via GitHub means I have to specify a deadline, and I have no control over what happens if you push after the deadline. So this means you need to follow the deadline. Guildelines for this assignment (and any assignment). Start early and plan accordingly.

### What to turn in

in GitHub:
• three source files and a Makefile
• 3 plots, once for each algorithm with different values of n
• one additional plot showing the running time function of block size r
• other supporting files and data
in class:
• the plots (hardcopy)
• A cover page containing
• the http of the repository, and
• whether you worked alone or with a partner (if you worked with a partner please specify whose repository includes the team's work).
• the L1 cache size of the machine on which you ran the experiments

## Comments, in retrospect (for next time)

• Each team needs to email me the Github address of their repository (searching is slow), preferably on a sheet of paper (so that i dont have to search my email)
• Make it clear that they have to handle all values of n, not only power of 2
• Want the three algorithms on same plot so that we can compare