- standard
- blocked
- divide-and-conquer

For the blocked algorithm: First run a suite of experiments for a large value of n, with different values of the block size r. Plot the running time for that value of n, as function of r. Pick a value of n that's large enough so that a row of the matrix does not fit in cache (if possible). The optimal value of r will be the one that optimizes the running time. Experiment with a few different values of n, to see if the optimal block size changes. Second, once you found the optimal block size for that platform, run a suite of experiments keeping r fixed, and with various values of n. Plot the running time function of n.

Structure/tips:

- Write each matrix multiplication algorithm in its own separate file (like mmult.c, blocked_mmult.c, rec_mmult.c). Create one Makefile that compiles all three modules.
- All modules should read the value of n on the command line.
- Should work for any value of
`n`, not only powers of 2 <----- added 9/17 - All modules should start by allocating and randomly initializing the matrix. This should not be timed.
- The matrices should be stored as arrays of doubles.
- Run all experiments on the same platform ---- your laptop or a desktop in on e of the labs (all machines in the lab are the same platform).
- Makefile should compile with -O3 flag.
- Include unit testing <--- added 9/17 (see below)

//file that implements the blocked_mmult() main(..) { //a,b are the matrices, c is the result a = calloc(...) b = calloc(...) c = calloc(...) //initialize a,b with random values ... //call the blocked mmult //start timer blocked_mmult(a,b,c,n) //end timer //test it; note, this is not timed test(a,b,c, n) } //input: a,b are matrices and c=a*b computed by blocked_mmult //this function computed a*b by straightforward matrix multiplication and tests whether c is the same //if it finds an error, it prints something useful and exits //if all good, it prints that the test was passed void test(a,b,c,n) { //compute d = a*b with straightforward mmult d = calloc(...) for i for j for k //now compare c with d element by element; if find an element where they're different, //print an error message and exit() //if you got to the end of the loop, all elements match printf("congratulations: test past\n"): }

Finally, doing this via GitHub means I have to specify a deadline, and I have no control over what happens if you push after the deadline. So this means you need to follow the deadline. Guildelines for this assignment (and any assignment). Start early and plan accordingly.

- three source files and a Makefile
- 3 plots, once for each algorithm with different values of n
- one additional plot showing the running time function of block size r
- other supporting files and data

- the plots (hardcopy)
- A cover page containing
- your name
- your username in GitHub
- the http of the repository, and
- whether you worked alone or with a partner (if you worked with a partner please specify whose repository includes the team's work).
- the L1 cache size of the machine on which you ran the experiments

- Each team needs to email me the Github address of their repository (searching is slow), preferably on a sheet of paper (so that i dont have to search my email)
- Make it clear that they have to handle all values of n, not only power of 2
- Want the three algorithms on same plot so that we can compare