## CS107 - Lab 3

### Introduction to algorithmic problem solving (part 3: lists)

The goal for this lab is to put together everything you've seen so far in this class: basic instructions, conditionals, loops and list variables. Start by reviewing the problems we did in class and read the lecture notes. Then go ahead and start on the problems below. Work individually, and call me if you need help. You are encouraged to discuss ideas and techniques broadly with other class members, but not specifics. Discussions should be limited to questions that can be asked and answered without using any written medium (e.g. pencil and paper or email). You should at no point look at the screen of your colleagues.
1. Write a search algorithm that finds every occurrence of a target value in the list and prints out the location of each match, and the total number of occurences found. If the name is not found at all it should print out "Sorry, this number is not in the list" . You can assume the size of the list is 1000. For example:
```Enter 1000 numbers:
3 4 2 6 2 8 7 2 ...
Enter target: 2

Target occurs in the list at positions: 3, 5, 8, ...
Total 23 times.
```
or
```Enter 1000 numbers:
3 4 2 6 2 8 7 2 ...
Enter target: 5

Target occurs in the list at positions:
Sorry, this number is not in the list.
```
Extra points: How would you change the algorithm so that it only prints "Target occurs in the list at positions: " only in the case that target occurs in the list at least once.

2. Consider the following algorithm for finding the largest number in a list.
```Algorithm FindLargest:
Variables: list a of size 100, i, largest, location

print "Enter 100 values:"
i = 1
while (i <= 100)
get ai
i = i+1
set largest = a1
set location = 1

set i = 2
while (i <= 100)
if (largest < ai) then
set largest = ai
set location = i
set i to i+1
print "The largest value is " largest "at location " location
```
If the numbers in our list were not unique and therefore the largest could occur more than once, would the algorithm find the first occurence? The last occurence? Every occurence? Explain.

3. In the algorithm FindLargest above there is an instruction that reads:
``` while (i <= 100)
if (largest  < ai)
...
```
In each of the following cases, explain exactly what would happen if this instruction were changed to:
• (a) while (i == 100)
• (b) while (i < 101)
• (c) while (i < 100)
• (d) while (i == 1)

4. In the algorithm FindLargest above consider the instruction
` if (largest < ai) then ...`
In each of the following cases, explain exactly what would happen if this instruction were changed to:
• (a) if (largest <= ai) then ...
• (b) if (largest > ai) then ...

5. Take a look at the following search algorithm discussed in class. We'll refered to it as binary search (as opposed to sequential search, in problem 1):
```BinarySearch
Variables: list a of size 100, target, start, end, middle, found, i, n

i = 1
n = 100
print "Enter " n " numbers: "
while (i <= 100)
get ai
i = i + 1
print "Enter target:"
get target
start = 1
end = n
found = 0
while (... ?? ...)
//compute the middle between start and end
middle = ceiling( (start + end)/2)
if (target == amiddle) then
print "Target found at position " middle
found = 1
if (target < amiddle) then
end = middle - 1
if (target > amiddle) then
start = middle + 1
if (found == 0) then
print "Target not in the list"
```
• (a) Under what assumption is binary search correct in general? (Can we use it to search in any list of numbers?)
• (b) Using this algorithm, search for target t=3 in the list 2,4,5,7,9. At some point the values of the variables start and end will be equal. What happens if we let the loop execute when start=end? What will the values of start and end be after this last interation? With this insight, how would you write the condition to stop the loop (so that it executes one last time start=end)?
• (c) Assume you use the binary search algorithm to decide whether 35 is in the following list 3, 6, 7, 9, 12, 14, 18, 21, 22, 31, 43. What numbers are compared to 35?

In the examples above, assume the algorithm is modified so that the size of the list (which in the algorithm above is 100), is replaced with the size of the actual list in that example (5 for (b), and 11 for (c)).

6. Imagine that by 2008 you have become a succesful Bowdoin CS graduate. One the many job offers you have is to work for Dumbbell Corporation, in charge of programming the voting machine that is suppposed to count votes in the next US presidential election. For the sake of this problem, despite the voting machine controversy, let's imagine you take the job.

(a) Your assignment is to write the algorithm that counts the number of votes and decides the president. To do this, you will have to read a list of votes, one by one, from the voting machine. Each vote will represent the selection of one person voting in the booth. A vote can be one of two names, "Tom" and "Jerry" (no connection with reality, this is just an exercise); any other name is considered invalid, and discarded. Assume that when the vote day is over and the last person has voted, an appointed officer comes in and sends the vote "alea iacta est".

Thus your program should read in votes from the machine until reading the vote "alea iacta est", which marks the end. For the sake of this problem assume that Dumbell has forbidden you from counterfeiting votes. Write an algorithm that displays the following information, in this order:

• the total number of voters
• the total number of invalid votes, in absolute value and percentage
• the number of votes for Tom, in absolute value, and in percentage of the total number of valid votes
• the same for Jerry
• the winner, and by how many votes

• (b)What is your opinion of electronic voting?

7. Write an algorithm that reads a list of letters from the user and inverses it. You can assume 20 letters. For example:
```Enter 20 letters: Hello world.........
Thanks.
The inverted text is: .........dlrow olleH
Goodbye.
```

8. The human genome is composed of a sequence of approximately 3.5 billion nucleotides, each of which can be one of only four different chemical compounds : Adenine, Cytosine, Thymine, Guanine. These nucleotides are usually referred to by their first letter: A, C, T and G. Thus, our DNA, the basis of our life, turns out to be a very long list of letters, written in a four-letter alphabet.
```....T A G C C A G T A A C T A A G C T...
```
Write an algorithm that reads two DNAs from the user (hmm...for the sake of this problem we'll assume a poor soul types in 3.5 billion letters, twice), and finds out whether they are the same or not. In this problem, assume that two DNAs are the same if they match leter by letter, either forwards or backwards. For example:
```A C A A G T C     and      A C A A G T C
```
match. Also
```A C A A G T C     and      C T G A A C A
```
match.

Your algorithm should print one of the folowing:

```DNA1 matches with DNA2 forward.
DNA1 matches with DNA2 backwards.
DNA1 does not match with DNA2.
```

9. Same problem as before, except that now the two DNAs are allowed to have at most 100 letter mismatches, either forward or backward.

In the example below let's assume the length of a DNA is 6 and the number of allowed mismatches is 3.

` G T G G C A   and     A T A G C G `
match forward, with 3 mismatches (first, third and last position).
` G T G A C A   and     A C A G T C `
If we try to align them forward, we find 6 mismatches, therefore they do not match forwards. However, when we try to match them backwards, we see that they in fact match with 1 mismatch.

Your algorithm should print one of the folowing:

```DNA1 matches with DNA2 forward with xx mismatches.
DNA1 matches with DNA2 backwards with xx mismatches.
DNA1 does not match with DNA2.
```