Fall, 2009
COMP/MATH 452
Bioinformatics
nick-peterss-imac:Desktop Nick$ python BranchAndBoundMedianSearch.py
aaaaaaaa : 27
aaaaaaac : 23
aaaaaaat : 22
aaaaaact : 20
aaaaatga : 19
aaaaatgc : 18
aaacaact : 14
aaagcaac : 12
aagcaact : 7
aatgcaac : 6
atgcaact : 0
atgcaact
aaaaaaaa : 22
aaaaaaac : 19
aaaaaaga : 18
aaaaaagc : 17
aaaacaac : 16
aaagatga : 15
aaagatgc : 14
aaggatgc : 13
aatgatgc : 12
atttgatg : 11
atttgatg
This shows the run of two examples given in the book on page 94. It first prints out the best match as it finds it along with the hamming distance. When it's done, it prints out the string with the lowest hamming distance. The first result is for the set of DNA sequences from figure 4.2b and the second sample is for the set of DNA sequences from Figure 4.2d. These results show the DNA sequence with the lowest hamming distance as we find them.
While working on this project I ran into a couple issues. The first issue dealt with comparing characters of different cases in my hamming distance function. Some characters were upper case while others were lower case. When these characters were compared, they were considered to be different. To fix this I simply converted all characters to lower case and then compared them.
The second issue was with the ByPass and NextVertex functions. The psuedocode given in the book didn't produce the list as given to us on page 106. To overcome this I found examples online that produced the same results given to us in the book (except the dashes are represented by 0s in my code). Since python code and the book's psuedocode are similar syntactially, it should be easy to see the difference between the two sets of code.