Vebjorn Ljosa » Software » MAP
The individual source code files are available here, in case you would like to take a look at the code before downloading.
This tarball, map-2009-12-31.tar.gz, contains a version of the source code for the sequence alignment method described in the following paper: Tamer Kahveci, Vebjorn Ljosa, and Ambuj K. Singh: “Speeding up whole-genome alignment by indexing frequency vectors,” Bioinformatics, 20(13), p. 2122–2134, 2004, doi:10.1093/bioinformatics/bth212. I write "a version" because there is some confusion about which version of the code is the right one. We submitted the paper on 2003-06-21. In our eagerness to start our next project (a sequence assembly approach that used some of the same ideas as MAP and building on its codebase), we did not properly use tags and branches to segregate the projects in CVS. What is packaged up here is functionally identical to what was current in CVS on 2003-07-16. It appears that I gave this version to someone who asked for it, so I presume that it works. The code is unchanged since then, but I have modified the build system slightly in order to make the code build with current versions of automake, autoconf, et al. You should be able to compile it as follows: ./configure make The program `mkmt` creates a match table, and the program `findex` creates an F-index. Although there is not much documentation, the `-h` options to `mkmt` and `findex` are decent starting points: map$ ~/tmp/mapinst/bin/mkmt -h Usage: /Users/ljosa/tmp/mapinst/bin/mkmt [OPTION]... INDEX Construct the match table for an INDEX and QUERY -h Display this help. -V Display version number. -a Output match table in ASCII format. -b Output match table in binary format. -B Use boxes for both strings. -t Use B+-trees. (Cannot be used with -n.) -n Nested-loop join. (Cannot be used with -t.) -p NUMBER Partition the vector space NUMBER ways in each dimension. -e NUMBER Use error rate NUMBER (default: 0.010000). -i FILENAME Read query string from FILENAME instead of standard input. -o FILENAME Write matchtable to FILENAME instead of standard output. -q Be quiet, don't print statistics to standard output. map$ ~/tmp/mapinst/bin/findex -h Usage: /Users/ljosa/tmp/mapinst/bin/findex [OPTION]... Construct an F-index -h Display this help. -V Display version number. -w INTEGER Window size (default: 4096). -c ARGUMENT Box capacity (default: 1000). A numerical argument implies static box capacity. The following arguments specify that adaptive techniques should be used: volume - Fixed volume. Requires -v. density - Fixed density. Requires -d, can use -l. mhistv - Minimal total volume. Requires -n. mhistd - Maximal total density. Requires -n. -v NUMBER Maximum volume for boxes. For use with -c volume. -d NUMBER Minimum density for boxes. For use with -c density. -l Use lookahead. For use with -c density. -n INTEGER Desired number of boxes. For use with -c mhistv and -c mhistd. -i FILENAME Read input from FILENAME instead of standard input. -o FILENAME Write index to FILENAME instead of standard output. -q Be quiet, don't print statistics to standard output. -D Debug; turn on verbose output. The code to partition a match table and order the MBRs for interactive search lives inside the older, monolithic program `map`. Run it without arguments to see which arguments to pass in. Be warned that `map` is a lot less polished than `mkmt` and `findex`. For instance, it will usually just crash if its arguments or input files are not as expected. Please let me know whether the programs work for you. The code is released under the BSD license; see the file `LICENSE` for details. 2009-12-31 Vebjorn Ljosa Broad Institute of MIT and Harvard ljosa@broad.mit.edu vebjorn@ljosa.com http://www.ljosa.com/~ljosa/
Last updated: 2009-12-31