Bowtie: an Ultrafast, Lightweight Short Read Aligner Bowtie NEWS =========== Bowtie is now available for download. 0.9.0 is the first version to be released under the OSI Artistic License (see `COPYING') and freely available to the public for download. The current version is 0.12.7. Reporting Issues ================ Please report any issues using the Sourceforge bug tracker: https://sourceforge.net/tracker/?group_id=236897&atid=1101606 Announcements ============= To receive announcements (including release announcements) about Bowtie and related tools (including Crossbow, TopHat, Cufflinks, Myrna) subscribe to our mailing list: https://lists.sourceforge.net/lists/listinfo/bowtie-bio-announce Version Release History ======================= Version 0.12.7 - September 7, 2010 * Fixes the all-gap reference sequence issue that was present in Bowtie 0.12.6. Index files produced by bowtie-build 0.12.5 and earlier (back to 0.10.*) are compatible with bowtie 0.12.7. Index files produced by bowtie-build 0.12.7 are backward compatible with bowtie 0.12.5 and earlier as long as the first reference sequence is not all-gaps (or, when colorspace indexes, as long as the first reference sequence has two consecutive ACGT characters in it somewhere). * Indexes where the first sequence consists of all gaps or other non-ACGT characters are not handled properly by Bowtie versions 0.12.5 and older, but are handled properly by Bowtie 0.12.7. * REMOVED: bowtie-build's --old-reverse option; the old reverse- index scheme is again the default. The new scheme is disabled pending further refinement. September 3, 2010 * The version of bowtie-build distributed in Bowtie 0.12.6 IS BROKEN. It could not handle stretches of ambiguous reference characters compatibly with prior versions of Bowtie. Please use Bowtie 0.12.5's bowtie-build instead. Bowtie 0.12.7, due out soon, will contain a complete set of working tools. Bowtie 0.12.7 will revert to old way of building the reverse index. Sorry for the inconvenience. Version 0.12.6 - August 29, 2010 * Modified bowtie-inspect's default mode to use the bit-encoded reference portion of the index to reconstruct the reference sequence, rather than the ebwt portion. This makes bowtie-inspect much faster and uses less memory, and the output for a colorspace index will now be in nucleotide space. To get the behavior of the old default, use the new -e/--ebwt-ref option. * Fixed bug whereby SOLiD QV strings would fail to parse. * Moved to a new default way of building the reverse index. Revert to the old behavior with bowtie-build's new --old-reverse option. The new reverse index format is forward and backward compatible with `bowtie`, unless otherwise noted in a future version. * Fixed issue that would sometimes cause bowtie-build to crash when building a large index with a low --offrate. * Fixed build issue that would cause bowtie-build built on one version of Linux to die with a "floating point error" on other versions. * Fixed a bug whereby alignment cost could sometimes be miscalculated. Stratum was unaffected. * bowtie now simply skips reads with 0 characters. Previously it would print an error and exit. Version 0.12.5 - April 10, 2010 * Fixed spurious "Error while writing string output; not all characters written" errors in -S/--sam mode. Version 0.12.4 - April 5, 2010 * Periods in read sequences are now treated as Ns instead of ignored. This should help with some problems where Bowtie erroneously reports "Reads file contained a pattern with more than 1024 quality values..." for data from recent versions of the Illumina GA pipeline. * Fixed a bug whereby some error and warning messages would be printed on top of each other in -p mode. * Chunk-exhaustion warnings messages are now suppressed when --quiet is specified. * Fixed small issue in quality decoding whereby no-confidence colors would incorrectly influence decoded quality of adjacent bases. Version 0.12.3 - February 17, 2010 * Fixed a significant bug in -C/--color mode whereby quality values for SNP nucleotide positions were erroneously penalized. * Fixed a bug in -S/--sam mode whereby if whitespace occurred in the original read name, it would be printed in the QNAME field in violation of the SAM spec. Bowtie now truncates read names at the first whitespace character before printing. * When input is FASTQ and -C/--color is enabled, Bowtie is now tolerant of output from FASTQ converters that include the primer base and where the quality string is one character shorter than the sequence string (due to the primer). This fixes issues users have reported using output from BFAST's solid2fastq. * Added support for SOLiD-style _QV files, via the -Q/--quals, --Q1 and --Q2 options. These options are used in combination with -C/--color and -f to align parallel colorspace read/quality files without having to convert to FASTQ. Version 0.12.2 - February 2, 2010 * When -C/--color is enabled, the default paired-end orientation is now --ff, not --fr. The new default fits typical SOLiD output. * Fixed a bug whereby very large reads could cause bowtie to crash in --best mode * After 0.12.1, some issues remained whereby bowtie would fail to trim the primer in colorspace (e.g. in -c mode). All input modes should now have fully-functioning * Fixed a bug whereby decoded colorspace qualities could overflow and erroneously become low qualities. * Fixed a bug that could produce incorrect paired-end alignment results in -n mode when using paired-end orientation modes other than --fr. Even with the bug, reported results are reasonable; but the seed edit constraint (-n) may have been applied to the wrong end of one of the mates. * Changed --chunkmbs default up to 64 from 32. * Better error checking and reporting for some bowtie options. * Some basic testing scripts are now bundled with Bowtie (in scripts/test), which should make it easier to regression-test. Version 0.12.1 - January 8, 2010 * IMPORTANT: Fixed bug whereby bowtie would fail to remove both the primer base and the first color when parsing .csfasta files with primer bases in -C -f mode. A workaround for users of version 0.12.0 is to use "-5 1" in that situation. * Fixed bug whereby, when -M limit was exceeded for an unpaired read, the number printed in the 7th column for the random alignment was too low by 1. * Added documentation discussing a pitfall regarding SOLiD paired- end input, i.e., not all entries necessarily have corresponding mates in the other file. Version 0.12.0 - December 23, 2009 * Added missing README.markdown file * Minor documentation additions Version 0.12.0-beta1 - December 12, 2009 * Added SOLiD colorspace support * Colorspace indexes are distinct from standard letterspace indexes and must be built with a separate invocation of bowtie-build (with -C option) * Running bowtie with -C causes Bowtie to align in colorspace; both index and reads must be in colorspace * Colorspace memory requirement is the same as paired-end alignment in nucleotide-space (normal) mode. Paired-end alignment does not increase the memory requirement further in colorspace. * csfasta, csfastq, and "raw" read formats are all supported with -C; '0' means "blue" and is intechangeable with 'A', likewise '1' ('C') means "green", '2' ('G') means "orange" and '3' ('T') means "red" * Colorspace versions of pre-built indexes added (see Bowtie web site) * New manual section discussing colorspace features * Fixed a few SAM output issues * @PG line now properly uses colons instead of equals signs * Removed /1, /2 suffixes for paired-end reads in SAM mode * Added --sam-RG option that permits the user to insert set values for flags that appear on the @RG line * Fixed lingering pthreads bugs that would cause Bowtie to hang or crash toward the end of execution with -p > 1. * Fixed performance-related bug that would cause paired-end alignment to be artificially slow in many situations. * Fixed issue with random number generation that would result in non-random selection of alignments in some situations. * bowtie -f now supports fasta files with reads split across multiple lines * New --suppress option suppresses unwanted columns of output * The MANUAL file was converted to markdown format, facilitating conversion to various other formats using tools like pandoc * DEPRECATED: bowtie: --concise, bowtie-build: --big, --little * REMOVED: -z/--phased, -b/--binout, bowtie-maptool, bowtie-maqconvert Version 0.11.3 - October 12, 2009 * Fixed crashing bug in -S/--sam mode when the number of reference sequences in the index is very large. * Added --sam-nohead option to suppress output of SAM headers in -S/--sam mode. * Added --sam-nosq option to suppress output of @SQ SAM headers in -S/--sam mode. These can become a nuisance when the reference index contains a very large number of sequences. * Fixed a bug in bowtie-build's auto-configure mode that would cause it to underestimate the amount of memory required by a set of parameters. This in turn would cause the index to be corrupted. Version 0.11.2 - October 7, 2009 * Fixed issue whereby --max option was disabled. Version 0.11.1 - October 5, 2009 * SAM output: changed XS:i optional field to be named XA:i to avoid a conflict with TopHat's XS:i field. Version 0.11.0 - October 5, 2009 * Initial SAM output support with -S/--sam option. Bowtie sets all fields according to the SAM spec (Version 0.1.2-draft, 20090820). See the new "SAM Output" section of the manual for details. * Added --shmem option: --shmem is similar to --mm in that it allows concurrent 'bowtie' processes querying the same index to share a single memory image of the index. Unlike --mm, shared memory alocated by --shmem is permanent. * The alignment summary printed to stderr at the end of an alignment run is now more friendly and includes data about the number and proportion of reads that aligned, failed to align, or were suppressed via the -m option. * When too-short reads are encountered, Bowtie now always prints warnings, not errors. --quiet now suppresses those warnings. * By default, when bowtie prints a reference sequence name it now stops at the first whitespace. In 0.10.1, the default was to print the entire name, which could cause confusion when parsing Bowtie output. To revert to printing the full name, use the new --fullref option. * Bowtie now prints the command-line before exiting with an error. * Fixed mistake in the manual's "Default output" section: the offset in field 4 is 0-based, not 1-based. To obtain a 1-based offset instead, use the -B 1 option. * Various minor bug fixes. * DEPRECATED: -z/--phased, -b/--binout, bowtie-maptool, bowtie-maqconvert. These features will be removed in a future version of Bowtie. Note that -b/--binout, bowtie-maptool, and bowtie-maqconvert are largely superseded by the SAM output format (-S/--sam), BAM, and SAMtools (http://samtools.sf.net). Contact the authors if this is a problem. * REMOVED: --unfq/--unfa/--maxfq/--maxfa/--alfq/--alfa. Please use --un/--max/--al instead. Contact the authors if this is a problem. Version 0.10.1 - July 19, 2009 * Now when -3/-5 are used in combination with -I/-X, the -I/-X constraints are interpreted as applying to the original insert, not the trimmed insert. * Fixed issue whereby -I option was ignored; -I option works now. * Fixed a bug whereby some large indexes were incorrectly reported as corrupt by bowtie-build. * Fixed issue whereby negative quality values were wrongly rejected when both --integer-quals and --solexa-quals were specified. * The -l/--seedlen parameter can now be adjusted down to 5 (previously had to be >= 20). * Fixed several minor memory leaks and out-of-bounds issues. The Linux version of the bowtie aligner now receives a clean bill of health from valgrind's memcheck. * Other minor bugfixes. Version 0.10.0.2 - 6/28/09 * Second bugfix for Windows version. src and bin-win32 packages updated. Linux and Mac users are not affected. Thanks for your bug reports and patience. Version 0.10.0.1 - 6/23/09 * Fix for crashing bug in Windows version. src and bin-win32 packages updated. Linux and Mac users are not affected. Version 0.10.0 - June 12, 2009 * Major change: All alignment modes are now unstratified by default. The --nostrata option has been removed, since it is now the default. A --strata option has been added to override the default and force stratified reporting. Reporting is stratified if and only if --strata is specified. --strata now cannot be specified without also specifying --best. Please note that, because of this change, specifying the same arguments to this version of Bowtie may yield different reported results. * Replaced the --unfa/--unfq options with a single --un option, which writes unaligned reads to an output file (or pair of output files) but keeps reads in their original form. This is in contrast to the old --unfa/--unfq options, which only supported FASTA or FASTQ formats, and which would print a post-trimming and post-quality-value translation version of the read. Likewise, the --alfa/--alfq and --maxfa/--maxfq options have been replaced with --al and --max options. The old options are still present, but are deprecated and will be removed in a future version. * Added --nofw and --norc options, allowing alignment to just one reference strand or the other. * Added --mm option that causes bowtie to use memory-mapped files instead of traditional file I/O to access the reference index. This allows multiple bowtie processes running on the same computer to share a single in-memory image of a given index. This is a useful feature for parallelizing bowtie in situations where memory is limited and where -p is inappropriate or insufficient. This feature is not available in the Windows version of Bowtie. * Added a section to the manual ("Reporting Modes") clarifying and giving examples of how to use Bowtie's reporting options. * The --al and -z/--phased options previously interacted in such a way that the --al file could contain multiple entries for the same aligned read. --al and -z/--phased are now incompatible. * The --oldpmap option, deprecated in version 0.9.8, has been removed. Version 0.9.9.3 - May 12, 2009 * Fixed an issue where bowtie --best would sometimes use excessive amounts of memory to store path descriptors. There is now a per- thread 32-MB ceiling (configurable with new option --chunkmbs ) on the memory taken by path descriptors. If the ceiling is exceeded Bowtie will skip the offending read, print a warning message identifying the read, and continue. * More options are available for defining the quality-value format, including new --phred64-quals/--solexa1.3-quals options appropriate for the 64-based-Phred output of Illumina's GA Pipeline 1.3. Added option --phred33-quals to to handle the more typical 33-based-Phred scale (the default). The --solexa-quals option still handles the 64-based-Solexa scale output by GA Pipeline versions prior to 1.3. * bowtie-build now checks output files for obvious corruption due, for example, to disk exhaustion. * Specifying "-" (meaning stdin) as an input to bowtie is now supported and documented. * Fixed a bug whereby bowtie-maqconvert could fail to notice that it had exhausted memory and output a corrupt Maq map file. * Fixed a bug whereby bowtie would crash when trying to use an index built on a machine with different endianness. * Fixed several issues that prevented Bowtie from compiling on Solaris. I confirm that Bowtie builds and runs on Solaris. * Added _LARGEFILE_SOURCE _FILE_OFFSET_BITS=64 _GNU_SOURCE to the default build options in an attempt to resolve some of the large- file issues users are having. * Clarified column 7 in the manual. We received many queries from users curious about this number. * Moderate speed improvements in --best mode. Version 0.9.9.2 - April 6, 2009 * Paired-end alignment is now available in all alignment modes, including all -n modes. * --best now provides better guarantees. Reported alignments are now guaranteed to be "best" both in terms of stratum (i.e. number of mismatches, or mismatches in the seed in the case of -n mode), and in terms of the quality values at the mismatched position(s). Stratum always trumps quality when determining best alignments. Also, --best mode resolves the strand bias issue (see manual for a discussion of the issue). * Speed improvements for --best mode in most alignment modes. * Major speed improvement for the -v 3 alignment mode (except when -z is also used) * The "Reported X alignments..." message is now printed to stderr rather than stdout. Only alignments are written to stdout. * In bowtie-maqconvert, read names longer than Maq's limit (36) are now truncated to a suffix of the original name, rather than a prefix. This mimics Maq's behavior and prevents "/1" and "/2" suffixes for paired-end reads from being destroyed. * Added --alfq/--alfa options to dump aligned reads to FASTQ and/or FASTA files. * Removed many extraneous source files. Version 0.9.9.1 - March 10, 2009 * Added paired-end alignment for -v 2 and -v 3 alignment modes (-n modes coming soon). * Minor bug fixes and speed improvements for all paired-end modes. * Added -s/--skip option to skip over the first reads or pairs in the input. * --unfq/--unfa/--maxfq/--maxfa modes no longer create empty output files. * All Bowtie tools now compile under GCC 4.3.3. * Fixed bug whereby bowtie -b would sometimes write garbage into the reference offset field. * Paired-end info is now persisted in the -b format, allowing bowtie-maptool output to add "/1" and "/2" suffixes as appropriate. Version 0.9.9 - February 19, 2009 * Added some preliminary support for paired-end alignment in -v 0 and -v 1 modes. -1/-2 options to specify the paired-end files, -I/-X to specify min and max insert sizes, and --fr/--rf/--ff specify relative orientation of upstream and downstream mates. bowtie-build now builds two additional files: NAME.3.ebwt and NAME.4.ebwt. Together, these files store a bitpacked version of the reference and they are required for paired-end alignment. If your index does not include these files and you would like to perform a paired-end alignment, you will have to rebuild the index with bowtie-build version 0.9.9 or later. Paired-end alignment is not compatible with -z mode, and it incurs about a 30% greater memory overhead than single-end mode. * Pre-built indexes available from Bowtie website have been updated to include .3/.4.ebwt index files. These new pre-built indexes are no longer compatible with bowtie versions prior to 0.9.8. * New -B/--offbase option allows user to specify how bowtie numbers reference positions in its output. E.g. -B 1 causes bowtie to number leftmost char as 1. -B 0 is the default, but -B 1 will likely become the default in the 1.0 release. * Fixed a bug that caused trimming options -3 and -5 not to work properly in -r (raw input) mode. * bowtie-build now prints a friendly error message and exits if an input file doesn't exist. * Fixed a bug that caused the Win32 version of bowtie to hang just before it would normally have exited. * Fixed bug that could prevent successful read-in of very large (>1GB) .2.ebwt index files. * Removed --maxns option since it's mostly redundant with what -v and -n already do. * Removed --ntoa option. * bowtie usage message is now divided into sections for clarity. Version 0.9.8.1 - January 7, 2009 * Fixed all known problems with the --unfa/--unfq options: * They now work properly with multiple threads. * Fixed issue where sequence and quals were sometimes reversed. * Fixed other issues causing spurious omission of unaligned reads. * Added --maxfa/--maxfq options so that reads that don't align due to the -m limit can be dumped separately from reads that don't align at all. * Alignment output is now guaranteed to be "deterministic" even when multiple threads are used. I.e., given the same input reads (in any order) and the same --seed, bowtie will produce the same alignments every time it is run, though not necessarily in the same order. This does not hold across different versions of Bowtie. * Multiple other bug fixes. Version 0.9.8 - November 25, 2008 * --unfa/--unfq options cause bowtie to dump unaligned reads to FASTA and/or FASTQ files. * bowtie-build now selects its memory-efficiency parameters (--bmax, --dcv, --packed) automatically by default; this makes it far easier to build an index under memory constraints by eliminating tedious trail-and-error. New -a option disables this, yielding old behavior. * bowtie-build-packed is no longer a separate binary. Supplying the new -p/--packed argument to bowtie-build is the new equivalent. * New tool bowtie-maptool converts between Bowtie's output formats. * New tool bowtie-inspect recreates reference strings from Bowtie index. * Renamed bowtie-convert to bowtie-maqconvert for clarity. * New universal Mac binary combines i386 & x86_64 binaries. PowerPC still not supported. * Added --nomaqround option to bowtie. * Fixed memory leaks in bowtie. * Switched to a new scheme for mapping positions in "joined" reference string to positions in original strings. This changes the index format. bowtie-build's --oldpmap parameter reverts to the old format. Versions of bowtie prior to 0.9.8 cannot search indexes produced by bowtie-build 0.9.8 unless bowtie-build is run with --oldpmap. bowtie 0.9.8 can search either index format. Pre-built indexes are still in the old format, but will switch to new format when Bowtie 1.0 is released. Version 0.9.7.1 - November 11, 2008 * Fixed an issue that caused a spurious loss of sensitivity between Bowtie versions 0.9.6 and 0.9.7 in certain modes. Many thanks to Ali Mortazavi for bringing this to our attention. Version 0.9.7 - November 8, 2008 * Added new reporting option -m which suppresses all alignments for a particular read if more than reportable alignments exist for it. * Threads now buffer all alignments for a particular read/phase then output all alignments in one critical section. This guarantees that all alignments for a given read/phase appear in one consecutive block of the output, even when multiple threads are operating in parallel. * Separated the quality-conversion and parsing aspects of the old --solexa-quals argument into separate arguments: --solexa-quals (quality conversion) and --integer-quals (parsing). * bowtie-convert now handles the new (post-0.7.0) Maq alignment format. The new format allows Maq tools to handle reads up to 127 bases, whereas the old format was limited to 63 bases. Added a -o option to opt for the old Maq format. * New --refout argument sends alignments to a set of files named refXXXXX.map, where XXXXX is the 0-padded index of the reference sequence aligned to. Useful for dealing with large datasets aligned to, e.g., the assembled human genome. * Improved tutorial to use a simple simulated read set (included) to do SNP calls with Maq. * Added --nota option to bowtie-build * Fixed make_h_sapiens_asm.sh script to include mitochondrial DNA. Version 0.9.6 - October 10, 2008 * 'bowtie' now supports a host of options that allow the user to specify which and how many valid alignments to report per read. The default is still to report 1 "good" alignment, which is by far the fastest mode. See -k/-a/--best/--nostrata options described in the manual for details. * 'bowtie' now supports reads up to 1024 bases long. Note that for reads much longer than, say, 35 bases, the user must be careful to set alignment policy parameters (especially -e) appropriately. * --fast flag eliminated, double-index mode is now the default. Added the -z/--phased flag to revert to phased, half-index mode. * --concise output mode now officially supported. Now outputs one alignment per line. * Changed 'bowtie-build' default back to --bmaxdivn 4. * -h/--help now prints much more verbose help for 'bowtie' and 'bowtie-build' (verbatim from MANUAL file) * BWT-searching code streamlined; much old code eliminated Version 0.9.5 - September 27, 2008 * Last column of output now additionally reports the reference and query bases (in that order) for mismatches. E.g., old: "30,32", new: "30:C>A,32:C>T". * Eliminated spurious trailing space in first column of output. * Minor performance and sensitivity improvements. * New option '-p' spawns a user-specified number of pthreads for parallel processing of reads. For example, use '-p 4' to run 'bowtie' on 4 processor cores simultaneously. * Due to the new '-p' option, 'bowtie' needs pthreads to compile and run. To compile 'bowtie' without pthreads support (which disables the '-p' option), use 'make BOWTIE_PTHREADS=0'. * Also due to '-p' option, the Windows version of Bowtie now comes with the pthreadGC2.dll file from the pthreads for Win32 project (http://sourceware.org/pthreads-win32). This library is released under the LGPL license. * New option '--fast' causes Bowtie to load both the "forward" and "mirror" halves of the index at once, which eliminates the need for multiple phases and speeds up matching at the cost of using about twice as much memory. '--fast' also causes 'bowtie' to scale better when used in combination with '-p'. * Fixed crashing bug with -o/--offrate in 'bowtie'. * Improved error reporting. Version 0.9.4 - September 16, 2008 * New method for handling gaps and ambiguity codes in the reference. New 'bowtie-build' method handles long stretches of gaps gracefully. New 'bowtie' rejects alignments that overlap a gap or ambiguous character in the reference. * Due to above change, index file format has been changed. All pre-built indexes available on this site have been updated to the new format. To obtain indexes with the old format, contact us. * In 'bowtie' unnamed reads are now given ordinal names (rather than "default") in the alignment output. Works for all input modes. * New 'bowtie' input mode: Raw, activated with -r. Expects one read sequence per line; no quality values or names. * Fixed 'bowtie' bug whereby trimming did not work in -c mode. * Changed 'bowtie-build' default to not use blockwise mode. * Changed 'bowtie-build' to avoid certain infinite-loop and very- long-runtime scenarios. * Packaging improvements: archives now explode into subdirectories and scripts are executable. Version 0.9.3 - September 6, 2008 * Major reference-name bug fixes to bowtie-convert Version 0.9.2 - September 4, 2008 * Now allows 3-mismatches: -n and -v options accept 3 * Output format prints reference name instead of id in third column * Pre-built indexes updated to encode reference names * Ns in reads now match nothing (previously, they matched As/Ts) * Dropped -l/--linerate and -i/--linesperside arguments to bowtie- build * Fixed bug in Maq-like mode that allowed some poor alignments * Minor speed improvements Version 0.9.1 - August 25, 2008 * Integrated relevant SeqAn-1.1 sources into Bowtie source release * Now builds on Windows under MinGW (needs pthreads and zlib) * Binary releases for Linux (i386, x86_64), Windows (i386) and MacOS X (i386) Version 0.9.0 - August 18, 2008 * First stable release of Bowtie. * Includes the three core Bowtie tools: the indexer 'bowtie-build', the read aligner 'bowtie' and the converter from Bowtie's to Maq's mapping output format, 'bowtie-convert'. * Compatible pre-built indexes for many model organisms are available from http://bowtie-bio.sf.net. * FASTA, FASTQ inputs supported; tested with Solexa FASTQ * Supports Maq alignment policy (-n and -e behave as in Maq) * Supports X-mismatch policy (-v option behaves as in SOAP) * -n and -v options accept 0, 1, or 2