Releases: alekseyzimin/masurca
MaSuRCA v4.1.4
This release has improvements to chromosome_scaffolder.sh script aimed at reference-assisted scaffolding of assemblies. The improvements include improved logic for computing contig coordinates, and improved treatment of contained alignments.
Please install MaSuRCA from the attached archive MaSuRCA-4.1.4.tar.gz. Do not use the Source files below.
To install:
tar xzf MaSuRCA-4.1.4.tar.gz
cd MaSuRCA-4.1.4
./install.sh
MaSuRCA v4.1.3
This release contained bug fixes and improvements, freebayes binary has been updated to resolve incompatibility on some systems.
Please install MaSuRCA from the attached archive MaSuRCA-4.1.3.tar.gz. Do not use the Source files.
To install:
tar xzf MaSuRCA-4.1.3.tar.gz
cd MaSuRCA-4.1.3
./install.sh
MaSuRCA v4.1.2
This release contained bug fixes and improvements, primarily to chromosome scaffolder -- the component that performs reference-based scaffolding of an assembly.
Please install MaSuRCA from the attached archive MaSuRCA-4.1.2.tar.gz. Do not use the Source files.
MaSuRCA v4.1.1
This release contained bug fixes and improvements. There is a new option for running mega-reads on the grid: GRID_ENGINE=MANUAL. This option will produce a script to run mega-reads correction jobs on multiple servers manually, and provide instructions on how to execute the jobs and restart the assembly.
Please install MaSuRCA from the attached archive MaSuRCA-4.1.1.tar.gz. Do not use the Source files.
MaSuRCA 4.1.0
This release introduces multiple improvements and compatibility fixes:
- Eugene annotation pipeline (eugene.sh), based on Maker software was improved significantly,
- SAMBA scaffolder's performance and accuracy improved,
- MaSuRCA assembler code added compatibility fixes that prevented it from running on some systems that do not support numactl
- close_scaffold_gaps.sh, a wrapper for SAMBA scaffolder aimed at closing gaps in existing scaffolds was improved
MaSuRCA 4.0.9
This release has major improvements to SAMBA scaffolder, and minor improvements to POLCA polisher and reference-based chromosome scaffolder.
Detection of misassemblies in SAMBA is improved, along with accuracy of gap-filling consensus sequences and structural quality of the output contigs. If scaffolds with gaps are given to SAMBA, it will now not consider gaps misassemblies and will avoid splitting at or near gaps. SAMBA runs automatically as the last step in MaSuRCA assembler resulting in more contiguous and correct assemblies.
POLCA polisher now outputs the QV value. POLCA can be also used as an integrated variant calling/assembly evaluation pipeline. With "-n" switch it will not make any changes in the assembly, it will produce a vcf file with all variant calls in the reads against the assembly, and output evaluation of consensus quality.
The close_scaffold_gaps.sh wrapper to SAMBA has been improved as well, and this script can be used to effectively close gaps in scaffolds with another assembly (or a reference genome for closely related species) or additional long-read data. Usage: close_scaffold_gaps.sh -h.
Performance, stability and accuracy of the chromosome scaffolder tool (chromosome_scaffolder.sh) has been improved.
MINOR UPDATE 04/29/2022: removed deprecated sys/sysctl.h header from CA8. The header was deprecated in glibc 2.32, and its presence prevented compilation on newer systems.
MaSuRCA 4.0.8
This release fixes a bug in SAMBA that resulted in failure in nucmer alignment step on some data sets.
SAMBA can now use gzipped fasta file for scaffolding sequences. The sequences to be scaffolded have to be in fasta format, not gzipped.
This release also improves usage messages.
MaSuRCA 4.0.7
This release has significant improvements to SAMBA scaffolder, in error rates, output contiguity, and consensus quality. Since SAMBA now is part of default MaSuRCA assembly pipeline, the quality and contiguity of the MaSuRCA assemblies improves as well.
I also added assembly QV computation to POLCA, QV for the assembly is now reported in .report file, along with the other metrics. Note that POLCA polisher has -n option that allows it to run in "evaluation" mode where it outputs number or errors it detects in the assembly, but does not make any corrections. After that one can rerun the pipeline without -n switch to make corrections. Also -n option is useful for efficiently producing VCF file containing variant calls made by freebayes.
MaSuRCA 4.0.6
The 4.0.6 release introduces code cleanup and performance improvements in MaSuRCA assembly pipeline, POLCA error correction/assembly evaluation tool and SAMBA scaffolder.
In response to the several issues raised by the users with use of POLCA and chromosome scaffolder, I recommend that users install MaSuRCA in a separate folder with the provided install.sh script as opposed to installing it globally into /usr/local/bin. MaSuRCA is self-contained and it does not require root privileges to compile, install and run. Many components of MaSuRCA depend on having appropriate versions of binaries such as samtools, mummer and jellyfish, that are provided with MaSuRCA and may produce errors is the system attempts to use different versions of these tools available on the $PATH. For these specific versions MaSuRCA will always first look to use the binaries installed under /path-to/MaSuRCA-x.x.x/bin/.
MaSuRCA 4.0.5
This is a maintenance release that improves the stability of masurca scaffolder (soon to be published as SAMBA tool), and improves speed and consensus quality of the hybrid assemblies.
Major changes:
- upgraded swig headers to version 4.0.2
- fixed occasional division by zero bug in masurca scaffolder
- removed the step of k-mer size reduction for the super-reads, it is not needed with the improvements that has been made recently
- the masurca_scaffolder.sh tool has been renamed to samba.sh tool (manuscript in preparation)
- The SAMBA tool can be used to close intrascaffold gaps when invoked through close_scaffold_gaps.sh script.