wfa2lib-rs

A pure Rust port of WFA2-lib, the wavefront alignment library for pairwise sequence alignment.

This project was created with Claude Code.

Overview

wfa2lib-rs is a from-scratch Rust implementation of the Wavefront Alignment (WFA) algorithm. WFA exploits the structure of sequence similarity to achieve O(ns) time complexity — proportional to the sequence length (n) and the alignment score (s) — making it significantly faster than traditional O(n²) dynamic programming approaches for similar sequences.

This implementation covers the full WFA feature set:

Distance metrics: indel (LCS), edit (Levenshtein), gap-linear (Needleman-Wunsch), gap-affine (Smith-Waterman-Gotoh), and gap-affine 2-piece (concave penalties)
Alignment modes: end-to-end (global), ends-free (semi-global), and extension alignment
BiWFA: bidirectional WFA for O(s) memory alignment with full CIGAR output
Heuristics: WF-adaptive, X-drop, Z-drop, static banding, and adaptive banding for pruning divergent sequences
Sequence representations: ASCII byte sequences, packed 2-bit DNA encoding, and a lambda callback interface for custom sequence access
SIMD acceleration: NEON-vectorized compute kernels on aarch64 (4 diagonals/iteration), AVX2 on x86_64 (8 diagonals/iteration), with scalar fallback
Score and alignment: score-only mode for speed, or full CIGAR traceback with support for match rewards via Eizenga's formula

Performance

Benchmarked against the C reference implementation (WFA2-lib) on Apple Silicon (M-series), 50,000 sequence pairs of 1000 bp at 5% divergence:

Benchmark	Rust vs C	Notes
Edit distance (score)	0.59x	1.7× faster (NEON SIMD)
Gap-linear (score)	0.69x	1.4× faster
Gap-affine (score)	0.94x	1.06× faster
Gap-affine (CIGAR)	0.95x	1.06× faster
BiWFA (ultralow memory)	0.92x	1.09× faster
Gap-affine 2-piece (score)	1.00x	Parity
Gap-affine 2-piece (CIGAR)	1.03x

Ratio < 1.0 means Rust is faster. All modes are at parity or faster than C. Key optimizations include NEON/AVX2 SIMD vectorization, arena bump allocation matching C's mm_allocator pattern, conditional kernel dispatch (affine fallback when O2/E2 inputs are null), wavefront reuse between alignments, and blockwise 64-bit sequence extension.

Building

Requires Rust 1.91+ (edition 2024).

# Debug build
cargo build

# Optimized release build (recommended)
RUSTFLAGS="-C target-cpu=native" cargo build --release

# Run tests (228 tests: 210 unit + 18 integration)
cargo test

Usage

As a library

use wfa2lib_rs::aligner::{AlignmentScope, WavefrontAligner};
use wfa2lib_rs::penalties::{AffinePenalties, DistanceMetric, WavefrontPenalties};

// Configure gap-affine penalties (mismatch=4, gap_open=6, gap_extend=2)
let penalties = WavefrontPenalties::new_affine(AffinePenalties {
    match_: 0,
    mismatch: 4,
    gap_opening: 6,
    gap_extension: 2,
});

let mut aligner = WavefrontAligner::new(penalties);
aligner.alignment_scope = AlignmentScope::ComputeAlignment;

let pattern = b"TCTTTACTCGCGCGTTGGAGAAATACAATAGT";
let text    = b"TCTATACTGCGCGTTTGGAGAAATAAAATAGT";

let score = aligner.align_end2end(pattern, text);
println!("Score: {}", score);
println!("CIGAR: {}", aligner.cigar);

CLI benchmark tool

The included align_benchmark binary is compatible with the WFA2-lib reference interface:

# Gap-affine alignment
cargo run --release --bin align_benchmark -- \
    -a gap-affine-wfa \
    -g 0,4,6,2 \
    -i sequences.seq

# BiWFA (memory-efficient)
cargo run --release --bin align_benchmark -- \
    -a gap-affine-wfa \
    -g 0,4,6,2 \
    --wfa-memory ultralow \
    -i sequences.seq

# Score-only mode
cargo run --release --bin align_benchmark -- \
    -a gap-affine-wfa \
    -g 0,4,6,2 \
    --wfa-score-only \
    -i sequences.seq

Input files use the WFA2-lib format: one alignment pair per line, with pattern and text separated by a > character.

Project structure

src/
  lib.rs              # Public module exports
  aligner.rs          # Main WavefrontAligner struct and alignment loops
  penalties.rs        # Distance metrics and penalty configurations
  sequences.rs        # Sequence representations (ASCII, packed 2-bit, lambda)
  wavefront.rs        # Wavefront data structure
  wavefront_set.rs    # Grouped wavefront sets (M, I, D components)
  slab.rs             # Wavefront memory pool with arena allocation
  arena.rs            # Byte-oriented bump allocator for wavefront arrays
  compute/            # WFA compute kernels (edit, linear, affine, affine2p)
  extend.rs           # Wavefront extend with SIMD acceleration
  backtrace.rs        # CIGAR traceback from wavefronts
  bialign.rs          # Bidirectional WFA (BiWFA)
  cigar.rs            # CIGAR operations and scoring
  heuristic.rs        # Pruning heuristics
  offset.rs           # Wavefront offset type and coordinate helpers
  pcigar.rs           # Packed CIGAR for BiWFA backtrace
  bt_buffer.rs        # Backtrace buffer for BiWFA
  components.rs       # Wavefront component management
  termination.rs      # Alignment termination conditions
  bin/
    align_benchmark.rs  # CLI binary

License

BSD 3-Clause. See LICENSE.

Citations

If you use this software, please cite the original WFA papers:

WFA — Wavefront Alignment:

Santiago Marco-Sola, Juan Carlos Moure, Miquel Moreto, Antonio Espinosa. "Fast gap-affine pairwise alignment using the wavefront algorithm." Bioinformatics, Volume 37, Issue 4, February 2021, Pages 456–463. https://doi.org/10.1093/bioinformatics/btaa777

BiWFA — Bidirectional WFA:

Santiago Marco-Sola, Jordan M Eizenga, Andrea Guarracino, Benedict Paten, Erik Garrison, Miquel Moreto. "Optimal gap-affine alignment in O(s) space." Bioinformatics, Volume 39, Issue 2, February 2023, btad074. https://doi.org/10.1093/bioinformatics/btad074

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.cargo		.cargo
.claude		.claude
bench		bench
src		src
tests		tests
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

wfa2lib-rs

Overview

Performance

Building

Usage

As a library

CLI benchmark tool

Project structure

License

Citations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

Folders and files

Latest commit

History

Repository files navigation

wfa2lib-rs

Overview

Performance

Building

Usage

As a library

CLI benchmark tool

Project structure

License

Citations

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages