In this follow-up Software Deep Dive episode, we continue our conversation with Dr. Bede Constantinides (University of Birmingham) about the design and implementation of Deacon, a fast host-read removal tool for metagenomics.
Deacon uses minimizers and k-mer set membership queries instead of alignment, allowing it to filter reads extremely quickly while balancing sensitivity and specificity. The tool is written in Rust, producing a small, fast binary and enabling very high throughput.
We also discuss benchmarking with diverse viral and bacterial datasets, why tools like Kraken2 are not always ideal for host depletion, and why host read removal remains an unsolved problem—especially when balancing privacy, computational cost, and preservation of microbial reads.
Links
Deacon
https://github.com/bede/deacon
Hostile
https://github.com/bede/hostile/
Bede Constantinides
http://bede.im/
Kraken2
https://ccb.jhu.edu/software/kraken2/