Data in Biotech

The Power of Open-Source Pipelines for Scientific Research with Harshil Patel


Listen Later

This week, Harshil Patel, Director of Scientific Development at Seqera, joins the Data in Biotech podcast to discuss the importance of collaborative, open-source projects in scientific research and how they support the need for reproducibility.

Harshil lifts the lid on how Nextflow has become a leading open-source workflow management tool for scientists and the benefits of using an open-source model. He talks in detail about the development of Nextflow and the wider Seqera ecosystem, the vision behind it, and the advantages and challenges of this approach to tooling.

He discusses how the nf-core community collaboratively develops and maintains over 100 pipelines using Nextflow and how the decision to constrain pipelines to one per analysis type promotes collaboration and consistency and avoids turning pipelines into the “wild west.”

We also look more practically at Nextflow adoption as Harshil delves into some of the challenges and how to overcome them.

He explores the wider Seqera ecosystem and how it helps users manage pipelines, analysis, and cloud infrastructure more efficiently, and he looks ahead to the future evolution of scientific research. 

Data in Biotech is a fortnightly podcast exploring how companies leverage data innovation in the life sciences.

---

Chapter Markers

[1:23] Harshil shares a quick overview of his background in bioinformatics and his route to joining Seqera.

[3:37] Harshil gives an introduction to Nextflow, including its origins, development, and the benefits of using the platform for scientists.

[9:50] Harshil expands on some of the off-the-shelf process pipelines available through NFcore and how this is continuing to expand beyond genomics.

[12:08] Harshil explains NFcore’s open-source model, the advantages of constraining pipelines to one analysis per type, and how the Nextflow community works.

[17:43] Harshil talks about Nextflow's custom DSL and the advantages it offers users

[20:23] Harshil explains how Nextflow fits into the broader Seqera ecosystem. 

[26:08] Ross asks Harshil about overcoming some of the challenges that arise with parallelization and optimizing pipelines

[28:01] Harshil talks about the features of Wave, Seqera’s containerization solution. 

[32:16] Ross asks Harshil to share some of the most complex and impressive things he has seen done within the Seqera ecosystem.

[35:42] Harshil gives his take on how he sees the future of biotech genomics research evolution.

---

Download our latest white paper on “Using Machine Learning to Implement Mid-Manufacture Quality Control in the Biotech Sector.”

Visit this link: https://connect.corrdyn.com/biotech-ml

...more
View all episodesView all episodes
Download on the App Store

Data in BiotechBy CorrDyn

  • 5
  • 5
  • 5
  • 5
  • 5

5

10 ratings


More shows like Data in Biotech

View all
Fresh Air by NPR

Fresh Air

37,854 Listeners

Planet Money by NPR

Planet Money

30,652 Listeners

Freakonomics Radio by Freakonomics Radio + Stitcher

Freakonomics Radio

32,104 Listeners

Odd Lots by Bloomberg

Odd Lots

1,866 Listeners

Pivot by New York Magazine

Pivot

9,253 Listeners

Bold Names by The Wall Street Journal

Bold Names

1,448 Listeners

The Long Run with Luke Timmerman by Timmerman Report

The Long Run with Luke Timmerman

122 Listeners

The Indicator from Planet Money by NPR

The Indicator from Planet Money

9,507 Listeners

The Readout Loud by STAT

The Readout Loud

316 Listeners

Practical AI by Practical AI LLC

Practical AI

187 Listeners

Business Of Biotech by Ben Comer

Business Of Biotech

88 Listeners

BioCentury This Week by BioCentury

BioCentury This Week

30 Listeners

Raising Health by Andreessen Horowitz, a16z Bio + Health

Raising Health

144 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

15,395 Listeners

Biotech Hangout by Daphne Zohar, Josh Schimmer, Brad Loncar, Tim Opler & more

Biotech Hangout

18 Listeners