GitHub - biosustain/dsp_nf-nanoseq-preprocess: nextflow pipeline to parse data from our datalake into a format directly compatible with nf-core/nanoseq

This is the repository for the nextflow pipeline to preprocess GridION fastq files and preparing a samplesheet.csv for nf-core/nanoseq DNA protocol.

The pipeline is will take an in-house parquet file with metadata (sample, replicate, barcode, and more) and together with a experiement project directory a compatible samplesheet.csv will be generated.

Expected input prarams:

*parquetpath - path tp parquet file of metadata

publishDir - location to put samplesheet.csv - merged fastq files will be in a subdirectory called fastq_merged

*fasta - path to reference fasta file

*gtf - path to reference fasta file

reference - to be implemented: string which is the ID of strain, used to define URL to fasta and gtf

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
modules/local		modules/local
subworkflows/local		subworkflows/local
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
example-params.yml		example-params.yml
main.nf		main.nf
nanoseq_pre_v2.png		nanoseq_pre_v2.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

License

biosustain/dsp_nf-nanoseq-preprocess

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages