NF-Core is not only a collection of state-of-the-art pipelines. They also offer plenty of modules that can be easily included in your workflows. This is a very quick guide on how to do it.
- Install nf-core tools and pre-commit
pip install nf-core
pip install pre-commit
- Initialize pre-commit by writing a .pre-commit-config.yaml file.
This is an example:
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.5.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
Then do a pre-commit install
pre-commit install
pre-commit installed at .git/hooks/pre-commit
and pre-commit run
pre-commit run --all-files
trim trailing whitespace.................................................Passed
fix end of files.........................................................Passed
check yaml...............................................................Passed
- Add to your nextflow config a manifest section together with env and profiles.
manifest {
name = 'GATK_WGS_preprocessing'
author = 'Luca Cozzuto'
description = 'A description of your pipeline'
version = 2.0
}
env {
R_PROFILE_USER = "/.Rprofile"
R_ENVIRON_USER = "/.Renviron"
PYTHONNOUSERSITE = 1
}
profiles {
myprofile {
includeConfig 'conf/myprofile.config'
}
}
- Search for a module
nf-core modules list remote
- Install a module
nf-core modules install fastqc
Indicate that you are installing within a pipeline
nf-core modules install fastqc
,--./,-.
___ __ __ __ ___ /,-._.--~\
|\ | |__ __ / ` / \ |__) |__ } {
| \| | \__, \__/ | \ |___ \`-._,-`-,
`._,._,'
nf-core/tools version 3.2.0 - https://nf-co.re
WARNING 'repository_type' not defined in .nf-core.yml
? Is this repository a pipeline or a modules repository? Pipeline
INFO To avoid this prompt in the future, add the 'repository_type' key to your .nf-core.yml file.
? Would you like me to add this config now? [y/n] (y): y
INFO Config added to '.nf-core.yml'
INFO The 'modules.json' file is not up to date. Recreating the 'modules.json' file.
? Can't find a ./modules directory. Would you like me to create one? [y/n] (y): y
INFO Creating ./modules directory in '.'
INFO Installing 'fastqc'
INFO Use the following statement to include this module:
include { FASTQC } from '../modules/nf-core/fastqc/main'
- Let's read the docs of the module. Go to https://nf-co.re/modules/fastqc/
The input is groovy map with both meta information and files. So I made it in this way:
include { FASTQC } from "${projectDir}/modules/nf-core/fastqc"
if (params.single == "NO") {
Channel
.fromFilePairs( params.reads, checkIfExists: true ) // size: 2 is used by default
.map {[ [id: it[0], single_end:false], it[1] ] }
.set { reads }
} else {
Channel
.fromFilePairs( params.reads, size: 1, checkIfExists: true)
.map {[ [id: it[0], single_end:true], it[1] ] }
.set { reads }
}
workflow {
FASTQC(reads)
}
- In the main.nf code of fastqc, the label indicated is
process_medium
, so let's define it in our nextflow config file (myprofile.config):
process {
withLabel: process_medium {
cpus = 2
memory='12G'
}
}
- Let's now add the tool version and connect it to multiqc
nf-core modules install multiqc
and add it to the include
include { MULTIQC } from "${projectDir}/subworkflows/nf-core/fastqc"
As we can see the output of fastqc module consists of 3 channels: html, zip and versions. The first two are groovy maps with meta information and files while the latter is just a file. We can plug the html and version to multiqc in this way:
workflow {
fqc = FASTQC(reads)
ch_versions = fqc.versions
multiqc_data = fqc.zip.map{ meta, zip -> return zip } )
MULTIQC(multiqc_data.collect(), [], [], [], [], [])
}
The last input of MULTIQC can be left as an empty map, so they are skipped.
Now we miss the version file for being uploaded to multiqc. We need to install utils_nfcore_pipeline for using the function softwareVersionsToYAML
nf-core subworkflows install utils_nfcore_pipeline
let's include it
include { softwareVersionsToYAML } from '${projectDir}/subworkflows/nf-core/utils_nfcore_pipeline/'
and then
workflow {
fqc = FASTQC(reads)
ch_versions = fqc.versions
multiqc_data = fqc.zip.map{ meta, zip -> return zip } )
// STORE VERSIONS OF TOOLS
softwareVersionsToYAML(ch_versions)
.collectFile(
storeDir: "${params.output}/pipeline_info",
name: 'nf_core_' + 'pipeline_software_' + 'mqc_' + 'versions.yml',
sort: true,
newLine: true
).set { ch_collated_versions }
multiqc_data = multiqc_data.mix(ch_collated_versions)
MULTIQC(multiqc_data.collect(), [], [], [], [], [])
}
Top comments (0)