Skip to content

feat: allow bwa-mem to output sam format#11457

Open
piplus2 wants to merge 8 commits intonf-core:masterfrom
piplus2:feat-bwa-mem-output-sam
Open

feat: allow bwa-mem to output sam format#11457
piplus2 wants to merge 8 commits intonf-core:masterfrom
piplus2:feat-bwa-mem-output-sam

Conversation

@piplus2
Copy link
Copy Markdown
Contributor

@piplus2 piplus2 commented May 1, 2026

PR checklist

Closes #11456

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the module conventions in the contribution docs
  • If necessary, include test data in your PR.
  • Remove all TODO statements.
  • Broadcast software version numbers to topic: versions - See version_topics
  • Follow the naming conventions.
  • Follow the parameters requirements.
  • Follow the input/output options guidelines.
  • Add a resource label
  • Use BioConda and BioContainers if possible to fulfil software requirements.
  • Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
    • For modules:
      • nf-core modules test <MODULE> --profile docker
      • nf-core modules test <MODULE> --profile singularity
      • nf-core modules test <MODULE> --profile conda
    • For subworkflows:
      • nf-core subworkflows test <SUBWORKFLOW> --profile docker
      • nf-core subworkflows test <SUBWORKFLOW> --profile singularity
      • nf-core subworkflows test <SUBWORKFLOW> --profile conda

The patch allows BWA/MEM to produce SAM format outputs when task.ext.args2 = '--output-fmt sam'. In that case, we can skip samtools view as bwa mem already produces that format.

@piplus2 piplus2 requested a review from maxulysse as a code owner May 1, 2026 09:18
@piplus2
Copy link
Copy Markdown
Contributor Author

piplus2 commented May 1, 2026

There's a possible issue with the subworkflow FASTQ_CREATE_UMI_CONSENSUS_FGBIO.
Also with the untouched code, the CI fails:

🚀 nf-test 0.9.5
https://www.nf-test.com
Please cite: https://doi.org/10.1093/gigascience/giaf130
(c) 2021 - 2026 Lukas Forer and Sebastian Schoenherr

Load .nf-test/plugins/nft-anndata/0.4.1/nft-anndata-0.4.1.jar
Load .nf-test/plugins/nft-bam/0.6.1/nft-bam-0.6.1.jar
Load .nf-test/plugins/nft-csv/0.1.0/nft-csv-0.1.0.jar
Load .nf-test/plugins/nft-compress/0.1.0/nft-compress-0.1.0.jar
Load .nf-test/plugins/nft-fastq/0.1.0/nft-fastq-0.1.0.jar
Load .nf-test/plugins/nft-utils/0.0.9/nft-utils-0.0.9.jar
Load .nf-test/plugins/nft-vcf/1.0.7/nft-vcf-1.0.7.jar

Test Subworkflow FASTQ_CREATE_UMI_CONSENSUS_FGBIO

  Test [517809d8] 'single_umi' FAILED (7.818s)

  Assertion failed:

  assert workflow.success
         |        |
         workflow false

  Nextflow stdout:

  ERROR ~ Error executing process > 'FASTQ_CREATE_UMI_CONSENSUS_FGBIO:BWAMEM1_INDEX'

  Caused by:
    Input tuple does not match tuple declaration in process `FASTQ_CREATE_UMI_CONSENSUS_FGBIO:BWAMEM1_INDEX` -- offending value: [[id:genome], /nf-core/test-datasets/modules/data/genomics/homo_sapiens/genome/genome.fasta, /nf-core/test-datasets/modules/data/genomics/homo_sapiens/genome/genome.fasta.fai, /nf-core/test-datasets/modules/data/genomics/homo_sapiens/genome/genome.dict]



  Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

   -- Check '/mnt/c/Users/pingl/Documents/GitHub/nf-core-modules/.nf-test/tests/517809d8b122e70d816e16bb19a4ce62/meta/nextflow.log' file for details
  Nextflow stderr:



  Test [4c722dd0] 'duplex_umi' FAILED (7.009s)

  Assertion failed:

  assert workflow.success
         |        |
         workflow false

  Nextflow stdout:

  ERROR ~ Error executing process > 'FASTQ_CREATE_UMI_CONSENSUS_FGBIO:BWAMEM1_INDEX'

  Caused by:
    Input tuple does not match tuple declaration in process `FASTQ_CREATE_UMI_CONSENSUS_FGBIO:BWAMEM1_INDEX` -- offending value: [[id:genome], /nf-core/test-datasets/modules/data/genomics/homo_sapiens/genome/genome.fasta, /nf-core/test-datasets/modules/data/genomics/homo_sapiens/genome/genome.fasta.fai, /nf-core/test-datasets/modules/data/genomics/homo_sapiens/genome/genome.dict]



  Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

   -- Check '/mnt/c/Users/pingl/Documents/GitHub/nf-core-modules/.nf-test/tests/4c722dd06f1e8251582b24ecc780003e/meta/nextflow.log' file for details
  Nextflow stderr:



  Test [f9597a38] 'single_umi - stub' Assertion failed:

assert workflow.success
       |        |
       workflow false

java.lang.RuntimeException: Different Snapshot:
[                                                                                                       [
    {                                                                                                       {
        "consensusbam": [                                                                                       "consensusbam": [
            [                                                                                      |
                {                                                                                  <
                    "id": "test_single",                                                           <
                    "single_end": false                                                            <
                },                                                                                 <
                "test_single_consensus_unmapped.bam:md5,d41d8cd98f00b204e9800998ecf8427e"          <
            ]                                                                                      <
        ],                                                                                                      ],
        "groupbam": [                                                                                           "groupbam": [
            [                                                                                      |
                {                                                                                  <
                    "id": "test_single",                                                           <
                    "single_end": false                                                            <
                },                                                                                 <
                "test_single_umi-grouped.bam:md5,d41d8cd98f00b204e9800998ecf8427e"                 <
            ]                                                                                      <
        ],                                                                                                      ],
        "mappedconsensusbam": [                                                                                 "mappedconsensusbam": [
            [                                                                                      |
                {                                                                                  <
                    "id": "test_single",                                                           <
                    "single_end": false                                                            <
                },                                                                                 <
                "test_single.bam:md5,d41d8cd98f00b204e9800998ecf8427e"                             <
            ]                                                                                      <
        ],                                                                                                      ],
        "ubam": [                                                                                               "ubam": [
            [                                                                                      |
                {                                                                                  <
                    "id": "test_single",                                                           <
                    "single_end": false                                                            <
                },                                                                                 <
                "test_single_unaligned.bam:md5,d41d8cd98f00b204e9800998ecf8427e"                   <
            ]                                                                                      <
        ]                                                                                                       ]
    }                                                                                                       }
]                                                                                                       ]

FAILED (6.853s)

  Assertion failed:

  2 of 2 assertions failed

  Nextflow stdout:

  ERROR ~ Error executing process > 'FASTQ_CREATE_UMI_CONSENSUS_FGBIO:BWAMEM1_INDEX'

  Caused by:
    Input tuple does not match tuple declaration in process `FASTQ_CREATE_UMI_CONSENSUS_FGBIO:BWAMEM1_INDEX` -- offending value: [[id:genome], /nf-core/test-datasets/modules/data/genomics/homo_sapiens/genome/genome.fasta, /nf-core/test-datasets/modules/data/genomics/homo_sapiens/genome/genome.fasta.fai, /nf-core/test-datasets/modules/data/genomics/homo_sapiens/genome/genome.dict]



  Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

   -- Check '/mnt/c/Users/pingl/Documents/GitHub/nf-core-modules/.nf-test/tests/f9597a38552d027b8d67056807e8428/meta/nextflow.log' file for details
  Nextflow stderr:



  Snapshots:
    Obsolete snapshots can only be checked if all tests of a file are executed successful.


FAILURE: Executed 3 tests in 22.66s (3 failed)

@piplus2
Copy link
Copy Markdown
Contributor Author

piplus2 commented May 4, 2026

I think the CI fail because of this bug #11519

@famosab
Copy link
Copy Markdown
Contributor

famosab commented May 6, 2026

We usually want to have compressed output so having sam is not the normal output we would expect. Why do you want this to directly output sam?

@piplus2
Copy link
Copy Markdown
Contributor Author

piplus2 commented May 6, 2026

The current code shows an inconsistency: when task.ext.args2 = '--output-fmt sam' it never returns the sam file. So this commit makes the code work as expected. If we want to exclude the sam, then it should be forbidden in the args2 or at least there should be a feedback about this behaviour.

@famosab
Copy link
Copy Markdown
Contributor

famosab commented May 6, 2026

That is a fair point! I think then we need to solve the other bug and add a test for this new output file and we should be good to go.

@piplus2
Copy link
Copy Markdown
Contributor Author

piplus2 commented May 6, 2026

I'm working on the fix of the subworkflow that makes the CI fail.

@piplus2 piplus2 force-pushed the feat-bwa-mem-output-sam branch from 7404fd2 to ccf7952 Compare May 7, 2026 08:13
@piplus2
Copy link
Copy Markdown
Contributor Author

piplus2 commented May 7, 2026

the CI fails because of #11548, waiting PR #11549 to be merged.

@piplus2 piplus2 force-pushed the feat-bwa-mem-output-sam branch from ccf7952 to e8e3e1b Compare May 7, 2026 10:14
@piplus2
Copy link
Copy Markdown
Contributor Author

piplus2 commented May 7, 2026

I'm updating the snapshot to match the modified test, which expects the sam output too now.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test where you actually expect the sam file in the output please? :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just added process.out.sam, do you think it may be worth writing a specific test for the sam only? I did not include it to match the cram, csi and crai.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in theory we should test each output properly. So yes please add it :) The way its done for nw we can already see that sam is not created when it should not be created (which is good).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'll do it :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added 2 tests (single and paired read) to check that we actually get a sam output.

@piplus2 piplus2 force-pushed the feat-bwa-mem-output-sam branch 2 times, most recently from 2e4c51d to c8e4f0a Compare May 7, 2026 11:38
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want the ext.args to be present in the main.nf.test file. That makes everything more readable in one go. See the module specifications for more info.

Other than that good job 🥳

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, my bad! Fixing it now!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks all fine now.

@piplus2 piplus2 requested a review from famosab May 7, 2026 14:54
@piplus2 piplus2 force-pushed the feat-bwa-mem-output-sam branch from 0a900f8 to 17a2c56 Compare May 7, 2026 16:49
@piplus2 piplus2 enabled auto-merge May 7, 2026 16:54
@piplus2 piplus2 self-assigned this May 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

update module: BWA/MEM

2 participants