Add ARJ, ARC, and ACE archive decompression support #194

Copilot · 2026-02-06T18:11:43Z

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

… clone requirement Co-authored-by: gfs <[email protected]>

Co-authored-by: gfs <[email protected]>

gfs · 2026-02-06T18:41:29Z

/azp run

azure-pipelines · 2026-02-06T18:41:44Z

Azure Pipelines successfully started running 1 pipeline(s).

Copilot

Pull request overview

Adds native extraction support for ARJ, ARC, and ACE archives (via SharpCompress) to RecursiveExtractor, including MiniMagic type detection and test coverage to validate detection + extraction output counts.

Changes:

Upgraded SharpCompress to enable ARJ/ARC/ACE reader support.
Added MiniMagic signatures and new ArchiveFileType enum values for ARJ/ARC/ACE, plus registered new extractors.
Added small fixture archives and expanded unit tests to cover detection and extracted-file counts.

Reviewed changes

Copilot reviewed 10 out of 13 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
RecursiveExtractor/RecursiveExtractor.csproj	Bumps SharpCompress dependency to a version that supports ARJ/ARC/ACE.
RecursiveExtractor/MiniMagic.cs	Adds ARJ/ARC/ACE types and signature detection; increases header read size.
RecursiveExtractor/Extractors/ArjExtractor.cs	New ARJ extractor implementation (sync + async).
RecursiveExtractor/Extractors/ArcExtractor.cs	New ARC extractor implementation (sync + async), with post-decompression size check.
RecursiveExtractor/Extractors/AceExtractor.cs	New ACE extractor implementation (sync + async).
RecursiveExtractor/Extractor.cs	Registers ARJ/ARC/ACE extractors in `SetDefaultExtractors()`.
RecursiveExtractor.Tests/TestData/TestDataArchives/TestData.arj	Adds ARJ test fixture archive.
RecursiveExtractor.Tests/TestData/TestDataArchives/TestData.arc	Adds ARC test fixture archive.
RecursiveExtractor.Tests/TestData/TestDataArchives/TestData.ace	Adds ACE test fixture archive.
RecursiveExtractor.Tests/RecursiveExtractor.Tests.csproj	Ensures new fixture archives are copied to test output.
RecursiveExtractor.Tests/ExtractorTests/MiniMagicTests.cs	Adds MiniMagic detection tests for ARJ/ARC/ACE fixtures.
RecursiveExtractor.Tests/ExtractorTests/ExpectedNumFilesTests.cs	Adds extraction count expectations for ARJ/ARC/ACE fixtures.
.github/copilot-instructions.md	Documents deep-clone requirement for Nerdbank.GitVersioning builds.

RecursiveExtractor/MiniMagic.cs

gfs · 2026-02-06T18:56:33Z

@copilot it doesn't intuitive eem like it should be related - but with this change i'm seeing the tests for TestDataEncrypted.7z fail with 0 results found on both the lib and cli tests. Is there perhaps an issue with the changes to minimagic that are causing an incorrect header detection on that file is one theory i have.

…e each signature check on bytes actually read Co-authored-by: gfs <[email protected]>

SharpCompress 0.44.5 now throws CryptographicException at Open() time for encrypted 7z archives, leaving the stream at a non-zero position. Subsequent password retry calls to Open() then fail with EndOfStreamException. Reset fileEntry.Content.Position = 0 before each retry, matching the pattern already used by RarExtractor and ZipExtractor. Co-authored-by: gfs <[email protected]>

Copilot · 2026-02-06T19:04:02Z

@copilot it doesn't intuitive eem like it should be related - but with this change i'm seeing the tests for TestDataEncrypted.7z fail with 0 results found on both the lib and cli tes...

Good catch — it's not MiniMagic but a side effect of the SharpCompress 0.44.5 upgrade. In the old version, SevenZipArchive.Open() on encrypted archives succeeded silently and detection happened later via .IsEncrypted. In 0.44.5, Open() now throws CryptographicException immediately, which leaves the stream at a non-zero position (e.g. 1775 bytes in). The subsequent password-retry Open() calls then fail with EndOfStreamException because the stream isn't rewound.

Fixed in 9444f82 by adding fileEntry.Content.Position = 0 before each retry — matching the pattern already used by RarExtractor and ZipExtractor.

gfs · 2026-02-06T19:04:48Z

/azp run

azure-pipelines · 2026-02-06T19:05:01Z

Azure Pipelines successfully started running 1 pipeline(s).

gfs · 2026-02-06T21:22:08Z

@copilot something I wonder about these new test samples for these formats - the fallback case of extact self on fail also would result in 1 file being extracted - I think maybe our expected num files tests should check to make sure that the status isn't set to failed archive. What do you think?

Three test methods (ExtractArchiveParallel, ExtractArchiveFromStreamAsync, ExtractArchiveFromStream) only verified result count, which could mask extraction failures for single-file archives (ARJ, ARC, ACE) since ExtractSelfOnFail returns 1 result with FailedArchive status. Now all three also assert that no result has FailedArchive status. Co-authored-by: gfs <[email protected]>

Copilot · 2026-02-06T21:28:38Z

@copilot something I wonder about these new test samples for these formats - the fallback case of extact self on fail also would result in 1 file being extracted - I think maybe our ex...

Great point — with ExtractSelfOnFail=true and 1-file archives, a broken extractor still returns 1 result, so the count assertion alone can't distinguish success from failure. Fixed in ac4537f by adding FailedArchive status assertions to the three test methods that previously only checked count: ExtractArchiveParallel, ExtractArchiveFromStreamAsync, and ExtractArchiveFromStream. The other methods (ExtractArchive, ExtractArchiveAsync, ExtractArchiveNoRecursion, ExtractArchiveParallelNoRecursion, ExtractArchiveSmallBatchSize) already had status verification.

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot

Pull request overview

Copilot reviewed 11 out of 14 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (2)

RecursiveExtractor/MiniMagic.cs:191

These 'if' statements can be combined.

            if (bytesRead >= 4)
            {
                if (buffer[0] == 0x50 && buffer[1] == 0x4B && buffer[2] == 0x03 && buffer[3] == 0x04)
                {
                    return ArchiveFileType.ZIP;
                }
            }

RecursiveExtractor/MiniMagic.cs:211

These 'if' statements can be combined.

            if (bytesRead >= 7)
            {
                if (buffer[0] == 0x52 && buffer[1] == 0x61 && buffer[2] == 0x72 && buffer[3] == 0x21 && buffer[4] == 0x1A && buffer[5] == 0x07 && buffer[6] == 0x00)
                {
                    return ArchiveFileType.RAR;
                }
            }

RecursiveExtractor/MiniMagic.cs

Prevents breaking changes when adding new members. All pre-existing members retain their original implicit values (UNKNOWN=0 through INVALID=18). New members ARJ/ARC/ACE get values 19-21. Co-authored-by: gfs <[email protected]>

gfs · 2026-02-10T20:27:49Z

/azp run

azure-pipelines · 2026-02-10T20:28:01Z

Azure Pipelines successfully started running 1 pipeline(s).

Copilot

Pull request overview

Copilot reviewed 11 out of 14 changed files in this pull request and generated no new comments.

Comments suppressed due to low confidence (2)

RecursiveExtractor/MiniMagic.cs:191

These 'if' statements can be combined.

            if (bytesRead >= 4)
            {
                if (buffer[0] == 0x50 && buffer[1] == 0x4B && buffer[2] == 0x03 && buffer[3] == 0x04)
                {
                    return ArchiveFileType.ZIP;
                }
            }

RecursiveExtractor/MiniMagic.cs:211

These 'if' statements can be combined.

            if (bytesRead >= 7)
            {
                if (buffer[0] == 0x52 && buffer[1] == 0x61 && buffer[2] == 0x72 && buffer[3] == 0x21 && buffer[4] == 0x1A && buffer[5] == 0x07 && buffer[6] == 0x00)
                {
                    return ArchiveFileType.RAR;
                }
            }

gfs

I think these implementations are all pretty straightforward and seem to follow the normal pattern of the wrappers around sharp compress library extractors.

I think the nits I have are around the new archive samples, which were also generated by copilot to build the test cases. We also have no nested sample with them - thus not testing the recurse functionality for these extractors, and we have no samples with more than one file, the samples only contain a single file.

@copilot can you add a sample archive of two text files in a TAR archive in an ARJ archive in an ARC archive in an ACE archive in a ZIP archive? Test it with the test number of files test, and maybe also add a test case validating that the actual content of the two files are actually properly extracted, this would give us a single test that exercises all three of the newly added extractors without having to modify the deeply nested sample we have for the extant formats. And begin down the path of building slightly more robust samples that aren't just looking for the number of files. Add the test code for both async and sync methods (using same sample file).

gfs · 2026-02-12T04:59:44Z

RecursiveExtractor/Extractors/SevenZipExtractor.cs

                    {
                        try
                        {
+                            fileEntry.Content.Position = 0;


Explanation: Though it would look unrelated to the other changes, In the migration to the newer sharpcompress this is required to reset the stream position to properly handle multiple password attempts correctly on sevenzip archives.

RecursiveExtractor/MiniMagic.cs

Create NestedFormatsTest.zip: ZIP → ACE → ARC → ARJ → TAR containing two text files. Exercises all three new extractors in a single deeply nested sample with multiple files. Add NestedFormatsContentTests with sync and async tests that verify both file count and actual extracted content (file1.txt + file2.txt). Add entries to ArchiveData (2 files) and NoRecursionData (1 file) datasets in ExpectedNumFilesTests. Move INVALID enum member next to UNKNOWN for readability (value=18 preserved). Co-authored-by: gfs <[email protected]>

Copilot · 2026-02-12T05:20:13Z

I think these implementations are all pretty straightforward and seem to follow the normal pattern of the wrappers around sharp compress library extractors.

I think the nits I have are around the ne...

Added in 753c05f. Created NestedFormatsTest.zip with this nesting structure: ZIP → ACE → ARC → ARJ → TAR → file1.txt + file2.txt. This exercises all three new extractors recursively in a single sample with multiple files.

New NestedFormatsContentTests class validates actual extracted content (both sync and async):

Asserts 2 files extracted
Asserts no FailedArchive status
Asserts file1.txt and file2.txt contents match expected strings exactly

Also added entries to ArchiveData (expects 2 with recursion) and NoRecursionData (expects 1 without recursion) in ExpectedNumFilesTests.

Copilot

Pull request overview

Copilot reviewed 12 out of 16 changed files in this pull request and generated no new comments.

Comments suppressed due to low confidence (2)

RecursiveExtractor/MiniMagic.cs:191

These 'if' statements can be combined.

            if (bytesRead >= 4)
            {
                if (buffer[0] == 0x50 && buffer[1] == 0x4B && buffer[2] == 0x03 && buffer[3] == 0x04)
                {
                    return ArchiveFileType.ZIP;
                }
            }

RecursiveExtractor/MiniMagic.cs:211

These 'if' statements can be combined.

            if (bytesRead >= 7)
            {
                if (buffer[0] == 0x52 && buffer[1] == 0x61 && buffer[2] == 0x72 && buffer[3] == 0x21 && buffer[4] == 0x1A && buffer[5] == 0x07 && buffer[6] == 0x00)
                {
                    return ArchiveFileType.RAR;
                }
            }

gfs · 2026-02-12T05:33:39Z

@copilot add the new binary test archives formats to .gitattributes as binary

Co-authored-by: gfs <[email protected]>

Copilot · 2026-02-12T05:35:10Z

@copilot add the new binary test archives formats to .gitattributes as binary

Added in 03c5e33. *.arj, *.arc, and *.ace are now marked as binary in .gitattributes to prevent EOL normalization corruption.

gfs · 2026-02-12T05:36:13Z

/azp run

azure-pipelines · 2026-02-12T05:36:25Z

Azure Pipelines successfully started running 1 pipeline(s).

gfs

This generally seems pretty straightforward to me on the implementation side. The new test case for checking test content is I think good to expand later with data rows, but this PR is already getting big so I think its better left for later.

INFO: The new test file fixtures were created by agent.

Initial plan

b3482f9

Copilot AI assigned Copilot and gfs Feb 6, 2026

Copilot started work on behalf of gfs February 6, 2026 18:12 View session

Copilot AI and others added 3 commits February 6, 2026 18:23

Add ARJ, ARC, ACE support and update agent instructions for NBGV deep…

5d90944

… clone requirement Co-authored-by: gfs <[email protected]>

Fix ACE test data file CRC and restore nuget.config

e03db48

Co-authored-by: gfs <[email protected]>

Improve ACE resource governor check to validate before decompression

4f47643

Co-authored-by: gfs <[email protected]>

Copilot AI changed the title ~~[WIP] Reimplement arj and arc changes with cleaner commit history~~ Add ARJ, ARC, and ACE archive decompression support Feb 6, 2026

Copilot finished work on behalf of gfs February 6, 2026 18:37

Copilot AI requested a review from gfs February 6, 2026 18:37

gfs requested a review from Copilot February 6, 2026 18:42

Copilot started reviewing on behalf of gfs February 6, 2026 18:42 View session

gfs mentioned this pull request Feb 6, 2026

Add ARJ and ARC decompression support #186

Closed

9 tasks

Copilot AI reviewed Feb 6, 2026

View reviewed changes

RecursiveExtractor/MiniMagic.cs Outdated Show resolved Hide resolved

Copilot started work on behalf of gfs February 6, 2026 18:53 View session

Fix DetectFileType regression: read min(length, buffer) bytes and gat…

fba41e2

…e each signature check on bytes actually read Co-authored-by: gfs <[email protected]>

Copilot finished work on behalf of gfs February 6, 2026 18:57

Copilot started work on behalf of gfs February 6, 2026 18:58 View session

Copilot finished work on behalf of gfs February 6, 2026 19:04

Copilot started work on behalf of gfs February 6, 2026 21:22 View session

Copilot AI reviewed Feb 10, 2026

View reviewed changes

gfs requested a review from Copilot February 10, 2026 19:26

Copilot started reviewing on behalf of gfs February 10, 2026 19:27 View session

Copilot AI reviewed Feb 10, 2026

View reviewed changes

RecursiveExtractor/MiniMagic.cs Outdated Show resolved Hide resolved

Copilot started work on behalf of gfs February 10, 2026 20:02 View session

Copilot finished work on behalf of gfs February 10, 2026 20:08

gfs requested a review from Copilot February 10, 2026 20:27

Copilot started reviewing on behalf of gfs February 10, 2026 20:27 View session

Copilot AI reviewed Feb 10, 2026

View reviewed changes

gfs requested changes Feb 12, 2026

View reviewed changes

Copilot started work on behalf of gfs February 12, 2026 05:10 View session

Copilot AI requested a review from gfs February 12, 2026 05:21

Copilot finished work on behalf of gfs February 12, 2026 05:21

gfs marked this pull request as ready for review February 12, 2026 05:22

gfs requested a review from Copilot February 12, 2026 05:23

Copilot started reviewing on behalf of gfs February 12, 2026 05:24 View session

Copilot AI reviewed Feb 12, 2026

View reviewed changes

Copilot started work on behalf of gfs February 12, 2026 05:34 View session

Add *.arj, *.arc, *.ace to .gitattributes as binary

03c5e33

Co-authored-by: gfs <[email protected]>

Copilot finished work on behalf of gfs February 12, 2026 05:35

gfs approved these changes Feb 12, 2026

View reviewed changes

Add ARJ, ARC, and ACE archive decompression support #194

Are you sure you want to change the base?

Add ARJ, ARC, and ACE archive decompression support #194

Conversation

Copilot AI commented Feb 6, 2026 • edited by gfs Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gfs commented Feb 6, 2026

Uh oh!

azure-pipelines bot commented Feb 6, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

gfs commented Feb 6, 2026

Uh oh!

Copilot AI commented Feb 6, 2026

Uh oh!

gfs commented Feb 6, 2026

Uh oh!

azure-pipelines bot commented Feb 6, 2026

Uh oh!

gfs commented Feb 6, 2026

Uh oh!

Copilot AI commented Feb 6, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

gfs commented Feb 10, 2026

Uh oh!

azure-pipelines bot commented Feb 10, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

gfs left a comment

Choose a reason for hiding this comment

Uh oh!

gfs Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI commented Feb 12, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

gfs commented Feb 12, 2026

Uh oh!

Copilot AI commented Feb 12, 2026

Uh oh!

gfs commented Feb 12, 2026

Uh oh!

azure-pipelines bot commented Feb 12, 2026

Uh oh!

gfs left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Feb 6, 2026 •

edited by gfs

Loading