Skip to content

feat: add Azure Blob Storage support#5803

Closed
hztBUAA wants to merge 1 commit intoFlowiseAI:mainfrom
hztBUAA:feat/azure-blob-storage
Closed

feat: add Azure Blob Storage support#5803
hztBUAA wants to merge 1 commit intoFlowiseAI:mainfrom
hztBUAA:feat/azure-blob-storage

Conversation

@hztBUAA
Copy link

@hztBUAA hztBUAA commented Feb 20, 2026

Summary

Closes #5411

  • Adds Azure Blob Storage as a fourth storage backend alongside local, S3, and GCS
  • Implements all 11 storage functions (upload, download, list, delete, stream) with Azure Blob branches
  • Adds Azure Blob Storage credential node supporting connection string or account name + key authentication
  • Adds Azure Blob File document loader node for loading documents from Azure Blob Storage
  • Adds custom multer storage engine for Azure Blob file uploads
  • Updates .env.example with Azure Blob Storage configuration variables
  • Adds @azure/storage-blob (^12.26.0) as dependency

New Environment Variables

Variable Description
STORAGE_TYPE=azure Set storage type to azure
AZURE_BLOB_STORAGE_CONNECTION_STRING Azure connection string (option 1)
AZURE_BLOB_STORAGE_ACCOUNT_NAME Storage account name (option 2)
AZURE_BLOB_STORAGE_ACCESS_KEY Storage access key (option 2)
AZURE_BLOB_STORAGE_CONTAINER_NAME Container name (required)

Test plan

  • Unit tests for getStorageType() with all storage types
  • Unit tests for getAzureBlobConfig() (connection string, account/key, missing credentials)
  • Unit tests for credential class validation
  • Unit tests for document loader node class validation
  • Manual testing with Azure Blob Storage account

🤖 Generated with Claude Code

Add Azure Blob Storage as a fourth storage backend alongside local, S3,
and GCS. This enables users to store files (uploads, document loaders,
chat attachments) in Azure Blob Storage.

Changes:
- Add Azure branches to all 11 storage functions in storageUtils.ts
- Add Azure Blob Storage credential (connection string or account/key)
- Add Azure Blob File document loader node
- Add Azure multer storage engine for file uploads
- Add @azure/storage-blob dependency
- Update .env.example with Azure configuration variables
- Add tests for Azure Blob Storage configuration and credential
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @hztBUAA, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the platform's storage capabilities by integrating Azure Blob Storage as a new backend option. This allows users to leverage Azure's robust and scalable object storage service for managing files, providing greater flexibility and choice in deployment environments. The changes encompass new credential and document loader nodes, along with comprehensive updates to existing storage utility functions to support Azure-specific operations.

Highlights

  • Azure Blob Storage Integration: Added Azure Blob Storage as a new storage backend, expanding storage options alongside local, S3, and GCS.
  • Comprehensive Storage Functionality: Implemented all 11 core storage functions (upload, download, list, delete, stream) with dedicated Azure Blob branches.
  • Azure Blob Storage Credential Node: Introduced a new credential node to support Azure Blob Storage authentication via connection string or account name and access key.
  • Azure Blob File Document Loader: Added a document loader node specifically for loading documents from Azure Blob Storage, supporting both built-in and Unstructured processing methods.
  • Custom Multer Storage Engine: Developed a custom Multer storage engine to handle file uploads directly to Azure Blob Storage.
  • Environment Variable Updates: Updated the .env.example file with new configuration variables for Azure Blob Storage, including STORAGE_TYPE=azure, AZURE_BLOB_STORAGE_CONNECTION_STRING, AZURE_BLOB_STORAGE_ACCOUNT_NAME, AZURE_BLOB_STORAGE_ACCESS_KEY, and AZURE_BLOB_STORAGE_CONTAINER_NAME.
  • Dependency Addition: Included @azure/storage-blob as a new dependency to enable Azure Blob Storage interactions.
Changelog
  • packages/components/credentials/AzureBlobStorage.credential.ts
    • Added a new credential class AzureBlobStorageApi to manage Azure Blob Storage authentication details.
    • Configured inputs for connection string, storage account name, and access key, all marked as optional to allow flexible authentication methods.
  • packages/components/nodes/documentloaders/AzureBlobFile/AzureBlobFile.ts
    • Added a new document loader node AzureBlobFile_DocumentLoaders for retrieving and processing files from Azure Blob Storage.
    • Implemented logic to handle file downloads from Azure Blob and process them using either built-in loaders (PDF, DOCX, CSV, Excel, PowerPoint) or the Unstructured API.
    • Included options for text splitting, metadata handling, and various Unstructured API configurations (strategy, encoding, skip infer table types, hi-res model name, chunking strategy, max characters, source ID key).
  • packages/components/nodes/documentloaders/AzureBlobFile/azureBlobStorage.svg
    • Added an SVG icon for the new Azure Blob Storage document loader node.
  • packages/components/package.json
    • Added @azure/storage-blob version ^12.26.0 to the project dependencies.
  • packages/components/src/azureBlobStorage.test.ts
    • Added unit tests to verify the correct export and structure of the AzureBlobStorageApi credential class.
    • Added unit tests to verify the correct export and structure of the AzureBlobFile_DocumentLoaders node class, including its credential and input/output definitions.
  • packages/components/src/storageUtils.azure.test.ts
    • Added new unit tests for Azure Blob Storage configuration functions within storageUtils.ts.
    • Verified getStorageType correctly identifies 'azure' when set.
    • Tested getAzureBlobConfig for proper error handling when container name or credentials are missing.
    • Confirmed getAzureBlobConfig successfully creates a client using both connection string and account name/key authentication methods.
  • packages/components/src/storageUtils.ts
    • Imported BlobServiceClient, ContainerClient, and StorageSharedKeyCredential from @azure/storage-blob.
    • Extended addBase64FilesToStorage, addArrayFilesToStorage, and addSingleFileToStorage functions to support uploading files to Azure Blob Storage.
    • Modified getFileFromUpload, getFileFromStorage, getFilesListFromStorage, removeFilesFromStorage, removeSpecificFileFromUpload, removeSpecificFileFromStorage, removeFolderFromStorage, and streamStorageFile to include Azure Blob Storage logic for respective operations.
    • Implemented fallback mechanisms in getFileFromStorage and streamStorageFile to handle files moved from a non-orgId path to an orgId-prefixed path.
    • Added new utility functions: getAzureBlobConfig for retrieving Azure Blob Storage client configuration, getAzureBlobStorageSize for calculating storage usage, _cleanEmptyAzureBlobFolders for recursive folder cleanup, and _deleteAzureBlobFolder for deleting blobs by prefix.
  • packages/server/.env.example
    • Updated the STORAGE_TYPE comment to include azure as a valid option.
    • Added new environment variables for Azure Blob Storage configuration: AZURE_BLOB_STORAGE_CONNECTION_STRING, AZURE_BLOB_STORAGE_CONTAINER_NAME, AZURE_BLOB_STORAGE_ACCOUNT_NAME, and AZURE_BLOB_STORAGE_ACCESS_KEY.
  • packages/server/src/utils/index.ts
    • Imported getAzureBlobConfig to enable Azure Blob Storage configuration.
    • Extended getMulterStorage to provide a custom Multer storage engine for direct file uploads to Azure Blob Storage.
Activity
  • The author, hztBUAA, created this pull request to add Azure Blob Storage support.
  • The pull request description includes a detailed summary of changes, new environment variables, and a test plan, indicating the author's thorough preparation.
  • Unit tests for getStorageType(), getAzureBlobConfig(), credential class validation, and document loader node class validation have been completed and passed.
  • Manual testing with an Azure Blob Storage account is planned but not yet marked as complete.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The addition of Azure Blob Storage support is a valuable enhancement to the storage capabilities of the platform. The implementation follows the established patterns for S3 and GCS backends. However, there are a few critical issues that must be addressed before merging: a syntax error in the server utilities that breaks local storage, incorrect usage of the Azure SDK's upload method signature, and inefficient memory usage during file uploads in the Multer storage engine. Addressing these will ensure the stability and performance of the new storage provider.

Comment on lines 1983 to 1984
} else {
return multer({ dest: getUploadPath() })
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The else block for local storage is incomplete and missing its closing braces. This will cause a syntax error and prevent the server from starting or correctly falling back to local storage.

    } else {
        return multer({ dest: getUploadPath() })
    }
}

Comment on lines +1962 to +1975
const chunks: Buffer[] = []
file.stream.on('data', (chunk: Buffer) => chunks.push(chunk))
file.stream.on('end', async () => {
try {
const buffer = Buffer.concat(chunks)
await blockBlobClient.upload(buffer, buffer.length, {
blobHTTPHeaders: { blobContentType: file.mimetype }
})
cb(null, { path: blobName, size: buffer.length })
} catch (err) {
cb(err)
}
})
file.stream.on('error', (err: Error) => cb(err))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Manually buffering the entire file into memory using file.stream.on('data', ...) is inefficient and can lead to Out-Of-Memory (OOM) errors for large file uploads. It is recommended to use blockBlobClient.uploadStream() to pipe the data directly to Azure Blob Storage. This also avoids the incorrect upload method signature used here.

                    blockBlobClient.uploadStream(file.stream, undefined, undefined, {
                        blobHTTPHeaders: { blobContentType: file.mimetype }
                    }).then(() => {
                        cb(null, { path: blobName })
                    }).catch(cb)

Comment on lines +111 to +113
await blockBlobClient.upload(bf, bf.length, {
blobHTTPHeaders: { blobContentType: mime, blobContentEncoding: 'base64' }
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The upload method in @azure/storage-blob v12 does not take the buffer length as a second argument. Passing bf.length here will cause it to be interpreted as the options object, which will lead to runtime errors or ignored headers. For Buffers, it is recommended to use uploadData which has the correct signature for this use case.

Suggested change
await blockBlobClient.upload(bf, bf.length, {
blobHTTPHeaders: { blobContentType: mime, blobContentEncoding: 'base64' }
})
await blockBlobClient.uploadData(bf, {
blobHTTPHeaders: { blobContentType: mime, blobContentEncoding: 'base64' }
})

Comment on lines +318 to +325
const downloadResponse = await blockBlobClient.download(0)
const chunks: Buffer[] = []
if (downloadResponse.readableStreamBody) {
for await (const chunk of downloadResponse.readableStreamBody) {
chunks.push(Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk))
}
}
return Buffer.concat(chunks)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Instead of manually downloading the stream and concatenating chunks into a Buffer, you can use the built-in downloadToBuffer() method. This is more concise and efficient.

Suggested change
const downloadResponse = await blockBlobClient.download(0)
const chunks: Buffer[] = []
if (downloadResponse.readableStreamBody) {
for await (const chunk of downloadResponse.readableStreamBody) {
chunks.push(Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk))
}
}
return Buffer.concat(chunks)
const blockBlobClient = containerClient.getBlockBlobClient(filePath)
return await blockBlobClient.downloadToBuffer()

Comment on lines +387 to +394
const downloadResponse = await blockBlobClient.download(0)
const chunks: Buffer[] = []
if (downloadResponse.readableStreamBody) {
for await (const chunk of downloadResponse.readableStreamBody) {
chunks.push(Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk))
}
}
const objectData = Buffer.concat(chunks)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The manual stream-to-buffer conversion can be simplified by using the downloadToBuffer() method provided by the Azure SDK.

Suggested change
const downloadResponse = await blockBlobClient.download(0)
const chunks: Buffer[] = []
if (downloadResponse.readableStreamBody) {
for await (const chunk of downloadResponse.readableStreamBody) {
chunks.push(Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk))
}
}
const objectData = Buffer.concat(chunks)
const objectData = await blockBlobClient.downloadToBuffer()

@hztBUAA
Copy link
Author

hztBUAA commented Feb 25, 2026

Thanks for the review and feedback. I am following up on this PR now and will either push the requested changes or reply point-by-point shortly.

@hztBUAA
Copy link
Author

hztBUAA commented Feb 25, 2026

Quick follow-up: I am reviewing the feedback and will update this PR shortly.

@HenryHengZJ
Copy link
Contributor

a similar PR is already in work: #5604

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Azure Blob Storage support to Flowise storage configuration

2 participants