ENG-3517: Upload and promotion for user uploaded files - schema variant, storage util, hooks, config#8113
Merged
Conversation
Contributor
|
The latest updates on your projects. Learn more about Vercel for GitHub.
1 Skipped Deployment
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #8113 +/- ##
==========================================
+ Coverage 85.57% 85.59% +0.01%
==========================================
Files 657 658 +1
Lines 42727 42847 +120
Branches 5012 5018 +6
==========================================
+ Hits 36565 36675 +110
- Misses 5055 5064 +9
- Partials 1107 1108 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
9dd46c8 to
3294cad
Compare
9d8d72e to
96eecd0
Compare
e6f0a43 to
f28be72
Compare
f321175 to
41ef133
Compare
JadeCara
reviewed
May 13, 2026
JadeCara
reviewed
May 13, 2026
JadeCara
reviewed
May 13, 2026
Contributor
JadeCara
left a comment
There was a problem hiding this comment.
Left a few comments - thinking the bool change is the most important to verify.
Data layer for the data-subject file-upload feature: `attachment_user_provided`
table + repository primitives. Upload service + endpoint live in fidesplus
and consume this PR.
Lifecycle: uploaded → promoted | deleted (deleted rows kept for audit).
The (field_name, property_id, policy_key) triple binds an upload to the
privacy-center config it was issued under so re-validation at submission
can refuse mismatches. storage_key is intentionally non-FK so audit rows
survive storage-config churn.
* Model: AttachmentUserProvidedStatus enum + AttachmentUserProvided model
with composite (status, created_at) index for the orphan-sweep predicate.
field_name/property_id/policy_key are intentionally non-FK so audit rows
survive policy/property/field rename or delete between upload and
submission.
* Migration 9b449105864d, reversible.
* Repository:
- create_uploaded: insert + flush + refresh, returns AttachmentUserProvidedRecord
- lock_by_ids: SELECT … FOR UPDATE ordered by id (deadlock-safe under
overlapping batches), returns dict[str, row] keyed by id; missing ids
absent. session is required (no @with_optional_sync_session) since the
lock is only meaningful inside a caller-owned transaction.
- assert_all_uploaded: lifecycle precondition check
- mark_promoted: pure mutation, caller owns the session boundary
- mark_deleted: in-session only, no commit
- list_uploaded_older_than: orphan-sweep query, exclusive cutoff
* Domain entities (AttachmentUserProvidedRecord) + exceptions
(AttachmentsServiceError + InvalidAttachmentStateError).
* Test suite covers every public method incl. lifecycle invariants,
empty-list short-circuit, missing-id detection, cutoff exclusivity.
* .fides/db_dataset.yml: dataset entries for the new table.
530a1c9 to
2104001
Compare
4f0cb45 to
709742b
Compare
…fig, and tests - Add domain exceptions for the attachments service - Add schema variant (attachment.py), storage util, extension hooks in privacy_request_service - Add config toggles (allow_custom_privacy_request_file_upload, security settings) - Cover new code with unit tests (exceptions, security/execution settings, storage util)
- AttachmentNotFoundError takes an optional reason tag so callers can surface diagnostic detail (e.g. row status) without expanding the exception surface. - FilesMagicBytes.max_prefix_length co-locates the header-peek length with the signatures it derives from; callers no longer recompute it across the repo boundary.
Cover the two diff lines codecov/patch flagged at 100% target: - AttachmentNotFoundError(reason=...) appends the tag to the message. - FilesMagicBytes.max_prefix_length tracks the longest signature.
709742b to
e5b5744
Compare
JadeCara
approved these changes
May 14, 2026
Contributor
JadeCara
left a comment
There was a problem hiding this comment.
Thanks for making the updates! I ran a quick security review on it too and it all came back clean!
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Ticket ENG-3517
List of related changes
attachment_user_providedmodel + repository (lifecycle, locking, orphan-sweep query)FileUploadCustomPrivacyRequestFieldschema variant, storage util (magic bytes, allowed types), promotion hook, config flagsPrivacyRequestServicepromotion hook implementationPOST /privacy-request/attachmentHTTP endpoint (multipart upload, per-field constraints, rate limit)uploadedrows older than cutoffclamav-icapcontainer (Ubuntu + c-icap + libclamav, profile-gated in compose)X-Infection-Foundhandling)Description Of Changes
Schema + storage + service-extension half of the upload foundation. Pairs with the data-layer PR (#8110); the upload service and endpoint live in fidesplus and consume both.
Adds the Privacy Center config variant for file fields, a magic-byte sniff catalog so the upload route can verify a payload's claimed type at byte-zero (client-supplied
Content-Typeis not trusted), two config knobs for the global upload ceilings, and twoPrivacyRequestServicehook points so fidesplus can plug attachment-resolve and attachment-promote into the existing submission flow without overridingcreate_privacy_requestwholesale.Code Changes
schemas/privacy_center_config.py:FileUploadCustomPrivacyRequestFielddiscriminated-union variant (field_type="file",max_size_bytes > 0,allowed_file_typesnon-empty + must be a subset ofAllowedFileType).schemas/attachment.py:AttachmentUploadResponse(the{id: "att_..."}payload returned by the fidesplus upload route).service/storage/util.py:FilesMagicBytes— known signatures (pdf,jpg/jpeg,png,gif, ...) andfrom_bytes(...)for byte-zero detection.AllowedFileType.mime_types(),MIME_TO_EXTENSION,extension_for_mime(...)helpers.FileUploadConstraints— runtime-validated config bag (per-field cap + allowed types) reused by the upload route and the schema validator.service/privacy_request/privacy_request_service.py:_resolve_attachment_state(submitted_data, *, db)and_promote_attachment_state(privacy_request, state)— overridable no-op hooks on the OSS service. Default impls do nothing so OSS behavior is unchanged.create_privacy_requestcalls_resolve_attachment_statebefore persisting and_promote_attachment_stateafter caching/masking-secrets. On promotion failure the request row is deleted (clear_cached_values()runs as part ofdelete()) and the original error is rewrapped asPrivacyRequestError.common/urn_registry.py: route constant for the future fidesplus upload endpoint.config/security_settings.py:request_attachment_max_bytesglobal ceiling (caps the per-fieldmax_size_bytes).config/execution_settings.py:attachment_orphan_ttl_secondsfor the orphan-cleanup sweep.FileUploadConstraints.Steps to Confirm
nox -s "pytest(ops-unit-non-api)" -- tests/ops/schemas/test_attachment.py tests/ops/schemas/test_privacy_center_config.py tests/ops/service/storage/test_util.py— all green.FileUploadCustomPrivacyRequestFieldwithmax_size_bytes=0orallowed_file_types=[]— pydantic should reject. Construct one with an extension outsideAllowedFileType— pydantic should reject.FilesMagicBytes.from_bytes(open("sample.pdf","rb").read(8))returns thepdfmember. The same call against random bytes returnsNone.PrivacyRequestServiceand callcreate_privacy_request(...)with a non-file payload — flow runs unchanged (hooks are no-ops).FileUploadConstraints,FilesMagicBytes, and the new hook points — verify in the companion fidesplus PR.Pre-Merge Checklist
CHANGELOG.mdupdatedmaindowngrade()migration is correct and works