Skip to content

feat(Segment membership inspection PoC): Daily Snowflake-backed per-env segment counts#7464

Open
khvn26 wants to merge 10 commits into
mainfrom
feat/segment-membership-counts
Open

feat(Segment membership inspection PoC): Daily Snowflake-backed per-env segment counts#7464
khvn26 wants to merge 10 commits into
mainfrom
feat/segment-membership-counts

Conversation

@khvn26
Copy link
Copy Markdown
Member

@khvn26 khvn26 commented May 8, 2026

Thanks for submitting a PR! Please check the boxes below:

  • I have read the Contributing Guide.
  • I have added information to docs/ if required so people know about the feature.
  • I have filled in the "Changes" section below.
  • I have filled in the "How did you test this code" section below.

Changes

Contributes to #5663.

Adds a daily pipeline that backfills Dynamo identities into Snowflake, materialises per-(segment, environment) match counts via flagsmith-sql-flag-engine, and exposes them on the segment endpoint as memberships: [{environment, count, last_synced_at}] for env-dropdown badges. Gated behind the org-scoped segment_membership_inspection FoF flag; no-ops when SNOWFLAKE_* env vars are unset.

Review complexity: 4/5

Review order recommendation: models.py (cache table) → services.py (compile + count, parameterised SQL) → tasks.py (daily recurring backfill fans out per-project refresh) → mappers.py (Dynamo doc → IDENTITIES row) → migrations/0002_* (Snowflake DDL RunPython, no-op when unconfigured) → segments/serializers.py + views.py (read-side memberships field, prefetched).

How did you test this code?

Beyond the existing unit + integration tests:

Ran an end-to-end smoke test against DynamoDB Local + a real Snowflake account configured via .env. The flow:

  1. Brought up DynamoDB Local in Docker.
  2. Ran make docker-up django-migrate — migrations created the SegmentMembership table in core Postgres and the IDENTITIES schema in Snowflake.
  3. Seeded fixtures in core Postgres: a project, an environment, and a segment with rule plan EQUAL "growth".
  4. Seeded 50 synthetic identities in DynamoDB Local — 25 with traits.plan = "growth", 25 with traits.plan = "basic".
  5. Invoked backfill_identities_to_snowflake() directly — confirmed all 50 rows landed in Snowflake's IDENTITIES table with QUERY_TAG correctly attributing the spend per (org, project).
  6. Invoked refresh_project_segment_counts(project_id) — confirmed exactly one SegmentMembership row materialised with count = 25 and a fresh last_synced_at.

Extensive testing will be done on staging.

Backfills identities from Dynamo to Snowflake daily, then refreshes
per-(segment, environment) match counts in the new `SegmentMembership`
cache. The translator from `flagsmith-sql-flag-engine` turns each
canonical segment into a SQL `WHERE` predicate; counts are
materialised as `COUNT(*) ... GROUP BY environment_id` per segment.
The serializer surfaces them as a list of `{environment, count,
last_synced_at}`, ready to back per-env count badges in the
Identities-tab environment dropdown.

Pipeline shape:

- `backfill_identities_to_snowflake` is the daily recurring task
  (`timeout=4h` to fit large environments). After backfilling each
  project's environments it dispatches one
  `refresh_project_segment_counts(project_id)` per project so the
  count refresh always sees the freshly backfilled snapshot rather
  than racing a separate schedule.
- `refresh_project_segment_counts` opens its own Snowpark session,
  re-checks the FoF flag at execution time so a stale fan-out skips
  orgs that have since been disabled, and bulk-upserts via Postgres
  `ON CONFLICT` (single statement per project).
- `compute_segment_counts_for_project` returns a list of unsaved
  `SegmentMembership` instances; the task stamps `last_synced_at`
  consistently across the batch. Untranslatable segments emit a
  structlog `compute.segment.skipped` error event so we hear about
  predicate gaps rather than silently dropping rows.

Both tasks short-circuit when SNOWFLAKE_* env vars are unset and
skip per-organisation when the `segment_membership_inspection`
Flagsmith-on-Flagsmith flag is False, so SaaS rolls out gradually
and self-hosted is unaffected.

DELETE-then-INSERT runs without an explicit transaction. Snowflake
holds micropartition locks for the lifetime of an open transaction,
and at 10M+ identities a BEGIN/COMMIT around the whole env partition
would keep that lock open for minutes. Per-statement implicit
commits leave a brief mid-refresh window where readers see an empty
partition; acceptable under the FoF flag's gradual rollout.

Backfill writes via Snowpark DataFrames against the canonical
IDENTITIES schema, with `DynamoIdentity` documents projected through
`segment_membership.mappers.map_identity_document_to_snowflake_row`.
Refresh issues a single batched UNION ALL using parameterised SQL —
env keys are bound, predicates from the engine are already escape-
safe. Schema setup is a `RunPython` migration gated on
`is_snowflake_configured()`, so it no-ops on self-hosted and in the
test suite.

The segment serializer surfaces cached counts via a new `memberships`
list field; absence of an entry is the read-side signal, no flag
check on the read path. `SegmentMembershipSerializer` gives
drf-spectacular a typed schema. Adds a generic `batched` helper to
`api/util/util.py` for the per-INSERT batching.

beep boop
@khvn26 khvn26 requested review from a team as code owners May 8, 2026 23:05
@khvn26 khvn26 requested review from gagantrivedi and removed request for a team May 8, 2026 23:05
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

@vercel
Copy link
Copy Markdown

vercel Bot commented May 8, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

3 Skipped Deployments
Project Deployment Actions Updated (UTC)
docs Ignored Ignored Preview May 10, 2026 2:22pm
flagsmith-frontend-preview Ignored Ignored Preview May 10, 2026 2:22pm
flagsmith-frontend-staging Ignored Ignored Preview May 10, 2026 2:22pm

Request Review

@github-actions github-actions Bot added api Issue related to the REST API docs Documentation updates feature New feature or request and removed docs Documentation updates labels May 8, 2026
Comment thread api/segment_membership/mappers.py Fixed
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

Docker builds report

Image Build Status Security report
ghcr.io/flagsmith/flagsmith-e2e:pr-7464 Finished ✅ Skipped
ghcr.io/flagsmith/flagsmith-api-test:pr-7464 Finished ✅ Skipped
ghcr.io/flagsmith/flagsmith-frontend:pr-7464 Finished ✅ Results
ghcr.io/flagsmith/flagsmith-api:pr-7464 Finished ✅ Results
ghcr.io/flagsmith/flagsmith:pr-7464 Finished ✅ Results
ghcr.io/flagsmith/flagsmith-private-cloud:pr-7464 Finished ✅ Results

…ps prefetch

The new `prefetch_related("memberships")` adds one IN-clause query per
list response, even when no rows exist. Update the regression
expectations so the existing test suite reflects the new baseline.

beep boop
@github-actions github-actions Bot added docs Documentation updates feature New feature or request and removed feature New feature or request docs Documentation updates labels May 8, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 8, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.39%. Comparing base (e4651d1) to head (89e88f9).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7464      +/-   ##
==========================================
- Coverage   98.44%   98.39%   -0.06%     
==========================================
  Files        1398     1411      +13     
  Lines       52654    53111     +457     
==========================================
+ Hits        51834    52256     +422     
- Misses        820      855      +35     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

… pre-release

Switches the api dep from a private-repo git URL — which the Docker
build can't clone in CI — to a versioned pin against Flagsmith's
staging CodeArtifact PyPI (`flagsmith-pypi-staging`,
account 302456015006, eu-west-2). Initial published release: 0.1.0a1.

The reusable docker-build workflow now unconditionally assumes the
OIDC role
`arn:aws:iam::302456015006:role/codeartifact-github-actions-staging`
(trust policy allows any `repo:Flagsmith/*`), fetches an
authorisation token, and exposes it to every build as the
`codeartifact_token` BuildKit secret. Builds that don't mount the
secret simply ignore it; the OIDC + token cost is a couple of
seconds per build.

`Dockerfile`'s four `make install*` lines mount the
`codeartifact_token` secret and export
`POETRY_HTTP_BASIC_FLAGSMITH_PYPI_STAGING_*` so poetry resolves the
dep from CodeArtifact. The header documents the
`--secret="id=codeartifact_token,env=..."` incantation for local
builds.

beep boop
@khvn26 khvn26 requested a review from a team as a code owner May 9, 2026 00:05
@github-actions github-actions Bot added docs Documentation updates feature New feature or request and removed feature New feature or request docs Documentation updates labels May 9, 2026
@github-actions

This comment was marked as low quality.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 9, 2026

Visual Regression

16 screenshots compared. See report for details.
View full report

@github-actions github-actions Bot added docs Documentation updates and removed feature New feature or request docs Documentation updates labels May 9, 2026
@github-actions github-actions Bot added feature New feature or request and removed feature New feature or request docs Documentation updates labels May 9, 2026
@khvn26 khvn26 force-pushed the feat/segment-membership-counts branch from 5ff82e9 to 4d954a2 Compare May 9, 2026 19:12
@github-actions github-actions Bot added docs Documentation updates feature New feature or request and removed feature New feature or request docs Documentation updates labels May 9, 2026
@khvn26 khvn26 changed the title feat(segment_membership): Daily Snowflake-backed per-env segment counts feat(Segment membership inspection PoC): Daily Snowflake-backed per-env segment counts May 9, 2026
The unlabeled metrics added in this PR trip a bug in flagsmith-common's
`assert_metric` test plugin (`MetricWrapperBase.clear()` raises
AttributeError on unlabeled metrics). Fix is in flagsmith-common#224;
pin to the branch until that lands and a new release is cut.

beep boop
@github-actions github-actions Bot added docs Documentation updates feature New feature or request and removed feature New feature or request docs Documentation updates labels May 9, 2026
Wires the segment-membership pipeline against DynamoDB Local + a real
Snowflake account: seeds a project, environment, and segment in core
Postgres; creates the EdgeIdentities table; seeds 25 matching + 25
non-matching identities; runs backfill + refresh tasks; asserts
SegmentMembership.count equals the matching seed.

Run with `make docker-up django-migrate` followed by
`make smoke-test-segment-membership`. SNOWFLAKE_* env vars come from
.env via Make's existing dotenv include; cleans the env's Snowflake
rows on exit.

beep boop
@github-actions github-actions Bot added docs Documentation updates feature New feature or request and removed feature New feature or request docs Documentation updates labels May 10, 2026
@github-actions github-actions Bot added docs Documentation updates feature New feature or request and removed feature New feature or request docs Documentation updates labels May 10, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 10, 2026

Playwright Test Results (oss - depot-ubuntu-latest-16)

passed  1 passed

Details

stats  1 test across 1 suite
duration  38.9 seconds
commit  dd747cd
info  🔄 Run: #16630 (attempt 1)

Playwright Test Results (oss - depot-ubuntu-latest-arm-16)

passed  1 passed

Details

stats  1 test across 1 suite
duration  41.8 seconds
commit  dd747cd
info  🔄 Run: #16630 (attempt 1)

Playwright Test Results (private-cloud - depot-ubuntu-latest-arm-16)

passed  1 passed

Details

stats  1 test across 1 suite
duration  47.1 seconds
commit  dd747cd
info  🔄 Run: #16630 (attempt 1)

Playwright Test Results (private-cloud - depot-ubuntu-latest-16)

failed  1 failed
passed  1 passed

Details

stats  2 tests across 2 suites
duration  41.9 seconds
commit  dd747cd
info  📦 Artifacts: View test results and HTML report
🔄 Run: #16630 (attempt 1)

Failed tests

firefox › tests/environment-permission-test.pw.ts › Environment Permission Tests › Environment-level permissions control access to features, identities, and segments @enterprise

### Playwright Test Results (oss - depot-ubuntu-latest-16)

passed  1 passed

Details

stats  1 test across 1 suite
duration  39.1 seconds
commit  89e88f9
info  🔄 Run: #16631 (attempt 1)

Playwright Test Results (private-cloud - depot-ubuntu-latest-16)

passed  1 passed

Details

stats  1 test across 1 suite
duration  32.1 seconds
commit  89e88f9
info  🔄 Run: #16631 (attempt 1)

Playwright Test Results (oss - depot-ubuntu-latest-arm-16)

passed  1 passed

Details

stats  1 test across 1 suite
duration  41.8 seconds
commit  89e88f9
info  🔄 Run: #16631 (attempt 1)

Playwright Test Results (private-cloud - depot-ubuntu-latest-arm-16)

passed  3 passed

Details

stats  3 tests across 3 suites
duration  41.9 seconds
commit  89e88f9
info  🔄 Run: #16631 (attempt 1)

Mappers: drop private-helper tests, replace with parametrised cases
exercising `map_identity_document_to_snowflake_row` directly; trust
TypedDict-required fields rather than caring for absent ones.

Migration: assert the full DDL fed into `sess.sql(...)` and `spec`
the Snowpark mock against the real Session class.

beep boop
@github-actions github-actions Bot added docs Documentation updates feature New feature or request and removed feature New feature or request docs Documentation updates labels May 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api Issue related to the REST API feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants