Twin Campaigns (in situ)

A single install, bisected. How the same install_instance_id resolves to two different campaign_ids — one in Postgres, one in ClickHouse — because hazelnut and linkrunner-backend key on different Meta identifiers and each silently creates its own Campaign row.

Specimen · accession HZ-2026-04-24-0607-TWIN
Date opened 2026-04-24 06:07 UTC · 11:37 IST
Collection window 2026-04-23 15:00 → 2026-04-24 03:00 UTC (12 h)
Clinical context Post-Kafka-truncate (14:57 UTC). New-system on fresh topic offsets; mirror deployed Apr-22.
Referring clinician Ayaan R. · observed campaign numbers not matching at dashboard level
Pathologist tools@linkrunner.io — forensic review of live prod replicas
Specimen sources Postgres 10.1.0.2:5433 · ClickHouse 10.1.0.190:8123 · OTel 10.1.0.33:8123 · API metrics 10.1.0.156:8123
Status ACTIONABLE · systematic defect · no patient harm from pipeline lag
§01IMPRESSION

Total traffic is intact. Campaign identity is not.

Across the 12-hour post-truncate window, linkrunner-backend and hazelnut agree on 98.9% of attributed installs in aggregate — the pipeline is healthy, Kafka is flowing, the consumer is not lagging. But at the campaign-id level the two services diverge violently: on Project 89 they share only 2 of 21 Meta campaigns, while still reporting nearly identical install totals. The same install is being attributed to two parallel Campaign rows that differ only in which Meta identifier they were keyed on.

15,878 / 16,053
Attributed installs · CH vs PG Aggregate parity: 98.9%. The plumbing works.
2 / 21
Project 89 · campaigns shared 19 live in PG only, 19 in CH only — same traffic, different IDs.
91.5%
New-schema Meta rows · meta=false 1,392 of 1,521 hazelnut-touched Meta rows are invisible to legacy's WHERE meta_campaign_id=? lookup.

One sentence: hazelnut's Meta-ID normalizer prefers the modern campaign_id field when adset_id is also present (§05), while linkrunner-backend unconditionally uses campaign_group_id. Those are different numeric IDs on a non-trivial share of payloads. The findExistingCampaign chain in hazelnut (file cites below) then misses every existing legacy row and falls through to INSERT, minting a parallel Campaign. Legacy keeps picking its row; hazelnut keeps picking its own. Neither side ever sees the twin.

Pathologist · the finding is structural, not flaky. Bifurcation reproduces deterministically on every sampled install_instance_id (5/5). Traces on 10.1.0.33 show attribution-consumer p99 at 228 ms; the retry consumer p99 at ~40 s is within design. This is not a lag story.

§02SPECIMEN COMPARISON

Two slides. Same tissue. Different stain.

25703326 display · UqozgBjrms meta_campaign_id · …960758 LINKRUNNER · POSTGRES old schema · meta=true created 2026-01-15 · network_account_id=567 26096423 display · mjoniv ad_network_campaign_id · …960752 HAZELNUT · CLICKHOUSE new schema · ad_network_id=3 created 2026-03-31 · meta=false ONE PATIENT 1fe82ab2…cebe install_instance_id name (both): FB_AAA_Event Based_AAA Account_Android_Top Ads_India_Acquisition_Purchase_15 Jan 2026
Figure 01. Twin Campaign rows for the same project + name, both active=true, coexistent in the same Postgres table. Left: the legacy specimen, keyed on Meta campaign_group_id. Right: the hazelnut-created specimen, keyed on Meta campaign_id. The red filament is the install_instance_id each service sees, and each service attaches it to its own row.
SPECIMEN A · PG
id = 25703326
active · legacy · picked by linkrunner-backend
display_id
UqozgBjrms
project_id
89
created_at
2026-01-15 18:05:04 UTC
ad_network_id
NULL
meta
TRUE
meta_campaign_id
120240753929960758
ad_network_campaign_id
NULL
network_account_id
567
installs (window)
1,194
SPECIMEN B · CH
id = 26096423
active · new · picked by hazelnut
display_id
mjoniv
project_id
89
created_at
2026-03-31 20:44:11 UTC
ad_network_id
3 · Meta
meta
FALSE
meta_campaign_id
NULL
ad_network_campaign_id
120240753929960752
network_account_id
NULL
installs (window)
1,194

The Meta IDs differ by six in the trailing digits. …960758 is the Meta campaign_group (top level); …960752 is the Meta campaign / ad-set (middle level). Same campaign, two identifiers. The services chose different levels of the Meta hierarchy to key on — see §05.

§03PATIENT HISTORY

Five installs, same night. Each one split, the same way.

Sampled from the post-truncate window. For each install_instance_id the query ran once against Postgres ("Install") and once against ClickHouse (hazelnut.installs_denormalized FINAL). Column campaign_id · PG is what the legacy service attached; campaign_id · CH is what hazelnut attached. Every specimen in the sample bifurcates 25703326 → 26096423. Deterministic, not racy.

install_instance_id installed_at · UTC campaign_id · PG campaign_id · CH CH delta
1fe82ab2-4391-4795-8343-dc673014cebe 2026-04-23 23:56:41 25703326 26096423 +24 s
6d648909-b971-4c9b-b456-fe4cc3d16a40 2026-04-23 23:53:01 25703326 26096423 +5 m 21 s
5acb11fb-045b-40fb-b588-49b768e4b673 2026-04-23 23:48:38 25703326 26096423 +2 m 00 s
490deeae-6e12-46ac-803a-eeb45abb42a2 2026-04-23 23:43:11 25703326 26096423 +1 m 31 s
8a03a625-cfbc-478e-9065-e3dd036ae187 2026-04-23 23:41:08 25703326 26096423 +34 s
5 of 5 installs sampled · 100% bifurcation · CH write delay median ~90 s (healthy)

A campaign-level dashboard does GROUP BY campaign_id. When half the traffic lands on 25703326 and half on 26096423, each dashboard shows what its underlying store saw — and the numbers will never match, even though every single install is accounted for.

§04ETIOLOGY

The mutation sits at one branch, in one function.

Hazelnut normalizes the decrypted Meta payload in one place before any Campaign lookup. The branch below decides which Meta identifier becomes campaignID. When Meta sends both adset_id and campaign_id — the modern field set — hazelnut picks d.CampaignID. Legacy linkrunner-backend, when it can read the payload at all, picks campaignGroupId. Those are different Meta entities.

hazelnut internal/consumer/attribution/meta_decrypt.go:49–75
// Meta legacy mapping (matching TS normalizeMetaIds):
//   adgroup_id        → ad_id (creative)
//   campaign_id       → adset_id
//   campaign_group_id → campaign_id
// When modern fields (adset_id + campaign_id) are both present, prefer those directly.
func NormalizeMetaIDs(d *MetaDecryptedData) NormalizedMetaIDs {
    ...
    campaignID := ""
    if d.AdSetID != "" && d.CampaignID != "" {
        // Modern: both present, campaign_id is the real campaign.
        campaignID = d.CampaignID          // ← picks …960752
    } else {
        campaignID = d.CampaignGroupID     //   would pick …960758
    }
    ...
}

That campaignID flows through the Meta strategy to the repository's find-or-create. The lookup chain is literal — four SELECTs, LIMIT 1, no ORDER BY, no cross-schema fallback. Each miss lands on the next. A total miss falls through to INSERT, which mints a fresh Campaign row.

hazelnut cmd/consumer_attribution.go:1155–1193
// findExistingCampaign tries each lookup strategy in priority order:
// google_campaign_id → meta_campaign_id → display_id → natural key.
if campaign.MetaCampaignID != "" {
    lookups = append(lookups, func() (*Campaign, error) {
        return r.findCampaignByMetaCampaignID(ctx, campaign.ProjectID, campaign.MetaCampaignID)
        // WHERE project_id=$1 AND meta_campaign_id=$2 AND NOT deleted  — misses, value is …752
    })
}
if campaign.AdNetworkID > 0 || campaign.AdNetworkCampaignID != "" {
    lookups = append(lookups, func() (*Campaign, error) {
        return r.findCampaignByNaturalKey(ctx, campaign.ProjectID, campaign.AdNetworkID, campaign.AdNetworkCampaignID)
        // WHERE project_id=$1 AND ad_network_id=$2 AND ad_network_campaign_id=$3  — first run = miss.
    })
}
// no ORDER BY, no meta=TRUE bias, no ad_network_id IS NOT NULL preference.
// two equal candidates? implementation-defined heap order.

The legacy service's mirror of this code is shorter and less forgiving. It will only find a row keyed on meta_campaign_id, and will never look at ad_network_campaign_id. Which means: once hazelnut has created its new-schema row, legacy is structurally incapable of noticing it.

linkrunner-backend src/attribution/strategies/meta-strategy.ts:47–53
let campaign = await prismaClient.campaign.findFirst({
    where: {
        meta_campaign_id: metaAdsData.campaignGroupId.toString(),  // …960758 — the legacy top-level ID
        project_id: projectId,
        deleted: false,
    },
    // no orderBy. Prisma returns the row Postgres hands back first —
    // with two rows matching, typically insertion order, so the older row wins.
});

And one more piece — the empirical one: of 1,521 active ad_network_id=3 rows in Postgres right now, 1,392 (91.5%) have meta=FALSE and meta_campaign_id=NULL. Whatever code path minted those specific rows never wrote the legacy columns. Those rows are, by construction, invisible to every legacy WHERE meta_campaign_id=? query in linkrunner-backend. That's what makes the bifurcation permanent rather than self-healing.

§05GROSS DESCRIPTION · SCOPE

Systemic, not incidental.

Postgres · active Campaign rows, keyed by schema

Old schema only · meta=TRUE3,581
New schema only · ad_network_id IS NOT NULL2,084
Both set · conflicting state130
Neither (organic / default-link)7,753,852

Meta (ad_network_id=3) · legacy-visibility

Active Meta rows · total1,521
meta=TRUE · visible to legacy lookup129
meta=FALSE, meta_campaign_id=NULL1,392
Legacy-invisible share91.5%

In the 12-hour window · PG-attributed campaigns

Old-schema rows with an active new-schema twin84 / 461
Twin prevalence18.2%
Project 89 · PG campaigns · CH campaigns · overlap21 · 21 · 2
Project 89 · attributed installs · PG · CH7,207 · 7,198

Totals · window

PG attributed installs16,053
CH attributed installs15,878
CH organic (campaign_id=0)36,522
Aggregate install parity98.9%

New-schema Meta rows created per day · 2026-03-15 → 2026-04-24

Mar 31 · 212 Apr 8 · 398 Apr 9 · 423
Mar 15 · first new-schema row Apr 24 · today

Hazelnut has been minting new-schema Meta rows continuously since mid-March, not just as a one-time migration blast. Two conspicuous burst days (Apr 8 · 398, Apr 9 · 423) likely coincide with deploys or catch-up traffic — worth an independent look, but not the cause of the divergence itself.

§06DIFFERENTIAL · WHAT THIS IS NOT

Ruled out, so we don't chase ghosts.

Ruled out

Kafka lag on install-events-hazelnut

Post-truncate offsets are fresh. OTel on 10.1.0.33: attribution-consumer p99 = 228 ms over 135,082,045 spans in window; retry-consumer p99 = 40 s (by design). CH write delay on the five sampled installs is 34 s → 5 m 21 s — well within norm.

Ruled out

Legacy ingestion gap

On the source side — the mirror-fed 10.1.0.156/api/client/init served 5.28M 200s and 1.89M 202s in the window. Capture-event, trigger, attribution-data all healthy. No platform-side throttling.

Ruled out

Click-matcher routing installs to the wrong campaign

Meta strategy has Priority = 100; click-match has lower. The strategy's resolved campaign beats any click's CampaignID. The click path isn't choosing these rows.

Ruled out

NetworkAccount preference drift (issue #208)

#208 orders credentialed + INTEGRATED accounts first — but only influences which network_account_id is stamped on a newly created row. It doesn't choose between existing Campaign rows. Confirmed by walking the code at cmd/consumer_attribution.go:1357–1391.

Related, not root

click_instance_id UUID mint-mismatch

A separate issue (hazelnut and legacy mint their own UUIDs for the same logical install-instance). Relevant to lr_ia_id drift for mirrored installs, but does not explain campaign-id bifurcation — install_instance_id agrees, campaign_id doesn't.

§07RECOMMENDATION

Three treatments, ranked. One is cheap and stops the bleed.

01

Widen findExistingCampaign before INSERT

Before hazelnut falls through to INSERT, run one more lookup against the other schema — if we have a Meta campaign ID we're about to write as ad_network_campaign_id, also SELECT WHERE meta_campaign_id = $2. Symmetric on the legacy side. Attaches the install to the existing row instead of minting a twin.

Stops new twins at the source. Does not unify the ~2,084 already-created rows — §03 recommends a separate backfill. Trade-off: widens the lookup hot path by one SELECT; measurable impact at find_or_create_campaign (current p99 11 ms) should be negligible.

Cost ~1 day · low risk
02

Align NormalizeMetaIDs on campaign_group_id

Drop the modern-field preference at meta_decrypt.go:62. Always use d.CampaignGroupID if present; fall back to d.CampaignID only if the group is absent. Now both services key on the same Meta ID, the same legacy row is found, and no twin is ever created.

Cleanest semantic fix. Risk: modern-only payloads (where Meta omits campaign_group_id) regress to current legacy behavior — investigate frequency before shipping. Consult the Meta Marketing API versioning notes before flipping.

Cost ~2 days · medium risk
03

Deprecate the old schema · backfill + route everything through ad_network_*

One-time migration that sets ad_network_id=3, ad_network_campaign_id=meta_campaign_id, meta_campaign_id=NULL on every old-schema Meta row; then patch linkrunner-backend's Meta strategy to query by ad_network_campaign_id. Unifies the data model; retires the dual-schema era entirely.

Definitive but expensive. Requires coordinated TS + Go release + a backfill that touches ~3,581 existing rows + re-keying any dashboard that filters on meta=TRUE. Not a hotfix; a planned migration.

Cost ~1–2 weeks · high risk

Recommendation: ship Option 01 as a hotfix this week to stop new twins. File a follow-up for Option 02 once the campaign_group_id coverage question is answered against a week of live Meta payloads. Treat Option 03 as the planned retirement — worth scoping, not worth rushing.