Identity Graph — The Foundation for Accurate Marketing Attribution
When it comes to marketing attribution, most teams jump straight into debating which model to use — first-touch, multi-touch, you name it.
What they often skip, however, is the more boring but essential groundwork — building an Identity Graph first. That’s why many MTA projects fail and show no significant improvement compared to the last-click model.
In this article, we uncover the essential pillars required to build a solid data foundation that ensures your marketing attribution project succeeds — no matter which model you choose to use.
What Is an Identity Graph?
Consumers rarely follow a straight path to purchase. The same person might first discover your product while scrolling through Instagram on their phone, later search for your website from a desktop, and eventually convert through another browser after clicking an email link.
Without a robust Identity Graph, these actions are recorded as three separate users rather than one continuous journey. As a result, even the most advanced attribution model — whether first-touch or multi-touch — will produce misleading insights, since it’s built on fragmented data.
An Identity Graph solves this by connecting all user identifiers — cookies, device IDs, emails, and User IDs — into a single, unified profile. This unified view allows marketers to accurately trace how real people move across channels and devices. In short, it’s the foundation of reliable marketing attribution — without it, every model is built on incomplete data.

Key Pillars of a Robust Identity Graph
1. Capture Deterministic Identifiers
The most reliable way to identify users across sessions and platforms is through deterministic identifiers — such as a User ID or hashed email. These are explicit, verifiable signals that, unlike cookies, remain stable over time and can be matched confidently across your analytics, CRM, and ad platforms.
Collect them wherever possible — during newsletter sign-ups, content downloads, trial registrations, or purchases. Once captured, immediately hash the identifier (e.g., using SHA-256) to preserve privacy and store it as your persistent key.
2. Enable Cross-Device Continuity via Email Links
When sending marketing emails, decorate every link with a unique, privacy-safe identifier (e.g., uid= or a signed token).
Example: https://yourdomain.com/pricing?uid=<base64url(sha256(email))>
When the user clicks this link from another device, capture that uid on page load and set a first-party cookie value. All subsequent pageviews and website activity should include this uid, allowing you to merge sessions even if the user doesn’t log in on the new device.
3. Propagate Click IDs and UTMs to Preserve Cross-Browser Tracking
When users click on ads within mobile apps like Instagram, LinkedIn, or YouTube, the built-in in-app browser opens instead of the primary browser (Safari or Chrome). If the user then clicks “Open in Safari”, the connection between the original ad click and the future conversion in Safari/Chrome is lost.
Another common case: when users click on ads, browse your website, and then share a link with a friend or colleague. By default, tracking parameters exist only on the landing page URL and don’t persist across pages. So, if a user copies a link and sends it to someone, this interaction can no longer be attributed correctly.
Click Propagation solves both problems by always preserving tracking parameters such as gclid, fbclid, and liclid.
Learn more in our article: Click Propagation — And How It Impacts Marketing Attribution Accuracy
4. Connect Anonymous and Authenticated Sessions
Before authentication, a user’s activity (ad clicks, pageviews, form interactions) is recorded under an anonymous identifier such as a client_id. When they authenticate, a User ID or hashed email becomes available. If you don’t connect these identifiers, analytics and CRM systems will treat them as two separate users, losing the first touchpoint.
By stitching the User ID to historical anonymous data, you ensure that all prior website interactions are attributed to the same person — creating a complete, continuous customer journey. So, if the same user authenticates on a different device, their historical website activity will still be available.
Example:
A user clicks a LinkedIn ad, browses anonymously, and a couple of days later signs up for a demo. When they log in, your system should link their User ID to all earlier anonymous sessions so that the initial LinkedIn click is credited as the first touchpoint in the Identity Graph.
5. Use Probabilistic Signals for Broader Coverage
Even with deterministic identifiers like User ID or hashed email, there will always be users who never sign in, switch devices, or browse in privacy-restricted environments. To bridge these gaps, use probabilistic signals, such as IP addresses, to infer connections between fragmented sessions.
This approach doesn’t guarantee a perfect match, but it significantly improves coverage by linking interactions that likely belong to the same user. Probabilistic matching complements deterministic identity stitching — while deterministic data ensures accuracy, probabilistic signals provide continuity when explicit identifiers are missing.
Health Check
If your Identity Graph is functioning correctly, retargeting or email campaigns should never appear as the first touchpoint in your customer journeys — by definition, these users have already interacted with your website before.
Learn More
At SegmentStream, the Identity Graph lies at the core of every attribution project — and we take it very seriously.
Our goal is to connect every possible user touchpoint across sessions, browsers, and devices into one unified journey. To achieve this, we go as far as implementing tailored solutions for complex businesses where customer journeys are unique.
If you’d like to discuss how your business can benefit from a robust Identity Graph — don’t hesitate to contact us.
Optimal marketing
Achieve the most optimal marketing mix with SegmentStream
Talk to expert