Why I'm Obsessing Over Signal Capture

Analytics isn't the product. It's the foundation. A deep dive into the API design decisions behind Userloom's event ingestion schema, and why getting behavioral signals right matters more than anything else.

Seerat Awan
January 1, 2026
5 min read
Share:

Userloom has analytics. Good analytics. But I'm not building it so you can watch numbers go up. I'm building it so you know exactly when to reach out, and never miss the moment.

Behavioral email. In-app surveys. That's what saves users from churning. But those features are only as good as the signals feeding them. If we don't capture behavior perfectly, the emails go out at the wrong time. The surveys miss the moment. The whole system falls apart.

That's why I'm obsessing over the foundation. Analytics isn't the destination. It's the engine that powers everything else.

The Problem With Existing Tools

Here's what frustrated me about Mixpanel, Amplitude, and even Segment when building B2B products:

Three calls to do one thing. Want to identify a user and their company? That's an identify() call, a set_group() call, and maybe a get_group().set() call. Three network requests. Three chances to fail. Three opportunities for your data to end up in a weird partial state.

I have debugged too many "why is this user not linked to their company?" issues. It's always a race condition or a dropped request.

When your behavioral triggers depend on clean data, "mostly works" isn't good enough.

One Call To Rule Them All

So Userloom's $identify does everything at once:

{
  "batch": [{
    "id": "evt_001",
    "event": "$identify",
    "distinct_id": "anonymous_session_abc",
    "timestamp": "2024-12-31T10:00:00Z",
    "properties": {
      "traits": {
        "$user_id": "user_123",
        "$email": "jane@acme.com",
        "$name": "Jane Smith",
        "$created_at": "2024-12-31T10:00:00Z",
        "$group": {
          "$group_id": "acme_inc",
          "$group_type": "company",
          "$name": "Acme Inc",
          "$created_at": "2024-01-01T00:00:00Z",
          "$plan": "enterprise"
        }
      }
    }
  }],
  "sent_at": "2024-12-31T10:00:01Z"
}

One request. User created. Company created (or updated). User linked to company. Anonymous session merged. Done.

If it fails, nothing happened. No zombie users. No orphaned companies. No "user exists but isn't linked" nightmares.

Clean data in. Reliable triggers out.

The $ Prefix Convention

I borrowed this from PostHog (they use $ for default properties), but took it further.

Fields with $ are system fields - reserved names with special meaning to Userloom. These power core features: $email for identity, $group for company relationships, $plan for segmentation.

Fields without $ are your fields - track whatever matters to your product. feature_used, export_format, team_size. All fully queryable, all available for triggers.

The convention keeps things clean: you'll never accidentally overwrite a system field, and Userloom will never clash with your custom properties.

Anonymous → Known: The Identity Gap

This is where most behavioral email fails.

User browses anonymously, views pricing three times, then signs up. Your email tool has no idea they were ever on the pricing page. You can't send them the discount offer because you don't know they need it.

The distinct_id + $user_id pattern solves this:

  1. Track anonymous user with distinct_id: "anon_xyz"
  2. User signs up
  3. Send $identify with both the anonymous ID and the new $user_id
  4. System merges the history

Now you can trigger: "User viewed pricing 3+ times → Send discount offer." That's the insight that actually drives conversions.

Cookieless? Covered.

But what happens when there's no anonymous ID at all? Privacy-focused browsers, cookie blockers, incognito mode.

That's where fingerprinting comes in. When the SDK can't persist an anonymous ID, Userloom falls back to device fingerprinting: a combination of browser characteristics, screen resolution, timezone, and other signals that create a probabilistic identifier.

It's not 100% foolproof, but it's good enough to stitch together most anonymous sessions. And when the user finally identifies themselves, all that fingerprint-linked history merges into their profile.

Important: No personal data is gathered or stored during the fingerprinting process. It's an anonymous device identifier, not PII.

Batching By Default

There's no /track endpoint. No /identify endpoint. Everything goes through /batch.

Even if you're sending one event.

Why? Because:

  • SDKs can queue events and flush periodically
  • Offline-first becomes trivial
  • Retry logic is simpler (retry the batch, not individual calls)
  • Fewer connections = less overhead

It's a small API surface that handles everything. And when you're capturing signals from website, mobile, API, webhooks, and forms—simplicity matters.

Context Separation

Event data is split between properties (what happened) and context (where/how):

{
  "properties": {
    "feature_name": "export",
    "format": "csv",
    "row_count": 1500
  },

  "context": {
    "page": {
      "url": "https://app.example.com/reports",
      "path": "/reports",
      "title": "Reports Dashboard",
      "referrer": "https://app.example.com/home"
    },
    "screen": {
      "width": 1920,
      "height": 1080
    },
    "library": {
      "name": "userloom-js",
      "version": "1.0.0"
    },
    "locale": "en-US",
    "userAgent": "Mozilla/5.0..."
  }
}

Why this helps:

  • Clean separation of concerns
  • context can be auto-collected by SDKs (page, screen, library info)
  • properties stay focused on business-meaningful data
  • Easier to filter out noise when building triggers

Flexible Group Types

I didn't hardcode "company." The schema uses $group_type:

"$group_type": "company"

Because B2B isn't always Company → Users. Sometimes it's:

  • Company → Workspace → Users
  • Organization → Project → Members
  • Franchise → Location → Staff

One schema handles all of it. One foundation for any structure.

Why This Matters

I could have shipped a basic analytics layer and moved on to the "exciting" features: email templates, survey builders, dashboards.

But behavioral email only works if you know the behavior. Triggered surveys only work if you catch the trigger. Every feature I build depends on signals being captured correctly, completely, and reliably.

So I'm doing this right. The unsexy work. The foundation.

Because when you send that perfectly-timed email that saves a user from churning? It all starts here.

What's Next

The schema is done (for now 😉). Now comes the fun part: building the ingest pipeline and seeing if this actually holds up at scale.

I'll share the infrastructure decisions next, including why I chose Cloudflare Workers over AWS Lambda, how events flow from SDK to ClickHouse, and how I'm getting 8-9x cost savings.


Under the HoodPart 1 of 3

Follow along as I figure this out. View all articles →

Next in series

Core Infra and the Ingest Pipeline

Seerat Awan

Seerat Awan

Founder & Builder in Chief

Building the product growth platform for teams who act on behavior. Capture signals, trigger emails, launch surveys, and turn more users into engaged customers.

Limited beta spots available

Stop losing users to silence.

Join the beta, help us build, and lock in a lifetime discount when we launch.

Join founders on the waitlist · Lifetime discount at launch · Shape the product with us