Frontend Observability and Web Vitals

Real user monitoring with web-vitals, Sentry Performance, and percentile dashboards. Tie frontend optimization to actual field data, not Lighthouse vibes.

Lighthouse said our creator profile pages scored 92. Customers said the pages were slow. Both were right.

This was at a live-video creator platform I led engineering at. We’d spent a quarter on edge caching with Cloudflare Workers, dynamic Open Graph metas, the works. Synthetic runs looked great. Then a creator DM’d our founder a screen recording of his own profile page taking forever to render on a mid-range Android in Berlin. Synthetic was clean. Field was a mess. The gap between those two stories is the entire reason this post exists.

OK so here’s my position, up front. If you’re optimizing frontend performance without a real-user-monitoring pipeline wired to web-vitals, Sentry Performance, and percentile dashboards split by route and device, you’re guessing. Lighthouse CI is a smoke check. It is not a workflow.

Synthetic Scores Lie to You

Lab tests run on a beefy GitHub Actions runner with a simulated 4G profile. Real users run on a five-year-old phone on a Turkish 4G tower at 4 p.m. on a weekday. Same page. Wildly different LCP. The “Core Web Vitals score” badge in your CI pipeline is telling you the median lab result, which is roughly the best case your code can produce. The Web Vitals contract Google actually grades you on is the p75 of your real traffic.

That’s a different number. Usually a much worse one.

The thing is, even when lab and field agree, lab can’t tell you which route is the problem. Lab can’t tell you a release shipped Tuesday made INP regress on Android only. Lab can’t tell you the bio section on creator profiles is the LCP element for some locales and the hero image for others. You need the field.

Wiring Up web-vitals in Real Apps

The official web-vitals library is roughly twenty lines of integration code. Use it. Report LCP, INP, CLS, FCP, TTFB. Send via sendBeacon so you don’t block paint or get nuked when the user navigates away mid-flight.

// src/lib/rum/web-vitals.ts
import { onLCP, onINP, onCLS, onFCP, onTTFB, type Metric } from 'web-vitals/attribution';

type VitalsPayload = {
  metric: string;
  value: number;
  rating: 'good' | 'needs-improvement' | 'poor';
  navigationType: string;
  route: string;
  release: string;
  deviceType: 'mobile' | 'tablet' | 'desktop';
  connection: string | null;
  attribution: Metric['attribution'];
  ts: number;
};

const RUM_ENDPOINT = '/api/rum/vitals';

function getDeviceType(): VitalsPayload['deviceType'] {
  const w = window.innerWidth;
  if (w < 768) return 'mobile';
  if (w < 1024) return 'tablet';
  return 'desktop';
}

function getConnection(): string | null {
  const c = (navigator as Navigator & { connection?: { effectiveType?: string } }).connection;
  return c?.effectiveType ?? null;
}

function report(metric: Metric) {
  const body: VitalsPayload = {
    metric: metric.name,
    value: metric.value,
    rating: metric.rating,
    navigationType: metric.navigationType,
    route: window.__ROUTE_PATTERN__ ?? location.pathname,
    release: process.env.NEXT_PUBLIC_RELEASE_SHA ?? 'unknown',
    deviceType: getDeviceType(),
    connection: getConnection(),
    attribution: metric.attribution,
    ts: Date.now(),
  };

  const payload = JSON.stringify(body);
  // sendBeacon will fire even on tab close; fetch with keepalive is the fallback
  if (!navigator.sendBeacon(RUM_ENDPOINT, new Blob([payload], { type: 'application/json' }))) {
    fetch(RUM_ENDPOINT, { method: 'POST', body: payload, keepalive: true }).catch(() => {});
  }
}

export function initWebVitals() {
  onLCP(report);
  onINP(report);
  onCLS(report);
  onFCP(report);
  onTTFB(report);
}

Two things people miss. First, send the route pattern, not the literal pathname. /creator/akin and /creator/jane are the same route from your code’s perspective and you want your dashboards to roll up that way. Stash it on window from your router. Second, send the release SHA. Without it you cannot do “did this deploy regress p75 INP” which is the whole point.

Sentry Performance and Real User Monitoring

I run web-vitals and Sentry Performance side by side. They answer different questions. The custom RUM beacon answers “what is my p75 LCP for /creator/:slug on Android on the current release”. Sentry answers “show me an actual session trace for one of the worst LCP samples in the last hour”. You want both.

// src/lib/observability/sentry.ts
import * as Sentry from '@sentry/react';
import { browserTracingIntegration, replayIntegration } from '@sentry/react';

export function initSentry() {
  Sentry.init({
    dsn: process.env.NEXT_PUBLIC_SENTRY_DSN,
    release: process.env.NEXT_PUBLIC_RELEASE_SHA,
    environment: process.env.NEXT_PUBLIC_ENV,
    integrations: [
      browserTracingIntegration({
        // route-level tracing; the router pushes the pattern, not the URL
        instrumentNavigation: true,
        instrumentPageLoad: true,
      }),
      replayIntegration({
        maskAllText: true,
        blockAllMedia: true,
      }),
    ],
    tracesSampler: (ctx) => {
      // sample more aggressively on routes we care about
      const route = ctx.attributes?.['http.route'] as string | undefined;
      if (route?.startsWith('/creator/')) return 0.5;
      return 0.1;
    },
    replaysSessionSampleRate: 0.01,
    replaysOnErrorSampleRate: 1.0,
  });
}

export function tagVital(metric: string, rating: string, route: string) {
  Sentry.getCurrentScope().setTags({
    [`vital.${metric}`]: rating,
    'vital.route': route,
  });
}

Tag releases, route patterns, and user segments. If you can’t slice Sentry traces by release, you can’t tell a regression from background noise.

Build the Percentile Dashboard

p50 is reassurance. p75 is what Google grades you on. p99 is where the actual pain lives, and you ignore it at your peril because it’s also where your loudest users live.

I send the RUM beacon to a tiny ingest service that writes to ClickHouse. You can do this with Postgres, but ClickHouse handles percentile queries on millions of events without breaking a sweat, and you’ll have millions of events sooner than you think.

-- ClickHouse schema for raw RUM events
CREATE TABLE rum_vitals
(
  ts            DateTime64(3, 'UTC'),
  metric        LowCardinality(String),
  value         Float64,
  rating        LowCardinality(String),
  route         LowCardinality(String),
  release       LowCardinality(String),
  device_type   LowCardinality(String),
  connection    LowCardinality(String),
  navigation    LowCardinality(String),
  session_id    String
)
ENGINE = MergeTree
PARTITION BY toYYYYMMDD(ts)
ORDER BY (metric, route, release, ts)
TTL ts + INTERVAL 90 DAY;

-- p50 / p75 / p99 by route and release, last 24h, INP only
SELECT
  route,
  release,
  device_type,
  count() AS samples,
  quantile(0.50)(value) AS p50,
  quantile(0.75)(value) AS p75,
  quantile(0.99)(value) AS p99
FROM rum_vitals
WHERE metric = 'INP'
  AND ts >= now() - INTERVAL 24 HOUR
GROUP BY route, release, device_type
HAVING samples > 200
ORDER BY p75 DESC;

That HAVING samples > 200 matters. p75 on twelve samples is noise. Don’t alert on it.

Tie Optimization to Specific Metrics

Here’s the workflow. A scheduled job runs the percentile query every fifteen minutes. If p75 INP for any high-traffic route on the current release jumps more than 20% versus the previous release for the same device class, it cuts a ticket and pings the squad that owns that route. No vibes. No “let me add a Suspense boundary and see if it feels faster” PRs.

# Datadog monitor (excerpt) - p75 INP regression by route + release
type: query alert
query: >
  avg(last_30m):percentile(
    rum.web_vitals.inp{env:prod,route:/creator/:slug,device:mobile},
    75
  ) > 200
message: |
  p75 INP on /creator/:slug mobile is above 200ms for the current release.
  Check the release_sha tag, pull the worst-attribution sample from Sentry,
  and look at the long task breakdown before touching the code.

  @squad-creator-surface
options:
  thresholds: { critical: 200, warning: 160 }
  notify_no_data: false
  evaluation_delay: 60

When the alert fires, the first move is not to open the editor. It’s to pull the worst-attribution samples out of Sentry and look at the actual long tasks. The web-vitals/attribution import I used in the beacon gives you the long task culprit, the largest contentful element, the layout shift sources. Read those before you write any code.

Takeaways

Synthetic Lighthouse is a smoke check. p75 of your real traffic is the contract.
Wire web-vitals to your own ingest plus Sentry Performance. They answer different questions.
Send route pattern and release SHA on every beacon. Without those, dashboards are useless.
p50 reassures, p75 is the contract, p99 is where your loudest users live. Look at all three.
Tie regressions to alerts and tickets. Read the attribution data before you write code.

Thanks for reading. If you’ve got thoughts, send them my way.