Zenovay

Data Pipeline

Understanding how data flows through Zenovay—from a visitor's first page view to insights on the dashboard—helps you build and debug effectively.

Event Tracking Flow

When a visitor lands on a tracked website, here's what happens:

  • Script loads: Our lightweight tracker (<5KB) initializes and generates a visitor ID (stored for 365 days) and session ID (30-minute timeout).
  • Page view fires: Basic data is collected—URL, referrer, viewport size, user agent—and sent to our API.
  • Geolocation: Cloudflare provides country and region from the CF-Connecting-IP header. No client-side geolocation.
  • Processing: Events are validated, enriched with device detection, and stored in Supabase.
  • Real-time update: Dashboard subscribers receive the new data via Supabase real-time channels.

Real-Time vs Aggregated

We balance real-time responsiveness with query performance:

  • Real-time: Live visitor counts, active sessions, and recent events are served directly from raw tables.
  • Aggregated: Historical charts and reports query pre-computed daily aggregations for speed.
  • Cron jobs: Daily aggregation runs at 00:00 UTC, consolidating the previous day's events.

Session Replay Recording

Session replay uses the rrweb library to capture DOM changes:

  • Recording: rrweb observes DOM mutations, mouse movements, scrolls, and inputs (with sensitive data masked).
  • Chunking: Events are batched into chunks (max 10MB compressed) and uploaded to the API.
  • Storage: Chunks are stored in Supabase Storage with session metadata in PostgreSQL.
  • Playback: The dashboard fetches chunks on-demand and uses rrweb-player to reconstruct the session.

Heatmap Collection

Heatmaps aggregate interaction data across many sessions:

  • Click tracking: Element coordinates and click counts are recorded relative to the viewport.
  • Scroll depth: Maximum scroll positions are tracked to show content engagement.
  • Screenshots: Page screenshots (max 2MB) are captured for overlay rendering.
  • Aggregation: Click and scroll data is aggregated across sessions for statistical significance.

ML-Powered Visitor Scoring

Every visitor receives a value score (0-100) based on multiple factors:

  • Country multipliers: Geographic purchasing power (e.g., Switzerland 95, US 88, India 28)
  • Device signals: OS and browser correlate with conversion rates (macOS 95, iOS 90, Android 45)
  • Behavioral patterns: Session duration, page depth, and engagement metrics
  • ML refinement: A trained model adjusts scores based on historical conversion data

Scheduled Tasks

Background jobs keep data fresh and systems healthy:

  • Daily 00:00 UTC: Analytics aggregation, AI insight generation
  • Every 6 hours: Cache cleanup, health checks
  • Mondays 8 AM: Weekly client reports for agencies
  • Every 5 minutes: Uptime monitoring checks
  • Hourly: Subscription expiration warnings
  • Daily 2 AM: Auto-close inactive support tickets