For Engineering

Sync internals

How each adapter polls, how the auto-router classifies, how watermarks prevent replay, and how the lost-thread detector + signature parser fit on top.

For the user-facing version see Connecting your accounts.

The pipeline

┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐
│ Outlook  │  │  Slack   │  │ Telegram │  │  Notion  │
│ adapter  │  │ adapter  │  │ adapter  │  │ adapter  │
└────┬─────┘  └────┬─────┘  └────┬─────┘  └────┬─────┘
     │             │             │              │
     ▼             ▼             ▼              ▼
            ┌─────────────────────────┐
            │      auto-router        │  domain match, blocklist, skip rules,
            │   (sync/auto-router.js) │  internal-only filter
            └────────────┬────────────┘
                         │
            ┌────────────┴────────────┐
            ▼                         ▼
       Known → log              Unknown → sync_inbox
                                          (human gate)
                         │
                         ▼
                  interactions table
                         │
            ┌────────────┴────────────┐
            ▼                         ▼
     signature-parser           lost-thread-detector
     (people enrichment)        (stale outbound)

The whole flow runs in sync-worker.js, a separate Node process from the API server. Runs as laurelin-sync.service (systemd).

`sync-worker.js`

Single-process scheduler. Runs each adapter on its own interval, each in series within its own loop (so a slow Outlook poll never blocks Slack and vice versa). Reads/writes the same SQLite DB as the API server — SQLite WAL handles the concurrent access.

The worker is intentionally simple:

setInterval(pollOutlook,  60_000);   // every minute
setInterval(pollTelegram, 60_000);
setInterval(reconcileNotion, 5 * 60_000);
// Slack is event-driven via Express webhook, not polled here.
setInterval(runSignatureParser, 5 * 60_000);
setInterval(detectLostThreads, 10 * 60_000);

Intervals are illustrative; real values are in the source.

Outlook adapter (`sync/outlook-adapter.js`)

OAuth

sync/outlook-oauth.js. Azure AD app registration (admin-configured once via env / oauth_tokens table). Per-user OAuth via Microsoft authorization-code flow with scopes Mail.Read and Calendars.Read.

Refresh tokens stored in oauth_tokens (encrypted at rest in the table; KMS is on the maybe-later list). Refresh happens automatically when the access token is within 5 minutes of expiry. A failed refresh transitions the user to a "reconnect required" state, surfaced on the Sync tab.

Polling

Per connected user:

Read the user's watermark: sync_watermarks WHERE user_id = ? AND source = 'outlook' AND channel = 'mail'.
Call GET /me/messages?$filter=receivedDateTime ge {watermark}&$top=50&$orderby=receivedDateTime asc.
For each message:
- Drop if any of List-Unsubscribe, Auto-Submitted, Precedence: bulk, or known automated-sender heuristics.
- Drop if every participant address ends in @valinordigital.com.
- Build the normalized interaction shape: date, subject, summary (≤255 char bodyPreview), direction (outbound if sender is the connected user's mailbox owner, inbound otherwise), source_id (Outlook internet message ID), source_thread_id (conversation ID).
- Hand to the auto-router.
Advance the watermark to the max receivedDateTime of the batch.
Repeat until the page has fewer than 50 messages.

The same loop ingests calendar events (GET /me/events), producing interactions with type = 'meeting' and resolved attendees.

Why bodyPreview

We don't store full message bodies because (a) privacy posture is tighter, (b) the summary is enough to identify what the interaction was about for retrospective search, (c) storage cost on SQLite stays small.

Slack adapter (`sync/slack-adapter.js`)

OAuth & app install

sync/slack-oauth.js. Workspace-level app install + per-user OAuth with scopes:

channels:history — public channel history
groups:history — private channel history
im:history — DMs
mpim:history — group DMs
users:read — resolving sender IDs to email/name

Events webhook

sync/slack-adapter.js exposes a webhook handler (mounted on the API server, not the sync worker) that subscribes to Slack message events. On every fired event:

Look up the channel in slack_channel_map. If unmapped, surface to the Sync tab.
Resolve the user via slack_user_id → people row.
Build the interaction (source: slack, type: note — Slack doesn't fit cleanly into email/call/meeting).
Hand to the auto-router.

Threading is preserved by storing thread_ts in source_thread_id. A reply within a thread gets the same source_thread_id as the parent.

Backfill

sync/slack-api.js exposes a backfill(channelId, lookbackHours = 48) function that calls conversations.history. Triggered from the Sync UI's "Backfill" button per channel.

Channel mapping

The slack_channel_map table stores slack_channel_id → company_id. Mappings are sticky — once you've mapped C0123ABC to a company, every future event in that channel routes there without asking.

Telegram adapter (`sync/telegram-adapter.js`, `sync/telegram-bot.js`)

Bot setup

Admin creates a bot via @BotFather, stores the token in env. The Valinor bot runs in Business mode — Telegram's feature for letting a bot read chats from a connected personal account.

telegram-bot.js is intentionally lopsided: it imports the Telegram bot library but exports zero send-side methods. The only methods on the helper are getUpdates, pollChats, resolveUser. Audit-checkable: grep the file for sendMessage and you find nothing.

Pairing

To bind a team member's Telegram account to their Laurelin person row:

Team member clicks "Generate pairing code" in the Sync tab. Backend generates a code with HMAC over (person_id, expires_at), expires in 10 minutes.
Team member opens Telegram, finds the bot, sends /pair <code>.
Bot verifies HMAC + expiry, writes telegram_user_id onto the team member's people row, and creates a row in telegram_connections.

Pairing codes are single-use. A leaked code is useless after 10 minutes and useless to anyone who isn't logged in as the bot, which is just us.

Chat scoping

Each Telegram chat starts in off state — even after the bot is connected to your account, no messages get logged until you flip the toggle. State stored in telegram_chat_scope (chat_id, user_id, enabled). For chats shared with other team members, telegram_shared_chat_overrides lets one team member force-on or force-off the chat for everyone (with one canonical override per chat).

Polling

getUpdates long-poll. Per polled message:

Resolve sender via telegram_user_id lookup.
Check the per-user-per-chat scope. Drop if off.
Build the interaction (source: telegram, type: telegram).
Hand to the auto-router.

Edited messages are ignored. Future work: reconcile edits by message_id + content hash.

Notion adapter (`laurelin/notion-pipeline-sync.js`, `laurelin/notion-sync.js`)

Different shape from the others — Notion is a system of record we synchronize with, not a passive event source.

Two scripts:

notion-sync.js — companies. Reads from a Notion companies database, reconciles into Laurelin's companies.
notion-pipeline-sync.js — pipeline state. Pulls Notion pipeline records into Laurelin projects + companies.

Both use an admin-configured Notion integration token (workspace-level). No per-user OAuth.

The reconciler runs on a schedule (default 5 minutes) and on demand via POST /api/laurelin/sync/notion/pipeline. Strategy is:

List Notion records updated since last watermark.
For each record, look up the corresponding Laurelin row by notion_id (stored in source_metadata JSON).
Diff. Apply non-destructive merge: Laurelin fields take precedence if they were edited more recently than Notion, otherwise pull from Notion.
Write back changes (Laurelin → Notion is gated; off by default).

The reconcile endpoint returns a JSON diff so the UI can show "5 records updated, 2 conflicts."

Auto-router (`sync/auto-router.js`)

Classification cascade. Every normalized interaction passes through:

Internal-only check. If every participant (sender + recipients) has @valinordigital.com, drop. Don't log internal-team email as external interactions.
Domain blocklist. Check domain_blocklist for the sender's domain. If matched, drop. Used for gmail.com, outlook.com, and similar — domains that would generate junk companies if auto-routed.
Skip rules. Check sync_skip_rules for sender, domain, subject_pattern, source_id_prefix, contact_skip. Matches → drop.
Do-not-track. Check email_do_not_track for the sender or any recipient. Match → drop.
Known person. Look up the sender by email/slack_user_id/telegram_user_id/telegram_handle. If found, log directly with that person as a participant. Done.
Known company by domain. Look up the sender's domain in any company's email_domains JSON array. If found, log directly with the company; auto-create the person and the affiliation.
Unknown. Write a row to sync_inbox with status = pending. Suggest a company match by token overlap on the sender's domain root (so [email protected] suggests "Bridge" even if Bridge's email_domains doesn't include bridge.xyz yet).

The cascade is documented at the top of the file. Order matters — earlier rules short-circuit later ones.

Signature parser (`sync/signature-parser.js`)

Deterministic regex pass over interactions.summary (the 255-char preview). Extracts:

Phone numbers (multiple formats, US + international)
Title / role (from common signature patterns: "VP of X", "Director, X")
LinkedIn URL
Telegram handle (@username patterns)

Findings are written to the matching people row but only into empty fields. The parser never overwrites existing data.

Gating

signature_parsed_at is set once a person has been parsed.
signature_parse_attempts increments on each pass.
Capped at LAURELIN_SIGNATURE_MAX_ATTEMPTS (default 3) — if a person's signature consistently yields nothing useful, we stop trying.

The parser runs as a separate worker tick, processing people with signature_parsed_at IS NULL AND signature_parse_attempts < cap.

Lost-thread detector (`sync/lost-thread-detector.js`)

Scans interactions for outbound emails that haven't received an inbound reply within a threshold.

Candidate generation

For each external company with interactions.direction = 'outbound' as the latest message in any thread (grouped by source_thread_id):
Compute days_stale from the last message.
Classify urgency:
- emergency if the company is high importance + active/core stage, or a project linked to this thread has an upcoming key_dates row.
- normal otherwise.
Generate a content_hash over (sender, recipient, subject, snippet) for dedup.
Insert into lost_thread_candidates if no row exists for this content_hash with status = pending.

State machine

Per candidate row:

status = pending → user hasn't acted yet.
status = resolved (with resolved_at, optional resolved_interaction_id) → user marked it handled.
status = dismissed (with dismissed_scope = 'once' or 'forever') → user dismissed.

When the user requests a draft (POST /api/laurelin/lost-threads/:id/draft):

draft_requested_at set.
A worker tick picks it up, calls Claude via sync/claude-api.js with the thread context + user's voice setting + chosen intent.
On success: draft_completed_at, draft_body, draft_body_preview populated.

Drafts are never sent. The user copies into Outlook.

Watermarks (`sync_watermarks`)

(user_id, source, channel) → last_sync_at + last_source_id. Every adapter reads its watermark at the start of a poll cycle and advances it at the end. Idempotent — re-reading the same range produces no duplicates because dedup happens via source_id uniqueness on interactions.

A worker restart never replays history. If the SQLite DB is restored from backup, watermarks rewind with it (which is fine — re-running a small replay produces no duplicates).

Skip rules (`sync_skip_rules`)

5 rule types:

`rule_type`	Pattern
`sender`	Exact email address — `[email protected]`
`domain`	Exact domain — `mailchimp.com`
`subject_pattern`	Case-insensitive substring — `unsubscribe`
`source_id_prefix`	Prefix on the source's message ID — used to skip a known mailing list ID range
`contact_skip`	"Don't log interactions involving this specific person"

Source can be outlook, slack, telegram, or all. Created by team members from the Sync tab; admin can audit and bulk-delete.

Admin setup (one-time per integration)

Outlook

Azure AD app registration in the tenant.
Redirect URI set to the OAuth callback URL.
API permissions: Mail.Read, Calendars.Read (delegated).
Client ID, client secret, tenant ID written to env: OUTLOOK_CLIENT_ID, OUTLOOK_CLIENT_SECRET, OUTLOOK_TENANT_ID.
Set the config via PUT /api/laurelin/sync/outlook/config or the admin UI.

Slack

Slack app in api.slack.com. Add OAuth scopes listed above.
Configure the Events API webhook URL (must be publicly reachable — currently https://laurelin.valinorinfo.com/api/laurelin/sync/slack/events).
Client ID, client secret, signing secret in env.

/newbot to @BotFather to create a bot, get a token.
Enable inline mode + Business mode in @BotFather settings.
Token in env: TELEGRAM_BOT_TOKEN.

Notion

Create an internal integration at notion.so/my-integrations.
Grant it read access to the relevant databases.
Token in env: NOTION_API_KEY.
Database IDs configured in the Notion sync settings.

When sync goes wrong

Watermark stuck — last_sync_at not advancing. Check journalctl -u laurelin-sync -f. Common causes: token expiry (Outlook refresh failed), Slack scope revoked, Telegram bot ejected from Business chat.
Duplicate interactions — usually means source_id uniqueness was bypassed. Check the adapter's normalization and ensure every message has a stable source_id.
Sync Inbox flooded — email_domains not maintained on companies. Approving from the inbox auto-learns new domains, but if you're seeing many items from one company, add its full list of domains to the company record.
Auto-router routes wrong company — token-overlap suggestion is just a hint; the user should reject it. If a domain consistently mis-routes (e.g., a shared domain like consensys.net that has multiple sub-companies), use the notes field on the company to flag it and route manually.

Sync internals

The pipeline

sync-worker.js

Outlook adapter (sync/outlook-adapter.js)

OAuth

Polling

Why bodyPreview

Slack adapter (sync/slack-adapter.js)

OAuth & app install

Events webhook

Backfill

Channel mapping

Telegram adapter (sync/telegram-adapter.js, sync/telegram-bot.js)

Bot setup

Pairing

Chat scoping

Polling

Notion adapter (laurelin/notion-pipeline-sync.js, laurelin/notion-sync.js)

Auto-router (sync/auto-router.js)

Signature parser (sync/signature-parser.js)

Gating

Lost-thread detector (sync/lost-thread-detector.js)

Candidate generation

State machine

Watermarks (sync_watermarks)

Skip rules (sync_skip_rules)

Admin setup (one-time per integration)

Outlook

Slack

Telegram

Notion

When sync goes wrong

`sync-worker.js`

Outlook adapter (`sync/outlook-adapter.js`)

Slack adapter (`sync/slack-adapter.js`)

Telegram adapter (`sync/telegram-adapter.js`, `sync/telegram-bot.js`)

Notion adapter (`laurelin/notion-pipeline-sync.js`, `laurelin/notion-sync.js`)

Auto-router (`sync/auto-router.js`)

Signature parser (`sync/signature-parser.js`)

Lost-thread detector (`sync/lost-thread-detector.js`)

Watermarks (`sync_watermarks`)

Skip rules (`sync_skip_rules`)