feat(analytics): attach opaque server-url hash to every event#113
Merged
ar2rsawseen merged 2 commits intomainfrom Apr 23, 2026
Merged
feat(analytics): attach opaque server-url hash to every event#113ar2rsawseen merged 2 commits intomainfrom
ar2rsawseen merged 2 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds an opaque, truncated SHA-256–based server hash segment to analytics events so telemetry can be aggregated by distinct Countly server without transmitting raw URLs/domains.
Changes:
- Introduces URL normalization + 16-hex server hash computation and injects the
serversegment into all analytics event tracking. - Wires a lazy per-event server URL resolver (AsyncLocalStorage-aware for HTTP, config fallback for stdio).
- Updates docs (README, CHANGELOG) and adds test coverage for hashing + segment injection behavior.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/lib/analytics.ts |
Adds server URL normalization/hash utilities and injects the server segment into event segmentation. |
src/index.ts |
Passes a per-event resolver for serverUrl (request-scoped when available). |
tests/analytics.test.ts |
Adds unit tests for normalization, hashing, resolver semantics, and segment propagation. |
README.md |
Documents the new server hash telemetry and clarifies what is/isn’t tracked. |
CHANGELOG.md |
Notes the telemetry change under 1.3.0. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Adds a short SHA-256 hash of the Countly server URL as the `server` segment on every analytics event, so stats.count.ly can answer "how many distinct servers use MCP" — and the same question per tool, per auth method, per error type, etc. — without ever receiving a raw URL or domain. How it works: - New `computeServerHash(url)` (exported) and `normalizeServerUrlForHash(url)` in src/lib/analytics.ts. Normalize by stripping scheme, lowercasing, and trimming trailing slashes so e.g. https://Example.com/ and http://example.com hash identically. SHA-256, first 16 hex chars (64 bits of entropy) — enough to distinguish billions of servers with negligible aggregation collision risk, keeps event payload small. - `analytics.init()` now accepts an optional `getServerUrl: () => string | undefined` resolver, called lazily on every event-track. `trackEvent` / `trackTimedEvent` merge the resolved hash into segmentation under key `server`. All specialized helpers (`trackToolExecution`, `trackToolCategory`, `trackAuthMethod`, `trackApiEndpoint`, `trackHttpRequest`, `trackError`, etc.) delegate to those, so the segment flows through automatically. - `trackView` and `trackUserProperty` are unchanged — they aren't event-shaped in Countly. - `index.ts` wires the resolver: `() => requestContext.getStore()?.serverUrl || this.config?.serverUrl`. That makes the hash track the per-request URL in HTTP multi-tenant mode (via the AsyncLocalStorage set by the HTTP middleware) while still falling back to the static env-derived URL in stdio mode. `this.config` is populated after `analytics.init` so the resolver is written defensively against `this.config === undefined`; in practice it's only called at event- track time by which point config is set. - `device_id` stays `"mcp"` (explicit choice). Distinct-server counts come from `server` segmentation breakdown, not from Countly's built-in "users" metric. Privacy note included in the README "Analytics Tracking" section: the hash is coarse (64 bits) and server URLs are low-entropy, so it is intended for aggregation and NOT as a secret. Raw URLs and domains are still never transmitted. Coverage: +17 tests in tests/analytics.test.ts covering normalization (scheme/case/trailing-slash equivalence), hash length, resolver behavior (undefined/empty/varying URLs), injection into all specialized helpers, omission when no resolver is set, per-event re-evaluation (HTTP multi-tenant), and the explicit "device_id stays mcp" assertion. Total 356 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
c4d5255 to
4e20449
Compare
Three review items, all valid, all fixed: 1. normalizeServerUrlForHash claimed to strip default ports but didn't. Was a simple scheme-strip + lowercase + slash-trim over the raw string. Semantically-equivalent URLs like `https://example.com` and `https://example.com:443` would hash differently and split the distinct-server aggregation. Now uses `new URL()` to parse, then explicitly strips :80 on http and :443 on https. 2. Lowercasing the full URL was wrong for paths. RFC 3986 says only the host is case-insensitive; paths, queries, and fragments are case-sensitive. The old code would merge e.g. `/api` and `/API` into the same hash. Now only `parsed.hostname` is lowercased; path / search / hash case is preserved. 3. `server_started` event fired from inside `analytics.init()` before `this.config` was populated, so its resolver call returned `undefined` and the very first event shipped without the `server` segment — contradicting the "every event" goal. Resolver now falls back to `process.env.COUNTLY_SERVER_URL` for the pre-config window. Priority order is now: 1. HTTP per-request URL (AsyncLocalStorage) 2. this.config.serverUrl (after constructor finishes) 3. process.env.COUNTLY_SERVER_URL (pre-config fallback) The URL parse has a safe fallback — if `new URL()` throws (bare hostname without scheme, weird input), we prepend `https://` and try once more; ultimate fallback is the old-style regex strip preserving path case. So we still produce a stable hash for non-URL-ish input instead of dropping the `server` segment. Coverage: +5 tests in tests/analytics.test.ts covering the three semantic fixes: - default port stripping (:80, :443) on both hash and normalize - non-default ports preserved - path case preserved (`/api` vs `/API` differ) - bare hostnames without scheme accepted - server_started event carries the `server` segment Total 361 tests pass (up from 356). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cookiezaurs
approved these changes
Apr 23, 2026
Cookiezaurs
approved these changes
Apr 23, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
We want to answer "how many distinct Countly servers use the MCP" (and the same question sliced per tool, per transport, per auth method, etc.), while respecting the privacy commitment the README makes. Raw URLs and domains should never leave the process.
What this PR does
Adds a short opaque
serversegment to every analytics event — a 16-hex-char SHA-256 prefix of the normalized Countly server URL. That gives Countly the aggregation signal without ever transmitting an identifiable URL.Design
server"mcp"(explicit choice — the hash lives in event segmentation, not device identity). This was discussed and agreed: keep the device-level anonymity and derive distinct-server counts from theserversegment breakdown.https://Example.com/,http://example.com, andHTTPS://EXAMPLE.comall hash identically.analytics.init()accepts agetServerUrl()callback invoked lazily at event-track time. In HTTP transport the callback reads the request-scoped URL fromAsyncLocalStorage(same mechanism used for per-tenant auth isolation); in stdio mode it reads the static env-derived config. Multi-tenant deployments naturally emit per-tenant counts.Code changes
normalizeServerUrlForHash()andcomputeServerHash()Analytics.init(enabled, getServerUrl?)— second arg is the new lazy resolverwithServerSegment()merges the hash onto the segmentation of everytrackEvent/trackTimedEventcalltrackToolExecution,trackToolCategory,trackAuthMethod,trackApiEndpoint,trackHttpRequest,trackError) delegate to those two, so the segment flows through automatically() => requestContext.getStore()?.serverUrl || this.config?.serverUrlthis.configbeing undefined atanalytics.init()time (it's populated later in the constructor, resolver is called lazily)serverhash as tracked (and explains why it's coarse, so no one reads a secrecy guarantee into it), and explicitly calls out raw URLs / domains as NOT tracked[1.3.0](nothing published under 1.3.0 yet so no new version bump needed)Privacy characterization (for the README)
The hash is coarse (64 bits) and server URLs are low-entropy — cloud patterns (
*.count.ly) are dictionary-bruteforceable by anyone, Countly most of all. This is intended for aggregation, not secrecy. What it does buy:stats.count.lydoesn't leak every customer's on-prem URL)For strict privacy (e.g. on-prem URLs operators don't want even Countly to see), the server remains opt-out: analytics are disabled by default,
ENABLE_ANALYTICS=trueis required.Tests
+17 tests in tests/analytics.test.ts:
normalizeServerUrlForHash: scheme / case / trailing-slash equivalence, empty inputcomputeServerHash: same inputs → same hash, different inputs → different hash, 16-hex format, undefined/empty → undefinedtrackEvent,trackTimedEvent, all specialized helpers propagateserversegmentdevice_idassertion: stays"mcp", hash is on events not device identityTotal: 356 tests passing (339 existing + 17 new). Lint clean.
Test plan
ENABLE_ANALYTICS=trueand a real Countly server URL, run a couple of tools via stdio; confirm events arrive atstats.count.lywith aserversegment matching the expected SHA-256 prefixX-Countly-Server-Urlheaders, confirm theserversegments differ (per-tenant) within the same process🤖 Generated with Claude Code