Privacy policy

Last updated 2026-05-29 · Effective 2026-05-29

This policy explains what data Clew collects, why, how long we keep it, who else sees it, and the rights you have over it. We try to write this in plain language. If you read something here and the Security page contradicts it, the Security page is the implementation truth — tell us so we can fix the language.

1. Who's the controller

Backthread OÜ, registered in the Estonian Commercial Register under number [REGISTRATION NUMBER], with registered office at [REGISTERED ADDRESS], Estonia ("we", "us", "Clew"), is the data controller for the data described in §2.1, §2.2, §2.3, and §2.4 below.

For the source code we momentarily clone when you connect a repository (see §2.5), we act as a data processor on your instructions: you decide what code we read and why. The companion Data Processing Addendum sets out our obligations as your processor.

Contact for any privacy question, request, or complaint: [email protected].

2. What data we collect and why

2.1 Waitlist signups (controller)

When you join the waitlist on useclew.dev:

email address (so we can write back when we open access);
the form variant you used (one of lander-hero, lander-final, rescue-hero, rescue-final) — to tell which page wording landed;
UTM parameters from the URL — for the same reason;
referrer, user-agent, Accept-Language header — capped in length, used for very basic analytics;
edge metadata your browser exposes via Cloudflare — approximate country, city, region, timezone, and ISP name;
a salted SHA-256 hash of your IP address — for per-IP rate limiting on the signup endpoint. We never store the raw IP.

Lawful basis: consent (Art 6(1)(a) GDPR) — you typed your email into a form. You can withdraw at any time by emailing us.

2.2 Account data (controller)

When you sign in:

your GitHub OAuth identity (GitHub user ID, login, email, avatar URL) via Supabase Auth;
the session token Supabase issues, which lives in your browser;
an account record (one personal account per signup; the unit of data isolation in our database).

Lawful basis: contract performance (Art 6(1)(b)).

2.3 GitHub installation linkage (controller)

When you connect a repository via the Clew Ingest GitHub App:

the GitHub installation ID GitHub assigns to your install;
which repositories you granted access to (owner + name + default branch + visibility);
sync status per repo (pending, syncing, ready, failed, disconnected) and the latest sync error, with credentials scrubbed.

We do not store your GitHub App private key (it lives in our orchestration Worker as a secret) and we do not persist installation access tokens — they are minted per ingest job and destroyed with the sandbox.

Lawful basis: contract performance (Art 6(1)(b)).

2.4 Derived diagrams + changelogs (controller; eventually processor for team customers)

For every connected repository we store:

the derived diagram (modules, edges, clustering, layout) — JSONB;
the per-module changelog with the LLM-narrated "why" behind each PR-merged change;
the list of loose ends the system flagged.

This is the product. We do not store your source code in this category — only what we derived from it.

Lawful basis: contract performance (Art 6(1)(b)).

2.5 Source code (processor)

When an ingest runs, the orchestration Worker spawns a fresh, isolated sandbox (a Cloudflare Container — Firecracker microVM); inside it we:

clone your repo with git clone --depth 1 using a job-scoped installation token;
read the source statically (no npm install, no require() of your code, no eval);
write only derived data to our database;
destroy the sandbox.

The clone, the installation token, and any in-memory representation of your code die with the sandbox. We do not retain your source code in any database, log, or cache outside the lifetime of that single sandbox. See Security for the implementation detail.

When we process source code, we act as your processor under Article 28 GDPR. See the Data Processing Addendum.

2.6 Operational / transient data

To keep the service running we keep a few short-TTL records:

per-IP rate-limit counters on the signup endpoint (10-minute TTL);
per-webhook delivery dedupe markers for GitHub webhook replay protection (10-minute TTL);
per-repo in-flight markers for queue dedupe (10-minute TTL);
operational logs in our worker tier — credentials scrubbed before write — typically retained ≤ 30 days by Cloudflare.

Lawful basis: legitimate interest (Art 6(1)(f)) — operating and securing the service.

3. What we do not collect or do

We do not use first-party tracking cookies. We use one localStorage flag to remember if you dismissed the cookie banner. That's it.
We do not use third-party analytics, fingerprinting, advertising pixels, or session replay.
We do not sell, rent, or share your data for advertising.
We do not train any model on your code or your prompts. We pass narration context to Anthropic under an agreement that excludes training on commercial API content (see §5).
We do not persist your source code outside the lifetime of a single ingest sandbox.

4. How long we keep things

Data	Retention
Source code in the ingest sandbox	Destroyed at end of job (job timeout ≤ 10 minutes)
Derived diagrams + changelogs + loose ends	Until you disconnect AND explicitly ask us to delete (we keep derived data after a disconnect so you can reconnect without losing history; you can ask for deletion any time)
Account data	Until account deletion (self-serve from `/account` or via email)
GitHub installation linkage	Until you uninstall the GitHub App AND ask us to delete the linkage row
Waitlist signups	Up to 24 months after your last contact with us, or until you ask for removal
Per-IP rate-limit hashes	10 minutes
Webhook delivery / queue-dedupe markers	10 minutes
Operational logs (worker tier)	≤ 30 days at Cloudflare

Where we delete data, we delete it; where the rule requires us to retain it (e.g. accounting records for tax purposes), we keep what the rule requires for as long as it requires.

5. Who else sees it (sub-processors + recipients)

We use the following sub-processors. Each has a Data Processing Agreement with us and uses EU Standard Contractual Clauses where data leaves the EU.

Provider	Role	What they see	Where
Supabase, Inc.	Database, auth, realtime	Account data, derived diagrams, derived changelogs	EU region (`eu-central-1`, Frankfurt). DPA · Trust Center
Cloudflare, Inc.	Pages, Workers, Queues, KV, D1	Lander signups (D1), worker job buffer, transient dedupe / rate-limit data	Global edge, with the EU jurisdiction option for our D1 bucket. DPA · Sub-processors
Anthropic, PBC	LLM narration of derived graph context	Module names, edge metadata, PR titles + bodies + diff metadata sent during enrichment	US. EU SCCs (Modules 2 + 3) under the Anthropic Commercial DPA. DPA
GitHub, Inc.	OAuth identity + the read-only GitHub App	Your GitHub identity + the source code you grant the App to read	US. EU SCCs under the GitHub Customer DPA.

We send the founder a Telegram direct message when a new email joins the waitlist, containing the email address and the approximate geo / ISP returned by Cloudflare. We treat Telegram FZ-LLC as a recipient (not a sub-processor) of this single message; if you'd rather we didn't notify the founder about your signup, tell us at [email protected] and we'll turn it off for you.

We will publish at least 30 days' notice on this page before adding a new sub-processor that processes customer data. (Telegram is intentionally not in this list — it sees only the waitlist signup notification, not customer data.)

6. International transfers

We are established in Estonia; our primary data store (Supabase) is in the EU (eu-central-1).

Some sub-processors are US-headquartered (Cloudflare, Anthropic, GitHub). For those transfers we rely on the European Commission's Standard Contractual Clauses (SCCs, Commission Implementing Decision 2021/914) incorporated into each sub-processor's DPA, supplemented by the supplementary measures described on Security — the most important of which is that we never persist your source code, which sharply limits what a US authority can compel about it.

A summary Transfer Impact Assessment is available on request to [email protected].

7. Your rights

Under the GDPR + the Estonian Personal Data Protection Act (PDPA, 2018) you have the right to:

access the personal data we hold about you (Art 15);
rectify inaccurate data (Art 16);
erase your data (Art 17) — subject to retention rules we have to follow;
restrict processing (Art 18);
port your data to another service (Art 20) — we'll export your diagrams as JSON on request;
object to processing based on legitimate interest (Art 21);
withdraw consent at any time, for processing based on consent (Art 7(3));
lodge a complaint with the Estonian Data Protection Inspectorate (Andmekaitse Inspektsioon, Tatari 39, 10134 Tallinn, [email protected], +372 627 4135) or with the supervisory authority in your EU member state of residence.

To exercise any of these rights, email [email protected]. We'll respond within one month, per Art 12(3); if your request is complex we may extend by two months and tell you why. We do not charge for handling a request unless it is manifestly unfounded or excessive.

8. Security

We describe what we do — and what we don't do — on the Security page. Highlights: TLS in transit; encryption at rest by our cloud providers; ephemeral sandboxes for source code; least-privilege GitHub App scope; secrets only inside the worker tier; safety budgets against pathological inputs.

If you think you've found a security issue, write to [email protected] (or [email protected]) and we'll respond within one business day. We do not yet run a bug bounty.

9. Children

Clew is for adults. If we learn we've collected data from a child under 16 without parental authority, we'll delete it.

10. Changes

We'll update this page when we change a load-bearing fact and bump the date at the top. For material changes affecting your rights, we'll email the address on your account at least 14 days before the change takes effect.

11. AI Act transparency

The Clew product uses a general-purpose AI model (Anthropic Claude) to name and narrate the modules + per-module changelog you see in the diagram. The diagram structure is derived deterministically from your source code; the model never authors it. Strings the model produced are tagged in the UI so you can question them. This disclosure is provided in advance of the Article 50 EU AI Act transparency-obligation deadline (2 December 2026 per the May 2026 Commission consultation).

Backthread OÜ · registration number [REGISTRATION NUMBER] · registered office [REGISTERED ADDRESS], Estonia · [email protected]