← Documentation

What We Collect, and What We Refuse To

The student activity model — and the surveillance features we left on the table.


A learning platform that knew nothing about its students would be useless: teachers need to see who is struggling, who has finished, and whether the work is genuinely theirs. A platform that knew everything would be a surveillance system wearing a classroom badge. The interesting design work lives between those two failures, and it comes down to a small number of explicit decisions about what to record, how to store it, and when to throw it away.

What we collect, and why

For each quiz or exercise attempt, we record only what serves a clear pedagogical or security purpose:

Data Why we keep it How it’s stored Retention
Start / completion timestamps Time-on-task; spotting who is stuck Plain UTC time Life of the educational record
Attempt duration Engagement; integrity signals Derived value Life of the educational record
Scores Progress tracking Numeric (0–100) Life of the educational record
Answers given Item-level feedback and review Structured response data Life of the educational record
IP address Detecting account sharing only Salted one-way hash — never the raw address 90 days, then auto-deleted

Attempt records link to a student by an internal identifier, not by name embedded in the record. The student’s display name lives in one place, the roster, so that correcting or removing it is a single operation rather than a hunt across every row of activity they ever generated.

The IP address decision

Account sharing is a real problem in classrooms: one student logs in for another, or a code circulates beyond the class. To detect it you need some notion of where a session came from. The naive solution is to store the IP address. We don’t, because under the GDPR an IP address is personal data, and storing it indefinitely to catch occasional cheating fails any reasonable balancing test.

Instead we store a SHA-256 hash of the IP combined with a per-tenant salt. This is a one-way transformation: we can tell that ten sessions came from the same place (because the hashes match) without ever being able to recover the place itself. The per-tenant salt means the same address in two different schools produces two different hashes, so nothing can be correlated across customers. And the hash is deleted automatically after 90 days — long enough to investigate a live integrity concern, short enough that we are not sitting on a quietly growing pile of network identifiers.

Holding periods: why two clocks, not one

Not all data should age the same way, so we run two retention clocks rather than one blunt policy.

Separating the two clocks is itself a data-minimisation decision. The data with the highest privacy sensitivity and the lowest long-term value — network identifiers — expires fastest, automatically. The data the teacher actually relies on outlives it but never outlives the student’s relationship with the school.

Our legal basis: legitimate interest, documented

We process this activity data under legitimate interest (GDPR Article 6(1)(f)) rather than consent. Consent is the wrong instrument here: a student cannot meaningfully refuse the basic activity logging that a graded assignment requires, and a “consent” that cannot be declined is not consent. Legitimate interest is honest about what is happening — the school has a genuine educational and integrity interest, and we have minimised the data so that interest is not outweighed by the intrusion. That balancing is written down, not assumed, and the data minimisation described above is what makes it defensible.

What teachers see — and don’t

Collecting data responsibly is only half the job; exposing it responsibly is the other half. Teachers see actionable insight: “Sarah spent 18 minutes on Quiz 3, against a class average of 25,” or a flag that a quiz was completed implausibly fast, or an account-sharing alert. They do not see raw IP hashes, forensic logs, or precise second-by-second timelines. The dashboard is built to answer teaching questions, not to enable monitoring of a child’s every move.

What we deliberately refuse to collect

The clearest statement of a privacy philosophy is the list of easy features you turned down. We explicitly rejected each of these:

Each of these would have been straightforward to build, and some would have made a tidy bullet point on a feature comparison. We left them out on purpose. A platform for children should be judged as much by what it declines to know as by what it does.


This note is adapted from Lesson Commons’ internal architecture decision record on student activity tracking. It describes design intent and current implementation; it is not legal advice. Schools remain responsible for their own data-protection obligations as controllers.

This document was written with the assistance of Claude (Anthropic). The author defined the purpose, audience, and main ideas, directed the editorial approach, and edited the final text.