Cognitive Integrity Lab · interaction layer
About this prototype
This site is an early prototype of a different way of thinking about trust in AI-mediated communication. The core idea: rather than trying to detect AI-generated text after the fact — guessing from word patterns, em-dashes, or polish — we certify the process by which a piece of writing was produced.
Each composition finalized here produces a cryptographically signed receipt recording the editor mode in effect, the time of signing, and a one-way fingerprint of the exact text. The receipt is publicly verifiable, but the writing itself is never stored on this server.
For the full technical specification, see SPEC.md →
The cryptographic principle
Earlier cryptographic systems — like the Nazi Enigma — depended on shared secrets. Once those secrets were recovered, as they famously were by Turing’s team at Bletchley Park, the system’s communications could be read.
Modern public-key cryptography changed the game. Authenticity can now be verified publicly without exposing the private key that created the signature. That is the principle behind HTTPS, software signing, banking security, and cryptocurrencies.
We apply the same idea to AI-mediated communication: not detecting what “sounds human,” but verifying the conditions under which a message was produced.
What you know with mathematical certainty
When the verify page shows a green “Valid” banner, two things have been mathematically checked:
- Text integrity. The text in your hands matches, character for character (after small whitespace normalization), the text that was originally signed. Change one comma, and verification fails.
- Signature authenticity. That text fingerprint, paired with a timestamp, was signed using the private key paired with the public key shown below.
Both checks happen in your browser. The text you paste is not sent to this server or any other server. The signature can also be verified using any standard Ed25519 verifier outside this site, with no trust in this site required.
What you know with very high confidence
That the code running on this server matches the source code published in the open. The full source is at github.com/cogintegritylab/interaction-layer. The hosting provider (Vercel) records the public commit SHA each deployment is built from; the deployment page is itself public. So anyone can trace any verify page back to a specific public commit.
This is not a mathematical guarantee — it relies on Vercel as a trusted intermediary. A full mathematical guarantee here would require what is called “reproducible builds with cryptographic attestations”: a public, automated process whose build output is signed and matchable byte-for-byte against the public source. The method is well established in supply-chain security, and bringing this project under it is on our roadmap.
What we cannot prove yet
- Thinking. The receipt does not prove whose thinking shaped the words — only that the text passed through a tool whose paste, copy, and drag protections were in effect. Some input channels — voice dictation, accessibility tools — cannot be distinguished from manual typing. And no tool can determine what the writer was reading, referring to, or thinking through as they typed. Stronger enforcement of composition conditions would require a verified client, which is on the protocol roadmap.
- That the timestamp is independently anchored. The current implementation uses the issuer’s server clock; an independent timestamping service (such as OpenTimestamps, which anchors signatures to a public ledger) would provide times verifiable without trusting any single party. This is on the roadmap.
Why we don’t try to detect
A person can prompt a model, lightly edit the output, and pass along clever or misguided ideas they don’t understand; nothing in a detection result reveals that. And the question remains: was anyone home?
Detection by style also damages what it claims to protect. Em-dashes, semicolons, and well-formed paragraphs become evidence against the writer; choppy prose is rewarded, care is punished. Most stylistic formulas exist for reasons. The deeper cost of abandoning them is ceding authority over language to whoever runs the largest models and detectors. Language stops being a distributed inheritance and becomes a centralized verdict on what counts as human.
This tool does not look at your prose. Use em-dashes to your heart’s content; use phrases critics have decided “sound AI”; write in whatever voice suits you — none of it is evidence about you. Instead of asking readers to inspect every sentence for signs of machine origin, the approach is to let writers commit to a process and let readers check the commitment with math. The writer offers a receipt; the reader verifies it; nobody is on trial for their punctuation.
The bigger picture
ai_free is one demonstration of the idea — and a simple one. It is not the endpoint, and it is not a moralistic stance against AI use. It is a proof of concept: a certified-process model can be public, lightweight, and independently verifiable. The same structure supports many other modes, each appropriate to different settings:
cognitive_authorship_check— a follow-up assessment in which the writer is asked to extend the composition, consider alternatives, or restate it in different terms. Not a test of understanding; a test of whether the writer can inhabit the work as their own. This is the mode the Cognitive Integrity Lab is currently piloting.socratic_mode— composed with an AI assistant that may ask questions and offer challenges but never supplies substantive content.ai_assisted— composed under circumscribed AI involvement, with the writer specifying how AI may enter: light editing of grammar and style, translation, paraphrasing, structural feedback, or other declared roles. The mode constrains how AI may enter the composition rather than just disclosing that it did.classroom_exam_mode— produced under exam conditions agreed in advance and certified through the protocol rather than through proctoring. Suited to remote learning that wants neither surveillance software nor blue-book substitutes.- Professional modes such as
clinical_note_reviewed,legal_brief_certified_mode, orjournalism_sourced_mode— placeholder names for what medical, legal, journalistic, and other professional communities might define on top of the protocol. The specific shape of these is for the communities to determine.
What counts as judgment varies from domain to domain, and should be defined by practitioners in each domain, not by AI or AI companies. A legal mode might require, for instance, that no confidential information was submitted to any AI service and that a named attorney reviewed the brief’s reasoning. A journalism mode might require declared sourcing chains. A clinical mode might require human review of any AI-suggested diagnosis. These specifics belong to the communities that practice each craft; the protocol gives them a way to specify where AI’s participation ends and human judgment begins.
The Cognitive Integrity Lab is the first issuer, but the format is designed to be open: in principle, email clients, learning-management systems like Canvas, universities, journals, hospitals, and other institutions could each issue their own signed receipts under a shared vocabulary. The value of the system is not that one lab signs everything forever; it is that anyone can issue, anyone can verify, and the protocol vocabulary is public.
Verifying independently
Every verify page shows four pieces sufficient to verify the signature without trusting this site:
- The public key (Ed25519 SPKI, base64).
- The signed payload, as canonical JSON — the exact bytes that were signed.
- The signature (base64).
- The URL of the verify page itself.
Any cryptography library or command-line tool that supports Ed25519 can confirm the signature using just those pieces. The full source code shows exactly how the canonical payload is constructed.
The current Cognitive Integrity Lab signing key, key ID cil-v1:
MCowBQYDK2VwAyEAkW1e239Y+sREwqnQ8RZtRXlPJasTuV7LUiLJjoM8cJU=
Roadmap
Current development focuses on closing the limits described above. The protocol specification defines a framework of increasingly strong assurance — independent timestamping, verified clients, and third-party institutional audit — that we are actively working toward. The full framework is in §14 of the protocol specification.
Attribution
Built by the Cognitive Integrity Lab at Temple University. Source code and issues: github.com/cogintegritylab/interaction-layer.