The Missing Foundation

On the rotten abstractions beneath digital life, and what an honest replacement would require

Mar 10, 2026

Every building rests on a foundation. If the foundation is unsound, it does not matter how elegant the floors are, how tasteful the fixtures, how ingenious the plumbing. The building is unsound. You can paper over cracks. You can install load-bearing furniture. You can hire consultants to explain that the cracks are actually features. The foundation remains what it is.

The foundations of digital life are unsound. Not in the trivial sense that software has bugs, or that interfaces are ugly, or that corporations are greedy — though all of these are true. They are unsound in the structural sense: the fundamental abstractions upon which everything is built were never designed to work together, were never designed to serve the person using them, and have not been seriously re-examined since the 1970s.

I want to be precise about what I mean, because vague complaints about technology are cheap. There are exactly five foundational abstractions that every person’s digital existence depends upon, and every one of them is broken.

I. The File Is Dumb

A file is an opaque sequence of bytes identified by a path. Your operating system does not know that a JPEG is a photograph of your daughter. It does not know that two documents are versions of the same report. It does not know that the spreadsheet you received by email is the same spreadsheet you edited last week, now with someone else’s changes. It cannot know these things, because the file abstraction carries no semantic information whatsoever. A file is bytes at a location. That is all it is. That is all it has ever been.

Every application that wants to do anything intelligent with files must build its own model of what those files mean, from scratch, in isolation. Your photo application maintains its own database of image metadata. Your email client maintains its own index. Your note-taking application maintains its own graph of relationships between notes. None of these models talk to each other. None of them can. They are private interpretations of the same dumb bytes, each trapped in its own silo, each reinventing concepts — authorship, versioning, relationships, search — that should have been provided by the foundation.

The file abstraction was designed in an era when storage was expensive and computers were shared by multiple users via terminals. It solved the right problem for 1971. It is now 2026, and we are still building on it, not because it is good, but because it is there.

II. The Link Rots

The web promised to connect everything. It delivered something subtler and more corrosive: a system where every reference is a promissory note issued by a server operator.

A URL does not identify content. It identifies a location where content might be found, if the server is still running, if the domain has not lapsed, if the operator has not reorganised their paths, if the content has not been removed for legal, commercial, or arbitrary reasons. Studies vary, but the half-life of a URL is measured in years, not decades. The average web page survives roughly two years before its URL ceases to resolve. The web is a library where the catalogue entries spontaneously erase themselves.

The alternative has been known since 1979, when Ralph Merkle demonstrated that content can be identified by its own cryptographic hash — what it is, not where it sits. A content address is a mathematical fact. It does not depend on a server. It does not decay. It does not require anyone’s permission or continued goodwill. The same content always produces the same address, on any machine, at any time, verifiable by anyone. Git uses this principle. IPFS uses it. Every serious distributed system eventually converges on it. The web did not adopt it, and every rotten link is a consequence of that decision.

III. Identity Is Rented

Your digital identity is a username and password on someone else’s server. You prove who you are by asking a corporation to vouch for you. Lose access to the platform, lose your identity. Get locked out, and your data — your messages, your documents, your photographs, your relationships — goes with it. You do not own what another man can revoke.

This is not a technical necessity. It is a business model disguised as an architecture. Public-key cryptography has existed since 1976. The mathematics required for a person to generate their own unforgeable identity, on their own device, without asking anyone’s permission, has been available for half a century. You generate a key pair. The private key stays on your device. The public key is your identity. You sign things with it. Anyone can verify your signature without contacting a server. If you lose a device, you revoke that device’s key from your own key history — an append-only log that you control.

This is how SSH works. This is how Signal works. This is how every cryptographic system designed by serious people works. The fact that your email address, your social media presence, your cloud storage, and your messaging applications all use the account-first model — where a corporation issues your identity and can destroy it — is not because the alternative is unknown. It is because the alternative does not produce a captive audience for advertisements.

IV. Trust Is Outsourced

When you use a social platform, you delegate the question “whose assertions should I believe?” to an algorithm you cannot inspect, designed by people whose incentive is to maximise your engagement, not your understanding. When the platform decides what is misinformation, it decides for hundreds of millions of people simultaneously, using criteria that are opaque, inconsistent, and subject to political pressure. When it gets the decision wrong — and it regularly gets the decision wrong, in both directions — there is no recourse except to appeal to the same authority that made the error.

Trust, in any honest system, is personal. I trust certain people about certain things. You trust different people about different things. Neither of us is wrong, because trust is not a fact about the world. It is a judgement about relationships. A system that respects this would let each person maintain their own trust configuration — who they trust, about what, to what degree, with what decay over time — and would derive different views of the same underlying data based on those configurations. My view of a conversation would reflect the moderators I trust. Your view would reflect the moderators you trust. The data would be the same. The evaluation would be personal.

This is not utopian. It is how trust works in the physical world. You do not have a single, centralised authority that decides which of your neighbours is trustworthy. You form your own judgements, informed by the judgements of people you already trust. The computational infrastructure for this — signed attestations, trust path computation, policy-driven evaluation — is well-understood. It has not been built into any major platform because centralised trust is how platforms maintain control. If you can decide for yourself whom to trust, you do not need the platform to decide for you. And if you do not need the platform, the platform does not need you to look at advertisements.

V. Meaning Is Absent

The most fundamental absence is the one least discussed. There is no layer of meaning in the digital foundation. The relationship between a document and its author is not expressed anywhere that a machine can find and verify. The fact that two files are versions of the same thing is not expressed. The fact that a photograph depicts a specific person, that a message is a reply to another message, that a dataset was produced by a specific instrument — none of these semantic relationships exist in the infrastructure. They exist only in the private databases of applications, each using its own schema, each inaccessible to every other application.

The Semantic Web tried to solve this. It failed, for reasons that are instructive. It failed not because the idea was wrong — the idea that semantic relationships should be first-class, queryable data is obviously correct — but because it was built on URLs (which rot), required centralised ontology agreement (which never came), and offered no answer to the question of trust (anyone could assert anything, and there was no mechanism for evaluating whose assertions mattered). The idea was right. The foundation was wrong. And so the Semantic Web became an academic exercise, and the relationships between things remained trapped in corporate databases, exploited for profit, inaccessible to the people who created them.

• • •

These five failures are not independent. They are symptoms of a single architectural vacancy. There is no unified layer where content, identity, meaning, and trust coexist. Every application must reinvent each of these from scratch, in isolation, usually by delegating them to a corporation that has its own interests in maintaining the dependency.

The question is what a correct foundation would look like. Not a perfect foundation — perfectionism is the enemy of existence — but a correct one: one that gets the fundamental principles right, even if the implementation is incomplete.

I think the requirements are these:

Content must be identified by what it is, not where it lives. Hash-based addressing, deterministic encoding, self-verifying integrity. The same content must produce the same identifier on every machine, now and forever, without requiring anyone’s server to remain operational.

Identity must be self-sovereign. Generated on the device, controlled by the person, revocable at the device level without destroying the identity itself. No accounts. No platforms. No rented existence.

Relationships must be first-class, signed, and queryable. Not hidden in application databases. Not expressed in fragile URLs. Semantic triples — subject, predicate, object — signed by their author, verifiable by anyone, stored alongside the content they describe. The relationship “this document was authored by this person” should be as real and as findable as the document itself.

Trust must be explicit, personal, and computable. Not delegated to a platform. Not binary. Multi-dimensional: I trust this person about this topic to this degree, decaying over time, informed by the trust of people I already trust. Different people seeing different views of the same data is not a bug. It is the only honest architecture for a world where people genuinely disagree.

Data must live locally by default. Networking must be explicit. Sync must be merge-based. The person must own their data in the material sense — it is on their device, encrypted, and nothing leaves without their instruction. “Local-first” is not a feature. It is a prerequisite for every other property on this list. If your data lives on someone else’s server, every other guarantee is a gentleman’s agreement that the server operator can revoke at will.

The system must interoperate with the existing world. This is the requirement that visionaries most often refuse, and it is the one that determines whether a system exists in reality or only in its creator’s imagination. Plan 9 was technically superior to Unix in nearly every respect and did not matter. Xanadu envisioned content addressing and bidirectional links in 1963 and is still not finished. Technical superiority is necessary and insufficient. A correct foundation must provide bridges to the existing world — to files, to the web, to existing applications — or it will remain a beautiful idea that nobody uses.

• • •

I am aware that this reads as a shopping list for a system that does not exist. I am also aware that every item on this list has been implemented, somewhere, by someone. Content addressing exists in Git and IPFS. Self-sovereign identity exists in dozens of cryptographic protocols. Signed semantic triples exist in various Linked Data implementations. Trust computation exists in the academic literature and in a few niche applications. Local-first architectures exist in CRDTs and their implementations. Each of these works in isolation.

What does not exist — or does not yet exist in a form that has reached ordinary people — is the integration. A single coherent model where all five of these properties reinforce each other rather than existing as separate libraries stitched together with middleware and prayer.

The integration is the hard part. Not because the individual pieces are difficult — they are well-understood — but because the integration requires a single mind, or a small group of aligned minds, willing to hold the entire problem in view simultaneously and make consistent architectural decisions across every layer. This is rare. It is rare because it is not rewarded by the market. The market rewards applications, not foundations. It rewards products that solve one problem visibly, not substrates that solve many problems invisibly. The person who builds a better photo-sharing application gets funded. The person who builds a better foundation gets a PhD and a Wikipedia page about a system that was never finished.

And yet the foundation is what matters. Applications come and go. Platforms rise and fall. MySpace is a punchline. Facebook is a holding action. Twitter was sold to a man who renamed it after a letter of the alphabet. The content people created on these platforms — their messages, their photographs, their relationships, their communities — was lost, degraded, or held hostage, because none of it was built on a foundation that the users controlled. The platforms were landlords. The users were tenants. And when the landlord decided to renovate, demolish, or sell, the tenants had no recourse, because they had never owned the walls.

The question is not whether a correct foundation is possible. The question is whether anyone will build one and then do the unglamorous work of making it usable by people who do not care about cryptographic hash functions, semantic triples, or trust path computation. People who just want their photographs to be theirs, their messages to be private, their identity to survive the next corporate acquisition, and their digital life to be a thing they own rather than a thing they are permitted.

The mathematics is ready. The engineering is understood. The principles are clear. What has been missing is not knowledge but will — the will to build a foundation rather than another floor, and to build it in the world rather than in a paper.

The floor is not the foundation. The application is not the substrate. The product is not the principle. We have been building floors for fifty years. The foundation is still missing. Someone should pour it.

Ma An-Zuo

4dEdited

Building trust on a trustless system.

Peter McHale

19h

Here is Pedro Domingos acknowledging the ontological difficulties that challenged the early semantic web, and then positing that machine learning can help overcome them. That was filmed 12 months before chatgpt was released. The relevant part of the video is just 4 minutes or so. https://www.youtube.com/live/x72De1pdkFY?t=3064s

3 more comments...

Craig’s Substack

Discussion about this post

Ready for more?