Code & Documentation: Map ⇄ Territory

One thing that AI systems don't seem to know much about is AI systems.

From my post on the future of 'documentation';

But the thing that occurred to me this morning, as I was playing with a local LLM and asking it for help about some technical details around how to configure it, was that there is an opportunity for these models to be "self-documenting" that seems to be being missed.

This continues to be a frustration; models that lack an awareness or understanding of the context that the user is experiencing them in can only possibly hallucinate when asked about 'themselves'.

For example;

Copilot in Excel is writing me a prompt to copy and paste into Copilot in Excel, because Copilot in Excel can’t do what I’m asking for directly, but it’s OK because this is exactly what Copilot in Excel is great for!

What was happening then was that Copilot in Excel (at the time, in the UK) was simply a different 'view' for M365 Copilot. Same system, same chatbot, in a different place; it had no way of 'knowing'knowing that it wasn't in the 'regular' website window, and although it 'knew' about Copilot's 'agent mode' in Excel, it clearly didn't 'know' that it hadn't been rolled out in the UK yet.

(Or that I was in the UK.)

In other words – the model had some 'knowledge' about the Copilot products, but there was specific "self-knowledge" that the LLM really needed to be able to answer the question properly. But didn't have it…

Don't these chatbots have 'knowledge bases'? Shouldn't there be something in the system prompt? "You are Microsoft Copilot, a chatbot in the sidebar of the Microsoft Excel for Mac application." In the rush to AGI, are these companies not writing some basic documentation along the way?

To be clear; this isn't just a Copilot issue. I've been in Google's Antigravity, being told by Antigravity how to set global agent preferences by clicking on buttons in Antigravity that don't exist. I've had Claude, in the Claude.app desktop application, tell me that Claude Code is a totally separate desktop app to the 'regular Claude' desktop app (its exactly the same Claude.app application that it told me to download, that I was already in.) Again - a degree of 'self-knowledge' that was important but missing, leading to responses that looked helpful but were… not.

Similarly - asking Gemini about how to do things in Antigravity is weird; both are Google AI tools, using versions of the same Gemini model – but just a month ago, Gemini didn't even 'know' that Antigravity existed – other than as a Google Search easter egg. (On checking that Gemini wasn't familiar with Antigravity I got the reponse; "Or — and I say this gently — it might be one of those product names that sounds very plausible but doesn't exist? There's a long tradition of "Google [noun]" jokes given how many products they've launched and killed." Which I find less infuriating than Copilot's flavour of gaslighting, but not by much.)

I understand that hallucinations are a thing with LLMs that are very difficult to work around, because the model contains a huge amount of knowledge compressed into its billions of weights and biases but no way to differentiate between what it 'knows' and what it 'guesses'. I get it. I also get that agents have instructions, and 'grounding' texts.

So, how hard is it to set rules for agents that tells them they don't talk about the specific application that they live inside without grounding those responses in documentation? It doesn't seem hard – it could be addressed with simple prompt injection, some RAG on a self-documenting knowledge base, fine-tuning on specific questions (finally – an actual use for FAQs!)

Documentation used to be hard to write. Quite suddenly, it has become very easy - exactly the kind of task that a large language model can carry out. And yet, large language models being deployed into applications with very specific use-cases seem to be consistently lacking in the 'grounding' data needed to work properly in these contexts.

The thing is, documentation isn't just easy – if 75% of the code that Google/Anthropic/OpenAI/Microsoft etc. create is being written by AI, then it exists before the code.

Documentation and Code: Map ⇄ Territory

From Charity Majors: "AI demands more engineering discipline. Not less";

What happened in 2025 was this: the economics of code production were turned upside down. Instead of being very hard, time-consuming, and expensive to generate code, it became effectively free and instant. Lines of code went from being treasured, reused, cared for and carefully curated, to being disposable and regenerable, practically overnight.
For most of computing history, the primary way people have learned to understand software is by writing the code. Once you've achieved some mastery, reading and discussing code gets you most of the way there.

The code is the knowledge. It does what it does, and the documentation is - generally - a reflection of what it does.

The code is the territory; the documentation is the map.

So, what happens when the code is "disposable and regenerable"?

The code has been the bundled up repository of developer intent, user expectations, implicit and explicit behaviors, the only fossilized composite record we have of bugs gone by. It’s too much!

Right now, I'm getting used to a mindset shift – I used to write code by hand. Typed into a text editor, then into an IDE, then an IDE with a magic autocomplete. 'Artisinal' code – the sort where I'd be unsatisfied if the indentation style was inconsistent in different cells within the same notebook. (A notebook that nobody else was ever going to see…)

The shift is that I'm now writing messages in 'natural language' to an agent who writes my code for me; sometimes in a chat-type environment, sometimes into a README.md, SKILLS.md or CLAUDE.md file.

That means the code isn't even "my code". I'm not even seeing it, let alone hand-crafting it. The code still does what it does, but the 'documentation' isn't a reflection of what the code does any more. The roles have flipped. If the code and the documentation clash, then the code needs to change. And the type of file I'm spending the most time working on in any given coding project isn't a .py or .js – its .md.

The code is now the map. The documentation is the territory.


A brief story; its 1996, and I have a summer job working in a mobile phone R&D centre. I'm in the Product Test Team; my job is pushing buttons on a phone and ticking a box if it does what its supposed to do. Sometimes in an office; sometimes, in a converted transit van driving around the country, making sure that the phones do what they are supposed to do when switching between transmitter masts. The actual work is boring, but the job is pretty fun. But not this week - its "all hands on deck", running tests around the clock and through the weekend, because there's a new phone about to be released which needs a last-minute change to its software. There's a printing error in the user manual, but rather than reprint all the manuals its been decided that the phone menus need to be changed instead; so what was previously Menu > 4 now needs to be Menu > 3, and vice versa. Everyone in the office agrees that this is crazy (because everyone in the office has spent the last six months running all sorts of tests on various versions of this software); the user manual is a 'map' that nobody actually reads, the software is the 'territory', and this decision to make the software match the manual is insane… But reprinting the manuals would mean they won't be in the boxes with the phones when they are supposed to be in the shops so… its just another day in the sort of world that made the pointy-haired boss in the Dilbert comics feel so accurate.


Now, its thirty years later, this kind of 'documentation' is no longer colour printed on shiny paper and put in the box (its a web page – or more likely a PDF – and the only 'documentation' that comes in the box is the "do not swallow the battery" regulatory warnings in forty two different languages), the cost of rewriting and republishing the documentation is now zero. And now, it makes perfect sense to say that the documentation says one thing, the software does something else, so its the software that should be rewritten.

Because the economics of code production were turned upside down. 'Documentation' is more important than ever – not just because the AI needs it to answer users' questions, but because the AI needs it to build the software in the first place.

I'm sure that recursive self-improvement will make future AI great (again?) But maybe some recursive self-documentation could make the systems we have a bit better?

  1. I use scare quotes around 'knowing' as a reminder to myself that it doesn't really "know" things.

The future of 'documentation'

In the late 1990s, right back at the start of my squiggly career, I was a 'Technical Author' - writing tests for mobile phone software. (Not writing software tests - literally, instructions for the people like me who would have to run the tests. (eg. "Press Menu > 1 > 1 : You should be in a 'Compose New SMS' screen".)

I liked the idea of writing - in particular, the idea of writing something that would help people make the most of consumer technology (phones - this was pre-smartphone, PCs, software etc.) User manuals at the time were typically a joke - written in technical jargon that you could only understand if you already understood how the things worked. 1 So my plan at the time was to get into that side of "technical authoring"; making complicated things simple.

After a few meetings and chats with various people in the field, the first realisation I had was that the reason was that the people writing the "user" documentation were spending >90% of their time writing the technical documentation for engineers; their job was literally the opposite of communicating complex things in simple ways for non-technical people.

The other thing I realised was that the future of this kind of 'documentation' wasn't going to be printed on a piece of paper in the box that the technology came in; it was going to be on the website, where it could be constantly updated, revised etc.

Well - that wasn't entirely incorrect. In the last few weeks, I've needed to find manuals for a few things; we've moved into a new house, and I've needed to understand things like an extractor fan that had stopped spinning, a water pump that controls the heating, and some flat-pack furniture from the old house I needed to reassemble. For all three, I found what I was looking for online - and for all three, it was in the form of a PDF of the printed piece of paper that presumably originally came in the box.

But still - I think its true as a more general trend. Or at least, it has been.

"Documentation" might not have quite made the leap from the static paper-based things to a truly dynamic, searchable, interactive version - but the vast majority of the time, the web will still get you the answers to your questions. Maybe thats a Reddit thread. Maybe its an obscure electricians' forum where someone has asked for help for exactly the same extractor fan problem that I've had - and someone else has provided it.

But the thing that occurred to me this morning, as I was playing with a local LLM and asking it for help about some technical details around how to configure it, was that there is an opportunity for these models to be "self-documenting" that seems to be being missed. Meta's Llama model seemed to struggle with some questions about configuring itself (sending me off on a weird path of writing python scripts, editing .zshrc configuration files - before I did a google search and realised I could do what I wanted to do with two lines of code in the same window that I was 'talking' to the Llama model in.

A fairly small LLM, trained on the model's own documentation should surely be able to get you to an accurate answer much faster and easier than the current 'best option' of Google/Reddit/Stack Overflow searches - which can just as easily get you to outdated/obsolete advice as to the "right" answer.

Sure - LLMs hallucinate; but only when they are trying to provide an "answer" that they don't have enough information to provide and are forced into a 'best available information' situation - which a well-trained model with a single use case should not have a problem with.

  1. Honestly - I think this is still true, for the most part. For example, the manual for a robot hoover we recently got tells you to push a button that is *not labelled on the actual robot* - only in the manual itself, in text so small I had to get my daughter to read it for me. (OK - my eyesight isn't as good as it used to be, but this was literally text on a diagram about 1mm high.) I'm sure it made perfect sense in the version on the designer's 5K screen - but the actual version that the user had to rely on was almost useless.