Open Source Vibe Coding

I'm finding vibe-coding fascinating for all sorts of reasons, but really curious about the idea of – for want of a better term – what I'm calling 'open source' vibe coding.

For some context, "Open source" software is where the source code is 'open' and visible for anyone to read. Generally, its linked to "free" softwarefree, but often made available as a binary downloader: fast and convenient, but impenetrable – you have to trust that the binary code (that is, the .exe or .app or whatever, that you can't 'read' that your computer runs) has been built from the source code that you can (assuming you have the time to read through a load of code, the ability to understand it, and to follow what its doing), and hasn't been nefariously modified.

But there's also the option to "build from source" – ie. you build the binary yourself, so you can trust that the code your computer is running is the same as the source code that you can 'read'. (Again – whether you can understand it is a different matter.) Slower, more hassle - but takes away the need to trust whoever built the binary version.

So, what has this got to do with 'vibe coding'?

One way to build an 'app' (or whatever you call the output of a particular vibe coding project) is through an ongoing chat – which isn't necessarily "shareable" (or useful if you can/do share it.) But another approach to building is to write a 'specification', which you can then point your Claude Code/Codex/Antigravity/Cursor/whatever tool at and say "build the thing that this text file describes."

What would a "build from spec" version look like? A specification file that anyone could read, that you can point your favourite chatbot at and say "explain this for me", or point your favourite AI coding assistant at and say "build me the thing that this describes". A 'natural language' specification that anyone can read – not just those who have the ability (and time) to meaningfully inspect the source code.

I wrote Code & Documentation: Map ⇄ Territory about the idea that instead of documentation describing the code it defines it;

The code is the knowledge. It does what it does, and the documentation is - generally - a reflection of what it does.
The code is the territory; the documentation is the map.
So, what happens when the code is "disposable and regenerable"?

[…] the 'documentation' isn't a reflection of what the code does any more. The roles have flipped. If the code and the documentation clash, then the code needs to change. And the type of file I'm spending the most time working on in any given coding project isn't a .py or .js – its .md. The code is now the map. The documentation is the territory.

The thing is, if the documentation is written as instructions for an LLM to read, then it can also be made available for anyone who speaks English to read. That completely changes the potential audience.

In the same way that you might publish experimental methodology to make your experiments repeatable by others (so they don't have to trust your numbers), publish your data to make your calculations repeatable (so they don't have to trust your workings); what about publishing your 'code specifications' to make your code-writing process 'repeatable' and transparent?

Maybe you're not a coder or a mathematician. Suppose you're a qualitative researcher, a historian, a regulator, an auditor, a policy maker, a student. ie. You can't necessarily read and understand the code, but you can understand the relevant principles. So, what if you don't just get to see the data in a graph (with a 'source:' label at the bottom), but you can read the instructions that tell your computer how to build the tool that pulls the data and builds the graph. That changes the 'trust' in the relationship between author and reader. You don't have to trust that the data in the graph is the same as the data published by the organisation named if you're looking at exactly the same graph and pulling the same data from the same source.

There's a saying in open source communities that 'given enough eyeballs, all bugs are shallow' – I'm not sure if non-coding eyeballs would help much with finding bugs (and thats also a job that the frontier LLMs "eyeballs" are astonishingly good at anyway), but what this would make more transparent is intent. Anything in the specification could be easily questioned, challenged, or even modified to see how it affects the output.

It also makes it easier to build on others' work. "Make me the graph on p23, but add the trend line. Include the five biggest South American cities. Extend to include the last three years of data." Or to provide accessibility benefits – "Make me the graph on p23, but 6x larger with 3x bigger text." "Make me the graph on p23, but use a colourblind-friendly palette." "Break out the five lines into five separate stacked charts." "Turn the pie chart into a bar chart…"

Oh – and an added benefit; you get to learn how other people are writing and refining their specifications and learn what "good" looks like from them. (Like we used to do in the days of "show source" on the web.) In a world of influencers and their "you only use 5% of Claude Code, these 27 tips will supercharge your productivity" posts, that can't be a bad thing.

  1. "Free" as in "freedom" – software you can modify, copy, share etc. Not "free" as in "beer", where the software is 'given away' with a bunch of conditions you have to agree to if you want to use it – eg. allow the software to collect and sell your personal data. Its a whole thing.