Executive Summary
On June 2, 2026, Boris Cherny, the engineer who built Claude Code, said this in an interview: "I don't prompt Claude anymore. There are loops running, and it's the loops that prompt Claude and decide what to do. My job is to write the loops." Within days the line traveled through the developer community and picked up a name: loop engineering. This piece looks at what that shift is, and what makes it trustworthy.
The core of the shift is that the human's seat moves. You go from the "operator" who types prompts directly to the "designer" who builds the system that prompts the agents. In Cherny's case, Claude Code wrote 100% of his code contributions over the past 30 days (259 pull requests), and merges per engineer per day on his team rose by 200%. But the more a loop runs on its own, the more the risk grows alongside it. If it grades its own output, an agent becomes a machine that agrees with its own answer and repeats forever.
So what makes an autonomous loop trustworthy is not a smarter model but two data instruments that accumulate outside the loop. One is a verifier that judges completion separately; the other is a persistent memory that carries what yesterday's loop learned into today. This piece traces why these two instruments are the safety rail.
Key Figures
Sources: WorkOS, The New Stack
Four numbers show the weight of this shift. One person's contributions are written almost entirely by an agent, a team's throughput doubles, a meaningful share of public code carries one tool's fingerprint, and the time it takes a new hire to ramp up shrinks to a couple of days.
100%
Cherny's code contributions
All of his contributions over 30 days written by Claude Code. 259 PRs
+200%
Engineer productivity
Rise in merges per Anthropic engineer per day
~4%
Public commit share
Estimated share of public GitHub commits made by Claude Code
~2 days
New-hire onboarding
Ramp-up for a new engineer, down from weeks to about two days
The Developer Who Stopped Prompting
Boris Cherny leads Claude Code at Anthropic. Claude Code is the coding agent he started as a side project in the fall of 2024. On June 2, 2026, he said in an interview that his way of working had crossed another threshold. He no longer prompts Claude directly; instead, loops run, prompting Claude and deciding what to do. And he summed up his job in a single line: "My job is to write the loops."
Cherny describes the change in three stages. At first he wrote code by hand with the help of autocomplete. Next he kept five to ten Claude sessions open at once and fed prompts into each one manually. Now the loops hand off the prompts for him. Hundreds of agents read GitHub, Slack, and Twitter and decide for themselves what to build. The person no longer sits inside the loop typing input; they step back to watch the loop run.
It was not a declaration that stopped at words. Cherny said he deleted his IDE in November 2025 and has never reopened it. Over the past 30 days, Claude Code wrote 100% of his code contributions, a body of work that came to 259 pull requests. On his team, more than 90% of the code is written by Claude Code, merges per engineer per day rose by 200%, and the time for a new engineer to ramp up shortened from weeks to about two days. By one estimate, roughly 4% of public GitHub commits now carry this tool's fingerprint.
Cherny's remark earned a name in the community within days. On June 8, developer Peter Steinberger wrote on X that "you should stop prompting your coding agent and instead design loops that prompt your agents," and the post was read more than 6.5 million times. Around the same time, Google Cloud's Addy Osmani gave the phenomenon shape, calling it loop engineering. He summed it up this way: "The unit of value in AI has shifted from the response to the trajectory." What matters is no longer a single answer but the path that many attempts build up over time.
Anatomy of a Loop — Maker and Checker
So what is a loop, exactly? In loop engineering, a loop is an automatic cycle that finds its own work, executes it, verifies the result, records what it learned, and then decides what to do next. The cycle keeps turning without a person feeding in a prompt each time. Osmani and practitioners break the loop into five parts: a goal that decides completion, a maker that produces the result, a checker that inspects it, a memory that carries learning forward, and the token economics that cap iterations and cost.
-
•
Goal: the completion criterion that decides when the loop is allowed to stop. Claude Code sets this with the
/goalcommand, added in May 2026. - • Maker: the agent that actually produces the code or output.
- • Checker: the seat where a different model inspects the result. The principle is that whoever turned in the homework does not grade it themselves.
- • Memory: the persistent file that passes what this loop learned to the next one.
- • Token economics: the ceiling on cost and iteration count. A single agent uses about 4× the tokens of an ordinary chat; multi-agent setups use about 15×.
Of these, the part most often left out is the checker. When you verify your own code, you grade it leniently, because there is always a reason to agree with your own answer. So the maker and the checker are split into different model instances with different instructions. One developer put it this way on X: "Designing the loop is only half of it. The other half is putting something in the loop that can say 'no' — a test, a type check, a real error. A loop with nothing pushing back is just a machine that keeps agreeing with its own answer."
Key point: the real parts of a loop are not the model but the instruments around it — the goal that defines completion, the checker that can say "no" to a result, and the memory that records learning. Strip those three away and an autonomous loop slides into a machine that repeats its own answer.
Two Pillars of Trust — Verifier and Memory
This is where Pebblous reads the story. What makes an autonomous loop trustworthy is not a better model but two data instruments that accumulate outside the loop. One is a separate verifier that judges completion; the other is a persistent memory that carries learning forward. What matters is that both live not inside the model's weights but in files and gates outside the loop.
3.1A separate verifier that judges completion
The first pillar is the verifier. It is the instrument that stops a loop from declaring "it's done" on its own. When you set a completion condition in Claude Code with /goal, a separate, fast model independently grades each turn against that condition. The key is that the agent that makes and the agent that grades are different models with different instructions. A loop without a verifier becomes a machine spinning in agreement with its own result; a loop with one turns the verdict of "complete" into a piece of data in its own right. That verdict data is what makes the loop trustworthy.
3.2A persistent memory that carries learning forward
The second pillar is memory. To keep an agent from repeating the same mistake every session, you write the correction into a file and have the next session read it automatically at startup. In Claude Code, a file called CLAUDE.md plays this role. In Cherny's words, "what yesterday's loop learned is passed to today's loop." Because the learning is recorded in an external file instead of being burned into model weights, a person can read it, fix it, and version-control it. The loop's trust assets accumulate not inside the model but in the file system.
The core proposition: what the two pillars share is that they sit outside the model. The verifier leaves the verdict of completion as data; the memory leaves learning as a file. The more autonomy grows, the more the center of gravity for trust shifts from the model to the data that accumulates outside the loop.
From Operator to Designer
So the human role moves. From the operator who typed prompts one line at a time, you cross over to the designer who builds the whole loop. What the designer sets is not individual answers but the rules of the loop: what counts as complete, who verifies that completion, and what gets carried to the next round. In the same interview, when asked what work ultimately stays with people, Cherny answered "values." Deciding what is worth building, and what to call a good result, remains the human's part.
For data-driven organizations, this shift narrows to a single practical question. When you adopt an agent, the first thing to weigh is not "which model is better." It is "do we have the reference data to judge completion," and "is there a structure that carries what we learned from last session's failure into the next one." Without these two, no matter how good a model you bolt on, the loop will either spin in agreement with its own answer or repeat the same mistake. You can borrow a model, but the verification standard and the learning record are data an organization has to build for itself.
Editor's Note: this is also why Pebblous has focused on getting data into an AI-Ready state. Once an agent moves beyond advice and starts doing the work itself, what makes that work trustworthy is not the model but the data that accumulates outside the loop — the standard that judges what counts as complete, and the record that carries yesterday's failure into today. The question loop engineering raises is, in the end, a question about data. Before you decide what to hand an agent, you find yourself asking whether you have first made the data that underpins that judgment trustworthy.
References
Primary sources — remarks & essays
- 1.Boris Cherny. (2026). "Boris Cherny: Claude Code & the Future of Engineering." WorkOS × Acquired Unplugged, 2026-06-02. — The original "my job is to write the loops" remark and the three-stage shift.
- 2.Addy Osmani. (2026). "Loop Engineering." Elevate (Substack), 2026-06-08. — Coined "loop engineering," the five loop parts, the "response to trajectory" framing.
- 3.Peter Steinberger. (2026). "You should be designing loops that prompt your agents." X, 2026-06-08. — The viral spread of the concept (6.5M+ views).
Industry coverage & analysis
- 4.The New Stack. (2026). "The Anthropic leader who built Claude Code says he ditched prompting — now he just writes loops." The New Stack. — Interview recap, deleting the IDE, 259 PRs.
- 5.The AI Corner. (2026). "Loop Engineering: Build Self-Running Coding Agents 2026." The AI Corner. — Maker–checker separation, token economics (4× / 15×).
- 6.WorkOS Blog. (2026). "Key takeaways from Boris Cherny on building Claude Code." WorkOS. — Productivity +200%, team code 90%+, onboarding ~2 days, public commits ~4%.