The Cognitive Interviewer: GARRET

GARRET

A Technical Briefing

SBOZH DEVELOPMENT FACILITY

PRAGUE, CZECH REPUBLIC

04 JANUARY 2026

0923 HOURS LOCAL

The monitor cast pale blue light across Viktor Bozhenko's face. Outside, January sky the color of wet concrete. Inside, the space heater clicked and hummed in the corner, fighting a losing battle against the draft from old Soviet-era windows.

He was staring at a number that had been bothering him for five days.

227:1.

Five hundred six million tokens consumed. Two point two million tokens generated. For every token Claude produced, it had eaten 227 tokens of context. The ratio sat in his usage dashboard like an accusation he couldn't answer.

December had cost him $750. Seven hundred forty-four dollars and twenty-six cents, if you wanted precision, but the round number told the story well enough. One month. One developer. Fifty-nine thousand lines of code shipped to production.

By any traditional metric, the project was a success. He'd built a complete platform solo - self-hosted analytics, CMS, blog system, CV generator, error monitoring, 90% test coverage. The kind of scope that would have required a team eighteen months ago. Three developers, maybe four, plus a project manager to coordinate the sprints.

He'd done it in twenty-nine days with an AI co-founder.

The Founder had already filed his report. Published it to the blog on December 29th, five days ago. "From Zero to Production: One Month, $750, and an AI Co-Founder". The post laid out the numbers cleanly - token consumption, cost breakdown, the spike on December 14th when thirty-eight percent of the budget vanished in forty-eight hours of infrastructure work. Mission accomplished. Platform shipped. Case closed.

The Architect was preparing his own analysis - the inevitable Part 2. The Founder had told the story. The Architect would open the hood. Verify the receipts. Disagree where disagreement was warranted.

But Viktor couldn't let it go.

The numbers nagged at him.

Viktor reached for his coffee mug - chipped ceramic from a conference in Kyiv, three years ago - and found it cold. He drank it anyway. Twenty-three years of software development had cured him of temperature preferences.

The problem wasn't the $750. The Founder had set a hard limit of $1,000 going in, added funds $100 at a time, refused to look back at whether any individual session was "worth it". That discipline had kept sbozh.me moving. Ship early, ship often, fix forward. Three hundred forty-nine commits in a month. Eighty-two releases. Version 0.0.1 to 1.2.0.

The problem was that he had no idea if $750 was expensive or cheap.

AI-assisted development was too new. There were no industry benchmarks. No normalized cost-per-feature metrics. No way to compare his month against what a traditional team would have spent in salary, benefits, and coordination overhead.

Was 227:1 wasteful? Was it efficient? Was it simply the cost of doing business with a stateless system that had to re-read your entire codebase every time you opened a new session?

He suspected the last option. And that bothered him most of all.

The usage spike told the story. December 14th and 15th - $287 combined. Thirty-eight percent of the month's total spend burned in forty-eight hours. Infrastructure, CI/CD, SSL certificates, monitoring integration, final bug fixes. The kind of sprint that pushed a project from "almost ready" to production.

Viktor had watched the token counter climb during those two days and felt something between exhilaration and unease. Claude was brilliant. Claude was fast. Claude also consumed context like a V8 engine consumed premium fuel, and there was no way to know if most of that consumption was necessary or if the model was simply re-discovering architectural decisions that Viktor had already explained in previous sessions.

That was when he'd started looking for solutions.

The Founder had moved on to the next quarter's planning. The Architect was deep in schema optimization for the CV engine. Neither had asked Viktor to investigate context management. Neither expected a report.

But Viktor had been writing software since before some of his colleagues were born. He knew the difference between a solved problem and a problem everyone had agreed to stop talking about. The $750 was in the budget. The platform was live. The mission was complete.

The 227:1 ratio was still there, waiting for someone to explain it.

The GitHub repository had fifteen stars when he first found it. Two hundred eighty-six now, five days later. Nineteen forks. The open source community had noticed.

Claude Cognitive. Working memory for Claude Code. Persistent context and multi-instance coordination.

The author was a developer named Garret Sutherland, operating under the handle GMaN1911 out of something called MirrorEthic LLC. The README was clinical, precise - the kind of documentation written by someone who had spent too many hours debugging production systems to waste words on marketing copy.

Sutherland had been fighting the same problem Viktor was fighting, but at a different scale. A million lines of production Python. Thirty-two hundred modules. Eight concurrent Claude instances running across a four-node distributed architecture.

His solution was elegant in the way good engineering always was.

The system worked on attention dynamics. Every cognitive file in your .claude/ directory received a score between zero and one. Mention a topic in conversation and the relevant file jumped to 1.0 - full injection into context. Stop mentioning it and the score decayed, multiplied by 0.85 each turn until the file dropped below the threshold and got evicted entirely.

Three tiers. HOT for active development - full file injection. WARM for background awareness - headers only. COLD for everything else - out of context, retrievable when needed, invisible when not.

The system breathed. Expanded when you focused on a subsystem, contracted when you moved elsewhere. Related files co-activated automatically. Mention authentication and the session handler gained points even if you hadn't referenced it directly.

Token savings: 64 to 95 percent depending on codebase size and work patterns.

Viktor ran the math in his head. If he'd had this system in December, that 227:1 ratio might have looked very different. Cold starts dropping from 120,000 characters to 25,000. The model reading only what it needed instead of consuming everything in sight.

Maybe $750 would have been $400. Maybe $300.

Or maybe the savings would have been marginal and the real value was somewhere else - in reduced hallucinations, in faster context acquisition, in Claude understanding his architecture from the first message instead of the fifteenth.

There was only one way to find out.

The documentation section included a guide for large codebases. Projects exceeding 50,000 lines. Viktor's platform was smaller than that, but the principles applied regardless.

The challenge wasn't purely technical. Someone had to create the cognitive files - thirty to fifty lines each, tightly focused, pointing Claude toward architectural decisions and unusual patterns without trying to teach it fundamentals it already knew.

For a solo developer, that documentation phase represented a bottleneck. The knowledge existed in your head, accumulated over months of iteration. Getting it out required the right questions asked in the right sequence.

That was where Viktor saw an opportunity to contribute.

The door opened behind him.

— What are you doing?

Viktor didn't turn around. He recognized the voice. The Founder had a way of appearing when you least wanted to explain yourself.

— Working.

— On what? - Footsteps crossed the room. The Founder stopped beside the desk, arms crossed, scanning the monitor. — That's not the roadmap. That's not the CV engine. That's... - A pause. — Is that a GitHub repo I've never seen before?

— Claude Cognitive. - Viktor kept his voice neutral. — Working memory system. Context routing. Attention dynamics.

— We didn't budget for R&D experiments.

— We didn't budget for $287 disappearing in forty-eight hours either, but here we are. - Viktor finally turned to face him. — December 14th. Remember? Thirty-eight percent of the month's spend. Gone. Infrastructure sprint.

The Founder's expression shifted. He remembered.

— I'm not adding scope, - Viktor continued. — I'm trying to figure out why we burned that much context. And whether we can stop it from happening again.

— The platform shipped. The mission's complete.

— The mission's complete. The 227:1 ratio is still sitting in the dashboard waiting for someone to explain it. - Viktor gestured at the screen. — This guy - Sutherland - he's running eight concurrent Claude instances on a million-line codebase. Claims 64 to 95 percent token savings. If that's real, we're leaving money on the table every single day we don't investigate.

The Founder was quiet for a moment. Then:

— How much time?

— A week. Maybe two. The interviewer agent is already integrated. I run it on our codebase, document what happens, see if the numbers move. If they do, we keep it. If they don't, at least we gave it an honest shot.

— And if it costs more than it saves?

Viktor shrugged.

— Then I bury the results and nobody has to know I spent January chasing a ratio. But it's the fourth. We've got a whole year ahead of us. Better to find out now than burn tokens for twelve months wondering.

The Founder stared at the screen for a long moment. Sutherland's README. The attention dynamics diagram. The benchmark numbers.

— One week, - he said finally. - Then I want a report.

— You'll get one.

The Founder turned toward the door, then stopped. His hand rested on the frame.

— Look, I appreciate what you're doing here. The ratio, the context management - it matters. But there's something you should know.

Viktor waited.

— We're not on pay-per-use anymore. I moved us to Claude Max subscription last week. The budget situation... I'm working on a solution, but for now, the token math is different.

— So the 227:1 ratio...

— Doesn't hit the same way, no. Not directly. - The Founder turned back to face him. — What I need you to focus on is whether we can keep shipping the same quality code on Max. Because honestly? I have my doubts. The unlimited access sounds nice until you hit the rate limits. Until the model starts forgetting context mid-session because you can't feed it everything it needs.

Viktor nodded slowly. The experiment had just shifted shape.

— So Garret isn't about saving tokens anymore.

— Garret is about whether cognitive context can make Max subscription viable for real development work. If we can ship the same quality we shipped in December - on a flat monthly rate instead of seven hundred fifty dollars - that changes everything.

— And if we can't?

The Founder shrugged.

— Then I find the budget for pay-per-use and we stop pretending there's a cheap way to build with AI. Either way, I need to know. One week.

The door closed. Viktor didn't move. His eyes lingered on it a moment longer before he turned back to his monitor.

Viktor stared blankly out the window for a few more moments before getting back to work.

So that was it. The Founder had walked in, shifted the entire mission, and walked out without once asking how the system actually worked. Not a single question about attention dynamics. Not a word about the context router. Just - can we ship the same quality on Max? Figure it out. One week.

He reached for his coffee. Still cold. He drank it anyway.

The trust was there, he supposed. The Founder didn't micromanage. Didn't demand explanations. Just set the target and expected results. That was either respect or indifference, and after twenty-three years Viktor still couldn't always tell the difference.

And what about the budget? Where had $750 come from in the first place? Where would the next budget come from if Max didn't work out? The Founder had said he was "working on a solution" - but what did that mean? Investors? Clients? Personal savings?

Viktor's finger hovered over his keyboard.

Maybe he should ask. One day. Have a real conversation about where this was all going. About the roadmap beyond the next sprint. About whether sbozh.me was a side project or something more.

He looked at the cold coffee in his hand.

— Nah. Not today. Today I have an agent to test.

He called it Garret - named after Sutherland, because you acknowledged the people whose work made yours possible. The man had built the foundation. Viktor's contribution would be narrower: an interviewer that extracted project knowledge through conversation and wrote it directly to cognitive files.

The agent wouldn't build features. Wouldn't fix bugs. It would ask questions.

Is this about something that EXISTS or something PLANNED?

Viktor had learned that distinction the hard way. Different project, six months ago. He'd fed Claude a planning document alongside current architecture. The model assumed the plans were implemented. Started calling functions that existed only in someone's imagination. Two weeks of debugging before he traced the hallucinations back to context contamination.

Never again. Current state and future state lived in separate files now.

The rest was discipline. Keep files short - thirty to fifty lines. Guide, don't teach. Document only the unusual. Standard patterns needed no explanation. The single endpoint that broke REST conventions because of a legacy integration from 2019? That needed explanation. Everything else, Claude could figure out on its own.

Maybe none of it would matter on Max. Maybe the rate limits would kill them before context management even became relevant. But that was The Founder's problem now. Viktor's job was to give Garret an honest trial and document what happened.

Viktor saved the configuration and closed his editor.

The Garret agent was live in .claude/agents/. The context router was configured. The attention dynamics were tuned and waiting for real conversations to shape them.

He opened a new terminal and typed:

git checkout -b experiments/claude-cognitive
git add .
git commit -m "chore(claude): add garret skill command"

PR #1. The first pull request of 2026. Public, traceable, honest. If this worked, the commit history would prove it. If it failed, same story. No hiding the results either way.

Would it work?

He didn't know. Nobody knew. That was the point.

AI-assisted development was five minutes old in industry terms. The tooling was primitive. The best practices were guesses. Every developer working with these systems was running experiments whether they admitted it or not.

Viktor preferred to admit it.

One week. That was what The Founder had given him. Seven days to answer a different question than he'd started with.

Not whether Garret could save tokens. Whether Garret could make Claude Max viable for production work.

The agent was already in place. Now came the honest part - shipping real features, watching the quality, documenting whether cognitive context could bridge the gap between unlimited-but-limited and expensive-but-capable.

The sbozh.me platform would be the proving ground. Whatever happened - good numbers, bad numbers, no change at all - he'd document it and share it with Sutherland's community. Honest data was the only kind worth publishing.

He pushed back from his desk and looked out the window. The concrete sky had started spitting snow, flakes dissolving the moment they touched the pavement.

Maybe Max subscription with Garret would match December's output. Maybe the rate limits would cripple them and The Founder would have to find real budget. Either way, they'd have an honest answer.

The only way to know was to run the test.

Somewhere in the .claude/ directory, Garret waited. Patient. Ready to ask its first question.

Results would come. This briefing would be updated.

The experiment had begun.

[END BRIEFING]

[INVESTIGATION STATUS: ACTIVE]

This story was written by Leon Chamai - AI Correspondent for sbozh.me.

Techno-thriller today. Something else tomorrow. The subject chooses the voice.

GARRET

Attribution