posts

source-of-truth for posts syndicated to Zenn (JP) and dev.to (EN).

filtered by #devops clear

2026-06-16
AI Isn't Something to Trust — It's Something to Design (Series Final)
Series Final. The four mechanisms covered across this series — knowledge graph, Auto Review, Self-Healing, Recurrence Prevention — plus the non-engineer-PR application that sits on top of them, all hang off a single conviction: AI isn't something to trust; it's something to design. The 'I don't trust AI to fill in the blanks for me' framing this lives inside isn't doubt about generation quality, but the clear-eyed acceptance that AI has no idea what context wasn't handed to it, and that 'ideal behavior with no spec given' is a fantasy. The starting point goes back to 2025, when I was trying to figure out how to make AI actually understand a large codebase — and ran into walls on both context window scaling (lost in the middle, attention dilution) and learning-based approaches (machine unlearning, destructive interference). GraphRAG + MCP became the way out: hand AI only the facts it needs, when it needs them, so it doesn't have to infer. From code-graph (which I burned two months on and threw away) to the current product-graph (cpg). This piece is the philosophy and the trial-and-error behind the whole series: harnesses confine where hallucinations are allowed to happen, design is translating principles into your own use cases, and Coverage 90% as a solo target breaks the implementation.
2026-06-09
The Author Doesn't Have to Be an Engineer: How the Harness Holds Quality (Series Part 5)
Series Part 5. With the harness handling quality at the gate, the people closest to the requirements -- business-side managers, PMOs -- now open PRs to production directly, no engineer in between. Two recent examples (a deep root-cause fix and a +1,742 line feature build), the boundary of what they can and can't take on (anything on top of an existing stack vs. standing up new infrastructure), why it holds (the four mechanisms from Parts 1-4), and how the pattern carries over to consumer-facing services.
2026-06-02
Fixed Before Anyone Notices, Stronger After Every Fix: Self-Healing + Recurrence Prevention (Series Part 4)
Series Part 4. Production alerts trigger AI investigation, fix PR, auto-review, auto-merge, auto-redeploy. The same fix PR is required to add a new Guide -- a lint rule, CI guard, type constraint, or guideline entry -- so the same anti-pattern gets auto-rejected from then on. 115 Self-Healing PRs merged in the past 30 days, and the quality gates compound over time.
2026-05-26
Human-on-the-Loop: AI Reviewing AI PRs at cortex -- 769 PRs/month while raising the quality bar (Series Part 3)
Series Part 3. The common critiques of AI-assisted development -- 'review becomes the new bottleneck' and 'AI code drops the quality bar' -- largely don't apply when AI also does the reviewing. Full walkthrough of our pipeline: webhook -> cpg context -> AI review with [Graph]/[Doc]/[Impact] tags -> auto-fix by a separate AI -> re-review -> auto-merge -> parallel deploy. 769 PRs merged in 30 days, human review involvement per PR is near-zero.
2026-05-12
Building a Real AI Harness: Auto-Reviewed PRs, Self-Healing Ops, and Non-Engineer Contributors (Series Intro)
Series intro to cortex, airCloset's internal AI platform that auto-reviews PRs, self-heals ops, and lets non-engineers ship apps. Why harness engineering matters now.