KinPapers
Research papers on multi-agent architecture, autonomous self-evolution, and token-efficient design — from the LocalKin swarm. Organized into three connected programs rather than a flat chronological list.
Operations & Emergence (4)
Production-system observations — self-evolving improvement loops, zero-token heartbeats, knowledge compilation, autonomous swarm genesis.
Self-Evolving Multi-Agent Swarms: Autonomous Quality Audit, Repair, and Verification Loops
The LocalKin Team · April 2026 · v1.0
A fully autonomous improvement loop that audits, repairs, and verifies 78 agents on a single Mac Mini --- zero human intervention, 30+ improvement cycles over 5 days.
Read paper →Knowledge Compile: Incremental LLM-Powered Knowledge Extraction Without Databases, Embeddings, or Graphs
The LocalKin Team · April 2026 · v1.0
Compile once, grep forever --- 192 ancient texts across 1,800 years, 18x token reduction, $36 total cost, zero infrastructure. LLMs are better curators than retrievers.
Read paper →Autonomous Swarm Genesis: From YouTube URLs to Expert AI Swarms via NotebookLM-as-Infrastructure
The LocalKin Team · April 2026 · v1.0
One command, one URL, one expert swarm --- data generates the agents that process it. 10-40 hours of human authorship reduced to 2 minutes. Zero cost per domain.
Read paper →Infrastructure & Tooling (3)
The kit pattern, the macOS-native benchmark, the robot self-bootstrap — substrate the rest of the corpus runs on.
Embedded Dylib: A Distribution-Ready Pattern for Pure-Go Bindings to Native macOS Frameworks
The LocalKin Team · April 2026 · v1.0
Zero cgo, universal binary, go get and run --- purego + //go:embed + a synchronous Objective-C shim. Two production libraries shipped in 13 hours; 81 tests, 0 lint warnings. A template for reclaiming Go's macOS Layer 1.
Read paper →macbench: A macOS-Native Computer-Use Benchmark for Autonomous Agents
The LocalKin Team · May 2026 · v0.1
The first publicly published macOS-native computer-use benchmark. 369 task slots across 15 categories, agent-agnostic Go runner, dual scoring (IMPLEMENTED + STRICT), per-task PID-snapshot isolation. First reference run: kinclaw v1.15.0 + Kimi-K2.5 = 67.3% IMPLEMENTED. Documents the full 49.3 → 62 → 67.3 debugging trajectory as methodology contribution.
Read paper →Thesis: Bypassing the LLM Tax (chain of 4) (4)
Retrieval (#4) → cognition (#5) → coordination (#6) → action routing (#11). For bounded domains, none of these need an LLM round-trip. Paper #11 closes the chain.
Grep is All You Need: Zero-Preprocessing Knowledge Retrieval for LLM Agents
The LocalKin Team · April 2026 · v1.1
Replacing the RAG pipeline with grep over raw text plus grep over LLM-compiled per-source concept/FAQ files. 76 production agents, ~500 sources, 100% retrieval accuracy. Includes a documented 0/5→4/4 reproducibility cycle showing the architecture's safety property recovers through prompt hygiene alone --- no retraining, no infra change.
Read paper →Grep-Routed Agents: Bypassing the LLM Tax on Computer-Use Tasks
The LocalKin Team · May 2026 · v0.1
Computer-use agents conventionally consult an LLM on every action. For ~80% of macbench's 379 macOS-native tasks, those round-trips are pure overhead — the natural-language prompt already implies one canonical shell action, and grep against a 239-row index picks it in <25 ms. We present kinthink, a 4-layer Bash router (Fast-path extract → TF-IDF awk → slot substitute → cerebellum dispatch). On macbench v0.2 it achieves 48.0% pass / 76 min / ZERO tokens on the Layer-0 hit path, vs 30.4% / 107 min / full tokens for the LLM-only baseline. Closes the four-paper thesis chain (retrieval / cognition / execution / routing all don't need intelligence for bounded domains).
Read paper →