How I Configured Claude Code as an Architect: 10 Settings, Grouped by What They Fix

I build with Claude Code every day, and for the first few months I treated it like a finished tool. It was the wrong mental model. It's closer to a platform: the defaults run at "fine," and the configuration is where the real speed lives. Below are the ten settings that earned a permanent place in my setup, grouped by the problem each one solves rather than listed flat.

Every section ends with config you can copy.

Problem 1: The agent forgets things mid-session

This was the failure that wasted the most time, and the fix was counterintuitive.

Cap CLAUDE.md at 8,000 characters

By March my CLAUDE.md was 40,000 characters — every convention and decision I could think of. More context, better output, I assumed. The opposite happened. Transformers don't retain the middle of long documents well, and past a certain size the agent stopped attending to rules in the middle of the file. I asked it to reproduce a rule from line 300 of its own config. It couldn't.

Hard cap: 8,000 characters in the main file, the rest split out and loaded on demand.

.claude/
├── CLAUDE.md       # always-on, ≤ 8,000 chars: stack, prohibitions, links
├── skills/         # task-specific rules, loaded when relevant
└── specs/          # domain-model.md, api-contracts.md, infra.md

Result: main file at 6,000 chars, sharper answers, ~35% lower session-init token cost.

Lazy-load context with skills

Skills are markdown files in .claude/skills/, loaded only when relevant — so project-specific rules aren't sitting in the agent's memory full-time.

.claude/skills/
├── nestjs-conventions.md
├── clean-arch.md
├── typescript-strict.md
├── testing-strategy.md
└── code-review-checklist.md

Writing a NestJS module pulls nestjs-conventions; a refactor pulls clean-arch; a review pulls code-review-checklist. After moving to skills: 30–40% lower token use in standard sessions, and the agent stopped mixing rules across domains.

Restart before context rot wins

Long sessions degrade. The tells: the agent re-proposes a rejected idea, suggests editing code it just wrote, or asks you to "clarify the task" you gave 10 messages ago. Don't push through — restart with a short handoff.

# Context to continue
**Task:** [one sentence]
**Done:** [file/function]: [what]
**Ruled out (discussed):** [option]: [why]
**Next step:** [one concrete item]

My trigger: 3 messages with no forward progress → restart.

Problem 2: The agent does things I didn't ask for

Permissions that stop interrupting you (but still block damage)

{
  "permissions": {
    "allow": ["Bash(git diff:*)", "Bash(git status:*)", "Bash(npm test:*)", "Bash(npm run lint:*)"],
    "deny":  ["Bash(git push:*)", "Bash(git reset --hard:*)", "Bash(rm -rf:*)"]
  }
}

Read/test commands pass without prompts; writes and deletes still ask. Interruptions per session: 15–20 → 2–3. deny outranks allow. Commit .claude/settings.json and the team inherits it.

Split settings by environment

Team config stays strict and committed; personal config is local and gitignored.

// .claude/settings.json (team, committed)
{ "permissions": { "deny": ["Bash(kubectl delete:*)", "Bash(terraform destroy:*)", "Bash(rm -rf:*)"] } }

// .claude/settings.local.json (personal, gitignored)
{ "permissions": { "allow": ["Bash(docker compose up --build:*)", "Bash(npm run db:reset:*)"] } }

You move fast locally; the production-dangerous denies hold because deny always beats allow.

acceptEdits, but only with a safety net

acceptEdits is great for type migrations, renames, formatting, and test generation. It's a trap for new features, architectural refactors, and auth/payment/data-layer work. I once enabled it for "clean up unused imports" and the agent rewrote a few services while it was in the file — 30 unplanned minutes of review. Rule: acceptEdits only when automated tests can catch the mistake.

Hooks as guardrails

A PreToolUse hook blocks dangerous commands by returning exit 2 (stderr goes back to the agent):

#!/bin/bash
# ~/.claude/hooks/pre_tool.sh
INPUT=$(cat)
COMMAND=\((echo "\)INPUT" | jq -r '.tool_input.command // empty')
DANGEROUS="(rm -rf|DROP TABLE|truncate|DELETE FROM.*WHERE 1=1|git reset --hard|kubectl delete)"
if echo "\(COMMAND" | grep -qiE "\)DANGEROUS"; then
  echo "BLOCKED: dangerous command: $COMMAND" >&2; exit 2
fi
exit 0

A Stop hook logs every session so you know how much you actually use the tool (mine: ~2h15m/day, not the "occasionally" I'd assumed). Wire them in settings.json under hooks.Stop / hooks.PreToolUse. A notification hook (osascript/notify-send) frees you to context-switch while the agent works.

Problem 3: Knowing how hard to push, and when to stop

Effort levels

A custom skill with four depths. (/effort is mine, not built in; built-in depth control is think / ultrathink.)

/effort low     fast draft, no plan         → throwaway
/effort medium  short plan, implement       → standard work
/effort high    detailed plan, implement    → new components, public API
/effort max     analyze→plan→approval→implement→report, stop for OK each step → core/auth/schema

Zero full rollbacks on /effort max in six months. If max feels like overkill for a task, the task's easier than you thought.

The two-correction rule

Correct the agent twice on the same point in one session → stop. Two failures means the framing is wrong, not the agent. Rebuild it: split smaller, give a concrete example of the desired output, or write the skeleton and let it fill in. Estimated 15–20 hours saved over six months.

Multi-model council for irreversible decisions

For module layout, schemas, and contracts, I run the same task across models:

Opus:   "Design [X]. Priority: reliability, explicit contracts"
Sonnet: "Design [X]. Priority: development speed"
Haiku:  "Design [X]. Minimal complexity, minimal abstraction"

A fourth Opus session synthesizes against our context. ~25 minutes; on my last big refactor it replaced a week of team debate.

The operational question that follows

Running models in parallel surfaces a question Claude Code doesn't answer: which workflow uses which model, and what does each cost? That's routing and cost attribution, and it lives outside the CLI. Routing Claude Code through a gateway gives per-workflow, per-model cost visibility across providers — the EvoLink Claude Code CLI guide covers the setup.

Wrap-up

Grouped or flat, the through-line is the same: Claude Code's defaults are a floor. Budget your context, set guardrails, and learn when to throw a session away, and the tool moves from "fine" to "fast." My rough tally: 50–60 hours recovered in six months. All figures are my own from daily use, not benchmarks.

Claude Code docs

tags: claude-code, ai, developer-tools, productivity

How I Configured Claude Code as an Architect: 10 Settings, Grouped by What They Fix

Problem 1: The agent forgets things mid-session

Cap CLAUDE.md at 8,000 characters

Lazy-load context with skills

Restart before context rot wins

Problem 2: The agent does things I didn't ask for

Permissions that stop interrupting you (but still block damage)

Split settings by environment

acceptEdits, but only with a safety net

Hooks as guardrails

Problem 3: Knowing how hard to push, and when to stop

Effort levels

The two-correction rule

Multi-model council for irreversible decisions

The operational question that follows

Wrap-up

Comments

More from this blog

How to Evaluate a New Model From Evidence Instead of Hype: A Walk Through 60 Claude Fable 5 Cases

A Practical Seedance 2.0 Prompt Structure for More Controllable AI Video

Testing Claude Code Dynamic Workflows With a 111-Agent Research Run

Gemini 3.5 Flash vs Claude Opus 4.7 vs GPT-5.5: Which Model for Which Agent Task (May 2026)

Command Palette

Problem 1: The agent forgets things mid-session

Cap CLAUDE.md at 8,000 characters

Lazy-load context with skills

Restart before context rot wins

Problem 2: The agent does things I didn't ask for

Permissions that stop interrupting you (but still block damage)

Split settings by environment

acceptEdits, but only with a safety net

Hooks as guardrails

Problem 3: Knowing how hard to push, and when to stop

Effort levels

The two-correction rule

Multi-model council for irreversible decisions

The operational question that follows

Wrap-up

Comments

More from this blog