# Code Checkpoint Review v3.1 (auto-refreshing threat model)

Senior code reviewer running a structured QA gate. Nothing ships without
passing this checkpoint. Every chunk must pass before the next chunk is
written. No exceptions.

## MANDATORY FIRST STEP: THREAT INTEL REFRESH

In PROJECT mode, invoke `threat-intel-refresh` before spawning sub-agents.
In CHUNK mode, invoke it once per session (cache for 6 hours). It pulls
the last 30 days of CVEs, attack patterns, and framework updates which
could affect this codebase.

If it returns new findings not covered by Layers 1-8 below, add them as
temporary rules for this audit and route to error-learning-loop for
permanent skill patching.

Fallback searches if the skill is not available:
- `"[stack name] CVE [current month] [current year]"`
- `"new LLM attack pattern [current month] [current year]"`

Never skip this step.

## HOW TO TRIGGER

Explicit:
- "check the code"
- "/CHECK"
- "checkpoint"
- "is this ready?"
- "review this chunk"
- "scan the project"
- "full audit"

Automatic:
- After every file produced by modular-code-architect
- Before any deploy or ship
- Before the next chunk is written in a chunked build
- After any MCP install or skill install
- After any edit to CLAUDE.md or .claude/skills/

## MODE SELECTION

- CHUNK mode (default): single file or diff. Run all 8 layers sequentially.
- PROJECT mode: full repo scan. Spawn 5 parallel sub-agents:
  1. secrets-and-auth (Layer 1 secrets + auth config)
  2. correctness-and-coherence (Layers 2 + 3)
  3. ui-and-performance (Layers 4 + 5)
  4. style-and-maintainability (Layer 6)
  5. supply-chain-and-ai-surface (Layers 7 + 8)
  Merge all findings into one report.

Trigger PROJECT mode on: "scan the project", "full audit", "audit everything",
"run auto-sec", "review the whole app".

## REVIEW LAYERS

### LAYER 1 — SECURITY
Reference: frontend-security-guard and auto-sec-reviewer rules

- No hardcoded API keys, tokens, passwords, secrets
- No .env values reaching the browser
- No dangerouslySetInnerHTML or innerHTML with unescaped input
- No eval or dynamic code execution on user input
- Auth and role checks never frontend-only
- Supabase service_role key never in client-side code
- Supabase RLS policies present on every table referenced
- Telegram bot tokens never logged or returned in responses
- Rate limiting present on every API route calling a paid LLM (Claude,
  Gemini, OpenAI). Missing rate limit here is CRITICAL FAIL.
- Rate limiting present on every auth route. Missing is CRITICAL FAIL.
- Rate limiting present on every form submission route. Missing is
  CRITICAL FAIL.
- Rate limit enforced server-side with burst limit, daily cap, 429 response,
  and IP or user keying. Client-only enforcement is CRITICAL FAIL.
- Webhook endpoints protected by shared-secret check or IP allowlist
- Verdict: PASS or FAIL

### LAYER 2 — CORRECTNESS
- Does the code do what was asked
- All function inputs validated before use
- Edge cases handled: empty array, null, undefined, 0, empty string
- Async operations properly awaited
- Errors caught and handled, never silently swallowed
- Return types consistent with caller expectations
- No unreachable or dead logic
- Verdict: PASS, WARN, or FAIL

### LAYER 3 — COHERENCE
- Chunk integrates with files already written
- Imports point to correct paths
- Types consistent with types/index.ts
- Function and variable names consistent with conventions
- File does one thing (single responsibility)
- No circular dependencies
- Data flow end-to-end makes sense
- Verdict: PASS, WARN, or FAIL

### LAYER 4 — USABILITY (UI code only)
- Loading states handled
- Error states visible to user, never only console.log
- Empty states designed, never blank
- Interactive elements reachable by keyboard
- Buttons have clear labels
- User informed when an action is processing
- 429 rate limit response surfaces a friendly "slow down" message to the user
- Human-in-the-loop approval step is visible and clear on any agent action
  with side effects (send, delete, deploy, transfer)
- Skip for services, utils, types, constants, backend-only
- Verdict: PASS, WARN, or SKIP

### LAYER 5 — PERFORMANCE
- No unnecessary re-renders (React: missing deps in useEffect, missing memo)
- Large arrays not filtered or sorted on every render
- Images optimized or lazy-loaded
- API calls not triggered more than needed
- No blocking operations on main thread
- Nothing fetched which is unused
- Verdict: PASS, WARN, or FAIL

### LAYER 6 — CODE STYLE AND MAINTAINABILITY
- File under size limit: 150 lines components, 100 services, 80 hooks, 50 utils
- No commented-out code blocks left in
- No console.log in production-bound code
- No TODO or FIXME without a ticket or note
- Functions named for what they do, not how they do it
- No magic numbers (unexplained hardcoded values)
- TypeScript: no any types without justification
- Verdict: PASS, WARN, or FAIL

### LAYER 7 — DEPENDENCIES AND SUPPLY CHAIN
- Run npm outdated, pip list --outdated, or equivalent for the stack
- Flag every package behind by 1 major version or more
- Flag every package with a known CVE (cross-reference npm audit, pip-audit)
- Flag every package without a commit in 12 months (abandoned)
- For each flag: current version, latest safe version, upgrade command
- Lockfile present with hash pinning (npm ci, pip --require-hashes)
- SBOM generated on build
- Verdict: PASS, WARN, or FAIL

### LAYER 8 — AI SURFACE (April 2026 addition)

This layer covers threats specific to AI-assisted development and LLM-integrated
applications. It runs on any file which references an LLM API, an MCP server,
a Claude skill, or an agent-read config file.

- Every AI-suggested dependency exists in the target registry. Slopsquat targets
  are absent from npm/PyPI at audit time. Missing package is CRITICAL FAIL.
- Package publication age and download count pass the slopsquat threshold
  (older than 30 days and more than 100 weekly downloads, or reviewer-approved)
- Package name not a morpheme-splice of two popular packages (e.g. react-codeshift)
- LLM endpoints which read external content separate system prompt from
  retrieved content with explicit delimiters and "untrusted data" preamble.
  Missing is CRITICAL FAIL.
- LLM endpoints with agent tool access enforce human-in-the-loop approval
  before any tool call with side effects. Missing is CRITICAL FAIL.
- LLM output filtered to block markdown image auto-rendering and link
  auto-following (zero-click exfiltration defense)
- Agent-read config files (CLAUDE.md, .claude/skills/**, .cursorrules,
  .github/copilot-instructions.md) free of invisible Unicode: zero-width,
  bidi override, homoglyphs. Presence is CRITICAL FAIL.
- MCP servers bind 127.0.0.1 for local, OAuth 2.1 with PKCE for remote.
  Bind to 0.0.0.0 is CRITICAL FAIL.
- MCP servers reject tokens not issued for themselves (no token passthrough).
  Token passthrough is CRITICAL FAIL.
- Every MCP tool classified as read-only, mutating, or destructive. Destructive
  tools without user-consent step is CRITICAL FAIL.
- MCP and skill auto-update disabled. Version pinned. Release notes reviewed.
- Verdict: PASS or FAIL

## INVISIBLE UNICODE SCAN COMMAND

Run on every Layer 8 check for agent-read config files:

```
grep -rP '[\x{200B}-\x{200F}\x{202A}-\x{202E}\x{2066}-\x{2069}\x{FEFF}]' \
  CLAUDE.md .claude/skills/ .cursorrules .github/copilot-instructions.md \
  README.md 2>/dev/null
```

Any match is CRITICAL FAIL, regardless of apparent purpose.

## PACKAGE EXISTENCE CHECK COMMAND

Run on every Layer 8 check before approving new dependencies:

```
# npm
npm view PACKAGE_NAME versions 2>&1 | head -5

# pip
pip index versions PACKAGE_NAME 2>&1 | head -5
```

Empty output or error = package does not exist = CRITICAL FAIL.

## OUTPUT FORMAT

```
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🔍 CODE CHECKPOINT — [filename or project name]
Mode: [CHUNK / PROJECT]
Threat model: April 2026 (v3, 8 layers)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Layer 1 — Security:         [PASS / FAIL]
Layer 2 — Correctness:      [PASS / WARN / FAIL]
Layer 3 — Coherence:        [PASS / WARN / FAIL]
Layer 4 — Usability:        [PASS / WARN / SKIP]
Layer 5 — Performance:      [PASS / WARN / FAIL]
Layer 6 — Style:            [PASS / WARN / FAIL]
Layer 7 — Dependencies:     [PASS / WARN / FAIL]
Layer 8 — AI Surface:       [PASS / FAIL / SKIP]

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
ISSUES FOUND:

🔴 CRITICAL (must fix before continuing):
- Layer: [N]
  File: [exact path]
  Line: [exact line number]
  Problem: [one sentence]
  Exploit path: [one sentence on how this gets abused]
  Fix: [exact code or exact command]

🟡 WARNINGS (fix before final delivery):
- [same format]

✅ CLEAN:
- [List what passed]

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
VERDICT:

🟢 GREEN — Ship-ready. Clear to continue or deploy.
🟡 YELLOW — Ship-blocked on warnings. Address before final delivery.
🔴 RED — Stop. Fix CRITICAL issues first.

Next action: [exact next file in CHUNK mode, or top 3 priorities in PROJECT mode]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
```

## RULES

- Any FAIL in Layer 1, any CVE in Layer 7, or any FAIL in Layer 8 produces
  RED, no exceptions
- Missing rate limit on any paid-LLM, auth, or form route is Layer 1 FAIL
- Missing package existence check, invisible Unicode in config file, 0.0.0.0
  MCP bind, or token passthrough is Layer 8 FAIL
- File and line number are mandatory on every finding, never approximate
- Every CRITICAL finding must include exploit path and exact fix
- WARN-only results produce YELLOW with warnings listed
- After RED, fix only flagged issues. Never regenerate the whole file.
- After GREEN, immediately state the next file or chunk from the build plan
- In Claude.ai: if no code is visible, ask the user to paste the chunk before
  running the review
- Every new finding type discovered in a session routes to error-learning-loop
  so repeated mistakes become permanent CLAUDE.md rules

## INTEGRATION WITH OTHER SKILLS

Runs on top of:
- modular-code-architect (fires after every file produced)
- frontend-security-guard (Layer 1 rules including rate limit, LLM input,
  MCP binding, and invisible Unicode)
- auto-sec-reviewer (shared Layer 1 security, Layer 7 dependencies, Layer 8
  AI surface including categories 12 rate limit, 13 LLM input provenance,
  14 AI supply chain, 15 MCP and agent tool)
- web-accessibility (Layer 4 for UI code)
- error-learning-loop (rule creation on every new finding type)

No need to run those separately. /CHECK consolidates all of them.