Autonomous Coding

Lesson 4 of 5

Review & Deploy Pipeline

Estimated time: 10 minutes

Review & Deploy Pipeline

The agent writes code and opens PRs. But between "PR opened" and "shipped to production," there's a critical pipeline: automated checks, human review, staging validation, and deployment. In this lesson, you'll build a pipeline that ensures agent-written code meets the same bar as human-written code.

Prerequisites

    The Pipeline

      Agent PR              Automated Checks         Human Review
      ┌──────────┐         ┌──────────────────┐     ┌──────────────┐
      │ PR #142  │────────>│ ✓ Lint           │────>│ Code review  │
      │ 5 files  │         │ ✓ Type check     │     │ by human     │
      │ 3 tests  │         │ ✓ Unit tests     │     │ developer    │
      └──────────┘         │ ✓ Integration    │     └──────┬───────┘
                           │ ✓ Security scan  │            │
                           │ ✓ Build          │            ▼
                           └──────────────────┘     ┌──────────────┐
                                                    │ Staging      │
                                                    │ deploy +     │
                                                    │ smoke test   │
                                                    └──────┬───────┘
                                                           │
                                                           ▼
                                                    ┌──────────────┐
                                                    │ Production   │
                                                    │ deploy       │
                                                    └──────────────┘
    

    Configure CI Checks for Agent PRs

    Agent PRs should go through the same (or stricter) CI pipeline as human PRs. Add an OpenClaw-specific workflow that includes extra validation.

    .github/workflows/agent-pr.yml
    name: Agent PR Checks
    on:
    pull_request:
    types: [opened, synchronize]jobs:
    validate:Only run on agent-created PRsif: contains(github.event.pull_request.labels.*.name, 'openclaw-agent')
    runs-on: ubuntu-latest
    steps:
    
    uses: actions/checkout@v4
    
    
    name: Install dependencies
    run: npm ci
    
    
    name: Lint
    run: npm run lint
    
    
    name: Type check
    run: npx tsc --noEmit
    
    
    name: Unit tests
    run: npm test -- --coverage
    
    
    name: Coverage threshold
    run: |
    Ensure agent didn't reduce coverage
    npx nyc check-coverage --lines 80 --branches 75
    
    
    name: Security audit
    run: npm audit --audit-level=moderate
    
    
    name: Build
    run: npm run build
    
    
    name: Diff size check
    run: |
    Alert if PR is suspiciously large
    LINES=$(git diff --stat origin/main | tail -1 | grep -oP '\d+(?= insertion)')
    if [ "$LINES" -gt 500 ]; then
    echo "::warning::Large PR ($ insertions). Review carefully."
    fi

    Label-Based Triggers

    The openclaw-agent label is automatically added to agent PRs. This lets you run additional checks (like diff size limits) that you might not need for human PRs.

    Set Up Automated Code Review

    Before a human looks at the PR, run automated review tools to catch common issues.

    Automated review config
    review:
    auto_reviewers:Static analysis
    tool: eslint
    block_on: error         # Block merge on lint errors
    warn_on: warning        # Comment but don't block on warnings
    Security scanning
    tool: semgrep
    rules: ["p/typescript", "p/jwt", "p/sql-injection"]
    block_on: error
    Dependency check
    tool: socket
    block_on: high_risk     # Block on risky new dependencies
    Complexity check
    tool: complexity
    max_cyclomatic: 15      # Flag functions over 15 complexity
    block_on: exceeded
    Auto-assign human reviewerassign_reviewer:
    strategy: codeowners       # Use CODEOWNERS file
    fallback: "team-lead"      # If no owner matches
    required_approvals: 1      # At least 1 human approval

    The automated review catches things before your eyes see the code:

    CheckWhat It CatchesAction
    LintStyle violations, unused varsBlock or auto-fix
    Type checkMissing types, wrong typesBlock
    SemgrepSQL injection, XSS, hardcoded secretsBlock
    ComplexityFunctions too long or nestedFlag for review
    Diff sizePRs over 500 lines of changesWarn reviewer
    New depsUnexpected npm packages addedFlag for review

    Human Review Best Practices

    Automated checks passed. Now you review. Here's what to focus on for agent-written code.

    Agent PR review checklist
    SECURITY (most important for agent code)
    [ ] No hardcoded secrets, tokens, or API keys
    [ ] Input validation on all user-facing inputs
    [ ] No SQL injection or XSS vectors
    [ ] Auth checks on new endpoints
    [ ] No elevated permissions or privilege escalationARCHITECTURE
    [ ] Changes are in the right location for your project structure
    [ ] No unnecessary abstractions or over-engineering
    [ ] Consistent with existing patterns (naming, imports, exports)
    [ ] No circular dependencies introducedEDGE CASES
    [ ] Handles empty/null inputs gracefully
    [ ] Error states covered (network failures, invalid data)
    [ ] Concurrent access considered where relevantTESTS
    [ ] Tests actually test the behavior, not just the implementation
    [ ] Edge cases tested (empty input, max length, auth failure)
    [ ] No flaky tests (timeouts, race conditions)

    The Agent's Blind Spot

    AI coding agents are excellent at implementing the happy path but can miss subtle edge cases, race conditions, and security implications. Your review should focus on what the agent is weakest at: things that require understanding the broader system context and adversarial thinking.

    Staging Deploy and Production

    After approval, deploy to staging for a final sanity check before production.

    Deploy pipeline
    deployment:
    staging:
    trigger: on_approval      # Deploy to staging when PR approved
    url: "https://staging.yourapp.com"
    smoke_tests:
    "curl -f https://staging.yourapp.com/health"
    "npx playwright test --project=smoke"
    hold_time: 30m            # Wait 30 min for manual testing
    production:
    trigger: on_merge          # Deploy when PR merged to main
    strategy: rolling          # Zero-downtime rolling deploy
    rollback:
    auto: true               # Auto-rollback if error rate > 5%
    window: 10m              # Monitor for 10 minutes post-deploy
    notifications:
    channel: "#deploys"
    message: "Deployed PR #: "
    Full lifecycle in chat
    Bot:  PR #142 — "Add dark mode toggle"
    ✓ CI passed (lint, types, tests, security)
    ✓ You approved the PR
    ✓ Deployed to staging: staging.yourapp.com
    ✓ Smoke tests passedReady to merge to production. Reply "merge" or
    "hold" if you want more staging time.You:  MergeBot:  Merged and deploying to production...
    ✓ Production deploy complete
    ✓ Health check passing
    ✓ Error rate: 0% (monitoring for 10 min)Dark mode is live! 🌙

    If something goes wrong after deploy:

    Auto-rollback
    Bot:  ⚠️ Alert — Error rate spiked to 8% after deploying PR #142
    Auto-rollback initiated...
    ✓ Rolled back to previous version
    ✓ Error rate returning to normal (0.2%)The dark mode PR caused errors on Safari 16.
    I've reopened the PR and added a comment with the
    error logs. Want me to attempt a fix?You:  Yes, fix the Safari compatibility issueBot:  Investigating... found the issue. Safari 16 doesn't
    support the 'color-mix()' CSS function. Replacing
    with fallback values. New commit pushed to PR #142.
    Knowledge Check

    Why should agent-created PRs go through stricter automated checks than human PRs?

    Narwhalexpert
    0

    Mastering OpenClaw Memory Architecture

    # Mastering OpenClaw Memory Architecture In OpenClaw, memory is not just a database; it’s a process of continuous evolution. This is what we call the **"Molt."** Just as a lobster sheds its shell to ...