Second Brain

Lesson 3 of 5

Capture & Organize Workflow

Estimated time: 8 minutes

Capture & Organize Workflow

Your notes apps are connected — but what about everything else? The article you read on your phone, the podcast insight during your commute, the random idea at 2 AM. In this lesson, you'll build capture pipelines so nothing valuable slips through the cracks.

Prerequisites

    The Capture Pipeline

      Input Sources              Processing              Knowledge Base
      ┌──────────────┐          ┌──────────────┐        ┌──────────────┐
      │ Chat message  │         │ Extract text  │        │              │
      │ Web clip      │         │ Identify type │        │  Chunked     │
      │ PDF upload    │────────>│ Add metadata  │───────>│  Embedded    │
      │ Voice note    │         │ Auto-tag      │        │  Tagged      │
      │ Email forward │         │ Chunk & embed │        │  Linked      │
      │ Screenshot    │         │               │        │              │
      └──────────────┘          └──────────────┘        └──────────────┘
    

    Quick Capture via Chat

    The fastest way to save something is to tell the bot directly. No context-switching, no opening another app.

    Chat capture examples
    You:  Remember: the best time to send marketing emails
    is Tuesday 10am according to the HubSpot studyBot:  Saved to your knowledge base.
    Tags: #marketing #email #research
    Source: manual capture (chat)You:  Save this quote: "The best way to predict the future
    is to invent it" - Alan KayBot:  Saved quote by Alan Kay.
    Tags: #quotes #innovation
    Source: manual capture (chat)

    Trigger words that activate capture: remember, save this, note that, capture, store.

    Bulk Capture

    After a meeting or conference, just dump everything into chat: "Remember these key takeaways from today's product meeting: 1) We're targeting Q3 for launch, 2) Budget approved for $50k, 3) Sarah is the new PM." The bot saves it all as one structured entry.

    Web Clipping

    Save articles and web pages without leaving your browser.

    Install the OpenClaw Web Clipper from your browser's extension store. When you find something worth saving:

    1. Click the OpenClaw icon in your toolbar
    2. Choose: Full page, Selection, or Simplified article
    3. Add optional tags or notes
    4. Click Save to Second Brain

    The extension extracts clean text (stripping ads and navigation), adds it to your knowledge base, and syncs within seconds.

    CLI install
    openclaw extensions install web-clipper

    PDF and Document Upload

    Research papers, reports, ebooks — PDFs contain some of the most valuable knowledge, and they're notoriously hard to search later.

    pdf-config.yaml
    documents:
    watch_folders:
    path: "~/Documents/Research"
    auto_index: true
    file_types: [pdf, docx, epub]
    path: "~/Downloads"
    auto_index: false       # Only index when manually triggered
    file_types: [pdf]
    pdf_processing:
    ocr: true                 # Handle scanned PDFs
    extract_images: false     # Skip image extraction
    table_extraction: true    # Convert tables to structured text
    max_pages: 500            # Safety limit
    PDF upload via chat
    You:  [Attaches quarterly-report-q4.pdf]
    Index this reportBot:  Processing "Quarterly Report Q4 2024" (42 pages)...Indexed successfully:
    42 pages processed
    156 chunks created
    Key topics: revenue growth, customer acquisition,
    product roadmap, hiring plan
    Tags: #business #quarterly-report #2024
    Sample queries you could ask:
    "What was Q4 revenue growth?"
    "What's on the product roadmap?"
    "How many new hires are planned?"

    Auto-Tagging and Organization

    You should not have to manually organize everything. Configure auto-tagging rules so content is categorized as it arrives.

    auto-tag-config.yaml
    auto_tagging:
    enabled: true
    ai_tags: true              # Let AI suggest tags based on content
    max_ai_tags: 5             # Limit auto-generated tagsrules:
    match: "source:notion AND workspace:Work"
    tags: [work]
    match: "source:obsidian AND path:Books/**"
    tags: [book-notes, reading]
    match: "content contains 'quarterly'"
    tags: [business, reports]
    match: "source:web-clip AND domain:arxiv.org"
    tags: [research, academic]
    collections:               # Group related content
    name: "Conference Notes"
    auto_add: "tag:conference"
    name: "Book Summaries"
    auto_add: "tag:book-notes"
    name: "Work Projects"
    auto_add: "tag:work AND tag:project"
    Auto-tagging in action
    [You clip an article from hbr.org]Bot:  Saved "Why Digital Transformations Fail" (hbr.org)Auto-tags applied:
    #business (rule: domain hbr.org)
    #digital-transformation (AI-suggested)
    #management (AI-suggested)
    #strategy (AI-suggested)
    Added to collection: "Work Projects" (matched: business + strategy)

    For audio/video content, use OpenClaw's transcript pipeline:

    Podcast capture
    You:  Index this podcast: https://youtube.com/watch?v=exampleBot:  Fetching transcript for "Lex Fridman #412 — Sam Altman"...Transcript indexed:
    Duration: 2h 34m
    12,400 words across 84 chunks
    Key topics: AGI timeline, safety research,
    compute scaling, OpenAI governance
    Tags: #podcast #ai #interviews
    You can now ask questions like:
    "What did Sam Altman say about AGI timelines on Lex Fridman?"

    Pairs well with the YouTube & Podcast Factory course.

    When the same content arrives from multiple sources (you clip an article AND someone shares it in your Notion workspace), OpenClaw deduplicates:

    • Content hashing detects identical or near-identical text
    • URL matching catches the same web page saved twice
    • Fuzzy matching identifies paraphrased duplicates (>90% similarity)

    Duplicates are merged, keeping metadata from both sources. You'll never see the same quote twice in search results.

    Knowledge Check

    What is the most important design principle for a knowledge capture workflow?