Smart Retrieval & Querying

You've been capturing knowledge for weeks now. Your Second Brain has hundreds of chunks from notes, articles, PDFs, and chat saves. The real test: can you actually get answers when you need them? In this lesson, you'll master the query patterns that make retrieval feel like talking to a research assistant.

How Retrieval Works

  Your Question               Search Pipeline              Answer
  ┌──────────────┐           ┌──────────────────┐         ┌──────────────┐
  │ "What did the│           │ 1. Embed query   │         │ Exact quote  │
  │  Stripe CTO  │──────────>│ 2. Vector search │────────>│ with source  │
  │  say about   │           │ 3. Rerank top 20 │         │ and context  │
  │  payments?"  │           │ 4. LLM synthesize│         │              │
  └──────────────┘           └──────────────────┘         └──────────────┘
                                      │
                              Searches 1,832 chunks
                              in ~200ms

The pipeline: your question gets embedded into a vector, matched against your knowledge base for the most semantically similar chunks, reranked for relevance, then an LLM synthesizes the answer with citations.

Start simple. Ask questions the way you'd ask a colleague who read all your notes.

Bot: At the FinTech Summit (June 2024), Patrick Collison made several points about payment trends:

  1. "Embedded payments will grow 340% by 2027, making
     standalone payment pages obsolete for most B2B SaaS."
     — Source: Conference Notes/FinTech Summit Day 2 (Notion)

  2. He also mentioned that cross-border payment costs
     will drop below 1% within 3 years due to stablecoin
     rails competing with SWIFT.
     — Source: Conference Notes/FinTech Summit Day 2 (Notion)

  Related: You have 3 other notes about payment trends
  from different sources. Want me to pull those too?

Notice the bot provides exact quotes with sources. This is the critical difference from a regular search — you get attribution you can verify.

Different question types unlock different retrieval strategies.

Ask the bot to combine knowledge from multiple sources.

Bot: Based on 7 sources in your knowledge base:

  OVERVIEW
  Vector databases store data as high-dimensional embeddings
  for similarity search. Your notes cover:

  COMPARISONS (from "DB Comparison Notes", Obsidian)
  - Pinecone: managed, easy setup, expensive at scale
  - Weaviate: open-source, good hybrid search
  - Qdrant: fastest for pure vector ops, Rust-based

  USE CASES (from "AI Architecture Decisions", Notion)
  - You noted RAG pipelines work best with chunk sizes
    of 500-1500 tokens with 10-20% overlap

  PERFORMANCE (from arxiv paper, web clip)
  - HNSW indexing provides 95%+ recall at <10ms latency
    for collections under 10M vectors

  Sources: 3 Obsidian notes, 2 Notion pages, 1 PDF, 1 web clip

Ask the bot to find contrasting viewpoints in your notes.

Bot: Your notes contain perspectives from 5 sources:

  FOR MICROSERVICES:
  - "Teams can deploy independently" (Tech Lead Meeting, Mar)
  - "Scaling individual services saves cost" (AWS re:Invent notes)
  - Martin Fowler article: "Right choice when teams > 20 people"

  AGAINST MICROSERVICES:
  - "Distributed debugging is a nightmare" (Post-mortem Nov 2024)
  - "We spent 3 months on infra before writing features" (Retro)
  - Book note: "A distributed monolith is worse than both"
    (Building Microservices, Sam Newman)

  Your own conclusion (from "Architecture Decision Record #14"):
  "Start monolith, extract services when pain is real, not theoretical"

Ask the bot to arrange knowledge chronologically.

Bot: Based on your notes, here's the timeline:

  Jan 2024: "AI safety feels overhyped" (Journal entry)
  Mar 2024: Saved 3 articles about alignment research
  Jun 2024: Conference note — "The Anthropic talk changed
            my mind about interpretability research"
  Sep 2024: Book notes from "Alignment Problem" — extensively
            highlighted section on reward hacking
  Dec 2024: Your blog draft titled "Why I Changed My Mind
            About AI Safety" (Notion)

  Your perspective shifted notably after June 2024,
  with the conference as the turning point.

If answers aren't precise enough, adjust these parameters.

Reranking model (improves relevance after vector search)

reranker: true rerank_model: "cross-encoder" rerank_top_n: 5 # Keep top 5 after reranking

Similarity threshold (0.0 - 1.0)

min_similarity: 0.65 # Ignore chunks below this score

Source diversity

diversify_sources: true # Don't return 5 chunks from same doc max_per_source: 2 # At most 2 chunks per source

Recency bias (0.0 = no bias, 1.0 = strongly prefer recent)

recency_weight: 0.15

Sometimes you want to search a subset of your knowledge. Use filters to narrow scope.

Bot: From your book notes (4 sources):

  1. "Who" by Geoff Smart — "The #1 hiring mistake is not
     defining the scorecard before interviewing"
  2. "An Elegant Puzzle" by Will Larson — Chapter on growing
     teams: "Hire for the team you need in 6 months"
  ...

You: What did I save from arxiv this year?

Bot: You have 12 arxiv papers indexed in 2025: - "Attention Is All You Need... Again" (Jan) - "Scaling Laws for Transfer Learning" (Feb) ...

You: In my work notes, what decisions did we make about the API?

Bot: Searching work-tagged content only... Found 8 relevant entries about API decisions: ...

Filter syntax: in:books, from:notion, tagged:work, after:2024-06, by:author.

If you're getting irrelevant answers, troubleshoot with these steps:

Check coverage — run openclaw knowledge status to verify the content is actually indexed
Try different phrasing — semantic search matches meaning, not keywords. "Payment trends" and "fintech forecasts" might return different results
Lower the similarity threshold — if min_similarity is too high, relevant chunks get filtered out
Check chunk sizes — if chunks are too large, embeddings become diluted. Try re-indexing with smaller chunks
Add context — "What did the CTO say at the June conference about payments?" is better than "payment stuff"

Smart Retrieval & Querying

Smart Retrieval & Querying

How Retrieval Works

Reranking model (improves relevance after vector search)

Similarity threshold (0.0 - 1.0)

Source diversity

Recency bias (0.0 = no bias, 1.0 = strongly prefer recent)

Building OpenClaw Skills: Parameters, Testing, and Deployment

Key Insights: Second Brain / Knowledge Management