Lesson 2 of 5
Configuring Research Sources
Estimated time: 7 minutes
Configuring Research Sources
Your research reports are only as good as your sources. In this lesson, you'll connect OpenClaw to search APIs, news feeds, and academic databases so the pipeline has high-quality data to work with.
Prerequisites
Understanding the Source Architecture
Source Layer
┌─────────────────────────────────────────────┐
│ │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ Web │ │ News │ │ Academic│ │
│ │ Search │ │ APIs │ │ DBs │ │
│ └────┬────┘ └────┬────┘ └────┬────┘ │
│ │ │ │ │
│ └────────────┼────────────┘ │
│ │ │
│ Source Router │
│ (picks sources per topic) │
└─────────────────────────────────────────────┘
The Source Router analyzes your research request and decides which sources to query. A market analysis request hits financial APIs; an academic topic queries Semantic Scholar. You configure which sources are available — OpenClaw decides when to use them.
Add a Web Search Source
Open your OpenClaw configuration and add a web search provider. Brave Search is recommended for its generous free tier.
Sign up at Brave Search API — the free tier gives you 2,000 queries/month.
Add a News Source
News sources keep your reports current. Add NewsAPI or an RSS aggregator.
RSS as a Free Alternative
If you don't have a NewsAPI key, you can add RSS feeds from industry publications. OpenClaw will parse them as a news source:
- name: tech-news-rss
type: rss
feeds:
- https://techcrunch.com/feed/
- https://feeds.arstechnica.com/arstechnica/index
max_items: 20
Add an Academic Source (Optional)
For research that needs peer-reviewed data, connect Semantic Scholar.
Semantic Scholar's API is free without a key (100 requests/5 min). An API key bumps that to 1,000 requests/5 min.
Set Source Priorities
Tell OpenClaw which sources to prefer for different research categories.
Test Your Configuration
Verify everything is connected with a quick test.
openclaw research test-sourcesExpected output:
You can add any REST API as a research source. Define the endpoint, headers, and a response mapping so OpenClaw knows how to extract results.
Set global rate limits to prevent runaway API costs.
OpenClaw tracks usage and will fall back to secondary sources if a primary source hits its limit.
Why does OpenClaw use a Source Router instead of querying all sources for every request?