How to Build an AI Research Agent with Claude Code in 2026
Step-by-step tutorial for building a multi-step AI research agent using only Claude Code, a project-level CLAUDE.md operating brief, and a tight permission allowlist. The example agent fetches web pages, extracts claims, cross-checks against a second source, and writes a structured Markdown report. Tested on Claude Sonnet as of April 2026.
Overview
Claude Code is Anthropic's official command-line interface for Claude. Although marketed primarily as a coding assistant, the CLI accepts arbitrary instructions and can read files, run shell commands, and call out to other tools, which makes it a practical scaffold for building lightweight research agents. This tutorial walks through building a multi-step AI research agent using only Claude Code, a project-level CLAUDE.md, and a small handful of permitted Bash commands.
The example agent fetches recent web pages on a topic, extracts key claims, cross-checks them against a second source, and writes a structured Markdown report. The same pattern extends to competitive intelligence, internal knowledge base updates, and audit-style reviews.
Prerequisites
- Claude Code installed and authenticated (
npm install -g @anthropic-ai/claude-codethenclaude auth) - An Anthropic account with API access or a Claude Pro/Max subscription that includes Claude Code
curlandpandocavailable on the PATH (used to fetch and clean web pages)- A working directory dedicated to the agent's outputs
Step 1: Create the Project Skeleton
mkdir -p ~/agents/research-agent/{sources,reports}
cd ~/agents/research-agent
touch CLAUDE.md
mkdir .claude && touch .claude/settings.json
Step 2: Write CLAUDE.md
The CLAUDE.md file is the agent's standing brief. Claude reads it at the start of every session and treats its instructions as the project rulebook.
# Research Agent — Operating Brief
## Mission
Produce a balanced, citation-backed Markdown report on the topic the user names.
Reports go in `reports/<slug>.md`. Source material goes in `sources/<slug>/`.
## Method
1. List 3-5 candidate URLs that cover the topic from different angles.
2. Use `curl` + `pandoc -f html -t plain` to fetch each URL into `sources/<slug>/<n>.txt`.
3. Read each file. Extract claims as a bullet list.
4. For every claim, cite the source URL inline.
5. Cross-check load-bearing claims against a second source. Flag any disagreement.
6. Write `reports/<slug>.md` with: Summary, Key Claims, Disagreements, Open Questions, Sources.
## Rules
- Do not invent URLs. If a fetch fails, report the failure and pick another candidate.
- Always include the date the page was fetched ("retrieved YYYY-MM-DD").
- Keep tone encyclopedic; avoid marketing language.
- Stop and ask the user before making more than 8 outbound HTTP requests.
Step 3: Lock Down Permissions
Put a tight allowlist in .claude/settings.json so the agent does not stop to ask before each safe command:
{
"permissions": {
"allow": [
"Read",
"Write",
"Edit",
"Bash(curl:*)",
"Bash(pandoc:*)",
"Bash(ls:*)",
"Bash(mkdir:*)"
]
}
}
Anything not on this list still prompts. That keeps the agent unable to run, for example, rm -rf without explicit consent.
Step 4: Run the Agent
From the project directory:
claude "Research the current state of open-source workflow engines. Slug: oss-workflow-engines."
Claude Code reads CLAUDE.md, drafts the candidate URL list, fetches each page through the allowlisted curl, and writes the cleaned text to sources/oss-workflow-engines/. It then reads the files back, extracts claims, cross-checks the load-bearing ones, and writes reports/oss-workflow-engines.md.
Step 5: Iterate
Refine CLAUDE.md after each run. Common upgrades:
- Add a "Stop conditions" section that lists when to abort (paywalled site, geo-blocked content, contradictory data).
- Add an "Output schema" section with the exact Markdown headings expected, so reports stay comparable.
- Record the model and date at the top of each report so older outputs do not get mistaken for current ones.
Cost Notes
A single research run on Claude Sonnet using the agent above typically reads 5-8 pages, runs ~15 tool calls, and produces a 600-900 word report. As of April 2026 that costs roughly $0.10-$0.30 in API spend per run on the Anthropic API. Pro and Max subscribers running the same prompt against the included quota pay nothing per run, subject to the rate limit on their plan.
See the Claude Code tool page for the current entitlement matrix and the How to Set Up Claude Code with VS Code tutorial for the editor-side companion. For agent platforms with a hosted runtime, see the Best AI Agent Tools ranking.
Editor's Note: We use a slightly more elaborate version of this agent at ShadowGen for client weekly intelligence briefs. The biggest practical lesson was forcing the agent to record retrieval dates inline; without that the same report could not be revised three months later because no one could tell which claims were stale. The cheapest model that still produced reliable cross-checks in our testing was Claude Sonnet, not Opus, mainly because cross-checking does not need long-form reasoning, just careful comparison.
Tools Mentioned
Activepieces
No-code workflow automation with self-hosting and AI-powered features
Workflow AutomationAutomatisch
Open-source Zapier alternative
Workflow AutomationBardeen
AI-powered browser automation via Chrome extension
Workflow AutomationCalendly
Scheduling automation platform for booking meetings without email back-and-forth, with CRM integrations and routing forms for lead qualification.
Workflow AutomationRelated Guides
Migrating 23 Make Scenarios to Self-Hosted n8n: a 3-Week Breakdown
Anonymized retrospective of a DTC ecommerce brand migrating 23 Make scenarios to a self-hosted n8n instance over three weeks. Tooling cost dropped from $348/month on Make Teams to roughly $12/month on a Hetzner VPS, but credential and webhook recreation consumed about 40% of total project time.
Trigger.dev vs Inngest 2026: OSS Durable Runners Compared
Trigger.dev (2022, London) is a fully Apache 2.0 durable runner with task-based authoring, machine-size selection, and first-class self-host. Inngest (2021, San Francisco) is a developer-first event-driven step platform with an open-source dev server and a managed cloud (50K step runs/month free, $20/month Hobby). This 2026 comparison covers license, programming model, pricing, observability, and self-host options.
Inngest vs Temporal 2026: Durable Functions vs Durable Workflows
Inngest (2021, San Francisco) is a developer-first durable functions platform with TypeScript and Python SDKs, 50,000 step runs/month free, and Hobby pricing from $20/month. Temporal (2019) is the heavyweight durable workflow engine with seven-language SDK coverage, Cassandra-backed scale, and Cloud pricing from roughly $200/month at low volume or $2.5-4.5K/month self-host. This 2026 comparison covers programming model, pricing, scale ceiling, and operational footprint.
Related Rankings
Best Durable Workflow Engines for Production in 2026
A ranked list of the best durable workflow engines for production deployments in 2026. Durable workflow engines persist execution state to a database so that long-running workflows survive process restarts, deployments, and infrastructure failures. The ranking covers Temporal, Prefect, Apache Airflow, Camunda, Windmill, and n8n. Tools were evaluated on production reliability, developer experience, scalability, open-source health, and documentation quality. The shortlist intentionally mixes code-first engines (Temporal, Prefect, Airflow) with hybrid visual platforms (Camunda, Windmill, n8n) to reflect how production teams actually choose workflow engines in 2026.
Best No-Code Automation Platforms in 2026
A ranked list of no-code automation platforms in 2026. The ranking covers visual workflow builders that allow non-engineering teams to connect SaaS apps, route data, and add conditional logic without writing code. Entries cover proprietary cloud platforms (Zapier, Make, Pipedream, IFTTT) and open-source visual builders (n8n, Activepieces). Scoring reflects integration breadth, pricing accessibility, visual editor ease, reliability and error handling, and self-hosting availability.
Common Questions
What are the best automation tools for solo founders in 2026?
Solo founders in 2026 get the most value from Zapier or Make (broad SaaS glue), n8n self-hosted (free, unlimited runs), Pipedream (generous free tier with code steps), Notion automations, and Lindy or Relay.app (AI agents for inbox and meetings). Free tiers cover most pre-revenue workflows.
What are the best automation tools for finance and AP teams in 2026?
Finance and AP teams in 2026 most often combine UiPath or Power Automate (RPA for legacy ERPs and invoice extraction), Workato (audit-friendly iPaaS), and Zapier or Make (lightweight task automation) alongside built-in tools such as NetSuite SuiteFlow. Selection depends on ERP, audit requirements, and invoice volume.
What are the best AI-native automation tools in 2026?
The leading AI-native automation tools in 2026 are Lindy and Relevance AI (agent builders), Gumloop (visual agent workflows), Relay.app (human-in-the-loop AI workflows), Bardeen (browser AI agents), and CrewAI (multi-agent code framework). "AI-native" here means the LLM is the orchestrator, not a step inside a traditional workflow.
What are the best workflow automation tools for technical writers in 2026?
Technical writers in 2026 typically combine Mintlify or ReadMe (docs-as-code platforms), n8n or Zapier (publishing automation), GitHub Actions (CI for docs), and Notion or Coda (drafting and review). The strongest setups treat docs as code with an automation layer for screenshots, link checks, and changelog publishing.