Arxiv
Access the open-access archive for 2 million+ scholarly articles in Physics, Mathematics, Computer Science, and more.
This tool allows your agents to search for preprints, summaries, and authors to ground their answers in scientific research.
💡 Core Concepts
The Arxiv tool is straightforward:
- Configuration: You can (optionally) set limits on how many results to return or how much text to fetch.
- Querying: The agent sends a search term to find relevant papers.
⚙️ Configuration Reference
Tool Configuration (config)
These parameters control the behavior of the ArXiv wrapper and are set in your YAML.
| Field | Type | Default | Description |
|---|---|---|---|
top_k_results | int | 3 | The maximum number of papers to return for a single query. |
load_max_docs | int | 3 | Same as top_k_results. Limits the number of documents loaded. |
doc_content_chars_max | int | 4000 | The maximum number of characters to read from the abstract/summary. Useful for saving tokens. |
load_all_available_meta | bool | False | If true, loads extra metadata (DOI, published date, etc.) beyond just the summary. |
Helper
Need to convert JSON payloads or configs to YAML? Check out our JSON to YAML Conversion Guide.
📚 Practical Recipes (Examples)
Recipe 1: Basic Research Assistant
Use Case: A general-purpose researcher that finds the top 3 papers.
create_vertical_agent_network:
agent-1:
agent_name: Researcher
LLM_config:
params:
model: gpt-5-nano
tools:
tool_assigned:
- name: Arxiv
config:
top_k_results: 3
agent_function:
- "Search Arxiv for papers on specific topics when asked."
Recipe 2: Deep Dive (More Content)
Use Case: When you need more details from the abstracts and more papers.
create_vertical_agent_network:
agent-1:
agent_name: DeepResearcher
tools:
tool_assigned:
- name: Arxiv
config:
top_k_results: 10
load_max_docs: 10
doc_content_chars_max: 10000
🚑 Troubleshooting
-
Empty Results
- The query might be too specific or contain typos. ArXiv search can be sensitive.
- Try broadening the search terms.
-
Rate Limits
- The ArXiv API has rate limits. If you make too many requests quickly, you might be temporarily blocked. The wrapper handles some retries, but be mindful of loop usage.