Skip to main content

Arxiv

Access the open-access archive for 2 million+ scholarly articles in Physics, Mathematics, Computer Science, and more.

This tool allows your agents to search for preprints, summaries, and authors to ground their answers in scientific research.

💡 Core Concepts

The Arxiv tool is straightforward:

  1. Configuration: You can (optionally) set limits on how many results to return or how much text to fetch.
  2. Querying: The agent sends a search term to find relevant papers.

⚙️ Configuration Reference

Tool Configuration (config)

These parameters control the behavior of the ArXiv wrapper and are set in your YAML.

FieldTypeDefaultDescription
top_k_resultsint3The maximum number of papers to return for a single query.
load_max_docsint3Same as top_k_results. Limits the number of documents loaded.
doc_content_chars_maxint4000The maximum number of characters to read from the abstract/summary. Useful for saving tokens.
load_all_available_metaboolFalseIf true, loads extra metadata (DOI, published date, etc.) beyond just the summary.
Helper

Need to convert JSON payloads or configs to YAML? Check out our JSON to YAML Conversion Guide.


📚 Practical Recipes (Examples)

Recipe 1: Basic Research Assistant

Use Case: A general-purpose researcher that finds the top 3 papers.

create_vertical_agent_network:
agent-1:
agent_name: Researcher
LLM_config:
params:
model: gpt-5-nano
tools:
tool_assigned:
- name: Arxiv
config:
top_k_results: 3
agent_function:
- "Search Arxiv for papers on specific topics when asked."

Recipe 2: Deep Dive (More Content)

Use Case: When you need more details from the abstracts and more papers.

create_vertical_agent_network:
agent-1:
agent_name: DeepResearcher
tools:
tool_assigned:
- name: Arxiv
config:
top_k_results: 10
load_max_docs: 10
doc_content_chars_max: 10000

🚑 Troubleshooting

  • Empty Results

    • The query might be too specific or contain typos. ArXiv search can be sensitive.
    • Try broadening the search terms.
  • Rate Limits

    • The ArXiv API has rate limits. If you make too many requests quickly, you might be temporarily blocked. The wrapper handles some retries, but be mindful of loop usage.