Tool use & agents
When the model stops just talking and starts doing
- Explain what “tool use” (a.k.a. function calling) means and when it changes the failure surface.
- Distinguish chat assistants, coding assistants, and autonomous agents.
- Reason about the marginal risks of giving an AI write access to your filesystem, shell, or APIs.
The agentic-AI landscape moves on a six-month cycle. Specific products, APIs, and capability claims here will be out of date soon. Treat the categories and risk patterns as durable. Verify any specific tool claim before relying on it.
Tool use in one paragraph
Modern LLMs can be given access to tools: functions they can call during a conversation, like run a shell command, query a database, or fetch a URL. The model decides when to call a tool, constructs the call, receives the result, and continues. This moves the system from “text predictor” to “loop that can act on the world”.
The spectrum (by blast radius)
- A chat assistant (pure text) has a low blast radius. Failure means a bad paragraph. Mitigation: read it.
- A coding assistant with file edits (Cursor, Claude Code, Copilot) has a moderate blast radius. Failure means bad diffs and broken local state. Mitigation: human review before commit, version control, sandboxed checkout.
- An agentic assistant with shell and network has a large blast radius. Failure means destructive commands, leaked credentials, runaway loops, and exfiltration. Mitigation: sandboxing, scoped permissions, explicit confirmations, dry-run modes, audit logs.
The blast-radius framing matters more than the marketing label. A “chat assistant” with a single shell tool is functionally an agent.
Implications for research workflows
- For analysis tasks, a coding assistant that reads and writes files is often the sweet spot. High productivity, low surprise surface.
- For long-running literature or data-collection tasks, agentic setups are tempting, but verify at every boundary and prefer human-in-the-loop confirmation for irreversible actions.
- Never give an agent credentials it doesn’t need. Never give an agent credentials whose blast radius you can’t bound.
- Treat any agent with write access to a shared system (cloud bucket, lab database, version-controlled repo with auto-deploy) as a junior collaborator who never sleeps, has no judgement, and submits to whoever’s prompt is loudest.
Pattern A: a minimal agent for bibliography cleanup
A postdoc has a .bib file with 80 entries, many with incorrect or missing DOIs, inconsistent journal abbreviations, and wrong years. She wants to verify and fix each entry against CrossRef. The task is tedious by hand; the AI can automate the lookups. This is an agent: it reads a file, calls an external API in a loop, and proposes writes.
The loop, at the level of what the human approves:
- Read. Load
references.biband parse it into a list of entries. - For each entry. Send the title and first author to the CrossRef
/worksendpoint and retrieve the best match above a similarity threshold. - Propose. Surface the diff (current vs. CrossRef) to the human:
DOI: missing → 10.1093/nar/gkab1049,year: 2020 → 2021. - Human confirms. The human reviews each proposed change and types
y,n, orskip. - Write. Only after explicit confirmation, overwrite the entry.
The key design decision is step 4. Without a human-confirm gate, the agent would silently correct entries that CrossRef matched wrong (common for short titles or names shared across multiple papers), overwrite your canonical abbreviation choices with CrossRef’s, and propagate errors faster than you can track them. The confirm gate turns an irreversible bulk operation into a reviewed sequence of small, reversible decisions. The agent’s value is not autonomy. It is eliminating tedious API lookups and leaving only judgement calls.
Pattern B: a coding assistant editing your repo
Cursor, Claude Code, and Copilot Chat all sit one step short of full agents: they propose file edits and they also propose the diff for human review before applying. The blast-radius argument applies in miniature.
A typical edit-with-confirm loop:
- You describe a change. “Refactor
validate_qc_outputsto take an optionalmin_cellsargument, default 1000.” - The assistant reads the relevant files. Your test file, the function, its callers.
- It proposes a diff. Added parameter, updated docstring, updated tests, updated one caller.
- You review the diff. Accept all, accept some, reject. The assistant cannot apply changes without your click.
- Tests run, locally or in CI.
The confirm gate here is the diff review. The asymmetry that makes it work: the AI is fast at producing many small edits, and you are fast at scanning a diff. Removing the gate (an “auto-apply” mode) turns a coding assistant into an unsupervised agent. That is the moment to re-evaluate the blast radius.
A failure mode specific to this pattern is an assistant that confidently rewrites a passing test to make a buggy implementation pass. The diff looks reasonable. The test no longer guards what it used to guard. The mitigation is to read the test changes, not just the implementation changes. That is exactly the Discernment move.
When not to use an agent
A loop is overhead. If the task does not benefit from the loop, do not use one.
- One-shot tasks. “Generate a regex for the date format” does not need a loop. A chat completion is enough.
- Irreversible actions without easy rollback. Submitting a manuscript, sending email, transferring funds, dropping database tables. Even with a confirm gate, the social or scientific cost of one wrong confirmation is too high. Keep these in human-only workflows.
- Tasks where the human-confirm gate fires every step. If you would have to approve every action anyway (reviewing every proposed edit to a manuscript), the agent is just a slower text editor with extra failure modes.
- High-trust, low-frequency tasks. Rare actions are exactly the ones whose patterns the AI has not memorised reliably. The cost of a bug is high. The speed-up from automation is small.
The right question is not “can I build an agent for this?”. Current frameworks make that easy. The right question is “does the loop save more human time than the human-confirm gates and verification overhead cost?”.
Further reading
- Anthropic. (2025). Building effective agents. A practical taxonomy of agentic patterns (augmented LLM, routing, parallelisation, orchestration). The blast-radius framing maps onto the orchestrator-subagent distinction.
- Weng, L. (2023). LLM-powered autonomous agents. Lilian Weng’s blog. A thorough technical walkthrough of memory, tool use, and planning in agent architectures.
- Yao, S., et al. (2023). ReAct: Synergizing reasoning and acting in language models. ICLR 2023. Established the think-act-observe loop underlying most current agent frameworks.
- Ruan, Y., et al. (2023). Identifying the risks of LM agents with an LM-emulated sandbox. arXiv:2309.15817. Empirical study of agent failure modes. Good grounding for the blast-radius framing above.
- A “chat assistant” is given a single tool:
run_shell_command. Why is the marketing category misleading? - You are using a coding assistant in edit-with-confirm mode. What is the most important thing to actually read in the proposed diff, and why?
- Name two task types where building an agent is the wrong move, and explain why for one of them.
Answers: 1. The blast radius is the right framing, not the label. A single shell tool gives the assistant the ability to take destructive, irreversible actions on the host system. Functionally, that’s an agent. 2. Read the test changes, not just the implementation. An assistant can “fix” a failing build by rewriting the test to match a buggy implementation. The diff looks reasonable but the test no longer guards what it used to. 3. Examples: irreversible actions without rollback (submitting a manuscript, sending email, where one wrong confirmation has a high social or scientific cost); tasks where the human would confirm every step anyway (the agent becomes a slower text editor with extra failure modes); one-shot tasks (a chat completion is enough; no loop needed).