App	What you say	What the agent does
Google Assistant	"Set a timer for 10 minutes"	Calls the clock API
Siri	"Text Mom I'll be late"	Calls the messaging API
ChatGPT with browsing	"Summarize today's news"	Calls the Bing search API
Claude Code	"Fix the failing test"	Reads files, edits code, runs pytest
GitHub Copilot	"Add error handling here"	Reads context, writes code inline

Deep Dive: What Does Claude Code Actually Do?

You type: "Fix the failing test in test_api.py"

Claude Code doesn't magically know the answer. Here's what the agent loop does:

Step	Action	Tool used
1	Read `test_api.py` to understand the test	`Read file`
2	Run `pytest test_api.py` to see the error	`Bash (shell)`
3	Read the error traceback	`Read output`
4	Read `app.py` (the file under test) to find the bug	`Read file`
5	Edit line 42 of `app.py` to fix the bug	`Edit file`
6	Run `pytest test_api.py` again to verify	`Bash (shell)`
7	Tests pass → report success to user	No tool — done

Seven steps, four different tools, zero human intervention. The LLM
decided what to do at every step based on what it saw.

Prompt	Think about it...
"Book me a flight from Ahmedabad to Delhi under ₹5000"	What APIs? What order?
"Summarize the top 3 news stories about AI today"	Search, scrape, summarize?
"Generate a bar chart of India's GDP from 2015 to 2024"	Search data, write code, run it?
"Review this pull request and leave comments"	Read diff, read code, write comments?

Piece	What it does	Analogy
LLM	Thinks, reasons, decides what to do next	The brain
Tools	Functions the LLM can call (calculator, search, code runner)	The hands
Loop	Keeps going until the task is done	The work ethic

Step	Who does it	What happens
1	You	Hand the model a menu of available tools
2	Model	Reads the question, decides which tool to call
3	Model	Outputs a structured request: `{"name": "get_weather", "args": {"city": "Gandhinagar"}}`
4	You	Execute the function, get the result
5	You	Feed the result back to the model
6	Model	Writes the final answer using the real data

What the user asks	What the model calls
"I weigh 70 kg — in pounds?"	`convert_units(70, "kg", "pounds")`
"Convert 100°F to Celsius"	`convert_units(100, "fahrenheit", "celsius")`
"How far is 5 miles in km?"	`convert_units(5, "miles", "km")`

Building AI Agents from Scratch

User question	Tool the model picks
"What's 17 squared?"	`calculate("17 ** 2")`
"Is it raining in Mumbai?"	`get_weather("Mumbai")`
"Convert 100°F to Celsius"	`convert_units(100, "fahrenheit", "celsius")`
"When did we learn about Docker?"	`search_notes("docker")`
"What is the capital of France?"	No tool — answers from memory

Step	Model decides	Tool called	Result
1	"I need sqrt(144)"	`calculate("144 ** 0.5")`	`12.0`
2	"I need Mumbai's temp"	`get_weather("Mumbai")`	`32°C`
3	"Convert 32°C to °F"	`convert_units(32, "celsius", "fahrenheit")`	`89.6°F`
4	"Add them"	`calculate("12.0 + 89.6")`	`101.6`
5	No tool needed	—	"The answer is 101.6"

Agent	What it does	Tools it uses
Claude Code	Writes, edits, and tests code from your terminal	File read/write, bash, grep, git
Cursor / Windsurf	AI-powered code editors	File system, LSP, terminal
Devin	Autonomous software engineer	Browser, terminal, code editor
Perplexity	Search engine with citations	Web search, web scrape
Google Deep Research	Multi-step research reports	Search, summarize, cite
OpenAI Operator	Uses websites on your behalf	Browser automation

Risk	Example	Guardrail
Infinite loop	Agent keeps calling tools forever	`max_steps` parameter
Wrong tool	Agent calls `delete_file` when it meant `read_file`	Require user confirmation for dangerous tools
Hallucinated args	`get_weather("Narnia")`	Validate arguments before execution
Cost explosion	Agent makes 1000 API calls	Budget / rate limiting

Task	What you'll do
Load Gemma 4	Run a 4-bit model on free Colab T4
Single tool call	Define a calculator tool, watch the model use it
Multi-tool agent	Give the model 4 tools, ask compound questions
Build your own tool	Write a new function, add it to the agent
Multi-step reasoning	Ask questions that need 3+ tool calls