As AI agents become smarter and more capable, one of the most powerful features they gain is the ability to use tools (also known as function calling). This capability allows an AI model not just to respond with text, but to actively perform actions like fetching weather, doing calculations, querying databases, or sending emails.
But how do AI agents decide what to do and when? Do they rely on hardcoded logic? Or can they autonomously choose actions based on your request?
In this blog post, we'll explore:
What function/tool calling is
How it works in GPT-4 and Gemini
How LangChain uses or simulates function calling
The difference between built-in function calling and LangChain's logic
Code examples to show both styles in action
🔮 What is Function Calling in LLMs?
Function calling (also called tool use) lets a language model call predefined functions when the input requires it.
Instead of just replying to:
"What's the weather in Chennai?"
...with text like:
"I'm not sure, but you can check weather.com"
...the model calls a weather API function:
You then run the function in your code and return the real answer back to the model.
🔧 Function Calling in GPT-4 (OpenAI)
OpenAI's GPT-4 (especially gpt-4-0613, gpt-4-1106-preview, and gpt-4o) supports native function calling. You define a function schema, and the model decides if it wants to call a function based on user input.
Example:
When a user asks a question, the model can respond with a function_call, which your app handles.
📈 Function Calling in Google Gemini (1.5 Pro)
Google Gemini 1.5 Pro also supports tool use via "function declarations". It's similar to OpenAI but the output is handled through tool_requests.
Highlights:
Tools are defined via JSON schemas
Gemini returns structured tool calls
You handle the tool execution and return results
This feature works with the Vertex AI SDK or Gemini API and is intended for advanced agent use cases.
💡 How LangChain Handles Tool Use
LangChain is a powerful Python framework for building LLM-based apps. It supports tool use in two ways:
1. Using the LLM's Native Function Calling
If you're using GPT-4 with function calling (or Gemini), LangChain can hand over tool selection to the model.
In this case:
GPT-4 decides which tool to call
LangChain sends tool schemas
LangChain executes the function and continues the chain
2. LangChain's Custom Logic (ReAct Prompting)
For models that don't support function calling, LangChain uses the ZERO_SHOT_REACT_DESCRIPTION agent type.
This method relies on prompt engineering and output parsing:
Thought: I need to calculate something.
Action: Calculator
Action Input: 5 * 10
LangChain parses this output, runs the correct function, and continues the loop.
🔢 Comparison Table
🤝 When Should You Use Each?
🚀 Final Thoughts
Tool use/function calling is what makes LLMs act like true agents. They go beyond just answering questions to actually doing things.
LangChain gives you the flexibility to:
Use the LLM's native function calling when available
Or build your own logic layer when it isn't
With the rise of GPT-4 Turbo and Gemini 1.5 Pro, now is the perfect time to start experimenting with tool-using agents.
AI Course | Live AI Coaching
No comments:
Post a Comment