Introduction
LLMs are all the trend, and the tool-calling function has broadened the scope of giant language fashions. As an alternative of producing solely texts, it enabled LLMs to perform complicated automation duties that have been beforehand inconceivable, resembling dynamic UI era, agentic automation, and so forth.
These fashions are skilled over an enormous quantity of knowledge. Therefore, they perceive and might generate structured knowledge, making them perfect for tool-calling purposes requiring exact outputs. This has pushed the widespread adoption of LLMs in AI-driven software program growth, the place tool-calling—starting from easy capabilities to stylish brokers—has turn into a focus.
On this article, you’ll go from studying the basics of LLM device calling to implementing it to construct brokers utilizing open-source instruments.
Studying Targets
- Study what LLM instruments are.
- Perceive the basics of device calling and use circumstances.
- Discover how device calling works in OpenAI (ChatCompletions API, Assistants API, Parallel device calling, and Structured Output), Anthropic fashions, and LangChain.
- Study to construct succesful AI brokers utilizing open-source instruments.
This text was revealed as part of the Information Science Blogathon.
Instruments are objects that enable LLMs to work together with exterior environments. These instruments are capabilities made obtainable to LLMs, which will be executed individually every time the LLM determines that their use is acceptable.
Often, there are three parts of a device definition.
- Title: A significant title of the perform/device.
- Description: An in depth description of the device.
- Parameters: A JSON schema of parameters of the perform/device.
Software calling allows the mannequin to generate a response for a immediate that aligns with a user-defined schema for a perform. In different phrases, when the LLM determines {that a} device needs to be used, it generates a structured output that matches the schema for the device’s arguments.
For example, when you’ve got supplied a schema of a get_weather perform to the LLM and ask it for the climate of a metropolis, as a substitute of producing a textual content response, it returns a formatted schema of capabilities arguments, which you should use to execute the perform to get the climate of a metropolis.
Regardless of the title “device calling,” the mannequin doesn’t truly execute any device itself. As an alternative, it produces a structured output formatted in keeping with the outlined schema. Then, You may provide this output to the corresponding perform to run it in your finish.
AI labs like OpenAI and Anthropic have skilled fashions so to present the LLM with many instruments and have it choose the suitable one in keeping with the context.
Every supplier has a unique means of dealing with device invocations and response dealing with. Right here’s the overall move of how device calling works while you cross a immediate and instruments to the LLM:
- Outline Instruments and Present a Consumer Immediate
- Outline instruments and capabilities with names, descriptions, and structured schema for arguments.
- Additionally embody a user-provided textual content, e.g., “What’s the climate like in New York right now?”
- The LLM Decides to Use a Software
- The Assistant assesses if a device is required.
- If sure, it halts the textual content era.
- The Assistant generates a JSON formatted response with the device’s parameter values.
- Extract Software Enter, Run Code, and Return Outputs
- Extract the parameters supplied within the perform name.
- Run the perform by passing the parameters.
- Go the outputs again to the LLM.
- Generate Solutions from Software Outputs
- The LLM makes use of the device outputs to formulate a common reply.
Instance Use Instances
- Enabling LLMs to take motion: Join LLMs with exterior purposes like Gmail, GitHub, and Discord to automate actions resembling sending an electronic mail, pushing a PR, and sending a message.
- Offering LLMs with knowledge: Fetch knowledge from data bases like the net, Wikipedia, and Climate APIs to offer area of interest info to LLMs.
- Dynamic UIs: Updating UIs of your purposes based mostly on person inputs.
Completely different mannequin suppliers take completely different approaches to dealing with device calling. This text will focus on the tool-calling approaches of OpenAI, Anthropic, and LangChain. You too can use open-source fashions like Llama 3 and inference suppliers like Groq for device calling.
At present, OpenAI has 4 completely different fashions (GPT-4o. GPT-4o-mini, GPT-4-turbo, and GPT-3.5-turbo). All these fashions assist device calling.
Let’s perceive it utilizing a easy calculator perform instance.
def calculator(operation, num1, num2):
if operation == "add":
return num1 + num2
elif operation == "subtract":
return num1 - num2
elif operation == "multiply":
return num1 * num2
elif operation == "divide":
return num1 / num2
Create a device calling schema for the Calculator perform.
import openai
openai.api_key = OPENAI_API_KEY
# Outline the perform schema (that is what GPT-4 will use to know the best way to name the perform)
calculator_function = {
"title": "calculator",
"description": "Performs primary arithmetic operations",
"parameters": {
"kind": "object",
"properties": {
"operation": {
"kind": "string",
"enum": ["add", "subtract", "multiply", "divide"],
"description": "The operation to carry out"
},
"num1": {
"kind": "quantity",
"description": "The primary quantity"
},
"num2": {
"kind": "quantity",
"description": "The second quantity"
}
},
"required": ["operation", "num1", "num2"]
}
}
A typical OpenAI perform/device calling schema has a reputation, description, and parameter part. Contained in the parameters part, you may present the main points for the perform’s arguments.
- Every property has an information kind and outline.
- Optionally, an enum which defines particular values the parameter expects. On this case, the “operation” parameter expects any of “add”, “subtract”, multiply, and “divide”.
- Required sections point out the parameters the mannequin should generate.
Now, use the outlined schema of the perform to get response from the chat completion endpoint.
# Instance of calling the OpenAI API with a device
response = openai.chat.completions.create(
mannequin="gpt-4-0613", # You need to use any model that helps perform calling
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is 3 plus 4?"},
],
capabilities=[calculator_function],
function_call={"title": "calculator"}, # Instruct the mannequin to name the calculator perform
)
# Extracting the perform name and its arguments from the response
function_call = response.decisions[0].message.function_call
title = function_call.title
arguments = function_call.arguments
Now you can cross the arguments to the Calculator perform to get an output.
import json
args = json.hundreds(arguments)
outcome = calculator(args['operation'], args['num1'], args['num2'])
# Output the outcome
print(f"Outcome: {outcome}")
That is the best means to make use of device calling utilizing OpenAI fashions.
Utilizing the Assistant API
You too can use device calling with the Assistant API. This supplies extra freedom and management over all the workflow, permitting you to” construct brokers to perform complicated automation duties.
Right here is the best way to use device calling with Assistant API.
We’ll use the identical calculator instance.
from openai import OpenAI
consumer = OpenAI(api_key=OPENAI_API_KEY)
assistant = consumer.beta.assistants.create(
directions="You're a climate bot. Use the supplied capabilities to reply questions.",
mannequin="gpt-4o",
instruments=[{
"type":"function",
"function":{
"name": "calculator",
"description": "Performs basic arithmetic operations",
"parameters": {
"type": "object",
"properties": {
"operation": {
"type": "string",
"enum": ["add", "subtract", "multiply", "divide"],
"description": "The operation to carry out"
},
"num1": {
"kind": "quantity",
"description": "The primary quantity"
},
"num2": {
"kind": "quantity",
"description": "The second quantity"
}
},
"required": ["operation", "num1", "num2"]
}
}
}
]
)
Create a thread and a message
thread = consumer.beta.threads.create()
message = consumer.beta.threads.messages.create(
thread_id=thread.id,
function="person",
content material="What's 3 plus 4?",
)
Provoke a run
run = consumer.beta.threads.runs.create_and_poll(
thread_id=thread.id,
assistant_id="assistant.id")
Retrieve the arguments and run the Calculator perform
arguments = run.required_action.submit_tool_outputs.tool_calls[0].perform.arguments
import json
args = json.hundreds(arguments)
outcome = calculator(args['operation'], args['num1'], args['num2'])
Loop by way of the required motion and add it to a listing
#tool_outputs = []
# Loop by way of every device within the required motion part
for device in run.required_action.submit_tool_outputs.tool_calls:
if device.perform.title == "calculator":
tool_outputs.append({
"tool_call_id": device.id,
"output": str(outcome)
})
Submit the device outputs to the API and generate a response
# Submit the device outputs to the API
consumer.beta.threads.runs.submit_tool_outputs_and_poll(
thread_id=thread.id,
run_id=run.id,
tool_outputs=tool_outputs
)
messages = consumer.beta.threads.messages.record(
thread_id=thread.id
)
print(messages.knowledge[0].content material[0].textual content.worth)
This can output a response `3 plus 4 equals 7`.
Parallel Operate Calling
You too can use a number of instruments concurrently for extra sophisticated use circumstances. For example, getting the present climate at a location and the probabilities of precipitation. To realize this, you should use the parallel perform calling function.
Outline two dummy capabilities and their schemas for device calling
from openai import OpenAI
consumer = OpenAI(api_key=OPENAI_API_KEY)
def get_current_temperature(location, unit="Fahrenheit"):
return {"location": location, "temperature": "72", "unit": unit}
def get_rain_probability(location):
return {"location": location, "likelihood": "40"}
assistant = consumer.beta.assistants.create(
directions="You're a climate bot. Use the supplied capabilities to reply questions.",
mannequin="gpt-4o",
instruments=[
{
"type": "function",
"function": {
"name": "get_current_temperature",
"description": "Get the current temperature for a specific location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g., San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["Celsius", "Fahrenheit"],
"description": "The temperature unit to make use of. Infer this from the person's location."
}
},
"required": ["location", "unit"]
}
}
},
{
"kind": "perform",
"perform": {
"title": "get_rain_probability",
"description": "Get the likelihood of rain for a particular location",
"parameters": {
"kind": "object",
"properties": {
"location": {
"kind": "string",
"description": "Town and state, e.g., San Francisco, CA"
}
},
"required": ["location"]
}
}
}
]
)
Now, create a thread and provoke a run. Based mostly on the immediate, this may output the required JSON schema of perform parameters.
thread = consumer.beta.threads.create()
message = consumer.beta.threads.messages.create(
thread_id=thread.id,
function="person",
content material="What is the climate in San Francisco right now and the probability it will rain?",
)
run = consumer.beta.threads.runs.create_and_poll(
thread_id=thread.id,
assistant_id=assistant.id,
)
Parse the device parameters and name the capabilities
import json
location = json.hundreds(run.required_action.submit_tool_outputs.tool_calls[0].perform.arguments)
climate = json.loa"s(run.requir"d_action.submit_t"ol_out"uts.tool_calls[1].perform.arguments)
temp = get_current_temperature(location['location'], location['unit'])
rain_p"ob = get_rain_pro"abilit"(climate['location'])
# Output the outcome
print(f"Outcome: {temp}")
print(f"Outcome: {rain_prob}")
Outline a listing to retailer device outputs
# Outline the record to retailer device outputs
tool_outputs = []
# Loop by way of every device within the required motion part
for device in run.required_action.submit_tool_outputs.tool_calls:
if device.perform.title == "get_current_temperature":
tool_outputs.append({
"tool_call_id": device.id,
"output": str(temp)
})
elif device.perform.title == "get_rain_probability":
tool_outputs.append({
"tool_call_id": device.id,
"output": str(rain_prob)
})
Submit device outputs and generate a solution
# Submit all device outputs without delay after gathering them in tool_outputs:
strive:
run = consumer.beta.threads.runs.submit_tool_outputs_and_poll(
thread_id=thread.id,
run_id=run.id,
tool_outputs=tool_outputs
)
print("Software outputs submitted efficiently.")
besides Exception as e:
print("Didn't submit device outputs:", e)
else:
print("No device outputs to submit.")
if run.standing == 'accomplished':
messages = consumer.beta.threads.messages.record(
thread_id=thread.id
)
print(messages.knowledge[0].content material[0].textual content.worth)
else:
print(run.standing)
The mannequin will generate a whole reply based mostly on the device’s outputs. `The present temperature in San Francisco, CA, is 72°F. There’s a 40% probability of rain right now.`
Discuss with the official documentation for extra.
Structured Output
Just lately, OpenAI launched structured output, which ensures that the arguments generated by the mannequin for a perform name exactly match the JSON schema you supplied. This function prevents the mannequin from producing incorrect or sudden enum values, holding its responses aligned with the desired schema.
To make use of Structured Output for device calling, set strict: True. The API will pre-process the provided schema and constrain the mannequin to stick strictly to your schema.
from openai import OpenAI
consumer = OpenAI()
assistant = consumer.beta.assistants.create(
directions="You're a climate bot. Use the supplied capabilities to reply questions.",
mannequin="gpt-4o-2024-08-06",
instruments=[
{
"type": "function",
"function": {
"name": "get_current_temperature",
"description": "Get the current temperature for a specific location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g., San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["Celsius", "Fahrenheit"],
"description": "The temperature unit to make use of. Infer this from the person's location."
}
},
"required": ["location", "unit"],
"additionalProperties": False
},
"strict": True
}
},
{
"kind": "perform",
"perform": {
"title": "get_rain_probability",
"description": "Get the likelihood of rain for a particular location",
"parameters": {
"kind": "object",
"properties": {
"location": {
"kind": "string",
"description": "Town and state, e.g., San Francisco, CA"
}
},
"required": ["location"],
"additionalProperties": False
},
// highlight-start
"strict": True
// highlight-end
}
}
]
)
The preliminary request will take just a few seconds. Nonetheless, subsequently, the cached artefacts shall be used for device calls.
Anthropic’s Claude household of fashions is environment friendly at device calling as properly.
The workflow for calling instruments with Claude is much like that of OpenAI. Nonetheless, the essential distinction is in how device responses are dealt with. In OpenAI’s setup, device responses are managed beneath a separate function, whereas in Claude’s fashions, device responses are included immediately throughout the Consumer roles.
A typical device definition in Claude contains the perform’s title, description, and JSON schema.
import anthropic
consumer = anthropic.Anthropic()
response = consumer.messages.create(
mannequin="claude-3-5-sonnet-20240620",
max_tokens=1024,
instruments=[
{
"name": "get_weather",
"description": "Get the current weather in a given location",
"input_schema": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "The unit of temperature, both 'Celsius' or 'fahrenheit'"
}
},
"required": ["location"]
}
},
],
messages=[
{
"role": "user",
"content": "What is the weather like in New York?"
}
]
)
print(response)
The capabilities’ schema definition is much like the schema definition in OpenAI’s chat completion API, which we mentioned earlier.
Nonetheless, the response differentiates Claude’s fashions from these of OpenAI.
{
"id": "msg_01Aq9w938a90dw8q",
"mannequin": "claude-3-5-sonnet-20240620",
"stop_reason": "tool_use",
"function": "assistant",
"content material": [
{
"type": "text",
"text": "I need to call the get_weather function, and the user wants SF, which is likely San Francisco, CA. "
},
{
"type": "tool_use",
"id": "toolu_01A09q90qw90lq917835lq9",
"name": "get_weather",
"input": {"location": "San Francisco, CA", "unit": "celsius"}
}
]
}
You may extract the arguments, execute the unique perform, and cross the output to LLM for a textual content response with added info from perform calls.
response = consumer.messages.create(
mannequin="claude-3-5-sonnet-20240620",
max_tokens=1024,
instruments=[
{
"name": "get_weather",
"description": "Get the current weather in a given location",
"input_schema": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "The unit of temperature, both 'celsius' or 'fahrenheit'"
}
},
"required": ["location"]
}
}
],
messages=[
{
"role": "user",
"content": "What's the weather like in San Francisco?"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "I need to use get_weather, and the user wants SF, which is likely San Francisco, CA. "
},
{
"type": "tool_use",
"id": "toolu_01A09q90qw90lq917835lq9",
"name": "get_weather",
"input": {"location": "San Francisco, CA", "unit": "celsius"}
}
]
},
{
"function": "person",
"content material": [
{
"type": "tool_result",
"tool_use_id": "toolu_01A09q90qw90lq917835lq9", # from the API response
"content": "65 degrees" # from running your tool
}
]
}
]
)
print(response)
Right here, you may observe that we handed the tool-calling output beneath the person function.
For extra on Claude’s device calling, consult with the official documentation.
Here’s a comparative overview of tool-calling options throughout completely different LLM suppliers.
Managing a number of LLM suppliers can rapidly turn into troublesome whereas constructing complicated AI purposes. Therefore, frameworks like LangChain have created a unified interface for dealing with device calls from a number of LLM suppliers.
Create a customized device utilizing @device decorator in LangChain.
from langchain_core.instruments import device
@device
def add(a: int, b: int) -> int:
"""Provides a and b.
Args:
a: first int
b: second int
"""
return a + b
@device
def multiply(a: int, b: int) -> int:
"""Multiplies a and b.
Args:
a: first int
b: second int
"""
return a * b
instruments = [add, multiply]
Initialise an LLM,
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(mannequin="gpt-3.5-turbo-0125")
Use the bind device methodology so as to add the outlined instruments to the LLMs.
llm_with_tools = llm.bind_tools(instruments)
Typically, it is best to power the LLMs to make use of sure instruments. Many LLM suppliers enable this behaviour. To acheive this in LangChain, use
always_multiply_llm = llm.bind_tools([multiply], tool_choice="multiply")
And if you wish to name any of the instruments supplied
always_call_tool_llm = llm.bind_tools([add, multiply], tool_choice="any")
Schema Definition Utilizing Pydantic
You too can use Pydantic to outline device schema. That is helpful when the device has a fancy schema.
from langchain_core.pydantic_v1 import BaseModel, Discipline
# Word that the docstrings listed below are essential, as they are going to be handed alongside
# to the mannequin and the category title.
class add(BaseModel):
"""Add two integers collectively."""
a: int = Discipline(..., description="First integer")
b: int = Discipline(..., description="Second integer")
class multiply(BaseModel):
"""Multiply two integers collectively."""
a: int = Discipline(..., description="First integer")
b: int = Discipline(..., description="Second integer")
instruments = [add, multiply]
Guarantee detailed docstring and clear parameter descriptions for optimum outcomes.
Brokers are automated applications powered by LLMs that work together with exterior environments. As an alternative of executing one motion after one other in a series, the brokers can determine which actions to take based mostly on some situations.
Getting structured responses from LLMs to work with AI brokers was tedious. Nonetheless, device calling made getting the specified structured response from LLMs moderately easy. This major function is main the AI agent revolution now.
So, let’s see how one can construct a real-world agent, resembling a GitHub PR reviewer utilizing OpenAI SDK and an open-source toolset referred to as Composio.
What’s Composio?
Composio is an open-source tooling answer for constructing AI brokers. To assemble complicated agentic automation, it provides out-of-the-box integrations for purposes like GitHub, Notion, Slack, and so forth. It enables you to combine instruments with brokers with out worrying about complicated app authentication strategies like OAuth.
These instruments can be utilized with LLMs. They’re optimized for agentic interactions, which makes them extra dependable than easy perform calls. Additionally they deal with person authentication and authorization.
You need to use these instruments with OpenAI SDK, LangChain, LlamaIndex, and so forth.
Let’s see an instance the place you’ll construct a GitHub PR assessment agent utilizing OpenAI SDK.
Set up OpenAI SDK and Composio.
pip set up openai composio
Login to your Composio person account.
composio login
Add GitHub integration by finishing the mixing move.
composio add github composio apps replace
Allow a set off to obtain PRs when created.
composio triggers allow github_pull_request_event
Create a brand new file, import libraries, and outline the instruments.
import os
from composio_openai import Motion, ComposioToolSet
from openai import OpenAI
from composio.consumer.collections import TriggerEventData
composio_toolset = ComposioToolSet()
pr_agent_tools = composio_toolset.get_actions(
actions=[
Action.GITHUB_GET_CODE_CHANGES_IN_PR, # For a given PR, it gets all the changes
Action.GITHUB_PULLS_CREATE_REVIEW_COMMENT, # For a given PR, it creates a comment
Action.GITHUB_ISSUES_CREATE, # If required, allows you to create issues on github
]
)
Initialise an OpenAI occasion and outline a immediate.
openai_client = OpenAI()
code_review_assistant_prompt = (
"""
You might be an skilled code reviewer.
Your process is to assessment the supplied file diff and provides constructive suggestions.
Comply with these steps:
1. Establish if the file accommodates important logic adjustments.
2. Summarize the adjustments within the diff in clear and concise English inside 100 phrases.
3. Present actionable strategies if there are any points within the code.
After you have selected the adjustments for any TODOs, create a Github subject.
"""
)
Create an OpenAI assistant thread with the prompts and the instruments.
# Give openai entry to all of the instruments
assistant = openai_client.beta.assistants.create(
title="PR Evaluation Assistant",
description="An assistant that can assist you with reviewing PRs",
directions=code_review_assistant_prompt,
mannequin="gpt-4o",
instruments=pr_agent_tools,
)
print("Assistant is prepared")
Now, arrange a webhook to obtain the PRs fetched by the triggers and a callback perform to course of them.
## Create a set off listener
listener = composio_toolset.create_trigger_listener()
## Triggers when a brand new PR is opened
@listener.callback(filters={"trigger_name": "github_pull_request_event"})
def review_new_pr(occasion: TriggerEventData) -> None:
# Utilizing the data from Set off, execute the agent
code_to_review = str(occasion.payload)
thread = openai_client.beta.threads.create()
openai_client.beta.threads.messages.create(
thread_id=thread.id, function="person", content material=code_to_review
)
## Let's print our thread
url = f"https://platform.openai.com/playground/assistants?assistant={assistant.id}&thread={thread.id}"
print("Go to this URL to view the thread: ", url)
# Execute Agent with integrations
# begin the execution
run = openai_client.beta.threads.runs.create(
thread_id=thread.id, assistant_id=assistant.id
)
composio_toolset.wait_and_handle_assistant_tool_calls(
consumer=openai_client,
run=run,
thread=thread,
)
print("Listener began!")
print("Create a pr to get the assessment")
listener.hear()
Right here is what’s going on within the above code block
- Initialize Listener and Outline Callback: We outlined an occasion listener with a filter with the set off title and a callback perform. The callback perform is known as when the occasion listener receives an occasion from the desired set off, i,e. github_pull_request_event.
- Course of PR Content material: Extracts the code diffs from the occasion payload.
- Run Assistant Agent: Create a brand new OpenAI thread and ship the codes to the GPT mannequin.
- Handle Software Calls and Begin Listening: Handles device calls throughout execution and prompts the listener for ongoing PR monitoring.
With this, you should have a totally purposeful AI agent to assessment new PR requests. At any time when a brand new pull request is raised, the webhook triggers the callback perform, and eventually, the agent posts a abstract of the code diffs as a remark to the PR.
Conclusion
Software calling by the Massive Language Mannequin is on the forefront of the agentic revolution. It has enabled use circumstances that have been beforehand inconceivable, resembling letting machines work together with exterior purposes as and when wanted, dynamic UI era, and so forth. Builders can construct complicated agentic automation processes by leveraging instruments and frameworks like OpenAI SDK, LangChain, and Composio.
Key Takeaways
- Instruments are objects that permit the LLMs interface with exterior purposes.
- Software calling is the strategy the place LLMs generate structured schema for a required perform based mostly on person message.
- Nonetheless, main LLM suppliers resembling OpenAI and Anthropic supply perform calling with completely different implementations.
- LangChain provides a unified API for device calling utilizing LLMs.
- Composio provides instruments and integrations like GitHub, Slack, and Gmail for complicated agentic automation.
Ceaselessly Requested Questions
A. Instruments are objects that permit the LLMs work together with exterior environments, resembling Code interpreters, GitHub, Databases, the Web, and so forth.
A. LLMs, or Massive Language Fashions, are superior AI techniques designed to know, generate, and reply to human language by processing huge quantities of textual content knowledge.
A. Software calling allows LLMs to generate the structured schema of perform arguments as and when wanted.
A. AI brokers are techniques powered by AI fashions that may autonomously carry out duties, work together with their atmosphere, and make selections based mostly on their programming and the information they course of.
The media proven on this article is just not owned by Analytics Vidhya and is used on the Writer’s discretion.