How to Build a ReAct AI Agent with Claude 3.5 and Python

We present a Reason+Act agent that iteratively reasons and gathers information from external tools before providing an answer.

Jun 29, 2024

Image by author with a little help from DALL-E (OK, maybe quite a lot of help)

Reason+Act (ReAct) agents[1] can perform complex reasoning tasks by combining chain-of-thought reasoning with access to external tools and the ability to iterate towards a solution.

A major component of the agent is the system prompt which defines the overall behaviour of the agent (we’ll see an example shortly).

Processing begins with a user prompt which requests a solution to a problem. The agent is directed by the system prompt to reason about the question and to select, as appropriate, an external tool that will help solve it.

The agent invokes the tool and, when it has received a response, decides if more processing is necessary. If it is, it can request help again. The agent iterates through reasoning and action (calling a tool) until it has solved the problem and then returns a result to the user

The process is illustrated in the diagram, below.

Perhaps, the easiest way to explain how a ReAct agent works is with a simple case study. Below is an example response from an agent solving a simple arithmetic problem.

We start with the question “What is 20 * 15” and in response, the agent begins to ‘think out loud’.

What is 20 * 15
Thought: This is a simple multiplication problem that I can solve using the calculate action.
Action: calculate: 20 * 15
Observation: 300
Answer: The result of 20 * 15 is 300.

The agent’s response is preceded with the label Thought where it reasons about what it should do. In this case, it decides to use the calculate action to solve the problem (calculate is a tool we have provided to the agent).

The next response is preceded by the label Action where we see the agent using the calculate tool to retrieve an answer. Following this we see an Observation which is the response from the tool.

The final Answer is the result of the agent creating a response from the original question and the output from the calculate tool.

On the face of it, there is nothing terrifically unusual going on here — except that the agent is using a tool that we have provided.

And that is fundamental to a ReAct agent: the agent reasons and then acts using external tools.

Iteration

Where it gets more exciting is when the agent decides that it has not reached a conclusion and needs to go through another reason/act sequence. The agent can iterate through the reasoning/acting sequence until it comes to a valid conclusion.

The example above is trivial but shows you the basic sequence of events.

If we try a more complex query we can see the iteration in practice.

I asked the agent the following: “What is the result of multiplying the number of players in a basketball team by the number of players in a lacrosse team?”.

To answer this the agent needs to find out the number of players in each sport and multiply them together. It can use the calculate tool for multiplication but to find out about basketball and lacrosse it needs another tool — wikipedia. The wikipedia tool allows the agent to look anything up (from Wikipedia, obviously) and get a result.

Below we see the way that the agent goes through the though-action-observation sequence three times before it has come up with a satisfactory answer.

To answer this question, I need to find out how many players are in a basketball team and a lacrosse team, and then multiply these numbers. Let’s start with basketball.
Thought: I’ll look up the number of players in a basketball team first.
Action: wikipedia: Basketball
Observation: Basketball is a team sport. Each team consists of five players on the court at a time, although a team may have substitute players on the bench.
Thought: Now that I know a basketball team has 5 players on the court, I need to find out how many players are in a lacrosse team.
Action: wikipedia: Lacrosse
Observation: Lacrosse is a team sport played with 10 players on each team on the field at a time: three attackmen, three midfielders, three defensemen, and one goalie. However, the number can vary in different versions of the game. For example, box lacrosse is played with fewer players.
Thought: Now that I have the number of players for both sports, I can calculate the result.
Action: calculate: 5 * 10
Observation: 50
Answer: The result of multiplying the number of players in a basketball team (5) by the number of players in a lacrosse team (10) is 50.

If you read through the output you can see how the agent reasons and consults the tools before it produces a result and in doing that, it builds up its knowledge as it goes.

Very little code

There is surprisingly little code needed to implement a ReAct agent because much of the work is done in the prompt.

So let’s look at that first.

prompt = """
You run in a loop of Thought, Action, Observation, Answer.
At the end of the loop you output an Answer
Use Thought to describe your thoughts about the question you have been asked.
Use Action to run one of the actions available to you.
Observation will be the result of running those actions.
Answer will be the result of analysing the Observation

Your available actions are:

calculate:
e.g. calculate: 4 * 7 / 3
Runs a calculation and returns the number - uses Python so be sure to use floating point syntax if necessary

wikipedia:
e.g. wikipedia: Django
Returns a summary from searching Wikipedia

Always look things up on Wikipedia if you have the opportunity to do so.

Example session:

Question: What is the capital of France?

Thought: I should look up France on Wikipedia

Action: wikipedia: France

You should then call the appropriate action and determine the answer from 
the result

You then output:

Answer: The capital of France is Paris
"""

The first part of the prompt tells the LLM how to approach a problem looping through the labelled parts of the process that we’ve seen previously,

The actions calculate and wikipedia are then described and these are followed by example sessions.

This prompt is a simple example and most definitely for demonstration purposes, only. Very clearly, the prompt is designed for this specific application: the description of the tools is hard-coded whereas in a production system, you would want to be able to expand on these programmatically. (We will also see shortly that calculate is implemented using the Python function eval(). This is a fairly dodgy thing to do as it is very vulnerable to code insertion attacks.)

It is at this point I should acknowledge the work of Simon Willison which much of this code is based — see A simple Python implementation of the ReAct pattern for LLMs | Simon Willison’s TILs — the code is Apache 2 licensed.

The fundamental structure of the code and the prompt is Simon’s but I have modified it to use Anthropic’s Claude 3.5 Sonnet LLM and simplified both code and prompt a little.

Claude 3.5

Claude Sonnet 3.5 is a recent release from Anthropic and in their release announcement, they state “Claude 3.5 Sonnet raises the industry bar for intelligence, outperforming competitor models and Claude 3 Opus on a wide range of evaluations, with the speed and cost of our mid-tier model, Claude 3 Sonnet”.

To put that into context, Anthropic have three versions of their Claude LLM, ranging from the cheaper, faster, Haiku model, to the cleverer but more expensive Opus. Sonnet is the in-between model and is the only one to have a version 3.5 release, at the time of writing.

I’ve been meaning to look at Claude for some time and this looked like a good opportunity to do so.

Share Data Visualization, Data Science, AI and Python

The code begins the required imports:

import anthropic
import re
import httpx

There’s not a lot there. Anthropic, of course, for the LLM, regular expressions and an HTTP library to access Wikipedia.

To run the code, you will, of course, need an account with Anthropic and they will charge you for using it. It’s pretty cheap, though: Claude 3.5 sonnet is, I believe cheaper than previous versions as well as being more powerful — I’ve run the code here (and quite a lot more) multiple times and have only been charged a few 10s of cents.

The code begins by creating a client and then defining a Python class that will implement a chatbot. Unlike OpenAI, Claude keeps the system prompt separate from those of the user or assistant and we initialise that when instantiating ChatBot. The __call__ function takes care of storing the user messages and the responses from the chatbot and calls execute to run the agent.

client = anthropic.Anthropic(api_key="YOUR API KEY HERE")

class ChatBot:
    def __init__(self, system=""):
        self.system = system
        self.messages = []

    def __call__(self, message):
        self.messages.append({"role": "user", "content": message})
        result = self.execute()
        self.messages.append({"role": "assistant", "content": result})
        return message

    def execute(self):
        message = client.messages.create(
            model="claude-3-5-sonnet-20240620",
            max_tokens=1000,
            temperature=0,
            system = self.system,
            messages = self.messages
        )
        return message.content

The next step is to define a query() function that uses an instantiation of the chatbot. The query function implements a loop that continues until there are no more actions (or we've reached the maximum number of iterations). The function detects the action and extracts the name of the action using a regex. The actions are called and when there are no more, the chatbot messages are returned.

action_re = re.compile('^Action: (\w+): (.*)$')

def query(question, max_turns=5):
    i = 0
    bot = ChatBot(prompt)
    next_prompt = question
    while i < max_turns:
        i += 1
        result = bot(next_prompt)
        print(result)
        actions = [action_re.match(a) for a in result.split('\n') if action_re.match(a)]
        if actions:
            # There is an action to run
            action, action_input = actions[0].groups()
            if action not in known_actions:
                raise Exception("Unknown action: {}: {}".format(action, action_input))
            print(" -- running {} {}".format(action, action_input))
            observation = known_actions[action](action_input)
            print("Observation:", observation)
            next_prompt = "Observation: {}".format(observation)
        else:
            return bot.messages

We now need to define the action functions (tools) wikipedia and eval and we store references to those in a dictionary.

def wikipedia(q):
    return httpx.get("https://en.wikipedia.org/w/api.php", params={
        "action": "query",
        "list": "search",
        "srsearch": q,
        "format": "json"
    }).json()["query"]["search"][0]["snippet"]

def calculate(what):
    return eval(what)

known_actions = {
    "wikipedia": wikipedia,
    "calculate": calculate
}

Next is a utility function that prints out the last message to have been generated by the bot.

def get_last_message():
    for m in bot.messages[-1]['content'][0].text.split('\n'):
        print(m)

And, after all that we can start using the agent.

query("What is 20 * 15")
get_last_message()

The code above results in the very first example we saw at the top of the article and the code below shows another example of the way that the ReAct agent can iteratively work through various actions to come to a valid conclusion.

Here is another example where the agent must iterate through calls to a tool. In this case, it only uses Wikipedia but it must intelligently analyse the responses to come up with an answer.

query("What languages are spoken in Spain that are also spoken in France")
get_last_message()

What languages are spoken in Spain that are also spoken in France?
Thought: To answer this question accurately, I should look up information about languages spoken in both Spain and France. I’ll start by checking Wikipedia for information about languages in Spain.
Action: wikipedia: Languages of Spain
Observation: Spain has a number of languages and dialects spoken within its borders. The official language of Spain is Spanish (also called Castilian), which is spoken by the vast majority of Spaniards. However, there are several co-official languages in various autonomous communities:
Catalan: Spoken in Catalonia, the Balearic Islands, and parts of the Valencian Community (where it’s known as Valencian).
Galician: Spoken in Galicia.
Basque: Spoken in the Basque Country and parts of Navarre.
Occitan (Aranese): Spoken in the Aran Valley of Catalonia.
Other recognized languages include Aragonese and Asturian. There are also several dialects of Spanish spoken in different regions.
Thought: Now that I have information about languages in Spain, I should check for languages in France to see which ones overlap.
Action: wikipedia: Languages of France
Observation: France’s official language is French, which is the first language of 88% of the population. However, there are several regional languages and dialects spoken in France:
Occitan: Spoken in southern France.
Breton: A Celtic language spoken in Brittany.
Alsatian: A Germanic dialect spoken in Alsace.
Corsican: Spoken on the island of Corsica.
Catalan: Spoken in the Pyrénées-Orientales department.
Basque: Spoken in the French Basque Country.
Other regional languages include Franco-Provençal, Lorraine Franconian, and various oil languages.
Thought: Comparing the languages spoken in Spain and France, I can identify the overlapping languages.
Answer: The languages spoken in Spain that are also spoken in France are:
Catalan: Spoken in parts of Spain (Catalonia, Balearic Islands, and Valencia) and in the Pyrénées-Orientales department of France.
Basque: Spoken in the Basque Country and parts of Navarre in Spain, and in the French Basque Country.
Occitan: Known as Aranese in the Aran Valley of Catalonia, Spain, and spoken in southern France.
These three languages are recognized and spoken in both countries, although their official status and the number of speakers may vary between Spain and France.

The response above demonstrates the sensible process that the agent goes through in order to come up with a suitable answer.

I’ve also tried more complex questions which the agent seems to deal with quite happily (you can see more examples in the downloadable code).

Share Data Visualization, Data Science, AI and Python

Conclusion

The code presented here illustrates the way a ReAct agent works. It is not robust and is in no way suitable for production purposes. But I hope that you now have a fairly good understanding of the principles of ReAct agents and how they may be implemented.

Please try out the code with more complex examples and, maybe, add more tools. I would be very interested to hear about your experiments.

Of course, there are simpler solutions using Langchain and Llamaindex (for example) and I shall probably look at these in a future article. However, their abstracted implementations hide the inner workings of the agent, something that I hope this article and Simon’s code have made clear.

All code and examples are available in my [Github repo](GitHub — alanjones2/claudeapps) — feel free to download them.

To see more articles, follow me on Medium, or subscribe to this Substack.

Older articles are listed on my webpage.

Notes and references

Shunyu Yao, et al, ReAct: Synergizing Reasoning and Acting in Language Models, The Eleventh International Conference on Learning Representations (ICLR), 2023. (retrieved from https://arxiv.org/pdf/2210.03629, 27/06/2024)
Simon Willison’s blog is here.
All images and illustrations are by me the author unless otherwise noted.

Buy my book "Streamlit from Scratch"

Hands on Data Visualization, Data Science, AI and Python

Discussion about this post