Ever wished you had a private tutor that can assist you clear up tough math issues? On this article, we’ll discover the best way to construct a math downside solver chat app utilizing LangChain, Gemma 9b, Llama 3.2 Imaginative and prescient and Streamlit. Our app is not going to solely perceive and clear up text-based math issues but in addition in a position to clear up image-based questions. Let’s have a look at the issue assertion and discover the best way to method and clear up this downside step-by-step.
Studying Outcomes
- Be taught to create a strong, interactive Chat App utilizing LangChain to combine exterior instruments and clear up duties.
- Grasp the method of constructing a Chat App with LangChain that may effectively clear up complicated math issues.
- Discover the usage of APIs and surroundings variables to securely work together with giant language fashions.
- Acquire hands-on expertise in designing a user-friendly internet app with dynamic question-solving capabilities.
- Uncover strategies for seamless interplay between frontend interfaces and backend AI fashions.
This text was revealed as part of the Knowledge Science Blogathon.
Defining the Problem: Enterprise Case and Aims
We’re an EdTech firm trying to develop an revolutionary AI-powered utility that may clear up each text-based and image-based math issues in real-time. The app ought to present options with step-by-step explanations to reinforce studying and engagement for college students, educators, and unbiased learners.
We’re tasking you to design and construct this utility utilizing newest AI applied sciences. The app should be scalable, user-friendly, and able to processing each textual inputs and pictures with a seamless expertise.
Proposed Resolution: Strategy and Implementation Technique
We are going to now focus on proposed options under:
Gemma2-9B It
It’s an open supply giant language mannequin from Google designed to course of and generate human-like textual content with exceptional accuracy. On this utility:
- Function: It serves because the “mind” for fixing math issues introduced in textual content format.
- How It Works: When a person inputs a text-based math downside, Gemma2-9B understands the query, applies the mandatory mathematical logic, and generates an answer.
Llama 3.2 Imaginative and prescient
It’s an open supply Mannequin from Meta AI, able to processing and analyzing pictures, together with handwritten or printed math issues.
- Function: Permits the app to “see” and interpret math issues supplied in picture format and generate the response.
- How It Works: When customers add a picture, Llama 3.2 Imaginative and prescient Mannequin identifies the mathematical expressions or questions inside it, converts them right into a format appropriate for problem-solving.
LangChain
It is a framework particularly designed for constructing functions that contain interactions between language fashions and exterior methods.
- Function: Acts because the middleman between the app’s interface and the AI fashions, managing the circulate of knowledge.
- How It Works:
- It coordinates how the person’s enter (textual content or picture) is processed.
- It ensures the sleek trade of knowledge between Gemma2-9B, Llama 3.2 Imaginative and prescient Mannequin, and the app interface.
Streamlit
It is an open-source Python library for creating interactive internet functions rapidly and simply.
- Function: It’s used to write down frontend in Python
- How It Works:
- Builders can use Streamlit to design and deploy an online interface the place customers enter textual content or add pictures.
- The interface interacts seamlessly with LangChain and the underlying AI fashions to show outcomes.
Visualizing the Strategy: Movement Diagram of the Resolution
The method begins by establishing the surroundings, checking the Groq API key, and configuring the Streamlit web page settings. It then initializes the Textual content LLM (ChatGroq) and integrates instruments like Wikipedia and a Calculator to reinforce the textual content agent’s capabilities. A welcome message and sidebar navigation information the person via the interface, the place they’ll enter both textual content or image-based queries. The textual content part collects person questions and processes them utilizing the textual content agent, which makes use of the LLM and exterior instruments to generate solutions. Equally, for picture queries, the picture part permits customers to add pictures, that are then processed by the image-specific LLM (ChatGroq).
As soon as the textual content or picture question is processed, the respective agent generates and shows the suitable solutions. The system ensures easy interplay by alternating between dealing with textual content and picture queries. After displaying the solutions, the method concludes, and the system is prepared for the subsequent question. This circulate creates an intuitive, multi-modal expertise the place customers can ask each textual content and image-based questions, with the system offering correct and environment friendly responses.
Setting Up the Basis
Establishing the muse is an important step in making certain a seamless integration of instruments and processes, laying the groundwork for the profitable operation of the system.
Setting Setup
First issues first, arrange your growth surroundings. Ensure you have Python put in and create a digital surroundings to maintain your challenge dependencies organized.
# Create a Setting
python -m venv env
# Activate it on Home windows
.envScriptsactivate
# Activate in MacOS/Linux
supply env/bin/activate
Set up Dependencies
Set up the mandatory libraries utilizing
pip set up -r https://uncooked.githubusercontent.com/Gouravlohar/Math-Solver/refs/heads/grasp/necessities.txt
Get the Groq API
- To entry the llama and Gemma Mannequin we are going to use Groq .
- Get your Free API Key from right here .
Import Mandatory Libraries
import streamlit as st
import os
import base64
from dotenv import load_dotenv
from langchain_groq import ChatGroq
from langchain.chains import LLMMathChain, LLMChain
from langchain.prompts import PromptTemplate
from langchain_community.utilities import WikipediaAPIWrapper
from langchain.brokers.agent_types import AgentType
from langchain.brokers import Device, initialize_agent
from langchain_community.callbacks.streamlit import StreamlitCallbackHandler
from groq import Groq
These imports collectively arrange the mandatory libraries and modules to create a Streamlit internet utility that interacts with language fashions for fixing mathematical issues and answering questions primarily based on textual content and picture inputs.
Load Setting Variables
load_dotenv()
groq_api_key = os.getenv("GROQ_API_KEY")
if not groq_api_key:
st.error("Groq API Key not present in .env file")
st.cease()
This part of the code is accountable for loading surroundings variables and making certain that the mandatory API key for Groq is obtainable
Arrange the Each LLM’s
st.set_page_config(page_title="Math Solver", page_icon="👨🔬")
st.title("Math Solver")
llm_text = ChatGroq(mannequin="gemma2-9b-it", groq_api_key=groq_api_key)
llm_image = ChatGroq(mannequin="llama-3.2-90b-vision-preview", groq_api_key=groq_api_key)
This part of the code units up the Streamlit utility by configuring its web page title and icon. It then initializes two completely different language fashions (LLMs) from llm_text for dealing with text-based questions utilizing the “gemma2-9b-it” mannequin, and llm_image for dealing with questions that embrace pictures utilizing the “llama-3.2-90b-vision-preview” mannequin. Each fashions are authenticated utilizing the beforehand retrieved Groq API key.
Initialize Instruments and Immediate Template
wikipedia_wrapper = WikipediaAPIWrapper()
wikipedia_tool = Device(
title="Wikipedia",
func=wikipedia_wrapper.run,
description="A device for looking out the Web to seek out varied info on the subjects talked about."
)
math_chain = LLMMathChain.from_llm(llm=llm_text)
calculator = Device(
title="Calculator",
func=math_chain.run,
description="A device for fixing mathematical issues. Present solely the mathematical expressions."
)
immediate = """
You're a mathematical problem-solving assistant tasked with serving to customers clear up their questions. Arrive on the resolution logically, offering a transparent and step-by-step rationalization. Current your response in a structured point-wise format for higher understanding.
Query: {query}
Reply:
"""
prompt_template = PromptTemplate(
input_variables=["question"],
template=immediate
)
# Mix all of the instruments into a sequence for textual content questions
chain = LLMChain(llm=llm_text, immediate=prompt_template)
reasoning_tool = Device(
title="Reasoning Device",
func=chain.run,
description="A device for answering logic-based and reasoning questions."
)
# Initialize the brokers for textual content questions
assistant_agent_text = initialize_agent(
instruments=[wikipedia_tool, calculator, reasoning_tool],
llm=llm_text,
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=False,
handle_parsing_errors=True
)
This a part of the code initializes varied instruments and configurations required to deal with text-based questions within the Streamlit utility. It units up the device for Wikipedia search utilizing the WikipediaAPIWrapper, which permits the appliance to fetch info from the web, and initializes a mathematical device utilizing the LLMMathChain class, which makes use of the llm_text mannequin for fixing math issues, configured on calculator particularly for mathematical expressions. It additionally defines a immediate template to construction questions and anticipated solutions in a transparent, step-by-step method. This template guides the language mannequin to generate a logical and well-explained response to every person question.
Streamlit Session State
if "messages" not in st.session_state:
st.session_state["messages"] = [
{"role": "assistant", "content": "Welcome! I am your Assistant. How can I help you today?"}
]
for msg in st.session_state.messages:
if msg["role"] == "person" and "picture" in msg:
st.chat_message(msg["role"]).write(msg['content'])
st.picture(msg["image"], caption='Uploaded Picture', use_column_width=True)
else:
st.chat_message(msg["role"]).write(msg['content'])
The code initializes chat messages within the session state if they don’t exist, beginning with a default welcome message from the assistant. Subsequently, it loops via messages in st.session_state and prints every into the chat interface. For a message that’s from a person and carries a picture, the textual content content material together with uploaded picture will probably be rendered with a caption. If the message doesn’t comprise a picture, it shows solely the textual content content material. All chat messages-besides any uploaded images-to be displayed contained in the chat interface are additionally right.
Sidebar and Response Cleansing
st.sidebar.header("Navigation")
if st.sidebar.button("Textual content Query"):
st.session_state["section"] = "textual content"
if st.sidebar.button("Picture Query"):
st.session_state["section"] = "picture"
if "part" not in st.session_state:
st.session_state["section"] = "textual content"
def clean_response(response):
if "```" in response:
response = response.break up("```")[1].strip()
return response
This Part of code makes the sidebar for Textual content Part and Picture Part and the perform clean_response cleansing the response from LLM.
Processing Textual content-Primarily based Inquiries
Processing text-based inquiries focuses on dealing with and addressing person questions in textual content type, using language fashions to generate exact responses primarily based on the enter supplied.
if st.session_state["section"] == "textual content":
st.header("Textual content Query")
st.write("Please enter your mathematical query under, and I'll present an in depth resolution.")
query = st.text_area("Your Query:", "Instance: I've 5 apples and three oranges. If I eat 2 apples, what number of fruits do I've left?")
if st.button("Get Reply"):
if query:
with st.spinner("Producing response..."):
st.session_state.messages.append({"position": "person", "content material": query})
st.chat_message("person").write(query)
st_cb = StreamlitCallbackHandler(st.container(), expand_new_thoughts=False)
strive:
response = assistant_agent_text.run(st.session_state.messages, callbacks=[st_cb])
cleaned_response = clean_response(response)
st.session_state.messages.append({'position': 'assistant', "content material": cleaned_response})
st.write('### Response:')
st.success(cleaned_response)
besides ValueError as e:
st.error(f"An error occurred: {e}")
else:
st.warning("Please enter a query to get a solution.")
This part of the code handles the performance of the “Textual content Query” part within the Streamlit utility. When the part is energetic, it supplies a header and an area to enter any query associated to arithmetic. On the clicking of the “Get Reply” button, if the query is entered within the textual content space, it shows a spinner that signifies a response is being generated. The query entered by the person is added to the session state messages and rendered within the chat interface.
Processing Picture-Primarily based Inquiries
Processing image-based inquiries entails analyzing and decoding pictures uploaded by customers, utilizing superior fashions to generate correct responses or insights primarily based on the visible content material.
elif st.session_state["section"] == "picture":
st.header("Picture Query")
st.write("Please enter your query under and add a picture. I'll present an in depth resolution.")
query = st.text_area("Your Query:", "Instance: What would be the reply?")
uploaded_file = st.file_uploader("Add a picture", sort=["jpg", "jpeg", "png"])
if st.button("Get Reply"):
if query and uploaded_file just isn't None:
with st.spinner("Producing response..."):
image_data = uploaded_file.learn()
image_data_url = f"knowledge:picture/jpeg;base64,{base64.b64encode(image_data).decode()}"
st.session_state.messages.append({"position": "person", "content material": query, "picture": image_data})
st.chat_message("person").write(query)
st.picture(image_data, caption='Uploaded Picture', use_column_width=True)
This part of the code handles the “Picture Query” performance within the Streamlit utility. When the “Picture Query” part is energetic, it shows a header, a textual content space for customers to enter their questions, and an choice to add a picture. Upon clicking the “Get Reply” button, if each a query and a picture are supplied, it exhibits a spinner indicating {that a} response is being generated. The uploaded picture is learn and encoded in base64 format. The person’s query and the picture knowledge are appended to the session state messages and displayed within the chat interface, with the picture proven alongside the query. This setup ensures that each the textual content and picture inputs are accurately captured and displayed for additional processing.
Initialize Groq Shopper for Llama 3.2 Imaginative and prescient Mannequin
consumer = Groq()
messages = [
{
"role": "user",
"content": [
{
"type": "text",
"text": question
},
{
"type": "image_url",
"image_url": {
"url": image_data_url
}
}
]
}
]
This part will put together the message for Llama imaginative and prescient mannequin
Groq API Name
strive:
completion = consumer.chat.completions.create(
mannequin="llama-3.2-90b-vision-preview",
messages=messages,
temperature=1,
max_tokens=1024,
top_p=1,
stream=False,
cease=None,
)
This setup sends the person’s query and picture to the Groq API, which processes the inputs utilizing the desired mannequin and returns a generated response.
Response from Picture Mannequin
response = completion.decisions[0].message.content material
cleaned_response = clean_response(response)
st.session_state.messages.append({'position': 'assistant', "content material": cleaned_response})
st.write('### Response:')
st.success(cleaned_response)
besides ValueError as e:
st.error(f"An error occurred: {e}")
else:
st.warning("Please enter a query and add a picture to get a solution.")
This part of the code processes the response from the Groq API after producing a completion. It extracts the content material of the response from the primary alternative within the completion consequence and cleans it utilizing the clean_response perform. The system appends the cleaned response to the session state messages with the position of “assistant” and shows it within the chat interface. The response seems below a “Response” header with a hit message. If a ValueError happens, the system shows an error message. If both the query or the picture just isn’t supplied, a warning prompts the person to enter each to get a solution.
Verify the Full Code in GitHub Repo Right here.
Output
Enter for Textual content Part
A tank has three pipes connected to it. Pipe A can fill the tank in 4 hours, Pipe B can fill it in 6 hours, and Pipe C can empty the tank in 3 hours. If all three pipes are opened collectively, how lengthy will it take to fill the tank utterly?
Enter for Picture Part
Conclusion
By combining the powers of Gemma 9b, Llama 3.2 Imaginative and prescient, LangChain, and Streamlit, it’s doable to create a sturdy and user-friendly math problem-solving app that may revolutionize how college students study and have interaction with arithmetic, offering step-by-step options and real-time suggestions. This helps overcome not solely the complexity points inside mathematical ideas however, extra importantly, provides a scalable and accessible resolution for learners in any respect ranges.
That is one instance of some ways such giant language fashions and AI can be utilized in schooling. As we proceed to develop these applied sciences, much more artistic and impactful functions will emerge to alter how we study and educate.
What do you consider such an idea? Have you ever ever tried to develop AI-based edutainment functions? Share your experiences and concepts within the feedback under!
Key Takeaways
- You may construct a strong math downside solver utilizing superior AI fashions like Gemma 2 9b and Llama 3.2.
- Mix textual content and picture processing to create an app that may deal with varied sorts of math issues.
- Learn to combine LangChain with varied instruments to create a strong Math Downside Solver Chat App that enhances person expertise.
- Leverage Groq acceleration to make sure your app delivers fast responses.
- Streamlit makes it simple to construct an intuitive and interesting person interface.
- Contemplate the moral implications and design your app to advertise studying and understanding.
Often Requested Questions
A. Gemma 2 9b is a strong language mannequin developed by Google, able to understanding and fixing complicated math issues introduced in textual content type.
A. The app makes use of the Meta Llama 3.2 imaginative and prescient mannequin to interpret math issues in pictures. It then extracts the issue and generate the response.
A. Sure, you possibly can design the app to show the steps concerned in fixing an issue, which generally is a helpful studying device for customers.
A. It’s vital to make sure the app is used responsibly and doesn’t facilitate dishonest or hinder real studying. Design options that promote understanding and encourage customers to have interaction with the problem-solving course of.
A. Yow will discover extra details about Gemma 2 9b, Llama 3.2, Groq, LangChain, and Streamlit on Analytics Vidhya, their respective official web sites and documentation pages.
The media proven on this article just isn’t owned by Analytics Vidhya and is used on the Creator’s discretion.