DEV Community

Ray
Ray

Posted on

React: Front-end and back-end combined LLaMA Artificial Intelligence Project Demo

My internship organization, BOPRC, has a large amount of water meter data, but the current method of data entry is cumbersome and inefficient, requiring employees to manually take pictures and enter the information into an Excel spreadsheet. This approach is time-consuming and error-prone, increasing labour costs.

To solve this challenge, we try to develop a new approach based on AI technology. By integrating a GPT chatbot into the web front end, employees can upload photos of water meters and get automatically recognized text information. In this way, they can more quickly enter data into the back-end database and automate the office.

This project not only improves the efficiency of data entry but also reduces BOPRC's labor costs. In the future, we hope to further optimize the recognition accuracy and expand the system's functionality to make it more intelligent and easy to use.

First of all, I made a lot of modelling attempts, but unfortunately, the OpenAI API needs to be paid for (I'm a bit shy to apply for funds from my unit), but fortunately when I tried to deploy the LLaMA2 model introduced by FaceBook, I found that the model has good language understanding and generation ability, and it can be useful in multiple language tasks, which is a versatile and highly customizable language model. model. Most importantly, it can be deployed locally on my M1 MacBook Pro.

So my idea was to build a front-end page using my self-taught knowledge of React and deploy the LLaMA2 model locally as a back-end program. Implement a simple demo first.

First of all, there are several ways to deploy LLaMA locally, in general using llama.cpp is the most efficient, it also supports M1 GPU calls.

First, we need to clone the library from Github:

git clone https://github.com/ggerganov/llama.cpp.git
Enter fullscreen mode Exit fullscreen mode

Since LLaMA2 has a lot of models, if you want to run them locally (and are not looking for performance for the moment) I recommend 7B-Chat, which is about 4Gb in size.

cd llama.cpp
curl -L https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q4_K_M.gguf --output ./models/llama-2-7b-chat.Q4_K_M.gguf
Enter fullscreen mode Exit fullscreen mode

Take note of

LLaMA models are now in '.gguf' format, the previous '.bin' format can no longer be used, if you need other models, you can go to this URL to view and download. Please put the models in the models directory.
https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/tree/main

Once we are ready we can try LLaMA in a local deployment:

LLAMA_METAL=1 make
./server -m ./models/llama-2-7b-chat.Q4_K_M.gguf
Enter fullscreen mode Exit fullscreen mode

Back end

Locally built models can be used to open the user interface in a browser using the URL displayed in the terminal, set the adjustments you want and start communicating.
Image description

Image description

You can see that the M1 GPU resources have been used to
Image description

So how do you connect an AI model running on the back-end to a front-end web page? We need a couple of tool libraries: 'FastAPI' and 'llama_cpp(Python)', the

After installing these two we can start by writing our backend Python program:

Image description

Since I'm using a Python virtual environment, the code to start the back-end program looks like this:

source ~/myenv/bin/activate    
python mian.py     
Enter fullscreen mode Exit fullscreen mode

Now the backend program is already waiting for the request
Image description

Front-end

Now we can start with the front-end tweaks, here I used a React page that I made myself (I wanted to use the OpenAI API before but I had to pay for it) You can find the library on my Github, I'm not going to expand too much on the front-end code as my knowledge of the front-end is very scarce:
https://github.com/KyrieRui/BOPRC_water_meter_Monitor_Demo

Image description

Image description
Note that botReply is set here because on the backend we receive the request and return the value with the 'data' flag. Once the front end is set up we can test it:

Image description

Image description
The front-end user clicks on the 'Send' button, the back end receives the correct request and then the model starts working, after which it returns 'data' to the front-end.

This Demo is just a demonstration of the experiments I have conducted for the time being, next I will do more research and try to connect the local AI model to the database to meet the specific needs of the customer, I welcome all partners to make suggestions on my Demo code, I will take the time to reply and make changes! Thanks for reading and may you all have a great day!

Top comments (0)