DEV Community

Installing PrivateGPT on WSL with GPU support

Emilien Lancelot on January 20, 2024

[ UPDATED 23/03/2024 ] PrivateGPT is a production-ready AI project that allows you to ask questions about your documents using the power of Large ...

Read full post

enshoe • Mar 13 '24

Thank you very much for this guide. I have ran in to some issues regarding an updated version of poetry.

They have now gotten rid of --with in favor of --extras and the group 'local' is missing.
I went through all the errors I got without installing local and came up with this command:
poetry install -E llms-llama-cpp -E ui -E vector-stores-qdrant -E embeddings-huggingface

The model runs, without GPU support for some reason, and errors out when I input something in the UI to interact with the LLM. Any thoughts?

Emilien Lancelot • Mar 15 '24 • Edited

Thx for the input. I'll try and update the tutorial as soon as possible.
In the meantime a SO post has been made. Maybe it can help you ? stackoverflow.com/questions/781499...

Emilien Lancelot • Mar 23 '24

Hey. Updated the tutorial to the latest version of privateGPT. Works for me on a fresh install. HF.

Phyktion • Mar 14 '24

Thanks! This worked for me. Installing the huggingface via pip didn't work.

Ultra Marine • Sep 20 '24

this helped me:
stackoverflow.com/questions/781499...

Martin Staiger • Mar 16 '24

Thank you very much for this information! Now it´s running but without gpu support.

Emilien Lancelot • Mar 16 '24

When you start the server it sould show "BLAS=1". If not, recheck all GPU related steps. For instance, installing the nvidia drivers and check that the binaries are responding accordingly. Maybe start over the whole thing... Before getting things right I had to redo the whole process a bunch of times... If you messed up up a package it could have impacted GPU support...

Emilien Lancelot • Mar 23 '24

Hey. Updated the tutorial to the latest version of privateGPT. Works for me on a fresh install. HF.
For CPU only problems a simple reboot tends to do the trick... lol.

ALEX8642 • Apr 19 '24 • Edited

I have used this tutorial multiple times with success. I even set up remote access despite WSL2 being a PITA to host through. I am having a recent/new issue though. The Mistral model is now gated at huggingface, so I get an error. I have my token but I am not sure how to execute the command. My error happens at nearly the last step. Any ideas? I did log into huggingface and get access as discussed here: huggingface.co/mistralai/Mistral-7... and also added my huggingface token to settings.yaml. No luck.

poetry run python scripts/setup
11:34:46.973 [INFO    ] private_gpt.settings.settings_loader - Starting application with profiles=['default']
Downloading embedding BAAI/bge-small-en-v1.5
Fetching 14 files: 100%|███████████████████████████████████| 14/14 [00:00<00:00, 33.98it/s]
Embedding model downloaded!
Downloading LLM mistral-7b-instruct-v0.2.Q4_K_M.gguf
LLM model downloaded!
Downloading tokenizer mistralai/Mistral-7B-Instruct-v0.2
Traceback (most recent call last):
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_errors.py", line 270, in hf_raise_for_status
    response.raise_for_status()
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2/resolve/main/config.json

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/transformers/utils/hub.py", line 398, in cached_file
    resolved_file = hf_hub_download(
                    ^^^^^^^^^^^^^^^^
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1374, in hf_hub_download
    raise head_call_error
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1247, in hf_hub_download
    metadata = get_hf_file_metadata(
               ^^^^^^^^^^^^^^^^^^^^^
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1624, in get_hf_file_metadata
    r = _request_wrapper(
        ^^^^^^^^^^^^^^^^^
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 402, in _request_wrapper
    response = _request_wrapper(
               ^^^^^^^^^^^^^^^^^
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 426, in _request_wrapper
    hf_raise_for_status(response)
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_errors.py", line 286, in hf_raise_for_status
    raise GatedRepoError(message, response) from e
huggingface_hub.utils._errors.GatedRepoError: 401 Client Error. (Request ID: Root=1-66228f13-5be64525249d142a0400593c;f2728940-4ac3-4b40-bbe6-858e3a1fae0c)

Cannot access gated repo for url https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2/resolve/main/config.json.
Repo model mistralai/Mistral-7B-Instruct-v0.2 is gated. You must be authenticated to access it.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/alex8642/privateGPT/scripts/setup", line 43, in <module>
    AutoTokenizer.from_pretrained(
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 782, in from_pretrained
    config = AutoConfig.from_pretrained(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/transformers/models/auto/configuration_auto.py", line 1111, in from_pretrained
    config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/transformers/configuration_utils.py", line 633, in get_config_dict
    config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/transformers/configuration_utils.py", line 688, in _get_config_dict
    resolved_config_file = cached_file(
                           ^^^^^^^^^^^^
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/transformers/utils/hub.py", line 416, in cached_file
    raise EnvironmentError(
OSError: You are trying to access a gated repo.
Make sure to have access to it at https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2.
401 Client Error. (Request ID: Root=1-66228f13-5be64525249d142a0400593c;f2728940-4ac3-4b40-bbe6-858e3a1fae0c)

Cannot access gated repo for url https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2/resolve/main/config.json.
Repo model mistralai/Mistral-7B-Instruct-v0.2 is gated. You must be authenticated to access it.

ALEX8642 • Apr 20 '24

solved it:

pip install --upgrade huggingface_hub

huggingface-cli login

YOUR_ACCESS_TOKEN

Emilien Lancelot • Apr 24 '24

Good job ;-)

VIJAY • Apr 26 '24

In which step do i need to include it

pranava • Oct 13 '24

@alex8642 - I tried installing Mistral 7.1 (I have a Nvidia 6gb GPU), but it said it needs more space as pyTorch tool majority of that. Just curious what GPU do you have and if you had similar issue. I'm running the basic Llama2 right now.

Ai Curious • Mar 17 '24 • Edited

I was able to get it working after about 6 hours of trying to follow the changes and around 9-10 tries. It does reconize my GPU drivers but it just does not use them. Maybe because the local command in poetry is not there anymore..... 1 week after you made this tutorial. :(

It now has cuba-12.4 and that's what I used on my .bashrc but why does the nvidia-smi.exe display 12.3 ?

sadly BLAS = 0 and not 1.

I got a errors on the final page... I was just about ready to scrap it again, after getting errors about API Split and API tokens exceeded....... when it started working after a restart.

Looking forward to an update maybe? I just came to this page because of Network Chuck.. It's nice to get a taste of what AI can do..

Emilien Lancelot • Mar 23 '24

Thx for all the investigation ! I updated the tutorial to the latest version of privateGPT. Works for me on a fresh install. HF ^^.

IAmGroot • Mar 17 '24

Nvidia-smi.exe called in WSL is actually the Windows Nivida driver which is currently running the Cuda version 12.3. Your GPU isn't being used because you have installed the 12.4 Cuda toolkit in WSL but your Nvidia driver installed on Windows is older and still using Cuda 12.3.

I suggest you update the Nvidia driver on Windows and try again. nvidia.com/download/index.aspx

Ai Curious • Mar 17 '24

I updated my graphics driver just like you said, but I used the Nvidia Experience because it was already waiting for an update and restarted my host computer..

But when I tried to run it, the graphics card was still not being used.

BLAS =0 :(

So instead of starting from scratch, I just started at the "Building and Running PrivateGPT" section, since I noticed that there was a --force-reinstall flag already there.

Now I have the BLAS =1 flag. :)

Thanks..

That was a challenge, because I had never used wsl before, even though it was already on my computer.. I just never knew what it was. I might tempt fate and start over and make a video.. I had some DNS issues when I first started, and I did not know which .bashrc file was... or whichone to use..... I found many.... I have not messed with Linux in a hot minute, this was a good refresher.

It's about 5 times faster now.

IAmGroot • Mar 18 '24

Good to hear you got it working!

Felipe Tani • Apr 23 '24

Hi,

Thank you very much for your trouble doing this guide, I found it very useful and I could follow it and end up with a perfectly working Private GPT.

In the future how could I update the LLM Private GPT uses to an updated one?

Emilien Lancelot • Apr 24 '24

Honnestly... You really should consider using ollama to deal with LLM installation and simply plug all your softwares (privateGPT included) directly to ollama. Ollama is very simple to use and is compatible with openAI standards.

Felipe Tani • Apr 24 '24

Hi Emilien,

Thanks for the tip, how could I do that? Should I change the /scripts/setup in any way before running:

poetry run python scripts/setup ?

I'm looking forward to changing my LLM to llama3.

Emilien Lancelot • Apr 29 '24

There is a "settings-ollama.yaml" file at the root of the repo. Follow the documentation on how to start privateGPT whith this file instead of the default "settings.yaml" and you should be okay.

Emilien Lancelot • Mar 17 '24

About the current deprecation to wich I'll address asap, someone in the Medium comments made it out alive with the command poetry install --extras "llms-llama-cpp ui vector-stores-qdrant embeddings-huggingface". Haven't tried it myself yet but if you'r stuck than you don't have much to lose.

MrPlayerYork • Aug 17 '24

slight update to the command to run! Funny enough, I asked phind AI to help and it gave me this:
CMAKE_ARGS='-DGGML_CUDA=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python

Running this gave me no issues when building the wheel.

Shaithis Shaithis • Jul 11 '24 • Edited

Came across this today. I had to change DLLAMA_CUBLAS to DGGML_CUDA in the CMAKE line. I also downgraded numpy after the CMAKE to address resolver errors related to numpy 2.0.0.

CMAKE_ARGS='-DGGML_CUDA=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python

poetry run pip install numpy==1.23.2.

Thank you for the time you put into this guide.

Muazko • Aug 30 '24

I honestly have almost zero knowledge about programming but i have been able to set up privateGPT using your guide. Thanks a lot!

The only problem i had was this error message "LLAMA_CUBLAS is deprecated and will be removed in the future. Use GGML_CUDA instead."

I did what it says and it worked! Just letting you know if you'd like to update this guide if necessary.

Crysis • May 6 '24 • Edited

For anyone still encountering issues after the updated tutorial like I did, check the version of poetry that is installed. In my case it was using poetry 1.12 which doesn't work with the updated tutorial.
I just did this

sudo apt install pipx
pipx install poetry==1.2.0
pipx ensurepath

You will have to login and check the poetry version before proceeding

PS: Thanks a lot for this guide, had tried quite a few others before getting it right with this one.

Michael Krailo • Apr 10 '24

I just want to say thankyou for the guide. I only had to go through it once and was up and running in the browser in a couple of hours after figuring out where the Nvidia drivers were actually located. I'll start digging deeper into it tomorrow, but this sure is amazingly useful.

Mind you, it wasn't without difficulty as I misunderstood some of the steps to just be commands pasted in the terminal and run when they should have been pasted into ~/.bashrc file instead. Also it might be a good idea to put in a note at the end about being in the proper ~/privateGPT directory in the section on Building and Running PrivateGPT for those of us who are moving around while editing the ~/.bashrc file. Minor detail and that command is in an earlier step, it's just easy to find yourself in a different directory if not doing this all in one single session.

One final note about the ~/.bash_profile might need to be created and add the command "source ~/.bashrc" in it so that file gets sourced automatically when launching a fresh new wsl shell.

Emilien Lancelot • Apr 24 '24

Thx for the input :-)
Hope you got it working.

Michael Krailo • Apr 9 '24 • Edited

For the Nvidia Drivers part, Choose Windows > x86_64 > WSL-Ubuntu > 2.0 > deb (network). I see Windows and x86_64, but then there is no WSL-Ubuntu. So I'm stuck at downloading the drivers part of the tutorial. Did Nvidia remove those drivers?

UPDATE: This was an error in the guide or the Nvidia website changed the way these options work recently. To get the correct drivers, select Linux > x86_64 > WSL-Ubuntu > 2.0 > Deb (Network). So it appears the guide needs to be updated.

Emilien Lancelot • Apr 24 '24

The website must have change. I'll update. Thx.

Boris Duran • Apr 16 '24

This is a really good tutorial, thanks for sharing your work!

Emilien Lancelot • Apr 24 '24

Thx ;-)
Hope it helped.

Boris Duran • Apr 18 '24 • Edited

Hello again, something happened and now I'm not able to see the GradioUI, instead I get the following text when opening the localhost:8001. Do you know what this is and how to solve it?

# HELP python_gc_objects_collected_total Objects collected during gc
# TYPE python_gc_objects_collected_total counter
python_gc_objects_collected_total{generation="0"} 1806.0
python_gc_objects_collected_total{generation="1"} 578.0
python_gc_objects_collected_total{generation="2"} 0.0
# HELP python_gc_objects_uncollectable_total Uncollectable object found during GC
# TYPE python_gc_objects_uncollectable_total counter
python_gc_objects_uncollectable_total{generation="0"} 0.0
python_gc_objects_uncollectable_total{generation="1"} 0.0
python_gc_objects_uncollectable_total{generation="2"} 0.0
# HELP python_gc_collections_total Number of times this generation was collected
# TYPE python_gc_collections_total counter
python_gc_collections_total{generation="0"} 195.0
python_gc_collections_total{generation="1"} 17.0
python_gc_collections_total{generation="2"} 1.0
# HELP python_info Python platform information
# TYPE python_info gauge
python_info{implementation="CPython",major="3",minor="10",patchlevel="13",version="3.10.13"} 1.0
# HELP omni_service_connect_count_total Number of reconnects
# TYPE omni_service_connect_count_total counter
omni_service_connect_count_total{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail"} 2.0
# HELP omni_service_connect_count_created Number of reconnects
# TYPE omni_service_connect_count_created gauge
omni_service_connect_count_created{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail"} 1.7134406589269998e+09
# HELP omni_service_listen_count_total Number of channel listen loops
# TYPE omni_service_listen_count_total counter
omni_service_listen_count_total{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail"} 1.0
# HELP omni_service_listen_count_created Number of channel listen loops
# TYPE omni_service_listen_count_created gauge
omni_service_listen_count_created{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail"} 1.713440663977283e+09
# HELP omni_service_call_avg_time Number of function calls and avg execution time
# TYPE omni_service_call_avg_time summary
# HELP omni_service_exception_count_total Number of exceptions raised by a function
# TYPE omni_service_exception_count_total counter
# HELP omni_service_call_time Execution time of a function
# TYPE omni_service_call_time gauge
# HELP omni_service_process_cpu_times Process CPU utilization total time 
# TYPE omni_service_process_cpu_times gauge
omni_service_process_cpu_times{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="user"} 1.125
omni_service_process_cpu_times{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="system"} 7.953125
# HELP omni_service_process_cpu_percent Process CPU utilization in percent
# TYPE omni_service_process_cpu_percent gauge
omni_service_process_cpu_percent{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail"} 0.0
# HELP omni_service_process_memory_info Process Memory information
# TYPE omni_service_process_memory_info gauge
omni_service_process_memory_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="rss"} 2.2736896e+07
omni_service_process_memory_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="vms"} 6.92355072e+08
omni_service_process_memory_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="num_page_faults"} 17854.0
omni_service_process_memory_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="peak_wset"} 6.5949696e+07
omni_service_process_memory_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="wset"} 2.2736896e+07
omni_service_process_memory_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="peak_paged_pool"} 277200.0
omni_service_process_memory_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="paged_pool"} 276704.0
omni_service_process_memory_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="peak_nonpaged_pool"} 46692.0
omni_service_process_memory_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="nonpaged_pool"} 37992.0
omni_service_process_memory_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="pagefile"} 6.92355072e+08
omni_service_process_memory_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="peak_pagefile"} 6.9257216e+08
omni_service_process_memory_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="private"} 6.92355072e+08
omni_service_process_memory_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="uss"} 6.402048e+06
# HELP omni_service_process_status_info Process Status
# TYPE omni_service_process_status_info gauge
omni_service_process_status_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",status="running"} 1.0
# HELP omni_service_process_num_threads Process number of threads
# TYPE omni_service_process_num_threads gauge
omni_service_process_num_threads{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail"} 22.0
# HELP omni_service_task_queue_size Number of tasks in queue
# TYPE omni_service_task_queue_size gauge

Emilien Lancelot • Apr 24 '24

Hummm. I don't see any stacktraces in your output.

RAG3T • Mar 18 '24

hello, I am currently stuck at this section:

I am receiving this error:

I am just wondering if anyone knows/ understands what I am missing?

Emilien Lancelot • Mar 23 '24

Hey. Updated the tutorial to the latest version of privateGPT. Works for me on a fresh install. HF.

RAG3T • Mar 18 '24

after putting every line of code from the comments i finally got it running with the CPU but for some reaseon I am getting this issue:

Emilien Lancelot • Mar 23 '24

Hey. Updated the tutorial to the latest version of privateGPT. Works for me on a fresh install. HF.
For your specific issue it seems like privateGPT server crashed. Checkout the new troubleshooting section I added ;-).
Also you can provide any stacktrace from the terminal here so that I can help you debug.

UBenosa • Mar 24 '24

i have the same issue as this

ZeusG • Mar 24 '24

same here :

pranava • Oct 13 '24

Not sure if this is still useful, but I had the same error. Realised, the "RAG' or as shown above 'Query files' was selected by default. Changed it to 'Basic' & it worked.

The Archangel • Apr 18 '24

The path for Nvidia drivers are incorrect or have been changed.

From blog: Choose Windows > x86_64 > WSL-Ubuntu > 2.0 > deb (network)
On site: Choose Linux > x86_64 > WSL-Ubuntu > 2.0 > deb (network)

Emilien Lancelot • Apr 24 '24

The website must have changed. I'll update. Thx.

Matthew Hartman • Oct 12 '24

Phenomenal tutorial. Followed all the steps - worked like a charm. Thanks.

Emilien Lancelot • Oct 12 '24

Thx! This means a lot. Hope it helped and you have fun with AI.
If you do, consider taking a look at my opensource multi-LLM framework: Yacana. It was made with love. ^^

Shiv • Mar 27 '24

Help ! no errors until last command - 17:45:44.928 [INFO ] private_gpt.components.embedding.embedding_component - Initializing the embedding model in mode=huggingface
17:45:45.896 [INFO ] llama_index.core.indices.loading - Loading all indices.
17:45:46.243 [INFO ] private_gpt.ui.ui - Mounting the gradio UI, at path=/
17:45:46.304 [INFO ] uvicorn.error - Started server process [588]
17:45:46.304 [INFO ] uvicorn.error - Waiting for application startup.
17:45:46.305 [INFO ] uvicorn.error - Application startup complete.
17:45:46.307 [INFO ] uvicorn.error - Uvicorn running on 0.0.0.0:8001 (Press CTRL+C to quit)

BLAS = 1 and I have an rtx 3070

Emilien Lancelot • Apr 19 '24

Hum... Have you tried rebooting your computer (and I'm not even joking). Also update your drivers to the last version and redo the cuda installation part. Then re-reboot.
If nothing works rollback to the commit dating to the time I updated the article (23/03/2024).

Mitch • Apr 22 '24

Doesnt work still...... so frustrating... I am getting the same error as the above people.... the server is not starting in the browser 127.0.0.1:8001 fails to even load. I have tried Cuda driver installation again and also Nvidia local drivers and no luck..... please help.....

Mitch • Apr 22 '24

I have a bit of an update.. I did a Google search and found that uvicorn can also be run by using command localhost:8001 in browser and guess what it worked for me this time. :)

Mitch • Apr 22 '24

@docteurrs If people are getting errors of uvicorn with local IP address as 0.0.0.0 then get them to try to use the url in their browser as localhost:8001 this worked for me.

Carlo • Apr 10 '24 • Edited

Same problem here

alexmelSC • May 19 '24 • Edited

MidMay 2024
Hi i got these instructions working with the following changes before running
poetry run python scripts/setup

run this
git config --global credential.helper store

then paste it into the command below when it asks for it.

pip install --upgrade huggingface_hub
huggingface-cli login

visit

huggingface.co/mistralai/Mistral-7...
and agree

then run
poetry run python scripts/setup

now make run works!

punkra1n • Mar 17 '24 • Edited

found the bug

pls replace the poetry install --with ui and local

with that

poetry install --extras "llms-llama-cpp ui vector-stores-qdrant embeddings-huggingface"

but somehow BLAS = 0 and runs on CPU

Emilien Lancelot • Mar 23 '24

Hey. Updated the tutorial to the latest version of privateGPT. Works for me on a fresh install. HF.
For CPU related problems, a reboot or driver updates seems to be all it needs to work ^^.

Ed Ricketts • Mar 17 '24 • Edited

I added your amendment (thanks!) and everything's working fine for me now, using GPU (BLAS=1).

One thing I did have to do was change the WSL installation to version 2 after initially installing the Linux distro:
wsl --set-version <distro name> 2

So for example
wsl --set-version Ubuntu-22.04 2

No idea if that is the problem, but it's worth a go.

Ai Curious • Mar 17 '24 • Edited

I tried it... just for testing...

But my wsl version was already on 2.

BLAS Still Eq 0

-still learning.

Mostafa Ali Mansour • Jun 16 '24 • Edited

Thank you so much for this tutorial.
Do you have any idea about below error? I get it after running

CMAKE_ARGS='-DLLAMA_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python

    Found existing installation: MarkupSafe 2.1.3
    Uninstalling MarkupSafe-2.1.3:
      Successfully uninstalled MarkupSafe-2.1.3
  Attempting uninstall: diskcache
    Found existing installation: diskcache 5.6.3
    Uninstalling diskcache-5.6.3:
      Successfully uninstalled diskcache-5.6.3
  Attempting uninstall: jinja2
    Found existing installation: Jinja2 3.1.2
    Uninstalling Jinja2-3.1.2:
      Successfully uninstalled Jinja2-3.1.2
  Attempting uninstall: llama-cpp-python
    Found existing installation: llama_cpp_python 0.2.53
    Uninstalling llama_cpp_python-0.2.53:
      Successfully uninstalled llama_cpp_python-0.2.53
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
gradio 4.19.2 requires numpy~=1.0, but you have numpy 2.0.0 which is incompatible.
matplotlib 3.8.2 requires numpy<2,>=1.21, but you have numpy 2.0.0 which is incompatible.
pandas 2.1.4 requires numpy<2,>=1.23.2; python_version == "3.11", but you have numpy 2.0.0 which is incompatible.
contourpy 1.2.0 requires numpy<2.0,>=1.20, but you have numpy 2.0.0 which is incompatible.`

I've tried downgrading the numpy to version 1.23.4, unregister the WSL image and install everything from scratch, still didn't work.

This affects the make run as I get

ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
make: *** [Makefile:36: run] Error 1

Emilien Lancelot • Oct 12 '24

You should restart from scratch with a clean python environment, again... Packages are upgraded everyday so things can be broken on monday and work on tuesday... If it still doesn't work with pip you may try with conda... Hope this helps.

Jeff Risdon • Mar 6

Ive tried this so many times new installs and ubuntu 22.04, 24.04 etc and get the following:

poetry run python scripts/setup
00:41:40.363 [INFO ] private_gpt.settings.settings_loader - Starting application with profiles=['default']
Downloading embedding nomic-ai/nomic-embed-text-v1.5
Traceback (most recent call last):
File "/home/rizzopro/.cache/pypoetry/virtualenvs/private-gpt-XE8FwuRQ-py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_http.py", line 406, in hf_raise_for_status
response.raise_for_status()
File "/home/rizzopro/.cache/pypoetry/virtualenvs/private-gpt-XE8FwuRQ-py3.11/lib/python3.11/site-packages/requests/models.py", line 1024, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: huggingface.co/api/models/nomic-ai...

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/rizzopro/privateGPT/scripts/setup", line 23, in
snapshot_download(
File "/home/rizzopro/.cache/pypoetry/virtualenvs/private-gpt-XE8FwuRQ-py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/home/rizzopro/.cache/pypoetry/virtualenvs/private-gpt-XE8FwuRQ-py3.11/lib/python3.11/site-packages/huggingface_hub/_snapshot_download.py", line 229, in snapshot_download
raise api_call_error
File "/home/rizzopro/.cache/pypoetry/virtualenvs/private-gpt-XE8FwuRQ-py3.11/lib/python3.11/site-packages/huggingface_hub/_snapshot_download.py", line 160, in snapshot_download
repo_info = api.repo_info(repo_id=repo_id, repo_type=repo_type, revision=revision, token=token)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/rizzopro/.cache/pypoetry/virtualenvs/private-gpt-XE8FwuRQ-py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/home/rizzopro/.cache/pypoetry/virtualenvs/private-gpt-XE8FwuRQ-py3.11/lib/python3.11/site-packages/huggingface_hub/hf_api.py", line 2682, in repo_info
return method(
^^^^^^^
File "/home/rizzopro/.cache/pypoetry/virtualenvs/private-gpt-XE8FwuRQ-py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/home/rizzopro/.cache/pypoetry/virtualenvs/private-gpt-XE8FwuRQ-py3.11/lib/python3.11/site-packages/huggingface_hub/hf_api.py", line 2467, in model_info
hf_raise_for_status(r)
File "/home/rizzopro/.cache/pypoetry/virtualenvs/private-gpt-XE8FwuRQ-py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_http.py", line 454, in hf_raise_for_status
raise _format(RepositoryNotFoundError, message, response) from e
huggingface_hub.errors.RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-67c8ef44-6ee52022081e74f27e36c9fa;a53f2b3b-afbe-476a-991e-ea55eda55a24)

Repository Not Found for url: huggingface.co/api/models/nomic-ai....
Please make sure you specified the correct repo_id and repo_type.
If you are trying to access a private or gated repo, make sure you are authenticated.
Invalid credentials in Authorization header

i then tried :
pip install --upgrade huggingface_hub

huggingface-cli login

YOUR_ACCESS_TOKEN

and ran poetry run python scripts/setup

and got the same error

any Help would be appreciated!
Thanks for all your hard work on this!!

Emilien Lancelot • Jun 16

If you print repo_id and repo_type. Are the values correct at runtime ?

Charmander 004 • Jun 14

I am also having this same issue. You don't happen to have figured it out yet have you?

Ene Antonio • Jun 4 '24

Hello everybody.

I am facing a small problem when running:

poetry run python scripts/setup

OSError: We couldn't connect to 'huggingface.co' to load this file, couldn't find it in the cached files and it looks like mistralai/Mistral-7B-Instruct-v0.2 is not the path to a directory containing a file named config.json.
Checkout your internet connection or see how to run the library in offline mode at 'huggingface.co/docs/transformers/i...'.

Any idea on how i can solve this?

I already logged into huggingface with huggingface-cli login

Ene Antonio • Jun 4 '24

I found the problem.

Make sure that when you select "Write" then creating a new access token on huggingface.

Paweł Błasiak • Jun 8 '24

Thank you! Setting permissions to "write" helped

StevenTheGeek • Mar 6 '24

I followed this and got no errors along the way. However on the UI, I can upload files but as soon as I tried to execute any query I get:

CUDA error: the provided PTX was compiled with an unsupported toolchain.
current device: 0, in function ggml_cuda_op_flatten at /tmp/pip-install-m0l1c58x/llama-cpp-python_4761054bac7e4d008a824921145fb2da/vendor/llama.cpp/ggml-cuda.cu:10010
cudaGetLastError()
GGML_ASSERT: /tmp/pip-install-m0l1c58x/llama-cpp-python_4761054bac7e4d008a824921145fb2da/vendor/llama.cpp/ggml-cuda.cu:255: !"CUDA error"
make: *** [Makefile:36: run] Aborted

Any Ideas? I thought it was maybe because NVidia now installs toolkit 12.4 nad this guide referenced 12.3 but even when I install 12.3 and update bash, I get same error.

StevenTheGeek • Mar 6 '24

Nevermind, never thought to check my windows host Nvidia driver lol, all good now.

Emilien Lancelot • Mar 15 '24

Glad it work ^^

Charmander 004 • Jun 13

I'm not sure if you or anyone will reply to this a year later, but I am having issues after running
poetry run python scripts/setup

It gives me this error at the end

Repository Not Found for url: https://huggingface.co/api/models/nomic-ai/nomic-embed-text-v1.5/revision/main.
Please make sure you specified the correct `repo_id` and `repo_type`.
If you are trying to access a private or gated repo, make sure you are authenticated.
Invalid credentials in Authorization header

I have installed the huggingface_hub and logged in with a token.

pauloarn • Mar 26 '24

Neeed Heeeeelp
On poetry run python scripts/setup i get this error bellow:

any idea in what it could be?

John Boyle • Mar 22 '24 • Edited

Sorry to be a bother, but will this current set up only work on Windows that has the benefit of a GPU? or can I simply skip the NVidia driver installation step and continue with building/running the GPT part ? TIA

Emilien Lancelot • Apr 19 '24

You don't need to have a GPU that is on windows or else. It works on CPU by default but will be way slower that's all ;-)

Emilien Lancelot • Mar 23 '24

Hum. You are not required to use GPU. By default it will run on CPU :-)

Mitch • Apr 22 '24

I have finally gotten the gui running but now when I try to upload anything even a basic .TXT file I get the following error.

RuntimeError: CUDA error: unknown error
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Any help please ?

Emilien Lancelot • Apr 24 '24

No idea. Do you have a graphics card that has cuda cores ? Maybe it's to old.
Try installing the latest nvidia drivers and rebooting your computer (no joke).
If nothing works you really should consider dealing with LLM installation using ollama and simply plug all your softwares (privateGPT included) directly to ollama. Ollama is very simple to use and is compatible with openAI standards.

Max Wieler • May 27 '24 • Edited

Thanks for the great instructions.
Smaller PDF files work great for me.
But when I upload larger files, such as .sql files, and then ask the chatbot for something, I often get an ERROR with the message “ValueError: Initial token count exceeds token limit”.
Do any of you have any idea where I can increase the token count so that this error message no longer appears?
The .sql file I am uploading is 5,000KB in size.

andrewmule • Mar 27 '24

ok a few things. Sort of redundant but one if for automation the other for one off execution.

What needs to be added to get this to run as a service so that each time I reboot it executes and runs?
what's the command to run the app?

Dragoon731 • Mar 28 '24

I don't have an Nvidia GPU I have a Radeon RX 7600M XT will this private gpt still work ? or should I buy the ADLINK Pocket AI External NVIDIA GPU EGX-TBT-A500 With RTX A500 to get it to work? Please help thank you.

Emilien Lancelot • Apr 19 '24

No idea. Sorry. I only use NVIDIA stuff with all the cuda thingy.

Thurstan Davies • May 13 '24

Quick question, so that it knows that I have agreed to the model card, what do I need to do before I run poetry run python scripts/setup to avoid the hugging face access to repo error:

Access to model mistralai/Mistral-7B-Instruct-v0.2 is restricted. You must be authenticated to access it.

Cliff Spark • Sep 5 '24

Thank you very much for your hard work and for everything you do.

Total newb here. Can't find pyproject.toml

I just bought this lappy, ain't throwing it away yet! lol

Emilien Lancelot • Oct 12 '24

It's in the root folder inside the git repository. Here.

CyberRift • Apr 14 '24

Has anyone run into this before?:

This was the first exception thrown:

KeyError:

Then the below was thrown.

TypeError: BertModel.init() got an unexpected keyword argument 'safe_serialization'

Was completing a fresh install and did not have the drivers set properly so it was running extremely slow. I removed the wsl and all and installed new.

Went through all the same exact steps in the article and now I get the above error consistently when trying to run. I have been trying to chase it down for a while now.

Any insight would be greatly appricated.

Emilien Lancelot • Apr 24 '24

No idea... If it really isn't working, you really should consider dealing with LLM installation using ollama and simply plug all your softwares (privateGPT included) directly to ollama. Ollama is very simple to use and is compatible with openAI standards.

Jay-AI • Sep 27 '24

Hello,

Stuck with this error on ubuntu 2204-3, anyone has any idea how to fix this? tried downgrading numpy with no luck..

CMAKE_ARGS='-DGGML_CUDA=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
llama-index-core 0.11.13.post1 requires numpy<2.0.0, but you have numpy 2.1.1 which is incompatible.
llama-index-llms-llama-cpp 0.2.2 requires llama-cpp-python<0.3.0,>=0.2.32, but you have llama-cpp-python 0.3.0 which is incompatible.

Emilien Lancelot • Oct 12 '24

You should restart from scratch with a clean python environment. If it still doesn't work with pip you may try with conda... Hope this helps.

pluu87 • Mar 14

Hi guys, does anyone run into this error ValueError: Provided model path does not exist. Please check the path or provide a model_url to download.
make: *** [Makefile:36: run] Error 1? I got everything else to run fine (without installing the NVIDIA driver) but I keep running into this error and hence cannot run the app. Appreciate any help you can provide.

andrewmule • Mar 27 '24

Is there a way to use a different LLM other than mistral?

Emilien Lancelot • Apr 19 '24

Yes. PrivateGPT supports horrible configuration files... But I guess that with the latest update it would be simpler to use ollama and plug privateGPT to it directly. Clearly easier.

MAK • May 4 '24

how can I run other LLMs instead of Mistral, Can you give the syntax of command that would be needed to run instead of

poetry run python scripts/setup

pranava • Oct 5 '24

Am I the only one unable to see this article?
I can only see a 'Liquid syntax error' here. Can someone please help?

Emilien Lancelot • Oct 12 '24

If you can read paywall articles on medium it's crossposted here: medium.com/@docteur_rs/installing-...

pranava • Oct 13 '24

Thanks Emilien !!

Ed Ricketts • Mar 17 '24

As someone who's very tech savvy but useless at Linux/WSL, is there a way to configure this to use the "Llama-2-7B-Chat-GGUF" model from TheBloke? I'm lost in configuration files.

Emilien Lancelot • Mar 23 '24

For changing the LLM model you can create a config file that specifies the model you want privateGPT to use. Just grep -rn mistral in the repo and you'll find the yaml file.
However, you should consider using olama (and use any model you wish) and make privateGPT point to olama web server instead.

Henrique Crachat • Mar 21 '24

Have you added models from ollama.com/library ? If yes can you explain How to do it ?

Emilien Lancelot • Mar 23 '24

Istria kab • Mar 20 '24 • Edited

May I ask what are the minimum hardware requirements to run a privateGpt? Thanks

Emilien Lancelot • Mar 23 '24

There are no real minimum hardware requirements. If you don't have a GPU it will run on CPU. If you have a GPU then the better it is the quicker it will respond. Simple as that. :-)

Lightbender255 • Mar 17 '24

The NVIDIA drivers are under "Linux > x86_64 > WSL-Ubuntu > 2.0 > deb (network)" not "Windows > x86_64 > WSL-Ubuntu > 2.0 > deb (network)" as of 2024-03-17.

Henrique Crachat • Mar 20 '24 • Edited

Anyone made an updated video or blog for this? I got it working following everybody's comments.

How can I had more models to this ?

Emilien Lancelot • Mar 23 '24

Hey. Updated the tutorial to the latest version of privateGPT. Works for me on a fresh install. HF.
For changing the LLM model you can create a config file that specifies the model you want privateGPT to use. Just grep -rn mistral in the repo.
However, you should consider using olama (and use any model you wish) and make privateGPT point to olama web server instead.

Henrique Crachat • Mar 23 '24

Thank you Emilien. I don´t have the necessary knowlegde to do that, could you provide an example on how to do this. Thx

Emilien Lancelot • Mar 23 '24

Inside the privateGPT repo there are multiple YAML settings files. The default one used by privateGPT is "settings.yaml". It contains all availble configuration keys. You can derive this file to make your own configs if you wish (or modify this one directly but that woundn't be great).
To switch to your own config file you can create a "settings-myconfig.yaml" and set an env variable to "PGPT_PROFILES=myconfig" so that privateGPT loads this config instead of the default one. You can also load some of the sample ones that are already present, for instance "settings-local.yaml" or "settings-ollama.yaml".
I would recommand that instead of wagging war to privateGPT and GPU support you use the "settings-ollama.yaml" config which will point to an ollama server.
Ollama is a very simple piece of software that allows you to download and run LLMs localy. It's very simple and doesn't require silly yaml configs. So, install ollama. Pull a model you want and tell privateGPT to use ollama using the config file I mentionned previously.

privateGPT config doc : docs.privategpt.dev/manual/general...
Ollama doc : ollama.com/

Comment deleted

Tony • Sep 17 '24

Great instruction. Thanks!
Unfortunately I don't have NVIDIA Hardware in my PC. Any ways to use my GPU as well? (e.g. Intel HD Graphics 520)

Emilien Lancelot • Oct 12 '24

If you don't have a GPU, LLMs can work on the CPU but it will be slow.

HarvsG • Mar 20 '24

Anyone managed to get it working on an AMD GPU? I fail at the compilation step (6650xt)

Leo Angelo • Sep 6 '24

Not sure if this is mentioned somewhere or if there is documentation available, but how can I update the LLM that is being used?

Dragoon731 • Mar 29 '24

Would there be an option for the private chat gpt, to scan a hard drive for files? and maybe even train it for images and videos, I know that's last one is a long shot.

Emilien Lancelot • Apr 19 '24

In theory you could ingest any kind of text document on your harddrive. Check the API in the documentation or look for the netwok call that is done in your browser during ingest. Then you just have to reproduce it in programming.

For images it won't work. It's not a multimodal RAG system it only works on text.

You should checkout other RAG related content that is more programming focused.

Rubeena Khan • Jun 29 '24

stackoverflow.com/questions/786502...