DEV Community

Cover image for Installing PrivateGPT on WSL with GPU support
Emilien Lancelot
Emilien Lancelot

Posted on • Updated on

Installing PrivateGPT on WSL with GPU support

[ UPDATED 23/03/2024 ]

PrivateGPT is a production-ready AI project that allows you to ask questions about your documents using the power of Large Language Models (LLMs), even in scenarios without an Internet connection. 100% private, no data leaves your execution environment at any point.

Running it on Windows Subsystem for Linux (WSL) with GPU support can significantly enhance its performance. In this guide, I will walk you through the step-by-step process of installing PrivateGPT on WSL with GPU acceleration.

Installing this was a pain in the a** and took me 2 days to get it to work. Hope this can help you on your own journey… Good luck !

Prerequisites

Before we begin, make sure you have the latest version of Ubuntu WSL installed. You can choose from versions such as Ubuntu-22–04–3 LTS or Ubuntu-22–04–6 LTS available on the Windows Store.

Updating Ubuntu
sudo apt-get update
sudo apt-get upgrade
sudo apt-get install build-essential
Enter fullscreen mode Exit fullscreen mode

ℹ️ “upgrade” is very important as python stuff will explode later if you don’t

Cloning the PrivateGPT repo

git clone https://github.com/imartinez/privateGPT
Enter fullscreen mode Exit fullscreen mode

Setting Up Python Environment

To manage Python versions, we’ll use pyenv. Follow the commands below to install it and set up the Python environment:

sudo apt-get install git gcc make openssl libssl-dev libbz2-dev libreadline-dev libsqlite3-dev zlib1g-dev libncursesw5-dev libgdbm-dev libc6-dev zlib1g-dev libsqlite3-dev tk-dev libssl-dev openssl libffi-dev
curl https://pyenv.run | bash
export PATH="/home/$(whoami)/.pyenv/bin:$PATH"
Enter fullscreen mode Exit fullscreen mode

Add the following lines to your .bashrc file:

export PYENV_ROOT="$HOME/.pyenv"
[[ -d $PYENV_ROOT/bin ]] && export PATH="$PYENV_ROOT/bin:$PATH"
eval "$(pyenv init -)"
Enter fullscreen mode Exit fullscreen mode

Reload your terminal

source ~/.bashrc
Enter fullscreen mode Exit fullscreen mode

Install important missing pyenv stuff

sudo apt-get install lzma
sudo apt-get install liblzma-dev
Enter fullscreen mode Exit fullscreen mode

Install Python 3.11 and set it as the global version:

pyenv install 3.11
pyenv global 3.11
pip install pip --upgrade
pyenv local 3.11
Enter fullscreen mode Exit fullscreen mode

Poetry Installation

Install poetry to manage dependencies:

curl -sSL https://install.python-poetry.org | python3 -
Enter fullscreen mode Exit fullscreen mode

Add the following line to your .bashrc:

export PATH="/home/<YOU USERNAME>/.local/bin:$PATH"
Enter fullscreen mode Exit fullscreen mode

ℹ️ Replace by your WSL username ($ whoami)

Reload your configuration

source ~/.bashrc
poetry --version # should display something without errors
Enter fullscreen mode Exit fullscreen mode

Installing PrivateGPT Dependencies

Navigate to the PrivateGPT directory and install dependencies:

cd privateGPT
poetry install --extras "ui embeddings-huggingface llms-llama-cpp vector-stores-qdrant"
Enter fullscreen mode Exit fullscreen mode

Nvidia Drivers Installation

Visit Nvidia’s official website to download and install the Nvidia drivers for WSL. Choose Linux > x86_64 > WSL-Ubuntu > 2.0 > deb (network)

Follow the instructions provided on the page.

Add the following lines to your .bashrc:

export PATH="/usr/local/cuda-12.4/bin:$PATH"
export LD_LIBRARY_PATH="/usr/local/cuda-12.4/lib64:$LD_LIBRARY_PATH"
Enter fullscreen mode Exit fullscreen mode

ℹ️ Maybe check the content of “/usr/local” to be sure that you do have the “cuda-12.4” folder. Yours might have a different version.

Reload your configuration and check that all is working as expected

source ~/.bashrc
nvcc --version
nvidia-smi.exe
Enter fullscreen mode Exit fullscreen mode

ℹ️ “nvidia-smi” isn’t available on WSL so just verify that the .exe one detects your hardware. Both commands should displayed gibberish but no apparent errors.

Building and Running PrivateGPT

Finally, install LLAMA CUDA libraries and Python bindings:

CMAKE_ARGS='-DLLAMA_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python
Enter fullscreen mode Exit fullscreen mode

Let private GPT download a local LLM for you (mixtral by default):

poetry run python scripts/setup
Enter fullscreen mode Exit fullscreen mode

To run PrivateGPT, use the following command:

make run
Enter fullscreen mode Exit fullscreen mode

This will initialize and boot PrivateGPT with GPU support on your WSL environment.

ℹ️ You should see “blas = 1” if GPU offload is working.

...............................................................................................
llama_new_context_with_model: n_ctx      = 3900
llama_new_context_with_model: freq_base  = 1000000.0
llama_new_context_with_model: freq_scale = 1
llama_kv_cache_init:      CUDA0 KV buffer size =   487.50 MiB
llama_new_context_with_model: KV self size  =  487.50 MiB, K (f16):  243.75 MiB, V (f16):  243.75 MiB
llama_new_context_with_model: graph splits (measure): 3
llama_new_context_with_model:      CUDA0 compute buffer size =   275.37 MiB
llama_new_context_with_model:  CUDA_Host compute buffer size =    15.62 MiB
AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 |
18:50:50.097 [INFO    ] private_gpt.components.embedding.embedding_component - Initializing the embedding model in mode=local 
Enter fullscreen mode Exit fullscreen mode

ℹ️ Go to 127.0.0.1:8001 in your browser

Image description
Uploaded the Orca paper and asking random stuff about it.

Conclusion

By following these steps, you have successfully installed PrivateGPT on WSL with GPU support. Enjoy the enhanced capabilities of PrivateGPT for your natural language processing tasks.

If something went wrong then open your window and throw your computer away. Then start again at step 1.

You can also remove the WSL with:

wsl.exe --list -v
wsl --unregister <name of the wsl to  remove>
Enter fullscreen mode Exit fullscreen mode

If this article help you in any way consider giving it a like ! Thx

Troubleshooting

Having a crash when asking a question or doing make run ? Here are the issues I encountered and how I fixed them.

  • Cuda error
CUDA error: the provided PTX was compiled with an unsupported toolchain.
  current device: 0, in function ggml_cuda_op_flatten at /tmp/pip-install-3kkz0k8s/llama-cpp-python_a300768bdb3b475da1d2874192f22721/vendor/llama.cpp/ggml-cuda.cu:9119
  cudaGetLastError()
GGML_ASSERT: /tmp/pip-install-3kkz0k8s/llama-cpp-python_a300768bdb3b475da1d2874192f22721/vendor/llama.cpp/ggml-cuda.cu:271: !"CUDA error"
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
make: *** [Makefile:36: run] Aborted
Enter fullscreen mode Exit fullscreen mode

This one comes from downloading the latest CUDA stuff and your drivers are not up to date. So open the Nvidia "Geforce experience" app from Windows and upgrade to the latest version and then reboot.

  • CPU only

If privateGPT still sets BLAS to 0 and runs on CPU only, try to close all WSL2 instances. Then reopen one and try again.

If it's still on CPU only then try rebooting your computer. This is not a joke… Unfortunatly.

A note on using LM Studio as backend

I tried to use the server of LMStudio as fake OpenAI backend. It does work but not very well. Need to do more tests on that and I’ll update here.

For now what I did is start the LMStudio server on the port 8002 and unchecked “Apply Prompt Formatting”.

On PrivateGPT I edited “settings-vllm.yaml” and updated “openai > api_base” to “http://localhost:8002/v1" and the model to “dolphin-2.7-mixtral-8x7b.Q5_K_M.gguf” which is the one I use in LMStudio. It’s displayed in LMStudio if your wondering.

Other authors you might like

the-rise-of-human-based-botnets-unconventional-threats-in-cyberspace

Top comments (77)

Collapse
 
enshoe profile image
enshoe

Thank you very much for this guide. I have ran in to some issues regarding an updated version of poetry.

They have now gotten rid of --with in favor of --extras and the group 'local' is missing.
I went through all the errors I got without installing local and came up with this command:
poetry install -E llms-llama-cpp -E ui -E vector-stores-qdrant -E embeddings-huggingface

The model runs, without GPU support for some reason, and errors out when I input something in the UI to interact with the LLM. Any thoughts?

Collapse
 
docteurrs profile image
Emilien Lancelot • Edited

Thx for the input. I'll try and update the tutorial as soon as possible.
In the meantime a SO post has been made. Maybe it can help you ? stackoverflow.com/questions/781499...

Collapse
 
docteurrs profile image
Emilien Lancelot

Hey. Updated the tutorial to the latest version of privateGPT. Works for me on a fresh install. HF.

Collapse
 
phyktion profile image
Phyktion

Thanks! This worked for me. Installing the huggingface via pip didn't work.

Collapse
 
knigge111_74 profile image
Martin Staiger

Thank you very much for this information! Now it´s running but without gpu support.

Collapse
 
docteurrs profile image
Emilien Lancelot

When you start the server it sould show "BLAS=1". If not, recheck all GPU related steps. For instance, installing the nvidia drivers and check that the binaries are responding accordingly. Maybe start over the whole thing... Before getting things right I had to redo the whole process a bunch of times... If you messed up up a package it could have impacted GPU support...

Collapse
 
docteurrs profile image
Emilien Lancelot

Hey. Updated the tutorial to the latest version of privateGPT. Works for me on a fresh install. HF.
For CPU only problems a simple reboot tends to do the trick... lol.

Collapse
 
alex8642 profile image
ALEX8642 • Edited

I have used this tutorial multiple times with success. I even set up remote access despite WSL2 being a PITA to host through. I am having a recent/new issue though. The Mistral model is now gated at huggingface, so I get an error. I have my token but I am not sure how to execute the command. My error happens at nearly the last step. Any ideas? I did log into huggingface and get access as discussed here: huggingface.co/mistralai/Mistral-7... and also added my huggingface token to settings.yaml. No luck.

poetry run python scripts/setup
11:34:46.973 [INFO    ] private_gpt.settings.settings_loader - Starting application with profiles=['default']
Downloading embedding BAAI/bge-small-en-v1.5
Fetching 14 files: 100%|███████████████████████████████████| 14/14 [00:00<00:00, 33.98it/s]
Embedding model downloaded!
Downloading LLM mistral-7b-instruct-v0.2.Q4_K_M.gguf
LLM model downloaded!
Downloading tokenizer mistralai/Mistral-7B-Instruct-v0.2
Traceback (most recent call last):
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_errors.py", line 270, in hf_raise_for_status
    response.raise_for_status()
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2/resolve/main/config.json

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/transformers/utils/hub.py", line 398, in cached_file
    resolved_file = hf_hub_download(
                    ^^^^^^^^^^^^^^^^
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1374, in hf_hub_download
    raise head_call_error
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1247, in hf_hub_download
    metadata = get_hf_file_metadata(
               ^^^^^^^^^^^^^^^^^^^^^
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1624, in get_hf_file_metadata
    r = _request_wrapper(
        ^^^^^^^^^^^^^^^^^
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 402, in _request_wrapper
    response = _request_wrapper(
               ^^^^^^^^^^^^^^^^^
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 426, in _request_wrapper
    hf_raise_for_status(response)
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_errors.py", line 286, in hf_raise_for_status
    raise GatedRepoError(message, response) from e
huggingface_hub.utils._errors.GatedRepoError: 401 Client Error. (Request ID: Root=1-66228f13-5be64525249d142a0400593c;f2728940-4ac3-4b40-bbe6-858e3a1fae0c)

Cannot access gated repo for url https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2/resolve/main/config.json.
Repo model mistralai/Mistral-7B-Instruct-v0.2 is gated. You must be authenticated to access it.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/alex8642/privateGPT/scripts/setup", line 43, in <module>
    AutoTokenizer.from_pretrained(
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 782, in from_pretrained
    config = AutoConfig.from_pretrained(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/transformers/models/auto/configuration_auto.py", line 1111, in from_pretrained
    config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/transformers/configuration_utils.py", line 633, in get_config_dict
    config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/transformers/configuration_utils.py", line 688, in _get_config_dict
    resolved_config_file = cached_file(
                           ^^^^^^^^^^^^
  File "/home/alex8642/.cache/pypoetry/virtualenvs/private-gpt-hQFbaQIY-py3.11/lib/python3.11/site-packages/transformers/utils/hub.py", line 416, in cached_file
    raise EnvironmentError(
OSError: You are trying to access a gated repo.
Make sure to have access to it at https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2.
401 Client Error. (Request ID: Root=1-66228f13-5be64525249d142a0400593c;f2728940-4ac3-4b40-bbe6-858e3a1fae0c)

Cannot access gated repo for url https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2/resolve/main/config.json.
Repo model mistralai/Mistral-7B-Instruct-v0.2 is gated. You must be authenticated to access it.
Enter fullscreen mode Exit fullscreen mode
Collapse
 
alex8642 profile image
ALEX8642

solved it:

pip install --upgrade huggingface_hub

huggingface-cli login

YOUR_ACCESS_TOKEN
Enter fullscreen mode Exit fullscreen mode
Collapse
 
docteurrs profile image
Emilien Lancelot

Good job ;-)

Collapse
 
__vj__008 profile image
VIJAY

In which step do i need to include it

Collapse
 
aicurious profile image
Ai Curious • Edited

I was able to get it working after about 6 hours of trying to follow the changes and around 9-10 tries. It does reconize my GPU drivers but it just does not use them. Maybe because the local command in poetry is not there anymore..... 1 week after you made this tutorial. :(

Image description

It now has cuba-12.4 and that's what I used on my .bashrc but why does the nvidia-smi.exe display 12.3 ?

Image description

sadly BLAS = 0 and not 1.

Image description

I got a errors on the final page... I was just about ready to scrap it again, after getting errors about API Split and API tokens exceeded....... when it started working after a restart.

Image description

Looking forward to an update maybe? I just came to this page because of Network Chuck.. It's nice to get a taste of what AI can do..

Collapse
 
docteurrs profile image
Emilien Lancelot

Thx for all the investigation ! I updated the tutorial to the latest version of privateGPT. Works for me on a fresh install. HF ^^.

Collapse
 
iamgroot profile image
IAmGroot

Nvidia-smi.exe called in WSL is actually the Windows Nivida driver which is currently running the Cuda version 12.3. Your GPU isn't being used because you have installed the 12.4 Cuda toolkit in WSL but your Nvidia driver installed on Windows is older and still using Cuda 12.3.

I suggest you update the Nvidia driver on Windows and try again. nvidia.com/download/index.aspx

Collapse
 
aicurious profile image
Ai Curious

I updated my graphics driver just like you said, but I used the Nvidia Experience because it was already waiting for an update and restarted my host computer..

Image description

But when I tried to run it, the graphics card was still not being used.

BLAS =0 :(

Image description

So instead of starting from scratch, I just started at the "Building and Running PrivateGPT" section, since I noticed that there was a --force-reinstall flag already there.

Image description

Now I have the BLAS =1 flag. :)

Image description

Thanks..

That was a challenge, because I had never used wsl before, even though it was already on my computer.. I just never knew what it was. I might tempt fate and start over and make a video.. I had some DNS issues when I first started, and I did not know which .bashrc file was... or whichone to use..... I found many.... I have not messed with Linux in a hot minute, this was a good refresher.

It's about 5 times faster now.

Thread Thread
 
iamgroot profile image
IAmGroot

Good to hear you got it working!

Collapse
 
ftani profile image
Felipe Tani

Hi,

Thank you very much for your trouble doing this guide, I found it very useful and I could follow it and end up with a perfectly working Private GPT.

In the future how could I update the LLM Private GPT uses to an updated one?

Collapse
 
docteurrs profile image
Emilien Lancelot

Honnestly... You really should consider using ollama to deal with LLM installation and simply plug all your softwares (privateGPT included) directly to ollama. Ollama is very simple to use and is compatible with openAI standards.

Collapse
 
ftani profile image
Felipe Tani

Hi Emilien,

Thanks for the tip, how could I do that? Should I change the /scripts/setup in any way before running:

poetry run python scripts/setup ?

I'm looking forward to changing my LLM to llama3.

Collapse
 
docteurrs profile image
Emilien Lancelot

About the current deprecation to wich I'll address asap, someone in the Medium comments made it out alive with the command poetry install --extras "llms-llama-cpp ui vector-stores-qdrant embeddings-huggingface". Haven't tried it myself yet but if you'r stuck than you don't have much to lose.

Collapse
 
oliark profile image
Michael Krailo • Edited

For the Nvidia Drivers part, Choose Windows > x86_64 > WSL-Ubuntu > 2.0 > deb (network). I see Windows and x86_64, but then there is no WSL-Ubuntu. So I'm stuck at downloading the drivers part of the tutorial. Did Nvidia remove those drivers?

Image description

UPDATE: This was an error in the guide or the Nvidia website changed the way these options work recently. To get the correct drivers, select Linux > x86_64 > WSL-Ubuntu > 2.0 > Deb (Network). So it appears the guide needs to be updated.

Collapse
 
docteurrs profile image
Emilien Lancelot

The website must have change. I'll update. Thx.

Collapse
 
oliark profile image
Michael Krailo

I just want to say thankyou for the guide. I only had to go through it once and was up and running in the browser in a couple of hours after figuring out where the Nvidia drivers were actually located. I'll start digging deeper into it tomorrow, but this sure is amazingly useful.

Mind you, it wasn't without difficulty as I misunderstood some of the steps to just be commands pasted in the terminal and run when they should have been pasted into ~/.bashrc file instead. Also it might be a good idea to put in a note at the end about being in the proper ~/privateGPT directory in the section on Building and Running PrivateGPT for those of us who are moving around while editing the ~/.bashrc file. Minor detail and that command is in an earlier step, it's just easy to find yourself in a different directory if not doing this all in one single session.

One final note about the ~/.bash_profile might need to be created and add the command "source ~/.bashrc" in it so that file gets sourced automatically when launching a fresh new wsl shell.

Collapse
 
docteurrs profile image
Emilien Lancelot

Thx for the input :-)
Hope you got it working.

Collapse
 
apolo74 profile image
Boris Duran

This is a really good tutorial, thanks for sharing your work!

Collapse
 
docteurrs profile image
Emilien Lancelot

Thx ;-)
Hope it helped.

Collapse
 
apolo74 profile image
Boris Duran • Edited

Hello again, something happened and now I'm not able to see the GradioUI, instead I get the following text when opening the localhost:8001. Do you know what this is and how to solve it?

# HELP python_gc_objects_collected_total Objects collected during gc
# TYPE python_gc_objects_collected_total counter
python_gc_objects_collected_total{generation="0"} 1806.0
python_gc_objects_collected_total{generation="1"} 578.0
python_gc_objects_collected_total{generation="2"} 0.0
# HELP python_gc_objects_uncollectable_total Uncollectable object found during GC
# TYPE python_gc_objects_uncollectable_total counter
python_gc_objects_uncollectable_total{generation="0"} 0.0
python_gc_objects_uncollectable_total{generation="1"} 0.0
python_gc_objects_uncollectable_total{generation="2"} 0.0
# HELP python_gc_collections_total Number of times this generation was collected
# TYPE python_gc_collections_total counter
python_gc_collections_total{generation="0"} 195.0
python_gc_collections_total{generation="1"} 17.0
python_gc_collections_total{generation="2"} 1.0
# HELP python_info Python platform information
# TYPE python_info gauge
python_info{implementation="CPython",major="3",minor="10",patchlevel="13",version="3.10.13"} 1.0
# HELP omni_service_connect_count_total Number of reconnects
# TYPE omni_service_connect_count_total counter
omni_service_connect_count_total{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail"} 2.0
# HELP omni_service_connect_count_created Number of reconnects
# TYPE omni_service_connect_count_created gauge
omni_service_connect_count_created{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail"} 1.7134406589269998e+09
# HELP omni_service_listen_count_total Number of channel listen loops
# TYPE omni_service_listen_count_total counter
omni_service_listen_count_total{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail"} 1.0
# HELP omni_service_listen_count_created Number of channel listen loops
# TYPE omni_service_listen_count_created gauge
omni_service_listen_count_created{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail"} 1.713440663977283e+09
# HELP omni_service_call_avg_time Number of function calls and avg execution time
# TYPE omni_service_call_avg_time summary
# HELP omni_service_exception_count_total Number of exceptions raised by a function
# TYPE omni_service_exception_count_total counter
# HELP omni_service_call_time Execution time of a function
# TYPE omni_service_call_time gauge
# HELP omni_service_process_cpu_times Process CPU utilization total time 
# TYPE omni_service_process_cpu_times gauge
omni_service_process_cpu_times{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="user"} 1.125
omni_service_process_cpu_times{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="system"} 7.953125
# HELP omni_service_process_cpu_percent Process CPU utilization in percent
# TYPE omni_service_process_cpu_percent gauge
omni_service_process_cpu_percent{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail"} 0.0
# HELP omni_service_process_memory_info Process Memory information
# TYPE omni_service_process_memory_info gauge
omni_service_process_memory_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="rss"} 2.2736896e+07
omni_service_process_memory_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="vms"} 6.92355072e+08
omni_service_process_memory_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="num_page_faults"} 17854.0
omni_service_process_memory_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="peak_wset"} 6.5949696e+07
omni_service_process_memory_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="wset"} 2.2736896e+07
omni_service_process_memory_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="peak_paged_pool"} 277200.0
omni_service_process_memory_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="paged_pool"} 276704.0
omni_service_process_memory_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="peak_nonpaged_pool"} 46692.0
omni_service_process_memory_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="nonpaged_pool"} 37992.0
omni_service_process_memory_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="pagefile"} 6.92355072e+08
omni_service_process_memory_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="peak_pagefile"} 6.9257216e+08
omni_service_process_memory_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="private"} 6.92355072e+08
omni_service_process_memory_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",type="uss"} 6.402048e+06
# HELP omni_service_process_status_info Process Status
# TYPE omni_service_process_status_info gauge
omni_service_process_status_info{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail",status="running"} 1.0
# HELP omni_service_process_num_threads Process number of threads
# TYPE omni_service_process_num_threads gauge
omni_service_process_num_threads{omni_alert="0",omni_instance="utl-thumbnail-127.0.0.1",omni_service="thumbnail"} 22.0
# HELP omni_service_task_queue_size Number of tasks in queue
# TYPE omni_service_task_queue_size gauge
Enter fullscreen mode Exit fullscreen mode
Collapse
 
docteurrs profile image
Emilien Lancelot

Hummm. I don't see any stacktraces in your output.

Collapse
 
the4rchangel profile image
The Archangel

The path for Nvidia drivers are incorrect or have been changed.

From blog: Choose Windows > x86_64 > WSL-Ubuntu > 2.0 > deb (network)
On site: Choose Linux > x86_64 > WSL-Ubuntu > 2.0 > deb (network)

Collapse
 
docteurrs profile image
Emilien Lancelot

The website must have changed. I'll update. Thx.

Collapse
 
rag3t profile image
RAG3T

hello, I am currently stuck at this section:

Image description

I am receiving this error:

Image description

I am just wondering if anyone knows/ understands what I am missing?

Collapse
 
docteurrs profile image
Emilien Lancelot

Hey. Updated the tutorial to the latest version of privateGPT. Works for me on a fresh install. HF.

Collapse
 
rag3t profile image
RAG3T

after putting every line of code from the comments i finally got it running with the CPU but for some reaseon I am getting this issue:

Image description

Collapse
 
docteurrs profile image
Emilien Lancelot

Hey. Updated the tutorial to the latest version of privateGPT. Works for me on a fresh install. HF.
For your specific issue it seems like privateGPT server crashed. Checkout the new troubleshooting section I added ;-).
Also you can provide any stacktrace from the terminal here so that I can help you debug.

Thread Thread
 
ubenosa profile image
UBenosa

i have the same issue as this
Image description

Thread Thread
 
zeusg profile image
ZeusG

same here :

Image description

Collapse
 
kingofpi profile image
Shiv

Help ! no errors until last command - 17:45:44.928 [INFO ] private_gpt.components.embedding.embedding_component - Initializing the embedding model in mode=huggingface
17:45:45.896 [INFO ] llama_index.core.indices.loading - Loading all indices.
17:45:46.243 [INFO ] private_gpt.ui.ui - Mounting the gradio UI, at path=/
17:45:46.304 [INFO ] uvicorn.error - Started server process [588]
17:45:46.304 [INFO ] uvicorn.error - Waiting for application startup.
17:45:46.305 [INFO ] uvicorn.error - Application startup complete.
17:45:46.307 [INFO ] uvicorn.error - Uvicorn running on 0.0.0.0:8001 (Press CTRL+C to quit)

BLAS = 1 and I have an rtx 3070

Collapse
 
docteurrs profile image
Emilien Lancelot

Hum... Have you tried rebooting your computer (and I'm not even joking). Also update your drivers to the last version and redo the cuda installation part. Then re-reboot.
If nothing works rollback to the commit dating to the time I updated the article (23/03/2024).

Collapse
 
mitchmenghi profile image
Mitch

Doesnt work still...... so frustrating... I am getting the same error as the above people.... the server is not starting in the browser 127.0.0.1:8001 fails to even load. I have tried Cuda driver installation again and also Nvidia local drivers and no luck..... please help.....

Thread Thread
 
mitchmenghi profile image
Mitch

I have a bit of an update.. I did a Google search and found that uvicorn can also be run by using command localhost:8001 in browser and guess what it worked for me this time. :)

Collapse
 
mitchmenghi profile image
Mitch

@docteurrs If people are getting errors of uvicorn with local IP address as 0.0.0.0 then get them to try to use the url in their browser as localhost:8001 this worked for me.

Collapse
 
salaroglio profile image
Carlo • Edited

Same problem here

Image description

Collapse
 
punkra1n profile image
punkra1n • Edited

found the bug

pls replace the poetry install --with ui and local

with that

poetry install --extras "llms-llama-cpp ui vector-stores-qdrant embeddings-huggingface"

but somehow BLAS = 0 and runs on CPU

Collapse
 
docteurrs profile image
Emilien Lancelot

Hey. Updated the tutorial to the latest version of privateGPT. Works for me on a fresh install. HF.
For CPU related problems, a reboot or driver updates seems to be all it needs to work ^^.

Collapse
 
raelworld profile image
Ed Ricketts • Edited

I added your amendment (thanks!) and everything's working fine for me now, using GPU (BLAS=1).

One thing I did have to do was change the WSL installation to version 2 after initially installing the Linux distro:
wsl --set-version <distro name> 2

So for example
wsl --set-version Ubuntu-22.04 2

No idea if that is the problem, but it's worth a go.

Collapse
 
aicurious profile image
Ai Curious • Edited

I tried it... just for testing...

Image description

But my wsl version was already on 2.

Image description

BLAS Still Eq 0

-still learning.

Collapse
 
pauloarn profile image
pauloarn

Neeed Heeeeelp
On poetry run python scripts/setup i get this error bellow:
Image description

any idea in what it could be?

Collapse
 
steventhegeek profile image
StevenTheGeek

I followed this and got no errors along the way. However on the UI, I can upload files but as soon as I tried to execute any query I get:

CUDA error: the provided PTX was compiled with an unsupported toolchain.
current device: 0, in function ggml_cuda_op_flatten at /tmp/pip-install-m0l1c58x/llama-cpp-python_4761054bac7e4d008a824921145fb2da/vendor/llama.cpp/ggml-cuda.cu:10010
cudaGetLastError()
GGML_ASSERT: /tmp/pip-install-m0l1c58x/llama-cpp-python_4761054bac7e4d008a824921145fb2da/vendor/llama.cpp/ggml-cuda.cu:255: !"CUDA error"
make: *** [Makefile:36: run] Aborted

Any Ideas? I thought it was maybe because NVidia now installs toolkit 12.4 nad this guide referenced 12.3 but even when I install 12.3 and update bash, I get same error.

Collapse
 
steventhegeek profile image
StevenTheGeek

Nevermind, never thought to check my windows host Nvidia driver lol, all good now.

Collapse
 
docteurrs profile image
Emilien Lancelot

Glad it work ^^

Collapse
 
green_flamingo profile image
John Boyle • Edited

Sorry to be a bother, but will this current set up only work on Windows that has the benefit of a GPU? or can I simply skip the NVidia driver installation step and continue with building/running the GPT part ? TIA

Collapse
 
docteurrs profile image
Emilien Lancelot

You don't need to have a GPU that is on windows or else. It works on CPU by default but will be way slower that's all ;-)

Collapse
 
docteurrs profile image
Emilien Lancelot

Hum. You are not required to use GPU. By default it will run on CPU :-)