DEV Community

Arjun Shetty
Arjun Shetty

Posted on

notes- local AI setup

  • Use LLMFit to select a model from the list.
  • Download the selected model. This will save the downloaded model in .cache/llmfit/models/Qwen2.5-Coder-7B-Instruct-Q8_0.gguf
  • Get llama.cpp from the releases tab
  • Use this command llama-server -m .cache/llmfit/models/Qwen2.5-Coder-7B-Instruct-Q8_0.gguf -ngl -1 to run the localhost server of chat interface
  • -ngl is basically to say for how many layers of GPU to be used, -1 is auto.

Top comments (0)