notes- local AI setup

#ai #llm #tooling #tutorial

Use LLMFit to select a model from the list.
Download the selected model. This will save the downloaded model in .cache/llmfit/models/Qwen2.5-Coder-7B-Instruct-Q8_0.gguf
Get llama.cpp from the releases tab
Use this command llama-server -m .cache/llmfit/models/Qwen2.5-Coder-7B-Instruct-Q8_0.gguf -ngl -1 to run the localhost server of chat interface
-ngl is basically to say for how many layers of GPU to be used, -1 is auto.

DEV Community