As of Sep 2023, do you have any recommendation at running the model locally on macbook with Intel CPU and Intel graphics card?
wget https://huggingface.co/substratusai/Llama-2-13B-chat-GGUF/resolve/main/model.bin -O model.q4_k_s.gguf has worked for me.
However, when I run
./main -t 4 -m ./models/model.q4_k_s.gguf --color -c 4096 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "### Instruction: Write a story about llamas\n### Response:"
it could load the model.
ggml_metal_init: load pipeline error: Error Domain=CompilerError Code=2 "AIR builtin function was called but no definition was found." UserInfo={NSLocalizedDescription=AIR builtin function was called but no definition was found.}
llama_new_context_with_model: ggml_metal_init() failed
llama_init_from_gpt_params: error: failed to create context with model './models/model.q4_k_s.gguf'
main: error: unable to load model
I would start by making sure your source code is up to date and doing a recompile, as well as double checking the path where your model is located.
I don't know much about getting Llama.cpp working on Intel Macs, but I'd try to run it without metal enabled (set -ngl to 0) and see if you can get that working. I am not seeing anything about running Llama.cpp on Intel other than a few issues saying it doesn't work and they're trying to get it working with another GPU in their system (both of which were iMacs with other GPUs installed as well). If you can't get it working with the above advice, I'd advise making an issue on the Llama.cpp GitHub.
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
Thank you for the post!
As of Sep 2023, do you have any recommendation at running the model locally on macbook with Intel CPU and Intel graphics card?
wget https://huggingface.co/substratusai/Llama-2-13B-chat-GGUF/resolve/main/model.bin -O model.q4_k_s.ggufhas worked for me.However, when I run
it could load the model.
Thank you!
I would start by making sure your source code is up to date and doing a recompile, as well as double checking the path where your model is located.
I don't know much about getting Llama.cpp working on Intel Macs, but I'd try to run it without metal enabled (set
-nglto0) and see if you can get that working. I am not seeing anything about running Llama.cpp on Intel other than a few issues saying it doesn't work and they're trying to get it working with another GPU in their system (both of which were iMacs with other GPUs installed as well). If you can't get it working with the above advice, I'd advise making an issue on the Llama.cpp GitHub.