This is a quick 5 minute overview of how to setup a nodeJS server that responds to prompts using a local Deepseek model, or any model supported by Ollama.
This is based off of the instructions found here.
Pull a Deepseek model of your choice. You can find more models here
ollama pull deepseek-r1:7b
Initialise your project
pnpm init
Install Vercel AI
pnpm install ai
Install the ollama provider
pnpm install ollama-ai-provider
Install fastify
pnpm install fastify
Install Zod
pnpm install zod
Create a index.ts file and paste in the following code
import { generateText } from 'ai';
import { createOllama } from 'ollama-ai-provider';
import createFastify from 'fastify';
import z from 'zod'
const fastify = createFastify();
const promptSchema = z.object({
prompt: z.string()
})
const ollama = createOllama({
baseURL: 'http://localhost:11434/api',
});
fastify.post('/prompt', async (request, reply) => {
const promptResult = promptSchema.safeParse(request.body);
if (promptResult.error) {
console.log(promptResult.error)
return reply
.code(500)
.send();
}
const result = await generateText({
model: ollama('deepseek-r1:7b'),
prompt: promptResult.data.prompt
});
return { answer: result.text }
})
await fastify.listen({ port: 3000 }, () => {
console.log('listening on port 3000')
})
Make sure you have tsx installed and run
npx tsx index.ts
You can curl the endpoint to get a response
>curl -X POST http://localhost:3000/prompt -H "Content-Type: application/json" --json "{\"prompt\": \"Tell me a story\"}"
Top comments (0)