DEV Community

Cover image for Locally hosted AI writing assistant
Tom Nijhof
Tom Nijhof

Posted on

Locally hosted AI writing assistant

Most artificial intelligence (AI) tools are either closed-source or require a server to function. However, with the help of Ollama, it is possible to create simple yet powerful local AI tools without any of these constraints. In this blog post, I will demonstrate how I built my own local AI tool using Ollama's user-friendly interface and flexible architecture. By doing so, I aim to showcase the ease with which one can leverage AI capabilities without relying on proprietary software or remote servers.

An AI image of a bird robot writing

An AI image of a bird robot writing

I recently developed an AI-powered writing assistant using Ollama. To build this innovative tool, I leveraged Vue.js and the Ollama JavaScript package, both of which proved to be indispensable in the development process. By harnessing the power of these advanced technologies, I was able to create a user-friendly interface that streamlines the writing process and yields high-quality content with ease.

The solution only works on Linux, Windows Subsystem Linux (WSL), and maybe Mac

A screenshot of AI writer showing how the first draft of this blog was made

A screenshot of AI writer showing how the first draft of this blog was made.

Integrate into JavaScript

Ollama will send the message to Llama2 model. The response can be returned with streaming, allowing for a seamless and interactive experience. By displaying the model's response in real-time, users will no longer have to look at loading bars, further enhancing their overall experience. This feature not only simplifies the programming process but also provides a more dynamic and engaging environment for all involved.
It will use a model named blog that I create in the next chapter.



const output = ref(“”)
async function sendToModel(message: String) {
// Empty the output
output.value = ""


// Send the message to Ollama
// It will use the model named ‘blog’
 const response = await ollama.chat({
   model: 'blog',
   messages: [{ role: 'user', content:message }],
   stream: true
 })


// At every part of the stream back to the response in real time
 for await (const part of response) {
   output.value += part.message.content
 }
}


Enter fullscreen mode Exit fullscreen mode

Fine-tuning the model

Ollama's default model is Llama2, which is suitable for casual conversations. However, I wanted to modify it to generate rewritten versions of my initial ideas. To achieve this, I created a custom model file and made two key adjustments. Firstly, I lowered the temperature to 0.5, which results in less creative output that still conveys my intended message. Secondly, I modified the system message to avoid generating answers and instead focus on rewriting the input text. Below is the updated model file for your reference.



FROM llama2

# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 0.5

# set the system message
SYSTEM """
Rewrite the following into a paragraph of a technical blog with simple and formal English.
"""


Enter fullscreen mode Exit fullscreen mode

To utilize the newly introduced model file, I executed the following command. This resulted in the generation of a freshly minted model named "blog." With this novel model at my disposal, I am now able to incorporate it into my JavaScript applications for seamless usage.
ollama create blog -f ./Modelfile

Conclusion

In conclusion, Ollama provides an accessible and user-friendly means of utilizing Large Language Models (LLMs) in local projects. The platform allows for seamless customization to meet individual needs and preferences, making it a valuable tool for those looking to leverage the power of LLMs in their projects. By providing an easy-to-use interface and a wide range of customization options, Ollama simplifies the process of incorporating LLMs into local projects, enabling users to unlock their full potential.

Top comments (14)

Collapse
 
simouun profile image
Sami

I started using a local tool that corrects and explains my errors directly within any Windows app, without sending my text to the cloud. The real advantage is the global shortcut, which makes instant corrections easy. Sometimes the suggestions deserve a second look, but keeping everything local has really reassured me about confidentiality without affecting my workflow.

Collapse
 
wagenrace profile image
Tom Nijhof

That is really cool! What do you use?

Collapse
 
simouun profile image
Sami • Edited

I actually ended up building it myself! It's a desktop app I developed called LinguaPilot AI (I often call it my personal AI Writing Mentor).

I couldn't find any tool on the market that combined a global Windows shortcut with data privacy, so I coded it to run offline using local LLMs via Ollama.

If interested, you can easily find it on Google by searching for: LinguaPilot AI — Your AI Writing Mentor for Windows.

Let me know what you think if you check it out!

Thread Thread
 
wagenrace profile image
Tom Nijhof

I tried searching for it, but came back to this command and a Netlify website that is blocked by security

Could you share your git project?

Thread Thread
 
simouun profile image
Comment deleted
Thread Thread
 
simouun profile image
Sami

Hey! Thanks a lot for letting me know, I really appreciate it. It is certainly the Netlify subdomain that triggered the security system.

To keep it straightforward, the official GitHub repository for the project is available directly in my profile bio.

The app is packed with advanced features tailored for local execution:

  • Smart Auto-Detection: It automatically detects your installed Ollama models (as long as Ollama is running).
  • Custom Flexibility: It lets you enter custom models manually and test the connection before you even start.

Originally, it was purely local. However, after getting feedback from users with older PC configurations who wanted more speed for non-sensitive text, I integrated cloud models like OpenAI, Gemini, Mistral, and Anthropic.

The architecture is completely modular—if you choose to stick to Ollama, it runs entirely independently of the cloud. This means you can use it 100% offline with absolute data privacy for sensitive documents.

Just to be upfront, there is a free evaluation version available so you can test the global shortcut and see how it fits your workflow.

⚠️ Quick note on Windows SmartScreen: Since it's a brand new software and I haven't invested in a pricey Windows code-signing certificate yet, SmartScreen might flash a warning. You can safely bypass it by clicking 'More info' ➡️ 'Run anyway'.

Would love to know your thoughts on the architecture or the shortcut integration if you give it a spin!

Collapse
 
benajaero profile image
ben ajaero

Tom, this is a stellar walkthrough on setting up a local AI writing assistant with Ollama. It's refreshing to see AI tools being developed with privacy and local execution in mind.

Collapse
 
wagenrace profile image
Tom Nijhof

Thank you, it still feels so weird that Meta is the one to help with privacy 😄

Collapse
 
benajaero profile image
ben ajaero

Meta is pretty good on the developer side. Their maintenance of React and GraphQL have been great.

Thread Thread
 
wagenrace profile image
Tom Nijhof

True, but I did not see a LLM as a dev tool. A specially if you see what open ai get for a revenue from a chat bot 😯

Collapse
 
samiekblad profile image
Sami Ekblad

Setting up local stuff has got som much easier lately. As a Java developer I tested the setup of my own custom web app UI with Ollama and it takes like 20 lines of UI code and 2 commands to get Mistral running locally in a Docker container. Now I can build local AI systems for anything! But my question is: Is there way to use GPUs in local containers to speed up LLMs?

Some comments may only be visible to logged-in visitors. Sign in to view all comments.