DEV Community

Lakshmi Shankar
Lakshmi Shankar

Posted on

Fine-tuning Qwen as my personal assistant

I Built a Local “Remembering” AI Assistant Using Qwen 2.5 — In Under 50 Lines

I wanted my own tiny AI that runs locally, remembers things only when I tell it to, and doesn’t burn my GPU alive.
So I grabbed Qwen 2.5 (1.5B) and wrote this little assistant.

It talks.
It thinks.
It remembers on command.
And it fits in one Python file.


The Code (Yes, it’s this small)

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_name = "Qwen/Qwen2.5-1.5B-Instruct"
tokenizeer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    dtype=torch.float16,
    device_map="auto",
    local_files_only=True,
)

system_prompt = "You are a helpful assistant that helps as a software developer."
current_prompt = [{"role": "system", "content": system_prompt}]
chat_history = []

while True:
    userInput = input("User:\n")
    current_prompt.append({"role": "user", "content": userInput})

    if userInput.lower() in ["bye", "exit", "quit"]:
        print("\nExiting model...")
        break

    if any(word in userInput.lower() for word in ["remember", "log"]):
        chat_history.append({"role": "user", "content": userInput})

    final_prompt = chat_history + current_prompt
    text = tokenizeer.apply_chat_template(final_prompt, tokenize=False, add_generation_prompt=True)
    inputs = tokenizeer(text, return_tensors="pt")

    outputs = model.generate(**inputs, max_new_tokens=180)
    response = tokenizeer.decode(outputs[0], skip_special_tokens=True)
    reply = response.split("assistant")[-1].strip()

    if "remember" in userInput.lower():
        chat_history.append({"role": "assistant", "content": reply})

    print("\nAssistant:\n" + reply + "\n")
Enter fullscreen mode Exit fullscreen mode

What It Can Do

  • Talks like a normal chat model
  • Keeps track of conversation
  • Remembers things only when I say “remember” or “log”
  • Everything runs offline with Qwen 2.5

Example moment:

User: Remember my project name is SkillGapMatcherAi
Assistant: Noted! I’ll remember that.
Enter fullscreen mode Exit fullscreen mode

Later…

User: What's my project name?
Assistant: You said your project name is Flexy.
Enter fullscreen mode Exit fullscreen mode

Feels like magic, but it’s just Python + Qwen.


Top comments (0)