DEV Community

Wincent Balin
Wincent Balin

Posted on

Closure

#ai

After a pause, this series comes to a conclusion, mostly because of the rapid developments in the area of large language models.

Original intention

At the beginning I intended to create a language model, that would have gotten a prompt "Geschirrabwaschgesetz" (a law about washing dishes) and write me a corresponding law text in German.

I was discouraged from training the original char RNN because of the scary amount of training time with a 110 M training data. Therefore I went with fine-tuning a German GPT-2 (and later the better one; thanks Jo!). The fine-tuning process of such a model is described here or here, for example.

(Un-)expected discovery

I happened to discover that my intended case is covered perfectly by the LLAMA 2 Chat German model (almost, because of a few grammatical errors). This is very likely because of being fine-tuned with the German legal SQuAD dataset, among others.

I do not want to withhold the result from you (produced in LM Studio): Output to "Geschirrabwaschgesetz"

Just look at this beauty! It even defined "Hygiene" in the last subparagraph! And hence this series is concluded.

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more