Baka4LLM. New horizons

#machinelearning #llm

Hello fairy dairy diary~!

Conserved Causal LLM for a while, moved to new horizon...

...Masked LLM..

The reasoning is simple: hypothesis it's easier to train acceptable MLM on 16GB VRAM. Which then can be chained to itself.

For now points of references are

https://github.com/samvher/bert-for-laptops/blob/main/BERT_for_laptops.ipynb

https://arxiv.org/abs/2212.14034

and there was a paper that explored if it was possible to train decoder from encoder-only, IIRC, they found it can I think, but I can't find it and I might hallucinate it, will look for it later.

For now some pretraining is to go, then to decide what to do with it and how update it!

Chill!

Top comments (0)

Microsoft Robotic Process Automation

Power GI - Dec 4

Enhancing Observability in Machine Learning with OpenTelemetry: InsightfulAI Update

Philip Thomas - Nov 13

🚀 Launching a High-Performance DistilBERT-Based Sentiment Analysis Model for Steam Reviews 🎮🤖

Ericson Willians - Dec 16

Understanding Machine Learning Model Types: A Practical Guide (3)

Foyzul Karim - Dec 17

DEV Community