Most language models overlook low-resource languages. Khasibert is built to change that—it's the first open-source Khasi language model designed for translation, summarization, and civic NLP tasks in Northeast India.
What Is Khasibert?
- A compact transformer-based LLM trained on Khasi-language corpora
- Optimized for low-resource deployment and real-world usability
- Built by MWire Labs to support inclusive, culturally aware AI.
Why It Matters
- Khasi is spoken by over a million people, yet underrepresented in mainstream NLP
- Khasibert enables language technology research, civic applications, and education tools
- It’s part of a broader mission to democratize AI for Northeast India.
What’s Under the Hood
- Pretrained on cleaned, deduplicated Khasi text
- Fine-tuned for translation, summarization, and semantic understanding
- Benchmarked for responsiveness in resource-constrained environments
Top comments (0)