DEV Community

Cover image for OpenSource Models Surpass GPT3.5 on Several Benchmarks
Derrick Zziwa
Derrick Zziwa

Posted on

OpenSource Models Surpass GPT3.5 on Several Benchmarks

Unleashing the Power of Open Source AI to Redefine Possibilities

Stability AI Logo

Stability.AI, in an audacious move to push the boundaries of AI capabilities, proudly presents FreeWilly - a groundbreaking open-source project aimed at challenging the mighty GPT-3.5. Developed in collaboration with CarperAI lab, FreeWilly emerges as a formidable Large Language Model (LLM) with exceptional reasoning abilities, poised to revolutionize the world of natural language processing.

Empowering Open Source AI Research

FreeWilly's mission is to democratize AI research and foster open collaboration. Both FreeWilly1 and its successor FreeWilly2 are released under open-source licenses, encouraging the AI community to explore, innovate, and contribute to the future of language models. As staunch advocates of open access, Stability.AI is committed to making advancements in AI accessible to all.

A Formidable Challenge to GPT-3.5

Built on the strong foundation of the LLaMA 65B and LLaMA 2 70B models, FreeWilly1 and FreeWilly2 were carefully fine-tuned using a new synthetically-generated dataset. Stability.AI's team employed the Supervised Fine-Tune (SFT) approach in standard Alpaca format to refine these models to their peak performance.

Red-Teaming and Community Collaboration

Stability.AI leaves no stone unturned to ensure the safety and ethics of FreeWilly. Thorough internal red-teaming has been conducted to ensure the models remain polite and harmless. However, the community's participation in the red-teaming process is encouraged, as diverse perspectives can help strengthen the models and identify potential areas of improvement.

A Data-Driven Journey

Drawing inspiration from Microsoft's trailblazing "Orca: Progressive Learning from Complex Explanation Traces of GPT-4" paper, Stability.AI took a novel approach to data generation. By creating a variant dataset with 600,000 meticulously curated high-quality instruction data points from Enrico Shippole's datasets, FreeWilly sets new standards in training efficiency. Rigorous filtering and careful curation ensured that evaluation benchmarks did not influence the models, allowing for a fair and robust evaluation.

Unleashing Unprecedented Performance

Internal evaluation using EleutherAI’s lm-eval-harness, along with AGIEval, reveals FreeWilly's exceptional performance in various domains. From intricate reasoning to linguistic subtleties, FreeWilly showcases unparalleled proficiency in tasks like Law and mathematical problem-solving.

Leading the Open LLM Leaderboard

FreeWilly's exceptional results were not only validated by Stability.AI researchers but also independently reproduced by Hugging Face on July 21st, 2023. These benchmarks catapult FreeWilly to the forefront of the open LLM leaderboard, making it a powerful contender in the AI landscape.

_

An Open Future of Possibilities
_

FreeWilly1 and FreeWilly2 mark a transformative step forward in the world of open-source Large Language Models. By enhancing natural language understanding and enabling complex tasks, these models offer limitless possibilities and applications. Stability.AI extends its heartfelt gratitude to its team of passionate researchers, engineers, and collaborators whose unwavering dedication led to this remarkable milestone.

Source: https://lnkd.in/dNcY_9jW

Top comments (0)