DEV Community

Cover image for Llama-2-70b is almost as strong at factuality as gpt-4, and considerably better than gpt-3.5-turbo.
Fleszarjacek
Fleszarjacek

Posted on

Llama-2-70b is almost as strong at factuality as gpt-4, and considerably better than gpt-3.5-turbo.

We used to compare Llama 2 7b, 13b and 70b (chat-hf fine-tuned) vs OpenAI gpt-3.5-turbo and gpt-4. We used a 3-way verified hand-labeled set of 373 news report statements and presented one correct and one incorrect summary of each. Each LLM had to decide which statement was the factually correct summary.😭
[(https://link.medium.com/ugIcBrTXxCb)

Top comments (0)

AWS Security LIVE!

Join us for AWS Security LIVE!

Discover the future of cloud security. Tune in live for trends, tips, and solutions from AWS and AWS Partners.

Learn More

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay