Really liked this article, and thanks for sharing about "Other Approaches to Evaluation" I was thinking of something same as well, in what ways we can explore the capacity of the current LLMs.
Would you be looking into other models as well? Like Mistral, LlaMa 2, and the likes?
Really liked this article, and thanks for sharing about "Other Approaches to Evaluation" I was thinking of something same as well, in what ways we can explore the capacity of the current LLMs.
Would you be looking into other models as well? Like Mistral, LlaMa 2, and the likes?
There's actually a new model called Prometheus (huggingface.co/kaist-ai/prometheus...), claims to be on par with gpt-4 for evaluation!