DEV Community

Discussion on: I Ran Hermes Agent on the Same Task for 7 Days. The Skill File on Day 7 Looked Nothing Like Day 1.

Collapse
 
adriens profile image
adriens

Very interesting, and for the Nous you used from OpenRouter, did you use nousresearch/hermes-4-70b or less ? Which size makes interesting on local GPU ?

Collapse
 
sreejit_ profile image
Sreejit Pradhan

For this experiment I used a Nous Hermes endpoint via OpenRouter rather than full local inference, so not specifically Hermes-4-70B local. But from testing, the really interesting threshold for Hermes-style persistent learning feels around the 30B–70B range — that’s where skill evolution, preference inference, autonomous refinement, and “operational intuition” become much more coherent across sessions.

That said, even 7B–14B quantized models on consumer GPUs can become surprisingly strong for recurring workflows because the persistent skill/memory layer compounds over time. A smaller model with 7 days of accumulated skills can genuinely outperform a stronger stateless model for specific tasks.

Collapse
 
adriens profile image
adriens

Thanks a lot for the benchs and tips, they will be very useful !

Thread Thread
 
sreejit_ profile image
Sreejit Pradhan

You too man!☺️