I'm an independent AI safety researcher, or trying to be one. I build things, break things, and write about what I find. Feedback welcome, especially disagreement.
Does it "learn" (as in "training") or just memorize ? I don't really see anything about leaning here. And "self-improvement" is kind of misleading. Unless i completely misunderstood.
Here self improving essentially means improving the skills that agent generates over time. Generally skills are made just once. If they need imrprovement for example when the agent makes mistakes with the skill, we either have to prompt it explicitly to improve the skill or we need to edit it ourselves. with heremes, it realizes that it needs to update the skill and does it on its own. Thats the self improvement part. No weights finetuning at all/
I'm an independent AI safety researcher, or trying to be one. I build things, break things, and write about what I find. Feedback welcome, especially disagreement.
Does it "learn" (as in "training") or just memorize ? I don't really see anything about leaning here. And "self-improvement" is kind of misleading. Unless i completely misunderstood.
Here self improving essentially means improving the skills that agent generates over time. Generally skills are made just once. If they need imrprovement for example when the agent makes mistakes with the skill, we either have to prompt it explicitly to improve the skill or we need to edit it ourselves. with heremes, it realizes that it needs to update the skill and does it on its own. Thats the self improvement part. No weights finetuning at all/
thx. all i do is coding that's why i'm confused by hermes