Tinkering with Tinker
Video
My first impressions and initial few tests exploring Tinker, training an LLM to respond to all questions with ‘foo’ and then moving on to an RL example using a custom synthetic task and watching reward + accuracy go up and up :)
Tinkering with Tinker
My first impressions and initial few tests exploring Tinker, training an LLM to respond to all questions with ‘foo’ and then moving on to an RL example using a custom synthetic task and watching reward + accuracy go up and up :)
Really impressed by this so far, expect more Tinker tinkering in the new year :)