updated 2 public sources
reinforcement learninglarge language model post-trainingfoundation models

Current frame

Researcher at Cohere