updated 1 public sources
large-language-modelsoptimizationreinforcement-learningalignmentreasoning

Current frame

JD Explore Academy technical staff member and UT Austin Ph.D. student focused on LLM reasoning and alignment