Reinforcement Fine-Tuning LLMs with GRPO