learning (RL) in AI model building has been a growing topic over the past few months. From Deepseek models incorporating RL mechanics into their training processes to other success stories of RL-based improvement, “AI Twitter” has been ablaze. As more agents get deployed, a question emerges: can reinforcement learning control systems be built only in prompts? After all, reinforcement learning is all about using real-world feedback to optimize toward a goal, traditionally by adjusting model weights.