For a while now, I’ve been grinding through reinforcement learning theory, value functions, policy gradients, Bellman equations, exploration strategies, the whole stack. And like a lot of people, I hit that wall where the math made sense on paper, but the intuition wasn’t sticking. So I decided to flip the script. Instead of forcing myself... Continue Reading →
RL Fundamentals: Bandits & GridWorld Guide
Why Reinforcement Learning Feels Different (And Why That’s Good) If you’ve worked with supervised learning, you’re used to a straightforward paradigm: show the model labeled examples, and it learns to predict labels for new data. Unsupervised learning asks the model to find patterns in unlabeled data. Reinforcement Learning (RL) flips the script entirely. In RL,... Continue Reading →
