q-learning – maxant

About a year ago I set myself the goal of writing an algorithm that could learn to play tic-tac-toe. I didn't want to tell the algorithm what the rules of the game are, nor did I want it to try and use some kind of calculation to look ahead at possible moves which might lead to a win from the current state of the board. I wanted the algorithm to "learn" how to play, by playing against itself. I knew nothing about machine learning, so I spent a bit of time learning about neural networks but didn't get very far and convinced myself that neural networks wouldn't work and I put the project to the back of my mind. A few months ago at a Christmas party, I bumped into an acquaintance, JP, and I ended up telling him about my goal. He suggested reinforcement learning and a few days later I started reading about it. It didn't take long before I was trying to understand Q-learning and failing fast by getting lost in the maths. So with my limited knowledge I went about iteratively designing an algorithm to "learn" to play. Here is the result, which you can play against interactively by clicking on the board or buttons. It contains no code telling it what good or bad moves are and the AI algorithm knows nothing about the rules of the game - it's simply told what the outcome of each move is (unfinished, won, drawn, lost). It doesn't…

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Tag: q-learning

Using Reinforcement Learning To Learn To Play Tic-Tac-Toe