![]() ![]() The exploration/exploitation trade-off is a dilemma we frequently face in choosing between options. ![]() The source code of all three animations can be found in this Gist. To help students get a better feel for three of the most popular “ multi-armed bandit” exploration/exploitation balancing strategies (Epsilon Greedy, Thompson Sampling, and Upper Confidence Bound), I combined my R package “ contextual” with the versatile “ animation” package to create some interactive animations.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |