強化学習 - CyberAgent AI Lab

2023.7.20

Exploration of Unranked Items in Safe Online Learning to Re-Rank

2023.7.10

Rate-Optimal Bayesian Simple Regret in Best Arm Identification

2023.6.20

An Optimal Clustering Algorithm for the Labeled Stochastic Block Model

2023.6.12

Learning in Multi-Memory Games Triggers Complex Dynamics Diverging from Nash Equilibrium

2023.4.19

Last-Iterate Convergence with Full and Noisy Feedback in Two-Player Zero-Sum Games

2022.10.24

Fair Matrix Factorisation for Large-Scale Recommender Systems

2022.7.12

強化学習一般

2022.6.26

Mutation-Driven Follow the Regularized Leader for Last-Iterate Convergence in Zero-Sum Games

2022.5.17

Anytime Capacity Expansion in Medical Residency Match by Monte Carlo Tree Search

2022.5.17

Computing Strategies of American Football via Counterfactual Regret Minimization

2022.5.16

Thresholded Lasso Bandit

2021.12.21

Off-Policy Exploitability-Evaluation in Two-Player Zero-Sum Markov Games

2021.12.13

Mean Variance Efficient Reinforcement Learning

2021.12.1

見間違えのある繰り返し囚人のジレンマにおける方策勾配法に関する研究

2021.8.27

強化学習

2020.1.28

Online Learning for Bidding Agent in First Price Auction