強化学習 - CyberAgent AI Lab

2023.10.29

Learning in Multi-Memory Games Triggers Complex Dynamics Diverging from Nash Equilibrium

2023.10.29

Zero-Variance Perturbation Utiity for Extensive-Form Games

2023.10.29

A Slingshot Approach to Learning in Monotone Games

2023.9.6

オンライン環境において公平な資源配分を実現するアルゴリズムに関する研究

2023.9.6

研修医配属における地域間格差を調整する制約のモンテカルロ木探索

2023.7.10

Rate-Optimal Bayesian Simple Regret in Best Arm Identification

2023.6.20

An Optimal Clustering Algorithm for the Labeled Stochastic Block Model

2023.6.12

Learning in Multi-Memory Games Triggers Complex Dynamics Diverging from Nash Equilibrium

2023.6.6

オンライン環境において公平な資源配分を実現するアルゴリズムに関する研究

2023.6.6

二人零和展開型ゲームにおける突然変異付き乗算型重み更新に関する研究

2023.4.19

Last-Iterate Convergence with Full and Noisy Feedback in Two-Player Zero-Sum Games

2023.3.13

タスク指向対話システムのPolicy学習へのDecision Transformerの適用

2023.3.13

タスク指向対話における強化学習を用いた対話方策学習への敵対的学習の役割の解明

2023.3.2

研修医配属における地域間格差を調整するための制約のモンテカルロ木探索

2023.3.2

オンライン環境において公平な資源配分を実現するアルゴリズムに関する研究

2022.11.20

Thresholded Lasso Bandit

2022.11.20

ビームサーチ推論のための強化学習

2022.11.20

Last-Iterate Convergence with Full- and Noisy-Information Feedback in Two-Player Zero-Sum Games

2022.7.12

強化学習一般

2022.6.26

Mutation-Driven Follow the Regularized Leader for Last-Iterate Convergence in Zero-Sum Games

2022.6.26

二人零和ゲームにおける突然変異付きレプリケータダイナミクスを用いた学習アルゴリズムに関する研究