Publications - CyberAgent AI Lab

Policy Gradient with Kernel Quadrature

強化学習

Transactions on Machine Learning Research (TMLR)

Reinforcement Learning for Edit-Based Non-Autoregressive Neural Machine Translation

強化学習

NAACL SRW 2024

Policy Gradient Algorithms with Monte-Carlo Tree Learning for Non-Markov Decision Processes

強化学習

Reinforcement Learning Conference (RLC) 2024

On the True Distribution Approximation of Minimum Bayes-Risk Decoding

自然言語処理

NAACL 2024

Model-based minimum bayes risk decoding

強化学習

ICML 2024

Generating Diverse and High-Quality Texts by Minimum Bayes Risk Decoding

強化学習

Findings of ACL 2024

Hyperparameter-Free Approach for Faster Minimum Bayes Risk Decoding

強化学習

Findings of ACL 2024

Adaptively Perturbed Mirror Descent for Learning in Games

強化学習

ICML 2024

On Universally Optimal Algorithms for A/B Testing

強化学習

ICML 2024

Matroid Semi-Bandits in Sublinear Time

Machine Learning

40th International Conference on Machine Learning (ICML 2024)

二人零和ゲームにおける突然変異駆動型正則化先導者追従法の終極反復収束

強化学習

情報処理学会論文誌

Investigating Effect of Altered Auditory Feedback on Self-Representation, Subjective Operator Experience, and Task Performance in Teleoperation of a Social Robot

接客対話エージェント

CHI 2024

Regular Expressions with Backreferences and Lookaheads Capture NLOG

理論計算機科学

51st EATCS International Colloquium on Automata, Languages, and Programming (ICALP 2024)

Optimal PSPACE-hardness of Approximating Set Cover Reconfiguration

理論計算機科学

51st EATCS International Colloquium on Automata, Languages, and Programming (ICALP 2024)

Alphabet Reduction for Reconfiguration Problems

理論計算機科学

51st EATCS International Colloquium on Automata, Languages, and Programming (ICALP 2024)

The potentiality of telepsychiatry using a teleoperated robot for a patient with alcohol abuse on an isolated island

接客対話エージェント

PCNR

Computational complexity of normalizing constants for the product of determinantal point processes

理論計算機科学

Theoretical Computer Science

Field Experiments on the Effects of Multiple-Robot Expressions for Robot Influence in Recommendation Situations

接客対話エージェント

RA-L

Grasping Both Query Relevance and Essential Content for Query-focused Summarization

自然言語処理

SIGIR'24

研修医配属における地域間格差を調整する制約のモンテカルロ木探索

強化学習

情報処理学会第86回全国大会

二人零和マルコフゲームにおける状態抽象化法に関する研究

強化学習

情報処理学会第86回全国大会