2025 COLM Off-Policy Corrected Reward Modeling for Reinforcement Learning from Human Feedback Johannes Ackermann, Takashi Ishida, and Masashi Sugiyama In Conference on Language Modeling (COLM) 2025 , Oct 2025 PDF RLC Recursive Reward Aggregation Yuting Tang, Yivan Zhang, Johannes Ackermann, Yu-Jie Zhang, Soichiro Nishimori, and Masashi Sugiyama In Reinforcement Learning Conference (RLC) 2025 , Aug 2025 PDF RLC Offline Reinforcement Learning with Domain-Unlabeled Data Soichiro Nishimori, Xin-Qiang Cai, Johannes Ackermann, and Masashi Sugiyama In Reinforcement Learning Conference (RLC) 2025 , Aug 2025 arXiv PDF 2024 RLC Offline Reinforcement Learning from Datasets with Structured Non-Stationarity Johannes Ackermann, Takayuki Osa, and Masashi Sugiyama In Reinforcement Learning Conference (RLC) 2024 , Aug 2024 arXiv PDF Website 2022 Workshop High-Resolution Image Editing via Multi-Stage Blended Diffusion Johannes Ackermann, and Minjun Li In NeurIPS Machine Learning for Creativity and Design Workshop 2022 , Dec 2022 arXiv PDF Blog 2021 ECML-PKDD Unsupervised Task Clustering for Multi-Task Reinforcement Learning Johannes Ackermann, Oliver Richter, and Roger Wattenhofer In ECML-PKDD 2021 , Sep 2021 PDF Video 2020 ECOC Convolutional Neural Network Based Blind Estimation of Generalized Mutual Information for Optical Communication Johannes Ackermann, Maximilian Schädler, and Christian Blümm In European Conference on Optical Communication (ECOC) , Dec 2020 PDF 2019 Workshop Reducing Overestimation Bias in Multi-Agent Domains Using Double Centralized Critics Johannes Ackermann, Volker Gabler, Takayuki Osa, and Masashi Sugiyama In Deep Reinforcement Learning Workshop at NeurIPS , Dec 2019 arXiv PDF