2026 Workshop Do Coding Agents Deceive Us? Detecting and Preventing Cheating via Capped Evaluation with Randomized Tests Thanawat Lodkaew, Johannes Ackermann, Soichiro Nishimori, Nontawat Charoenphakdee, Masashi Sugiyama, and Takashi Ishida In ICML 2026 Workshop: Statistical Frameworks for Uncertainty in Agentic Systems , Jul 2026 arXiv Workshop Mitigating Reward Hacking in RLHF via Advantage Sign Robustness Shinnosuke Ono, Johannes Ackermann, Soichiro Nishimori, Takashi Ishida, and Masashi Sugiyama In ICML 2026 Workshop: Epistemic Intellgence in Machine Learning , Jul 2026 arXiv ICML Gradient Regularization Prevents Reward Hacking in Reinforcement Learning from Human Feedback and Verifiable Rewards Johannes Ackermann, Michael Noukhovitch, Takashi Ishida, and Masashi Sugiyama In ICML 2026 , Jul 2026 arXiv PDF Website ICML Bridging Spherical Black-Box Optimizers Johannes Ackermann, and Stefano Peluchetti In ICML 2026 , Jul 2026 arXiv PDF 2025 COLM Off-Policy Corrected Reward Modeling for Reinforcement Learning from Human Feedback Johannes Ackermann, Takashi Ishida, and Masashi Sugiyama In Conference on Language Modeling (COLM) 2025 , Oct 2025 arXiv PDF RLC Recursive Reward Aggregation Yuting Tang, Yivan Zhang, Johannes Ackermann, Yu-Jie Zhang, Soichiro Nishimori, and Masashi Sugiyama In Reinforcement Learning Conference (RLC) 2025 , Aug 2025 arXiv PDF RLC Offline Reinforcement Learning with Domain-Unlabeled Data Soichiro Nishimori, Xin-Qiang Cai, Johannes Ackermann, and Masashi Sugiyama In Reinforcement Learning Conference (RLC) 2025 , Aug 2025 arXiv PDF 2024 RLC Offline Reinforcement Learning from Datasets with Structured Non-Stationarity Johannes Ackermann, Takayuki Osa, and Masashi Sugiyama In Reinforcement Learning Conference (RLC) 2024 , Aug 2024 arXiv PDF Website 2022 Workshop High-Resolution Image Editing via Multi-Stage Blended Diffusion Johannes Ackermann, and Minjun Li In NeurIPS Machine Learning for Creativity and Design Workshop 2022 , Dec 2022 arXiv PDF Blog 2021 ECML-PKDD Unsupervised Task Clustering for Multi-Task Reinforcement Learning Johannes Ackermann, Oliver Richter, and Roger Wattenhofer In ECML-PKDD 2021 , Sep 2021 PDF Video 2020 ECOC Convolutional Neural Network Based Blind Estimation of Generalized Mutual Information for Optical Communication Johannes Ackermann, Maximilian Schädler, and Christian Blümm In European Conference on Optical Communication (ECOC) , Dec 2020 PDF 2019 Workshop Reducing Overestimation Bias in Multi-Agent Domains Using Double Centralized Critics Johannes Ackermann, Volker Gabler, Takayuki Osa, and Masashi Sugiyama In Deep Reinforcement Learning Workshop at NeurIPS , Dec 2019 arXiv PDF