publications

publications by categories in reversed chronological order. generated by jekyll-scholar.

2026

Workshop

Do Coding Agents Deceive Us? Detecting and Preventing Cheating via Capped Evaluation with Randomized Tests

Thanawat Lodkaew, Johannes Ackermann, Soichiro Nishimori, Nontawat Charoenphakdee, Masashi Sugiyama, and Takashi Ishida

In ICML 2026 Workshop: Statistical Frameworks for Uncertainty in Agentic Systems , Jul 2026

arXiv
Workshop

Mitigating Reward Hacking in RLHF via Advantage Sign Robustness

Shinnosuke Ono, Johannes Ackermann, Soichiro Nishimori, Takashi Ishida, and Masashi Sugiyama

In ICML 2026 Workshop: Epistemic Intellgence in Machine Learning , Jul 2026

arXiv
ICML

Gradient Regularization Prevents Reward Hacking in Reinforcement Learning from Human Feedback and Verifiable Rewards

Johannes Ackermann, Michael Noukhovitch, Takashi Ishida, and Masashi Sugiyama

In ICML 2026 , Jul 2026

arXiv PDF Website
ICML

Bridging Spherical Black-Box Optimizers

Johannes Ackermann, and Stefano Peluchetti

In ICML 2026 , Jul 2026

arXiv PDF

2025

COLM

Off-Policy Corrected Reward Modeling for Reinforcement Learning from Human Feedback

Johannes Ackermann, Takashi Ishida, and Masashi Sugiyama

In Conference on Language Modeling (COLM) 2025 , Oct 2025

arXiv PDF
RLC

Recursive Reward Aggregation

Yuting Tang, Yivan Zhang, Johannes Ackermann, Yu-Jie Zhang, Soichiro Nishimori, and Masashi Sugiyama

In Reinforcement Learning Conference (RLC) 2025 , Aug 2025

arXiv PDF
RLC

Offline Reinforcement Learning with Domain-Unlabeled Data

Soichiro Nishimori, Xin-Qiang Cai, Johannes Ackermann, and Masashi Sugiyama

In Reinforcement Learning Conference (RLC) 2025 , Aug 2025

arXiv PDF

2024

RLC

Offline Reinforcement Learning from Datasets with Structured Non-Stationarity

Johannes Ackermann, Takayuki Osa, and Masashi Sugiyama

In Reinforcement Learning Conference (RLC) 2024 , Aug 2024

arXiv PDF Website

2022

Workshop

High-Resolution Image Editing via Multi-Stage Blended Diffusion

Johannes Ackermann, and Minjun Li

In NeurIPS Machine Learning for Creativity and Design Workshop 2022 , Dec 2022

arXiv PDF Blog

2021

ECML-PKDD

Unsupervised Task Clustering for Multi-Task Reinforcement Learning

Johannes Ackermann, Oliver Richter, and Roger Wattenhofer

In ECML-PKDD 2021 , Sep 2021

PDF Video

2020

ECOC

Convolutional Neural Network Based Blind Estimation of Generalized Mutual Information for Optical Communication

Johannes Ackermann, Maximilian Schädler, and Christian Blümm

In European Conference on Optical Communication (ECOC) , Dec 2020

PDF

2019

Workshop

Reducing Overestimation Bias in Multi-Agent Domains Using Double Centralized Critics

Johannes Ackermann, Volker Gabler, Takayuki Osa, and Masashi Sugiyama

In Deep Reinforcement Learning Workshop at NeurIPS , Dec 2019

arXiv PDF