While significant research advances have been made in the field of deep
reinforcement learning, a major challenge to widespread industrial adoption of
deep reinforcement learning that has recently surfaced but little explored is
the potential vulnerability to privacy breaches. In particular, there have been
no concrete adversarial attack strategies in literature tailored for studying
the vulnerability of deep reinforcement learning algorithms to membership
inference attacks. To address this gap, we propose an adversarial attack
framework tailored for testing the vulnerability of deep reinforcement learning
algorithms to membership inference attacks. More specifically, we design a
series of experiments to investigate the impact of temporal correlation, which
naturally exists in reinforcement learning training data, on the probability of
information leakage. Furthermore, we study the differences in the performance
of emph{collective} and emph{individual} membership attacks against deep
reinforcement learning algorithms. Experimental results show that the proposed
adversarial attack framework is surprisingly effective at inferring the data
used during deep reinforcement training with an accuracy exceeding $84%$ in
individual and $97%$ in collective mode on two different control tasks in
OpenAI Gym, which raises serious privacy concerns in the deployment of models
resulting from deep reinforcement learning. Moreover, we show that the learning
state of a reinforcement learning algorithm significantly influences the level
of the privacy breach.

By admin