基于深度强化学习的通用量子架构研究
近日,福州大学钟志荣团队进行了基于深度强化学习的通用量子架构研究。这一成果发表在2025年11月4日出版的《物理评论A》杂志上。
强化学习(RL)显示了自动化量子电路设计的前景,但由于“保真度陷阱”而经常停滞不前:通过只优化状态保真度,代理忽略了纠缠结构,并陷入次优,过于复杂的电路中,导致搜索效率显着降低。
研究组提出了一种克服这一障碍的方案,通过实现纠缠感知学习框架,并通过直接的、定量的纠缠度量来增强代理的奖励函数。这种方法提供了对状态空间更全面的物理描述。研究组证明了该原理在扩展门集中的三和四量子位态合成任务中的有效性。对于这个问题,当保真度驱动的智能体系统地无法发现最小深度电路时,它们的纠缠感知智能体总是成功的。
这种变革性的结果对初始随机种子的变化具有高度的抗逆性,并且即使在存在噪声的情况下也可以扩展到多量子位系统。该发现建立了一个可推广的原则,即将纠缠作为辅助奖励可以显著增强基于RL的解决方案,以解决量子物理中以保真度为中心的广泛任务,并为近期量子设备上可扩展的自动化发现铺平道路。
附:英文原文
Title: General-purpose quantum architecture search based on deep reinforcement learning
Author: Xiao-Yu Bi, Yi-Ming Yu, Ye-Hong Chen, Zhi-Rong Zhong
Issue&Volume: 2025/11/04
Abstract: Reinforcement learning (RL) shows promise for automated quantum circuit design but often stalls due to a “fidelity trap”: by optimizing only state fidelity, agents overlook entanglement structure and become stranded in suboptimal, overly complex circuits, resulting in significantly reduced search efficiency. In this paper, we propose a scheme that overcomes this barrier by implementing an entanglement-aware learning framework and enhancing the agents reward function with a direct, quantitative measure of entanglement. This approach offers a more comprehensive physical description of the state space. We demonstrate the efficacy of this principle on three- and four-qubit state-synthesis tasks within an expanded gate set. For this problem, where the fidelity-driven agent systematically fails to discover the minimal-depth circuit, our entanglement-aware agent consistently succeeds. This transformative result is highly robust against variations in initial random seeds and extends to multiqubit systems even in the presence of noise. Our findings establish a generalizable principle that incorporating entanglement as an auxiliary reward can significantly enhance RL-based solutions for a broad class of fidelity-centric tasks in quantum physics and pave the way for scalable, automated discovery on near-term quantum devices.
DOI: 10.1103/7rc4-p446
Source: https://journals.aps.org/pra/abstract/10.1103/7rc4-p446
