A Reconfigurable Two‐WSe2‐Transistor Synaptic Cell for Reinforcement Learning. Issue 48 (25th February 2022)
- Record Type:
- Journal Article
- Title:
- A Reconfigurable Two‐WSe2‐Transistor Synaptic Cell for Reinforcement Learning. Issue 48 (25th February 2022)
- Main Title:
- A Reconfigurable Two‐WSe2‐Transistor Synaptic Cell for Reinforcement Learning
- Authors:
- Zhou, Yue
Wang, Yasai
Zhuge, Fuwei
Guo, Jianmiao
Ma, Sijie
Wang, Jingli
Tang, Zijian
Li, Yi
Miao, Xiangshui
He, Yuhui
Chai, Yang - Abstract:
- Abstract: Reward‐modulated spike‐timing‐dependent plasticity (R‐STDP) is a brain‐inspired reinforcement learning (RL) rule, exhibiting potential for decision‐making tasks and artificial general intelligence. However, the hardware implementation of the reward‐modulation process in R‐STDP usually requires complicated Si complementary metal–oxide–semiconductor (CMOS) circuit design that causes high power consumption and large footprint. Here, a design with two synaptic transistors (2T) connected in a parallel structure is experimentally demonstrated. The 2T unit based on WSe2 ferroelectric transistors exhibits reconfigurable polarity behavior, where one channel can be tuned as n‐type and the other as p‐type due to nonvolatile ferroelectric polarization. In this way, opposite synaptic weight update behaviors with multilevel (>6 bit) conductance states, ultralow nonlinearity (0.56/−1.23), and large G max / G min ratio of 30 are realized. By applying positive/negative reward to (anti‐)STDP component of 2T cell, R‐STDP learning rules are realized for training the spiking neural network and demonstrated to solve the classical cart–pole problem, exhibiting a way for realizing low‐power (32 pJ per forward process) and highly area‐efficient (100 µm 2 ) hardware chip for reinforcement learning. Abstract : Hardware implementation for reward‐modulated spike‐timing‐dependent plasticity (R‐STDP) is demonstrated with a unique 2T synaptic cell structure, which realizes the functions of bothAbstract: Reward‐modulated spike‐timing‐dependent plasticity (R‐STDP) is a brain‐inspired reinforcement learning (RL) rule, exhibiting potential for decision‐making tasks and artificial general intelligence. However, the hardware implementation of the reward‐modulation process in R‐STDP usually requires complicated Si complementary metal–oxide–semiconductor (CMOS) circuit design that causes high power consumption and large footprint. Here, a design with two synaptic transistors (2T) connected in a parallel structure is experimentally demonstrated. The 2T unit based on WSe2 ferroelectric transistors exhibits reconfigurable polarity behavior, where one channel can be tuned as n‐type and the other as p‐type due to nonvolatile ferroelectric polarization. In this way, opposite synaptic weight update behaviors with multilevel (>6 bit) conductance states, ultralow nonlinearity (0.56/−1.23), and large G max / G min ratio of 30 are realized. By applying positive/negative reward to (anti‐)STDP component of 2T cell, R‐STDP learning rules are realized for training the spiking neural network and demonstrated to solve the classical cart–pole problem, exhibiting a way for realizing low‐power (32 pJ per forward process) and highly area‐efficient (100 µm 2 ) hardware chip for reinforcement learning. Abstract : Hardware implementation for reward‐modulated spike‐timing‐dependent plasticity (R‐STDP) is demonstrated with a unique 2T synaptic cell structure, which realizes the functions of both STDP and anti‐STDP using a simple hardware structure. The total synaptic weight is increased (decreased) by applying feedback signal to gate1 or gate2 when, respectively, a positive or negative reward signal comes. … (more)
- Is Part Of:
- Advanced materials. Volume 34:Issue 48(2022)
- Journal:
- Advanced materials
- Issue:
- Volume 34:Issue 48(2022)
- Issue Display:
- Volume 34, Issue 48 (2022)
- Year:
- 2022
- Volume:
- 34
- Issue:
- 48
- Issue Sort Value:
- 2022-0034-0048-0000
- Page Start:
- n/a
- Page End:
- n/a
- Publication Date:
- 2022-02-25
- Subjects:
- 2D semiconductors -- ferroelectric materials -- reinforcement learning -- reward‐modulated spike‐timing‐dependent plasticity -- synaptic device
Materials -- Periodicals
Chemical vapor deposition -- Periodicals
620.11 - Journal URLs:
- http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)1521-4095 ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1002/adma.202107754 ↗
- Languages:
- English
- ISSNs:
- 0935-9648
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 0696.897800
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 24534.xml