S. No. | Type | Area | Power/Energy | Performance | Remarks |
---|---|---|---|---|---|
GPU | Â | ||||
1 | Multi-GPU, Garcia et al. [33] |  |  | 3.71 × speedup | Motion Estimation System |
Non CMOS | Â | ||||
2 | Memristive Dynamical System, Bavandpour et al. [49] | 4n memristors and no switch for implementing an n-cell system | Â | Similar to Cellular Memristive Dynamical System (CMDS) | FitzHugh-Nagumo (FHN), Adaptive Exponential (AdEx) integrate and fire, and Izhikevich neuron models |
3 | Spiking Deep Neural Nets, Indiveri et al. [72] | cxQuad (43.79 m m 2), ROLLS (51.4 m m 2) | cxQuad(945uW @1.8 V), ROLLS (4 mW @1. 8V) | Upto 100 % accuracy on toy problems | Event-based convolutional stage for feature extraction connected to a spike-based learning stage for feature classification. |
Accelerators | Â | ||||
4 | Memristive Boltzmann Machine, Bojnordi et al. [28] |  | 25 × lower energy compared to multicore system, fully utilized accelerator chip consumes 1.3W | 57 × higher performance compared to multicore system | Hardware Accelerator for Combinatorial Optimization and Deep Learning |
5 | Processor (PuLP), Conti et al. [24] | Overall cluster area is 1.2 m m 2. | Peak theoretical energy efficiency of 211 GOPS/W, achieved upto 192 GOPS/W | Scaled over a 1 × to 354 × range, | Parallel Ultra Low-power Processor for ConvNet-based detector for smart surveillance, 4 Open-RISC cores, 64 kB of L2 memory and 24 kB of TCDM fabricated in 28nm STMicroelectronics FD-SOI technology |
6 | Processor (Mobile), Kim et al. [29] | Area overhead of 9 % | energy-savings of 22 % | Average speedups of 126 % and 23 % over CPU and a state-of- the-art MLP accelerator | Neural Network Accelerator for Mobile Application Processors, applied for edge detection |
7 | Memristor Based Crossbar, Liu et al. [26] | 0.943 m m 2 (M-net) and 1.793 m m 2 (D-net) | 184.2 × (25.23 ×) energy saving over MLP(AAM) | 178.4 × (27.06 ×) performance speedup over MLP(AAM) | RENO: Reconfigurable Neuromorphic Computing Accelerator benchmarked with Multi-layer perceptron and Auto-associative memory |
8 | Accelerator for machine learning, Liu et al. [27] | 3.51 m m 2 | 596 mW | 1.20 × faster than NVIDIA K20M GPU | PuDianNao: A Polyvalent Machine Learning Accelerator |
9 | Hardware Co-processor, Shen et al. [22] | 5 × 5 m m 2 | 0.84 mW/MHz with 1.8 V power supply | 92.7 % classification accuracy | Darwin Neuromorphic co-processor unit for spiking and artificial neural nets |
FPGA | Â | ||||
10 | Accelerator for large scale neural networks, Chung et al. [38] | 3.02 m m 2 | 485mW | 117.87 × faster, and it can reduce the total energy by 21.08 × | For convolutional and deep neural networks |
Digital CMOS | Â | ||||
11 | CMOS Motion Sensor, Chiang et al. [18] | 4 × 4 m m 2, 86.2 % fill factor | 13.2 mW | 6.8 % for ± X motion, 3.5 % for ± Y motion, and 6 % for ± Z motion | Motion sensor for Z-motion direction/velocity detection |
12 | ASIC Neural Network, Knag et al. [19] | 3.06 mm × 65 nm CMOS ASIC test chip | 6.67 mW for a 140 Mpixel/s throughput at 35 MHz. | Memory bit error rate of 0.01 | ASIC for image and video feature extraction |
Analog | Â | ||||
13 | Vertical Resistive RAM, Piccolboni et al. [92] | Area gain of 3-10 | Â | 98 % recognition rate | For Cochlea and CNN applications |
Applications | Â | ||||
14 | CMOS Analog VLSI Circuit, Chien et al. [136] | 330 μ m × 210 μ m |  | Theoretically linear relationship between output ISI distribution and input current | Spike-based random sampling |
15 | Memristor Array+CMOS Neuron, Chu et al. [107] |  |  | 55–100 % recogntion rate based on noise level | Digit recognition task |
16 | Neuromorphic Bio-amplifier, Corradi et al. [116] | 0.178 m m 2 | 90 μ W | 96 % classification accuracy | EEG bio-amplifier has a programmable gain of 45–54 dB, with a Root Mean Squared (RMS) input-referred noise level of 2.1 μ V |
17 | Arithmetic Units, Kim et al. [148] | 121 μ m 2 | 0.111 mW | 0.098 % error rate | Approximate adders and comparators |
18 | Processor + on-chip learning, Kim et al. [101] | 1.8 m m 2 | 5.7pJ/pixel | classification accuracy to 90 % | 256 neurons, 83K synapses for Spiking LCA with classification for object detection |
19 | Tactile Sensors for Touch,Lee et al. [122] | 37 × 43.5 c m 2 active sensor area |  | 4096 element tactile sensor array that can be sampled at over 5 kHz | Kilohertz Kilotaxel Tactile Sensor Array for Investigating Spatiotemporal Features |
20 | RRAM Multistate Register, Lorenzi et al. [108] | 2.8–5.2 μ m 2 | 6.5 % energy reduction | 40 % improvement over switch-on-event processor | Multistate register for continuous flow multithreading |
21 | CM1K chip, Suri et al. [138] |  | 668 μJ for learning and 487 μJ for recognition, while operating at 25 MHz | 91 % recognition accuracy | Multi-modal authentication (person identification) system based on simultaneous recognition of face and speech data |
22 | Switched Capacitor Circuit, Mayr et al. [110] | 600 μm × 600 μm | 1.9 mW | Short and Long term plasticity, 8k synapses | Closed loop interface to in-vitro cortical neuron cultures. |