Abstract
<jats:p>Vector–matrix multiplication (VMM), implemented through multiply–accumulate (MAC) operations, represents the dominant computational primitive in many artificial intelligence (AI) workloads. When executed on conventional von Neumann architectures, VMM operations suffer from important energy consumption and latency due to the separation between memory and processing units. To overcome these limitations, crossbar arrays built from Resistive Random Access Memory (RRAM) cells have been proposed for accelerating VMM computations. In this work, we investigate the key optimization trade-offs associated with implementing RRAM-based neural networks for classification applications. A simple two-layer neural network is first defined and trained in software to generate the weight matrices and bias parameters. Next, three hardware implementation scenarios are evaluated depending on whether negative floating-point numbers are used: Positive Weights Only (PWO), Positive and Negative Weights Only (PNWO), and Positive and Negative Weights with Biases (PNWB). The different implementations are analyzed at the hardware level by examining classification accuracy, energy efficiency, latency, and area overhead. The study further incorporates important RRAM limitations, including restricted conductance range and device variability. Hardware results show that the PWO scenario offers the lowest energy consumption (189 fJ/MAC) and area overhead but results in the lowest accuracy. PNWO and PNWB significantly improve accuracy (+177% and +180%) but increase energy consumption (+63% and +87%) and area (×2 and ×2.1). Under variability effects, PWO achieves better accuracy (94.65%), followed by PNWO (93.11%) and PNWB (92.11%).</jats:p>