On-NAS: On-Device Neural Architecture Search on Memory-Constrained Intelligent Embedded Systems
Authors: Bosung Kim, Seulki Lee
ABSTRACT
We introduce On-NAS, a memory-efficient on-device neural archi-tecture search (NAS) solution, that enables memory-constrained embedded devices to find the best deep model architecture and train it on the device. Based on the cell-based differentiable NAS, it drastically curtails the massive memory requirement of archi-tecture search, one of the major bottlenecks in realizing NAS on embedded devices. On-NAS first pre-trains a basic architecture block, called meta cell, by combining n cells into a single condensed cell via two-fold meta-learning, which can flexibly evolve to various architectures, saving the device storage space n times. Then, the offline-learned meta cell is loaded onto the device and unfolded to perform online on-device NAS via 1) expectation-based operation and edge pair search, enabling memory-efficient partial architecture search by reducing the required memory up to k and m/4 times, respectively, given k candidate operations andm nodes in a cell, and 2) step-by-step back-propagation that saves the memory usage of the backward pass of the n-cell architecture up to n times. To the best of our knowledge, On-NAS is the first standalone NAS and training solution fully operable on embedded devices with limited memory. Our experiment results show that On-NAS effectively identifies optimal architectures and trains it on the device, on par with GPU-based NAS in both few-shot and full-task learning settings, e.g., even 1.3% higher accuracy on miniImageNet, while reducing the
run-time memory and storage usage up to 20x and 4x, respectively.