$ฮผ$NAS: Constrained Neural Architecture Search for Microcontrollers

October 27, 2020 ยท Entered Twilight ยท ๐Ÿ› EuroMLSys@EuroSys

๐ŸŒ… TWILIGHT: Old Age
Predates the code-sharing era โ€” a pioneer of its time

"Last commit was 5.0 years ago (โ‰ฅ5 year threshold)"

Evidence collected by the PWNC Scanner

Repo contents: .gitignore, Makefile, Pipfile, Pipfile.lock, README.md, architecture.py, cnn, config.py, configs, dataset, dragonfly_adapters, driver.py, generate_tflite_models.py, mlp, model_trainer.py, pruning.py, resource_models, schema_types.py, search_algorithms, search_space.py, search_state_processor.py, slurm_arcus.sh, slurm_job.sh, teachers, test, utils.py

Authors Edgar Liberis, ลukasz Dudziak, Nicholas D. Lane arXiv ID 2010.14246 Category cs.LG: Machine Learning Cross-listed cs.AR Citations 123 Venue EuroMLSys@EuroSys Repository https://github.com/eliberis/uNAS โญ 82 Last Checked 1 month ago
Abstract
IoT devices are powered by microcontroller units (MCUs) which are extremely resource-scarce: a typical MCU may have an underpowered processor and around 64 KB of memory and persistent storage, which is orders of magnitude fewer computational resources than is typically required for deep learning. Designing neural networks for such a platform requires an intricate balance between keeping high predictive performance (accuracy) while achieving low memory and storage usage and inference latency. This is extremely challenging to achieve manually, so in this work, we build a neural architecture search (NAS) system, called $ฮผ$NAS, to automate the design of such small-yet-powerful MCU-level networks. $ฮผ$NAS explicitly targets the three primary aspects of resource scarcity of MCUs: the size of RAM, persistent storage and processor speed. $ฮผ$NAS represents a significant advance in resource-efficient models, especially for "mid-tier" MCUs with memory requirements ranging from 0.5 KB to 64 KB. We show that on a variety of image classification datasets $ฮผ$NAS is able to (a) improve top-1 classification accuracy by up to 4.8%, or (b) reduce memory footprint by 4--13x, or (c) reduce the number of multiply-accumulate operations by at least 2x, compared to existing MCU specialist literature and resource-efficient models.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Machine Learning