Machine Learning isn't just for supercomputers. TinyML allows us to run intelligence on the billions of microcontrollers that power our everyday objects.
1Living with 256KB of RAM
Deploying AI to a Microcontroller (MCU) is a battle against physics. Unlike a server with gigabytes of RAM, an MCU might only have 256KB of SRAM and 1MB of Flash. We cannot use standard TensorFlow or even standard TFLite. Instead, we use TFLite Micro (TFLM), a specialized runtime that uses Zero Dynamic Allocation. This means it never calls malloc() or new, preventing unpredictable memory crashes in critical embedded systems.
# TinyML: AI on the Metal
# Deploying Models to 256KB RAM Devices2The Tensor Arena
Because TFLite Micro doesn't use dynamic memory, the developer must provide a Tensor Arenaβ a static byte array in SRAM. During the AllocateTensors() phase, the interpreter maps every tensor in the model's graph into this buffer. If the arena is too small, the code will fail to boot. If it's too large, you won't have room for your sensor data or Wi-Fi stack. Balancing the arena size is the core challenge of TinyML engineering.
// Arduino Memory Map
Flash Memory: 1MB (Model Storage)
SRAM: 256KB (Tensor Arena)
// We cannot train on the edge.
// Models must be quantized to INT8.3Flash vs. RAM
To save precious RAM, we store the model's weights in Flash Memory (Read-Only) using the PROGMEM attribute. The weights are accessed directly from Flash during inference. The SRAM (RAM) is reserved only for the intermediate 'Activations'βthe temporary mathematical results that change as data flows through the layers. This split architecture allows us to run 1MB models on devices with only 100KB of working memory.
Booting Arduino...
Initializing sensors...
[INFO] SRAM Available: 240 KB
[INFO] Ready for TFLite Micro.