Traffic Audio Monitoring Using Neural Networks On Microcontrollers

Bachelor's thesis about vehicle classification and counting.

As part of my bachelor’s degree project, I collaborated with a fellow student to write a paper titled “Traffic Audio Monitoring Using Neural Networks on Microcontrollers: A Study on Vehicle Classification and Counting.”

The project combined theoretical research with practical implementation. Our goal was to evaluate the feasibility of using low-cost microcontrollers (Raspberry Pi Pico 2W) for real-time audio-based vehicle classification and counting using Convolutional Neural Networks (CNNs). In essence, the work served as a benchmark study to assess how far cost-efficient embedded hardware can go in performing modern machine learning tasks.

Approach

Throughout the project, we followed a systematic process — from data preparation and model design to deployment and evaluation. Key milestones included:

Training a lightweight CNN on vehicle audio data extracted as MFCC features.
Running the model on the microcontroller using a minimal version of TensorFlow Lite Micro.
Developing a custom tool to stream live audio to the MCU via USB, enabling real-time testing and accuracy evaluation.
Measuring computational limits, such as the number of Multiply–Accumulate operations (MACs) the MCU could handle.
Implementing a soft plurality voting mechanism to compensate for limited computational power and short audio frames.

Results

Our final system achieved strong performance:

Classification accuracy: 90.4%
Counting accuracy: 99.8%
Model size: ~4 million MACs running on a €9 Raspberry Pi Pico 2W

The system successfully classified four categories — car, motorcycle, commercial vehicle, and background noise — demonstrating that even inexpensive hardware can perform advanced machine learning tasks when carefully optimized.

Reflections

While our initial goal was to perform a benchmark, the project grew into much more. We gained deep insights into embedded AI, model compression, and audio signal processing — learning more in one semester than in most courses combined.

We hope our work serves as a foundation for developers and researchers aiming to use microcontrollers for on-device classification or improve lightweight neural network architectures.

If you’re curious, I recommend reading the abstract of the thesis — it gives a great overview of what we achieved and why it matters.

William Fridh

Develop, Sleep, Repeat