Koala | AI-powered Vending Machine

Some industries have changed very little over the decades. A prime example is the traditional vending machine, often found at train stations and other busy locations. My first impression of these machines? They’re usually dirty and filled with unhealthy products. But does the user experience really have to be that bad?

Let’s change that! We’re building a smart vending machine using a Raspberry Pi and machine learning. I’ll show you the result. We realized that project with just two people in four days. 😮

The rise of smart vending machines

Even with increasing digitalization, most vending machines remain relics of the last century. The limitations aren’t just in payment options. The design and mechanics are also outdated.

We’ve all likely experienced the frustration of a product getting stuck in a machine, or the unpleasant task of putting the hand through that nasty and sticky slot.

While Amazon is pioneering with its Amazon Go Stores, countries like Japan are already seeing the rise of smart vending machines that promise a simple and pleasant user experience. Shopping should feel as easy as taking products out of your own fridge.

We build the first prototype

Our goal was to develop a prototype for this user experience. Our goal was to focus on the following two key aspects:

Aesthetic design
Great user experience

For a rough overview we sketeched this simplified activity diagram.

After discussing the application flow, we outlined the following work packages:

Constructing a physical vending machine with a controllable locking mechanism
Assembling the hardware and electronics
Training an artificial intelligence model for product recognition
Developing the backend system to control the electronics and the AI application
Developing a web app that communicates with the vending machine

Construction using the IKEA cube

So, we went to IKEA and bought one of the well-known EKET cube. With its matching glass door and hinges, the cube served as the basic structure for the vending machine.

We implemented the locking mechanism using a simple electromagnetic magnetic lock. In order to be able to request the current status (door opened / door closed), we have positioned a standard push button so that it comes into contact with the hinge of the glass door. When the door is closed, the push button remains pressed. If the door is opened by the magnetic lock, the hinge also opens and the Push Button is no longer pressed. So the push button signal can be read by the hardware.

We used a Raspberry Pi 3 to control the electronics, which also has enough power to run the machine learning model. The Pi’s internet connection allows it to communicate with the web app.

To detect product removal, we installed a wide-angle camera on the ceiling of the cube. Proper lighting is crucial to avoid shadows and other artifacts that could interfere with detection.

Training of the machine learning model

We used Google AutoML Vision for product recognition, training a machine learning model to classify images. The model was trained in the Google Cloud using a total of 150 images, showcasing different selections, combinations, quantities, rotations, and positions of products in the vending machine.

We trained the model with the following six products, labeling about 77 images for each:

Product	Label Count
Bebe Lotion	78
Dornfelder Wine	75
Feelissimo Condoms	78
Seitenbacher Cereal Bar	75
Koala Biscuits	76
Sagrotan Disinfectant Gel	77

The trained model runs directly on the Raspberry Pi using TensorFlow Light. The model, trained in the Google Cloud, is exported as a TensorFlow Lite model, optimized for making predictions on embedded devices with limited computing resources, making it ideal for AI applications.

Development of the web app

To open the vending machine, users enter a three-digit code via a web app located on the vending machine. Once the machine is open, users simply take out their products and close the door. The vending machine automatically recognizes the products in real-time, adds them to the shopping cart in the web app, and processes the payment.

Results

The calculated Precision and Recall of the trained model are very promising. Precision, which indicates the frequency of true-positive predictions, is 97.62%, meaning few false positives. Recall, indicating the frequency of correct predictions, is 93.18%, showing few false negatives.

Overall, the trained model appears to make reliable predictions. However, in early real-world tests, some weaknesses were noted despite excellent metrics. Occasionally, errors were detected, but these were corrected using heuristics that flatten recognized products in the shopping cart, significantly reducing incorrect recognition.

In our first test with 15 predictions, only one test resulted in false positives, which were automatically corrected by our shopping cart heuristic. The recognition rate for this test was 93.33%.

Our results were based on carefully sorted products in the vending machine. If products are stacked or placed too close together, detection fails.

What happens next?

The AutoML Vision model used was sufficient for proof of concept, but there’s still room for improvement. A detection rate of up to 98% seems realistic with a customized model tailored specifically for this use case.

A model that first recognizes objects and then identifies them could significantly improve recognition rates. Additionally, training separate models for each product could allow new products to be added without retraining the entire model, addressing scaling issues.

To solve the problem of overlapping products, multiple cameras could be installed from different angles. Recognition could also be enhanced by training the model with a greater variety of images, including those with more chaotic inventories and additional artifacts.