438 lines
18 KiB
Markdown
438 lines
18 KiB
Markdown
[//]: # (title: Detect people with a RaspberryPi, a thermal camera, Platypush and Tensorflow)
|
||
[//]: # (description: Use cheap components and open-source software to build a robust presence detector.)
|
||
[//]: # (image: /img/people-detect-1.png)
|
||
[//]: # (published: 2019-09-27)
|
||
|
||
Triggering events based on the presence of people has been the dream of many geeks and DIY automation junkies for a
|
||
while. Having your house to turn the lights on or off when you enter or exit your living room is an interesting
|
||
application, for instance. Most of the solutions out there to solve these kinds of problems, even more high-end
|
||
solutions like the [Philips Hue sensors](https://www2.meethue.com/en-us/p/hue-motion-sensor/046677473389), detect
|
||
motion, not actual people presence — which means that the lights will switch off once you lay on your couch like a
|
||
sloth. The ability to turn off music and/or tv when you exit the room and head to your bedroom, without the hassle of
|
||
switching all the buttons off, is also an interesting corollary. Detecting the presence of people in your room while
|
||
you’re not at home is another interesting application.
|
||
|
||
Thermal cameras coupled with deep neural networks are a much more robust strategy to actually detect the presence of
|
||
people. Unlike motion sensors, they will detect the presence of people even when they aren’t moving. And unlike optical
|
||
cameras, they detect bodies by measuring the heat that they emit in the form of infrared radiation, and are therefore
|
||
much more robust — their sensitivity doesn’t depend on lighting conditions, on the position of the target, or the
|
||
colour. Before exploring the thermal camera solution, I tried for a while to build a model that instead relied on
|
||
optical images from a traditional webcam. The differences are staggering: I trained the optical model on more than ten
|
||
thousands 640x480 images taken all through a week in different lighting conditions, while I trained the thermal camera
|
||
model on a dataset of 900 24x32 images taken during a single day. Even with more complex network architectures, the
|
||
optical model wouldn’t score above a 91% accuracy in detecting the presence of people, while the thermal model would
|
||
achieve around 99% accuracy within a single training phase of a simpler neural network. Despite the high potential,
|
||
there’s not much out there in the market — there’s been some research work on the topic (if you google “people detection
|
||
thermal camera” you’ll mostly find research papers) and a few high-end and expensive products for professional
|
||
surveillance. In lack of ready-to-go solutions for my house, I decided to take on my duty and build my own solution —
|
||
making sure that it can easily be replicated by anyone.
|
||
|
||
## Prepare the hardware
|
||
|
||
For this example we'll use the following hardware:
|
||
|
||
- A RaspberryPi (cost: around $35). In theory any model should work, but it’s probably not a good idea to use a
|
||
single-core RaspberryPi Zero for machine learning tasks — the task itself is not very expensive (we’ll only use the
|
||
Raspberry for doing predictions on a trained model, not to train the model), but it may still suffer some latency on a
|
||
Zero. Plus, it may be really painful to install some of the required libraries (like Tensorflow or OpenCV) on
|
||
the `arm6` architecture used by the RaspberryPi Zero. Any better performing model (from RPi3 onwards) should
|
||
definitely do the job.
|
||
|
||
- A thermal camera. For this project, I’ve used the
|
||
[MLX90640](https://shop.pimoroni.com/products/mlx90640-thermal-camera-breakout) Pimoroni breakout camera (cost: $55),
|
||
as it’s relatively cheap, easy to install, and it provides good results. This camera comes in standard (55°) and
|
||
wide-angle (110°) versions. I’ve used the wide-angle model as the camera monitors a large living room, but take into
|
||
account that both have the same resolution (32x24 pixels), so the wider angle comes with the cost of a lower spatial
|
||
resolution. If you want to use a different thermal camera there’s not much you’ll need to change, as long as it comes
|
||
with a software interface for RaspberryPi and
|
||
it’s [compatible with Platypush](https://platypush.readthedocs.io/en/latest/platypush/plugins/camera.ir.mlx90640.html).
|
||
|
||
Setting up the MLX90640 on your RaspberryPi if you have a Breakout Garden it’s easy as a pie. Fit the Breakout Garden on
|
||
top of your RaspberryPi. Fit the camera breakout into an I2C slot. Boot the RaspberryPi. Done. Otherwise, you can also
|
||
connect the device directly to the [RaspberryPi I2C interface](https://radiostud.io/howto-i2c-communication-rpi/),
|
||
either using the right hardware PINs or the software emulation layer.
|
||
|
||
## Prepare the software
|
||
|
||
I tested my code on Raspbian, but with a few minor modifications it should be easily adaptable to any distribution
|
||
installed on the RaspberryPi.
|
||
|
||
The software support for the thermal camera requires a bit of work. The MLX90640 doesn’t come (yet) with a Python
|
||
ready-to-use interface, but a [C++ open-source driver is provided](https://github.com/pimoroni/mlx90640-library) -
|
||
and that's the driver that is wrapped by the Platypush integration. Instructions to install it:
|
||
|
||
```shell
|
||
# Install the dependencies
|
||
[sudo] apt-get install libi2c-dev
|
||
|
||
# Enable the I2C interface
|
||
echo dtparam=i2c_arm=on | sudo tee -a /boot/config.txt
|
||
|
||
# It's advised to configure the SPI bus baud rate to
|
||
# 400kHz to support the higher throughput of the sensor
|
||
echo dtparam=i2c1_baudrate=400000 | sudo tee -a /boot/config.txt
|
||
|
||
# A reboot is required here if you didn't have the
|
||
# options above enabled in your /boot/config.txt
|
||
[sudo] reboot
|
||
|
||
# Clone the driver's codebase
|
||
git clone https://github.com/pimoroni/mlx90640-library
|
||
cd mlx90640-library
|
||
|
||
# Compile the rawrgb example
|
||
make clean
|
||
make bcm2835
|
||
make I2C_MODE=LINUX examples/rawrgb
|
||
```
|
||
|
||
If it all went well you should see an executable named `rawrgb` under the `examples` directory. If you run it you should
|
||
see a bunch of binary data — that’s the raw binary representation of the frames captured by the camera. Remember where
|
||
it is located or move it to a custom bin folder, as it’s the executable that platypush will use to interact with the
|
||
camera module.
|
||
|
||
This post assumes that you have already installed and configured Platypush on your system. If not, head to my post on
|
||
[getting started with Platypush](https://blog.platypush.tech/article/Ultimate-self-hosted-automation-with-Platypush),
|
||
the [readthedocs page](https://platypush.readthedocs.io/en/latest/), the
|
||
[Gitlab page](https://git.platypush.tech/platypush/platypush) or
|
||
[the wiki](https://git.platypush.tech/platypush/platypush/-/wikis/home).
|
||
|
||
Install also the Python dependencies for the HTTP server, the MLX90640 plugin and Tensorflow:
|
||
|
||
```shell
|
||
[sudo] pip install 'platypush[http,tensorflow,mlx90640]'
|
||
```
|
||
|
||
Heading to your computer (we'll be using it for building the model that will be used on the RaspberryPi), install
|
||
OpenCV, Tensorflow and Jupyter and my utilities for handling images:
|
||
|
||
```shell
|
||
# For image manipulation
|
||
[sudo] pip install opencv
|
||
|
||
# Install Jupyter notebook to run the training code
|
||
[sudo] pip install jupyterlab
|
||
# Then follow the instructions at https://jupyter.org/install
|
||
|
||
# Tensorflow framework for machine learning and utilities
|
||
[sudo] pip install tensorflow numpy matplotlib
|
||
|
||
# Clone my repository with the image and training utilities
|
||
# and the Jupyter notebooks that we'll use for training.
|
||
git clone https://github.com/BlackLight/imgdetect-utils
|
||
```
|
||
|
||
## Capturing phase
|
||
|
||
Now that you’ve got all the hardware and software in place, it’s time to start capturing frames with your camera and use
|
||
them to train your model. First, configure
|
||
the [MLX90640 plugin](https://platypush.readthedocs.io/en/latest/platypush/plugins/camera.ir.mlx90640.html) in your
|
||
Platypush configuration file (by default, `~/.config/platypush/config.yaml`):
|
||
|
||
```yaml
|
||
# Enable the webserver
|
||
backend.http:
|
||
enabled: True
|
||
|
||
camera.ir.mlx90640:
|
||
fps: 16 # Frames per second
|
||
rotate: 270 # Can be 0, 90, 180, 270
|
||
rawrgb_path: /path/to/your/rawrgb
|
||
```
|
||
|
||
Restart the service, and if you haven't already create a user from the web interface at `http://your-rpi:8008`. You
|
||
should now be able to take pictures through the API:
|
||
|
||
```yaml
|
||
curl -XPOST -H 'Content-Type: application/json' -d '
|
||
{
|
||
"type":"request",
|
||
"action":"camera.ir.mlx90640.capture",
|
||
"args": {
|
||
"output_file":"~/snap.png",
|
||
"scale_factor":20
|
||
}
|
||
}' -a 'username:password' http://localhost:8008/execute
|
||
```
|
||
|
||
If everything went well, the thermal picture should be stored under `~/snap.png`. In my case it looks like this while
|
||
I’m in standing front of the sensor:
|
||
|
||
![Thermal camera snapshot](../img/people-detect-1.png)
|
||
|
||
Notice the glow at the bottom-right corner — that’s actually the heat from my RaspberryPi 4 CPU. It’s there in all the
|
||
images I take, and you may probably see similar results if you mounted your camera on top of the Raspberry itself, but
|
||
it shouldn’t be an issue for your model training purposes.
|
||
|
||
If you open the web panel (`http://your-host:8008`) you’ll also notice a new tab, represented by the sun icon, that you
|
||
can use to monitor your camera from a web interface.
|
||
|
||
![Thermal camera web panel screenshot](../img/people-detect-2.png)
|
||
|
||
You can also monitor the camera directly outside of the webpanel by pointing your browser to
|
||
`http://your-host:8008/camera/ir/mlx90640/stream?rotate=270&scale_factor=20`.
|
||
|
||
Now add a cronjob to your `config.yaml` to take snapshots every minute:
|
||
|
||
```yaml
|
||
cron.ThermalCameraSnapshotCron:
|
||
cron_expression: '* * * * *'
|
||
actions:
|
||
- action: camera.ir.mlx90640.capture
|
||
args:
|
||
output_file: "${__import__(’datetime’).datetime.now().strftime(’/your/img/folder/%Y-%m-%d_%H-%M-%S.jpg’)}"
|
||
grayscale: true
|
||
```
|
||
|
||
Or directly as a Python script under e.g. `~/.config/platypush/thermal.py` (make sure that `~/.config/platypush/__init__.py` also exists so the folder is recognized as a Python module):
|
||
|
||
```python
|
||
from datetime import datetime
|
||
|
||
from platypush.config import Config
|
||
from platypush.cron import cron
|
||
from platypush.utils import run
|
||
|
||
|
||
@cron('* * * * *')
|
||
def take_thermal_picture(**context):
|
||
run('camera.ir.mlx90640.capture', grayscale=True,
|
||
output_file=datetime.now().strftime('/your/img/folder/%Y-%m-%d_%H-%m-%S.jpg'))
|
||
```
|
||
|
||
The images will be stored under `/your/img/folder` in the format
|
||
`YYYY-mm-dd_HH-MM-SS.jpg`. No scale factor is applied — even if the images will
|
||
be tiny we’ll only need them to train our model. Also, we’ll convert the images
|
||
to grayscale — the neural network will be lighter and actually more accurate,
|
||
as it will only have to rely on one variable per pixel without being tricked by
|
||
RGB combinations.
|
||
|
||
Restart Platypush and verify that every minute a new picture is created under
|
||
your images directory. Let it run for a few hours or days until you’re happy
|
||
with the number of samples. Try to balance the numbers of pictures with no
|
||
people in the room and those with people in the room, trying to cover as many
|
||
cases as possible — e.g. sitting, standing in different points of the room etc.
|
||
As I mentioned earlier, in my case I only needed less than 1000 pictures with
|
||
enough variety to achieve accuracy levels above 99%.
|
||
|
||
## Labelling phase
|
||
|
||
Once you’re happy with the number of samples you’ve taken, copy the images over
|
||
to the machine you’ll be using to train your model (they should be all small
|
||
JPEG files weighing under 500 bytes each). Copy them to the folder where you
|
||
have cloned my `imgdetect-utils` repository:
|
||
|
||
```shell
|
||
BASEDIR=~/git_tree/imgdetect-utils
|
||
|
||
# This directory will contain your raw images
|
||
IMGDIR=$BASEDIR/datasets/ir/images
|
||
|
||
# This directory will contain the raw numpy training
|
||
# data parsed from the images
|
||
DATADIR=$BASEDIR/datasets/ir/data
|
||
|
||
mkdir -p $IMGDIR
|
||
mkdir -p $DATADIR
|
||
|
||
# Copy the images
|
||
scp pi@raspberry:/your/img/folder/*.jpg $IMGDIR
|
||
|
||
# Create the labels for the images. Each label is a
|
||
# directory under $IMGDIR
|
||
mkdir $IMGDIR/negative
|
||
mkdir $IMGDIR/positive
|
||
```
|
||
|
||
Once the images have been copied and the directories for the labels created,
|
||
run the `label.py` script provided in the repository to interactively label the
|
||
images:
|
||
|
||
```shell
|
||
cd $BASEDIR
|
||
python utils/label.py -d $IMGDIR --scale-factor 10
|
||
```
|
||
|
||
Each image will open in a new window and you can label it by typing either 1
|
||
(negative) or 2 (positive) - the label names are gathered from the names of the
|
||
directories you created at the previous step:
|
||
|
||
![Thermal camera pictures labelling](../img/people-detect-3.png)
|
||
|
||
At the end of the procedure the `negative` and `positive` directories under the
|
||
images directory should have been populated.
|
||
|
||
## Training phase
|
||
|
||
Once we’ve got all the labelled images it’s time to train our model. A
|
||
[`train.ipynb`](https://github.com/BlackLight/imgdetect-utils/blob/master/notebooks/ir/train.ipynb)
|
||
Jupyter notebook is provided under `notebooks/ir` and it should be
|
||
relatively self-explanatory:
|
||
|
||
```python
|
||
### Import stuff
|
||
|
||
import os
|
||
import sys
|
||
|
||
import numpy as np
|
||
|
||
import tensorflow as tf
|
||
from tensorflow import keras
|
||
|
||
######
|
||
# Change this with the directory where you cloned the imgdetect-utils repo
|
||
basedir = os.path.join(os.path.expanduser('~'), 'git_tree', 'imgdetect-utils')
|
||
sys.path.append(os.path.join(basedir))
|
||
|
||
from src.image_helpers import plot_images_grid, create_dataset_files
|
||
from src.train_helpers import load_data, plot_results, export_model
|
||
|
||
# Define the dataset directory - replace it with the path on your local
|
||
# machine where you have stored the previously labelled dataset.
|
||
dataset_dir = os.path.join(basedir, 'datasets', 'ir')
|
||
|
||
# Define the size of the input images. In the case of an
|
||
# MLX90640 it will be (24, 32) for horizontal images and
|
||
# (32, 24) for vertical images
|
||
image_size = (32, 24)
|
||
|
||
# Image generator batch size
|
||
batch_size = 64
|
||
|
||
# Number of training epochs
|
||
epochs = 5
|
||
######
|
||
|
||
# The Tensorflow model and properties file will be stored here
|
||
tf_model_dir = os.path.join(basedir, 'models', 'ir', 'tensorflow')
|
||
tf_model_file = os.path.join(tf_model_dir, 'ir.pb')
|
||
tf_properties_file = os.path.join(tf_model_dir, 'ir.json')
|
||
|
||
# Base directory that contains your training images and dataset files
|
||
dataset_base_dir = os.path.join(basedir, 'datasets', 'ir')
|
||
dataset_dir = os.path.join(dataset_base_dir, 'data')
|
||
|
||
# Store your thermal camera images here
|
||
img_dir = os.path.join(dataset_base_dir, 'images')
|
||
|
||
### Create model directories
|
||
|
||
os.makedirs(tf_model_dir, mode=0o775, exist_ok=True)
|
||
|
||
### Create a dataset files from the available images
|
||
|
||
dataset_files = create_dataset_files(img_dir, dataset_dir,
|
||
split_size=1000,
|
||
num_threads=1,
|
||
resize=input_size)
|
||
|
||
### Or load existing .npz dataset files
|
||
|
||
|
||
|
||
dataset_files = [os.path.join(dataset_dir, f)
|
||
for f in os.listdir(dataset_dir)
|
||
if os.path.isfile(os.path.join(dataset_dir, f))
|
||
and f.endswith('.npz')]
|
||
|
||
### Get the training and test set randomly out of the dataset with a split of 70/30
|
||
|
||
train_set, test_set, classes = load_data(*dataset_files, split_percentage=0.7)
|
||
print('Loaded {} training images and {} test images. Classes: {}'.format(
|
||
train_set.shape[0], test_set.shape[0], classes))
|
||
|
||
# Example output:
|
||
# Loaded 623 training images and 267 test images. Classes: ['negative' 'positive']
|
||
|
||
# Extract training set and test set images and labels
|
||
train_images = np.asarray([item[0] for item in train_set])
|
||
train_labels = np.asarray([item[1] for item in train_set])
|
||
test_images = np.asarray([item[0] for item in test_set])
|
||
test_labels = np.asarray([item[1] for item in test_set])
|
||
|
||
### Inspect the first 25 images in the training set
|
||
|
||
plot_images_grid(images=train_images, labels=train_labels,
|
||
classes=classes, rows=5, cols=5)
|
||
|
||
### Declare the model
|
||
|
||
# - Flatten input
|
||
# - Layer 1: 50% the number of pixels per image (RELU activation)
|
||
# - Layer 2: 20% the number of pixels per image (RELU activation)
|
||
# - Layer 3: as many neurons as the output labels
|
||
# (in this case 2: negative, positive) (Softmax activation)
|
||
|
||
model = keras.Sequential([
|
||
keras.layers.Flatten(input_shape=train_images[0].shape),
|
||
keras.layers.Dense(int(0.5 * train_images.shape[1] * train_images.shape[2]),
|
||
activation=tf.nn.relu),
|
||
keras.layers.Dense(int(0.2 * train_images.shape[1] * train_images.shape[2]),
|
||
activation=tf.nn.relu),
|
||
keras.layers.Dense(len(classes), activation=tf.nn.softmax)
|
||
])
|
||
|
||
### Compile the model
|
||
|
||
# - Loss function:This measures how accurate the model is during training. We
|
||
# want to minimize this function to "steer" the model in the right direction.
|
||
# - Optimizer: This is how the model is updated based on the data it sees and
|
||
# its loss function.
|
||
# - Metrics: Used to monitor the training and testing steps. The following
|
||
# example uses accuracy, the fraction of the images that are correctly classified.
|
||
|
||
model.compile(optimizer='adam',
|
||
loss='sparse_categorical_crossentropy',
|
||
metrics=['accuracy'])
|
||
|
||
### Train the model
|
||
|
||
model.fit(train_images, train_labels, epochs=3)
|
||
|
||
# Example output:
|
||
# Epoch 1/3 623/623 [======] - 0s 487us/sample - loss: 0.2672 - acc: 0.8860
|
||
# Epoch 2/3 623/623 [======] - 0s 362us/sample - loss: 0.0247 - acc: 0.9936
|
||
# Epoch 3/3 623/623 [======] - 0s 373us/sample - loss: 0.0083 - acc: 0.9984
|
||
|
||
### Evaluate accuracy against the test set
|
||
|
||
test_loss, test_acc = model.evaluate(test_images, test_labels)
|
||
print('Test accuracy:', test_acc)
|
||
|
||
# Example output:
|
||
# 267/267 [======] - 0s 243us/sample - loss: 0.0096 - acc: 0.9963
|
||
# Test accuracy: 0.9962547
|
||
|
||
### Make predictions on the test set
|
||
|
||
predictions = model.predict(test_images)
|
||
|
||
# Plot a grid of 36 images and show expected vs. predicted values
|
||
plot_results(images=test_images, labels=test_labels,
|
||
classes=classes, predictions=predictions,
|
||
rows=9, cols=4)
|
||
|
||
### Export as a Tensorflow model
|
||
|
||
export_model(model, tf_model_file,
|
||
properties_file=tf_properties_file,
|
||
classes=classes,
|
||
input_size=input_size)
|
||
```
|
||
|
||
If you managed to execute the whole notebook correctly you’ll have a file named
|
||
`ir.pb` under `models/ir/tensorflow`. That’s your Tensorflow model file, you can
|
||
now copy it over to the RaspberryPi and use it to do predictions:
|
||
|
||
```shell
|
||
scp $BASEDIR/models/ir/tensorflow/ir.pb pi@raspberry:/home/pi/models
|
||
```
|
||
|
||
## Detect people in the room
|
||
|
||
Once the Tensorflow model has been deployed to the RaspberryPi you can replace the
|
||
previous cronjob that stores pictures at regular intervals with a cronjob that captures
|
||
pictures and feeds them to the previously trained model
|
||
|