diff --git a/static/img/baby-1.png b/static/img/baby-1.png new file mode 100644 index 0000000..a4a5c59 Binary files /dev/null and b/static/img/baby-1.png differ diff --git a/static/img/baby-2.jpg b/static/img/baby-2.jpg new file mode 100644 index 0000000..25b55ec Binary files /dev/null and b/static/img/baby-2.jpg differ diff --git a/static/pages/Create-your-smart-baby-monitor-with-Platypush-and-Tensorflow.md b/static/pages/Create-your-smart-baby-monitor-with-Platypush-and-Tensorflow.md new file mode 100644 index 0000000..6af349a --- /dev/null +++ b/static/pages/Create-your-smart-baby-monitor-with-Platypush-and-Tensorflow.md @@ -0,0 +1,638 @@ +[//]: # (title: Create your smart baby monitor with Platypush and Tensorflow) +[//]: # (description: Use open-source software and cheap hardware to build a solution that can detect your baby's cries.) +[//]: # (image: /img/baby-1.png) +[//]: # (author: Fabio Manganiello ) +[//]: # (published: 2020-10-31) + +Some of you may have noticed that it’s been a while since my last article. That’s because I’ve become a dad in the +meantime, and I’ve had to take a momentary break from my projects to deal with some parental tasks that can’t (yet) be +automated. + +Or, can they? While we’re probably still a few years away from a robot that can completely take charge of the task of +changing your son’s diapers (assuming that enough crazy parents agree to test such a device on their own toddlers), +there are some less risky parental duties out there that offer some margin for automation. + +One of the first things I’ve come to realize as a father is that infants can really cry a lot, and even if I’m at home I +may not always be nearby enough to hear my son’s cries. Commercial baby monitors usually step in to fill that gap and +they act as intercoms that let you hear your baby’s sounds even if you’re in another room. But I’ve soon realized that +commercial baby monitors are dumber than the ideal device I’d want. They don’t detect your baby’s cries — they simply +act like intercoms that take sound from a source to a speaker. It’s up to the parent to move the speaker as they move to +different rooms, as they can’t play the sound on any other existing audio infrastructure. They usually come with +low-power speakers, and they usually can’t be connected to external speakers — it means that if I’m in another room +playing music I may miss my baby’s cries, even if the monitor is in the same room as mine. And most of them work on +low-power radio waves, which means that they usually won’t work if the baby is in his/her room and you have to take a +short walk down to the basement. + +So I’ve come with a specification for a smart baby monitor. + +- It should run on anything as simple and cheap as a RaspberryPi with a cheap USB microphone. + +- It should detect my baby’s cries and notify me (ideally on my phone) when he starts/stops crying, or track the data + points on my dashboard, or do any kind of tasks that I’d want to run when my son is crying. It shouldn’t only act as a + dumb intercom that delivers sound from a source to one single type of compatible device. + +- It should be able to stream the audio on any device — my own speakers, my smartphone, my computer etc. + +- It should work no matter the distance between the source and the speaker, with no need to move the speaker around the + house. + +- It should also come with a camera, so I can either check in real-time how my baby is doing or I can get a picture or a + short video feed of the crib when he starts crying to check that everything is alright. + +Let’s see how to use our favourite open-source tools to get this job done. + +## Recording some audio samples + +First of all, get a RaspberryPi and flash any compatible Linux OS on an SD card — it’s better to use a RaspberryPi 3 +or higher to run the Tensorflow model. Also get a compatible USB microphone — anything will work, really. + +Then install the dependencies that we’ll need: + +```shell +[sudo] apt-get install ffmpeg lame libatlas-base-dev alsa-utils +[sudo] pip3 install tensorflow +``` + +As a first step, we’ll have to record enough audio samples where the baby cries and where the baby doesn’t cry that +we’ll use later to train the audio detection model. *Note: in this example I’ll show how to use sound detection to +recognize a baby’s cries, but the same exact procedure can be used to detect any type of sounds — as long as they’re +long enough (e.g. an alarm or your neighbour’s drilling) and loud enough over the background noise*. + +First, take a look at the recognized audio input devices: + +```shell +arecord -l +``` + +On my RaspberryPi I get the following output (note that I have two USB microphones): + +``` +**** List of CAPTURE Hardware Devices **** +card 1: Device [USB PnP Sound Device], device 0: USB Audio [USB Audio] + Subdevices: 0/1 + Subdevice #0: subdevice #0 +card 2: Device_1 [USB PnP Sound Device], device 0: USB Audio [USB Audio] + Subdevices: 0/1 + Subdevice #0: subdevice #0 +``` + +I want to use the second microphone to record sounds — that’s card 2, device 0. The ALSA way of identifying it is either +`hw:2,0` (which accesses the hardware device directly) or `plughw:2,0` (which infers sample rate and format conversion +plugins if required). Make sure that you have enough space on your SD card or plug an external USB drive, and then start +recording some audio: + +```shell +arecord -D plughw:2,0 -c 1 -f cd | lame - audio.mp3 +``` + +Record a few minutes or hours of audio while your baby is in the same room — preferably with long sessions both of +silence, baby cries and other non-related sounds — and Ctrl-C the process when done. Repeat the procedure as many times +as you like to get audio samples over different moments of the day or over different days. + +## Labeling the audio samples + +Once you have enough audio samples, it’s time to copy them over to your computer to train the model — either use `scp` +to copy the files, or copy them directly from the SD card/USB drive. + +Let’s store them all under the same directory, e.g. `~/datasets/sound-detect/audio`. Also, let’s create a new folder for +each of the samples. Each folder will contain an audio file (named `audio.mp3`) and a labels file (named `labels.json`) +that we’ll use to label the negative/positive audio segments in the audio file. So the structure of the raw dataset will +be something like: + +``` +~/datasets/sound-detect/audio + -> sample_1 + -> audio.mp3 + -> labels.json + + -> sample_2 + -> audio.mp3 + -> labels.json + + ... +``` + +The boring part comes now: labeling the recorded audio files — and it can be particularly masochistic if they contain +hours of your own baby’s cries. Open each of the dataset audio files either in your favourite audio player or in +Audacity and create a new `labels.json` file in each of the samples directories. Identify the exact times where the +cries start and where they end, and report them in `labels.json` as a key-value structure in the +form `time_string -> label`. Example: + +```json +{ + "00:00": "negative", + "02:13": "positive", + "04:57": "negative", + "15:41": "positive", + "18:24": "negative" +} +``` + +In the example above, all the audio segments between 00:00 and 02:12 will be labelled as negative, all the audio +segments between 02:13 and 04:56 will be labelled as positive, and so on. + +## Generating the dataset + +Once you have labelled all the audio samples, let’s proceed with generating the dataset that will be fed to the +Tensorflow model. I have created a generic library and set of utilities for sound monitoring called micmon. Let’s start +with installing it: + +```shell +git clone https://github.com/BlackLight/micmon.git +cd micmon +[sudo] pip3 install -r requirements.txt +[sudo] python3 setup.py build install +``` + +The model is designed to work on frequency samples instead of raw audio. The reason is that, if we want to detect a +specific sound, that sound will have a specific “spectral” signature — i.e. a base frequency (or a narrow range where +the base frequency may usually fall) and a specific set of harmonics bound to the base frequency by specific ratios. +Moreover, the ratios between such frequencies are not affected neither by amplitude (the frequency ratios are constant +regardless of the input volume) nor by phase (a continuous sound will have the same spectral signature regardless of +when you start recording it). Such an amplitude and time invariant property makes this approach much more likely to +train a robust sound detection model compared to the case where we simply feed raw audio samples to a model. Moreover, +this model can be simpler (we can easily group frequencies into bins without affecting the performance, thus we can +effectively perform dimensional reduction), much lighter (the model will have between 50 and 100 frequency bands as +input values, regardless of the sample duration, while one second of raw audio usually contains 44100 data points, and +the length of the input increases with the duration of the sample) and less prone to overfit. + +`micmon` provides the logic to calculate the [*FFT*](https://en.wikipedia.org/wiki/Fast_Fourier_transform) (Fast-Fourier +Transform) of some segments of the audio samples, group the resulting spectrum into bands with low-pass and high-pass +filters and save the result to a set of numpy compressed (`.npz`) files. You can do it over command-line through the +`micmon-datagen` command: + +```shell +micmon-datagen \ + --low 250 --high 2500 --bins 100 \ + --sample-duration 2 --channels 1 \ + ~/datasets/sound-detect/audio \ + ~/datasets/sound-detect/data +``` + +In the example above we generate a dataset from raw audio samples stored under `~/dataset/sound-detect/audio` and store +the resulting spectral data to `~/datasets/sound-detect/data`. `--low` and `--high` respectively identify the lowest and +highest frequency to be taken into account in the resulting spectrum. The default values are respectively 20 Hz (lowest +frequency audible to a human ear) and 20 kHz (highest frequency audible to a healthy and young human ear). However, you +may usually want to restrict this range to capture as much as possible of the sound that you want to detect and limit as +much as possible any other type of audio background and unrelated harmonics. I have found in my case that a 250–2500 Hz +range is good enough to detect baby cries. Baby cries are usually high-pitched (consider that the highest note an opera +soprano can reach is around 1000 Hz), and you may usually want to at least double the highest frequency to make sure +that you get enough higher harmonics (the harmonics are the higher frequencies that actually give a *timbre*, or colour, +to a sound), but not too high to pollute the spectrum with harmonics from other background sounds. I also cut anything +below 250 Hz — a baby’s cry sound probably won’t have much happening on those low frequencies, and including them may +also skew detection. A good approach is to open some positive audio samples in e.g. Audacity or any equalizer/spectrum +analyzer, check which frequencies are dominant in the positive samples and center your dataset around those frequencies. +`--bins` specifies the number of groups for the frequency space (default: 100). A higher number of bins means a higher +frequency resolution/granularity, but if it’s too high it may make the model prone to overfit. + +The script splits the original audio into smaller segments and it calculates the spectral “signature” of each of those +segments. `--sample-duration` specifies how long each of these segments should be (default: 2 seconds). A higher value +may work better with sounds that last longer, but it’ll decrease the time-to-detection and it’ll probably fail on short +sounds. A lower value may work better with shorter sounds, but the captured segments may not have enough information to +reliably identify the sound if the sound is longer. + +An alternative approach to the `micmon-datagen` script is to make your own script for generating the dataset through the +provided micmon API. Example: + +```python +import os + +from micmon.audio import AudioDirectory, AudioFile +from micmon.dataset import DatasetWriter + +basedir = os.path.expanduser('~/datasets/sound-detect') +audio_dir = os.path.join(basedir, 'audio') +datasets_dir = os.path.join(basedir, 'data') +cutoff_frequencies = [250, 2500] + +# Scan the base audio_dir for labelled audio samples +audio_dirs = AudioDirectory.scan(audio_dir) + +# Save the spectrum information and labels of the samples to a +# different compressed file for each audio file. +for audio_dir in audio_dirs: + dataset_file = os.path.join(datasets_dir, os.path.basename(audio_dir.path) + '.npz') + print(f'Processing audio sample {audio_dir.path}') + + with AudioFile(audio_dir) as reader, \ + DatasetWriter(dataset_file, + low_freq=cutoff_frequencies[0], + high_freq=cutoff_frequencies[1]) as writer: + for sample in reader: + writer += sample +``` + +Whether you used `micmon-datagen` or the micmon Python API, at the end of the process you should find a bunch of `.npz` +files under `~/datasets/sound-detect/data`, one for each labelled audio file in the original dataset. We can use this +dataset to train our neural network for sound detection. + +## Training the model + +`micmon` uses Tensorflow+Keras to define and train the model. It can easily be done with the provided Python API. +Example: + +```python +import os +from tensorflow.keras import layers + +from micmon.dataset import Dataset +from micmon.model import Model + +# This is a directory that contains the saved .npz dataset files +datasets_dir = os.path.expanduser('~/datasets/sound-detect/data') + +# This is the output directory where the model will be saved +model_dir = os.path.expanduser('~/models/sound-detect') + +# This is the number of training epochs for each dataset sample +epochs = 2 + +# Load the datasets from the compressed files. +# 70% of the data points will be included in the training set, +# 30% of the data points will be included in the evaluation set +# and used to evaluate the performance of the model. +datasets = Dataset.scan(datasets_dir, validation_split=0.3) +labels = ['negative', 'positive'] +freq_bins = len(datasets[0].samples[0]) + +# Create a network with 4 layers (one input layer, two intermediate layers and one output layer). +# The first intermediate layer in this example will have twice the number of units as the number +# of input units, while the second intermediate layer will have 75% of the number of +# input units. We also specify the names for the labels and the low and high frequency range +# used when sampling. +model = Model( + [ + layers.Input(shape=(freq_bins,)), + layers.Dense(int(2 * freq_bins), activation='relu'), + layers.Dense(int(0.75 * freq_bins), activation='relu'), + layers.Dense(len(labels), activation='softmax'), + ], + labels=labels, + low_freq=datasets[0].low_freq, + high_freq=datasets[0].high_freq +) + +# Train the model +for epoch in range(epochs): + for i, dataset in enumerate(datasets): + print(f'[epoch {epoch+1}/{epochs}] [audio sample {i+1}/{len(datasets)}]') + model.fit(dataset) + evaluation = model.evaluate(dataset) + print(f'Validation set loss and accuracy: {evaluation}') + +# Save the model +model.save(model_dir, overwrite=True) +``` + +After running this script (and after you’re happy with the model’s accuracy) you’ll find your new model saved under +`~/models/sound-detect`. In my case it was sufficient to collect ~5 hours of sounds from my baby’s room and define a +good frequency range to train a model with >98% accuracy. If you trained this model on your computer, just copy it to +the RaspberryPi and you’re ready for the next step. + +## Using the model for predictions + +Time to make a script that uses the previously trained model on live audio data from the microphone and notifies us when +our baby is crying: + +```python +import os + +from micmon.audio import AudioDevice +from micmon.model import Model + +model_dir = os.path.expanduser('~/models/sound-detect') +model = Model.load(model_dir) +audio_system = 'alsa' # Supported: alsa and pulse +audio_device = 'plughw:2,0' # Get list of recognized input devices with arecord -l + +with AudioDevice(audio_system, device=audio_device) as source: + for sample in source: + # Pause recording while we process the frame + source.pause() + prediction = model.predict(sample) + print(prediction) + # Resume recording + source.resume() +``` + +Run the script on the RaspberryPi and leave it running for a bit — it will print `negative` if no cries have been +detected over the past 2 seconds and `positive` otherwise. + +There’s not much use however in a script that simply prints a message to the standard output if our baby is crying — we +want to be notified! Let’s use Platypush to cover this part. In this example, we’ll use +the [`pushbullet`](https://platypush.readthedocs.io/en/latest/platypush/plugins/pushbullet.html) integration to send a +message to our mobile when cry is detected. Let’s install Redis (used by Platypush to receive messages) and Platypush +with the HTTP and Pushbullet integrations: + +```shell +[sudo] apt-get install redis-server +[sudo] systemctl start redis-server.service +[sudo] systemctl enable redis-server.service +[sudo] pip3 install 'platypush[http,pushbullet]' +``` + +Install the Pushbullet app on your smartphone and head to https://pushbullet.com to get an API token. Then create a +`~/.config/platypush/config.yaml` file that enables the HTTP and Pushbullet integrations: + +```yaml +backend.http: + enabled: True + +pushbullet: + token: YOUR_TOKEN +``` + +Now, let’s modify the previous script so that, instead of printing a message to the standard output, it triggers a +[`CustomEvent`](https://platypush.readthedocs.io/en/latest/platypush/events/custom.html) that can be captured by a +Platypush hook: + +```python +#!/usr/bin/python3 + +import argparse +import logging +import os +import sys + +from platypush import RedisBus +from platypush.message.event.custom import CustomEvent + +from micmon.audio import AudioDevice +from micmon.model import Model + +logger = logging.getLogger('micmon') + + +def get_args(): + parser = argparse.ArgumentParser() + parser.add_argument('model_path', help='Path to the file/directory containing the saved Tensorflow model') + parser.add_argument('-i', help='Input sound device (e.g. hw:0,1 or default)', required=True, dest='sound_device') + parser.add_argument('-e', help='Name of the event that should be raised when a positive event occurs', required=True, dest='event_type') + parser.add_argument('-s', '--sound-server', help='Sound server to be used (available: alsa, pulse)', required=False, default='alsa', dest='sound_server') + parser.add_argument('-P', '--positive-label', help='Model output label name/index to indicate a positive sample (default: positive)', required=False, default='positive', dest='positive_label') + parser.add_argument('-N', '--negative-label', help='Model output label name/index to indicate a negative sample (default: negative)', required=False, default='negative', dest='negative_label') + parser.add_argument('-l', '--sample-duration', help='Length of the FFT audio samples (default: 2 seconds)', required=False, type=float, default=2., dest='sample_duration') + parser.add_argument('-r', '--sample-rate', help='Sample rate (default: 44100 Hz)', required=False, type=int, default=44100, dest='sample_rate') + parser.add_argument('-c', '--channels', help='Number of audio recording channels (default: 1)', required=False, type=int, default=1, dest='channels') + parser.add_argument('-f', '--ffmpeg-bin', help='FFmpeg executable path (default: ffmpeg)', required=False, default='ffmpeg', dest='ffmpeg_bin') + parser.add_argument('-v', '--verbose', help='Verbose/debug mode', required=False, action='store_true', dest='debug') + parser.add_argument('-w', '--window-duration', help='Duration of the look-back window (default: 10 seconds)', required=False, type=float, default=10., dest='window_length') + parser.add_argument('-n', '--positive-samples', help='Number of positive samples detected over the window duration to trigger the event (default: 1)', required=False, type=int, default=1, dest='positive_samples') + + opts, args = parser.parse_known_args(sys.argv[1:]) + return opts + + +def main(): + args = get_args() + if args.debug: + logger.setLevel(logging.DEBUG) + + model_dir = os.path.abspath(os.path.expanduser(args.model_path)) + model = Model.load(model_dir) + window = [] + cur_prediction = args.negative_label + bus = RedisBus() + + with AudioDevice(system=args.sound_server, + device=args.sound_device, + sample_duration=args.sample_duration, + sample_rate=args.sample_rate, + channels=args.channels, + ffmpeg_bin=args.ffmpeg_bin, + debug=args.debug) as source: + for sample in source: + # Pause recording while we process the frame + source.pause() + prediction = model.predict(sample) + logger.debug(f'Sample prediction: {prediction}') + has_change = False + + if len(window) < args.window_length: + window += [prediction] + else: + window = window[1:] + [prediction] + + positive_samples = len([pred for pred in window if pred == args.positive_label]) + if args.positive_samples <= positive_samples and \ + prediction == args.positive_label and \ + cur_prediction != args.positive_label: + cur_prediction = args.positive_label + has_change = True + logging.info(f'Positive sample threshold detected ({positive_samples}/{len(window)})') + elif args.positive_samples > positive_samples and \ + prediction == args.negative_label and \ + cur_prediction != args.negative_label: + cur_prediction = args.negative_label + has_change = True + logging.info(f'Negative sample threshold detected ({len(window)-positive_samples}/{len(window)})') + + if has_change: + evt = CustomEvent(subtype=args.event_type, state=prediction) + bus.post(evt) + + # Resume recording + source.resume() + + +if __name__ == '__main__': + main() +``` + +Save the script above as e.g. `~/bin/micmon_detect.py`. The script only triggers an event if at least `positive_samples` +samples are detected over a sliding window of `window_length` seconds (that’s to reduce the noise caused by prediction +errors or temporary glitches), and it only triggers an event when the current prediction goes from negative to positive +or the other way around. The event is then dispatched to Platypush over the `RedisBus`. The script should also be +general-purpose enough to work with any sound model (not necessarily that of a crying infant), any positive/negative +labels, any frequency range and any type of output event. + +Let’s now create a Platypush hook to react on the event and send a notification to our devices. First, prepare the +Platypush scripts directory if it’s not been created already: + +```shell +mkdir -p ~/.config/platypush/scripts +cd ~/.config/platypush/scripts + +# Define the directory as a module +touch __init__.py + +# Create a script for the baby-cry events +vi babymonitor.py +``` + +Content of `babymonitor.py`: + +```python +from platypush.context import get_plugin +from platypush.event.hook import hook +from platypush.message.event.custom import CustomEvent + + +@hook(CustomEvent, subtype='baby-cry', state='positive') +def on_baby_cry_start(event, **_): + pb = get_plugin('pushbullet') + pb.send_note(title='Baby cry status', body='The baby is crying!') + + +@hook(CustomEvent, subtype='baby-cry', state='negative') +def on_baby_cry_stop(event, **_): + pb = get_plugin('pushbullet') + pb.send_note(title='Baby cry status', body='The baby stopped crying - good job!') +``` + +Now create a service file for Platypush if it’s not present already and start/enable the service so it will +automatically restart on termination or reboot: + +```shell +mkdir -p ~/.config/systemd/user + +wget -O ~/.config/systemd/user/platypush.service \ + https://git.platypush.tech/platypush/platypush/-/raw/master/examples/systemd/platypush.service + +systemctl --user start platypush.service +systemctl --user enable platypush.service +``` + +And also create a service file for the baby monitor — e.g. `~/.config/systemd/user/babymonitor.service`: + +```yaml +[Unit] +Description=Monitor to detect my baby's cries +After=network.target sound.target + +[Service] +ExecStart=/home/pi/bin/micmon_detect.py -i plughw:2,0 -e baby-cry -w 10 -n 2 ~/models/sound-detect +Restart=always +RestartSec=10 + +[Install] +WantedBy=default.target +``` + +This service will start the microphone monitor on the ALSA device plughw:2,0and it will fire a baby-cry event with +state=positive if at least 2 positive 2-second samples have been detected over the past 10 seconds and the previous +state was negative, and state=negative if less than 2 positive samples were detected over the past 10 seconds and the +previous state was positive. We can then start/enable the service: + +```shell +systemctl --user start babymonitor.service +systemctl --user enable babymonitor.service +``` + +Verify that as soon as the baby starts crying you receive a notification on your phone. If you don’t you may other +review the labels you applied to your audio samples, the architecture and parameters of your neural network, or the +sample length/window/frequency band parameters. + +Also, consider that this is a relatively basic example of automation — feel free to spice it up with more automation +tasks. For example, you can send a request to another Platypush device (e.g. in your bedroom or living room) with the +[`tts`](https://platypush.readthedocs.io/en/latest/platypush/plugins/tts.html) plugin to say aloud that the baby is crying. You can also extend the `micmon_detect.py` script so that the captured +audio samples can also be streamed over HTTP — for example using a Flask wrapper and `ffmpeg` for the audio conversion. +Another interesting use case is to send data points to your local database when the baby starts/stops crying (you can +refer to my previous article on how to use Platypush+PostgreSQL+Mosquitto+Grafana to create your flexible and +self-managed dashboards): it’s a useful set of data to track when your baby sleeps, is awake or needs feeding. And, +again, monitoring my baby has been the main motivation behind developing micmon, but the exact same procedure can be +used to train and use models to detect any type of sound. Finally, you may consider using a good power bank or a pack of +lithium batteries to make your sound monitor mobile. + +## Baby camera + +Once you have a good audio feed and a way to detect when a positive audio sequence starts/stops, you may want to add a +video feed to keep an eye on your baby. While in my first set up I had mounted a PiCamera on the same RaspberryPi 3 I +used for the audio detection, I found this configuration quite unpractical. A RaspberryPi 3 sitting in its case, with an +attached pack of batteries and a camera somehow glued on top can be quite bulky if you’re looking for a light camera +that you can easily install on a stand or flexible arm and you can move around to keep an eye on your baby wherever +he/she is. I have eventually opted for a smaller RaspberryPi Zero with a PiCamera compatible case and a small power +bank. + +![RaspberryPi Zero + PiCamera setup](../img/baby-2.jpg) + +Like on the other device, plug an SD card with a RaspberryPi-compatible OS. Then plug a RaspberryPi-compatible camera in +its slot, make sure that the camera module is enabled in `raspi-config` and install Platypush with the PiCamera +integration: + +```shell +[sudo] pip3 install 'platypush[http,camera,picamera]' +``` + +Then add the camera configuration in `~/.config/platypush/config.yaml`: + +```yaml +camera.pi: + # Listen port for TCP/H264 video feed + listen_port: 5001 +``` + +You can already check this configuration on Platypush restart and get snapshots from the camera over HTTP: + +```shell +wget http://raspberry-pi:8008/camera/pi/photo.jpg +``` + +Or open the video feed in your browser: + +```shell +http://raspberry-pi:8008/camera/pi/video.mjpg +``` + +Or you can create a hook that starts streaming the camera feed over TCP/H264 when the application starts: + +```shell +mkdir -p ~/.config/platypush/scripts +cd ~/.config/platypush/scripts +touch __init__.py +vi camera.py +``` + +Content of camera.py: + +```python +from platypush.context import get_plugin +from platypush.event.hook import hook +from platypush.message.event.application import ApplicationStartedEvent + + +@hook(ApplicationStartedEvent) +def on_application_started(event, **_): + cam = get_plugin('camera.pi') + cam.start_streaming() +``` + +You will be able to play the feed in e.g. VLC: + +``` +vlc tcp/h264://raspberry-pi:5001 +``` + +Or on your phone either through the VLC app or apps +like [RPi Camera Viewer](https://play.google.com/store/apps/details?id=ca.frozen.rpicameraviewer&hl=en_US&gl=US). + +## Audio monitor + +The last step is to set up a direct microphone stream from your baby’s RaspberryPi to whichever client you may want to +use. The Tensorflow model is good to nudge you when the baby is crying, but we all know that machine learning models +aren’t exactly notorious for achieving 100% accuracy. Some time you may simply be sitting in another room and want to +hear what’s happening in your baby’s room. + +I have made a tool/library for purpose called [`micstream`](https://github.com/BlackLight/micstream/) — it can actually +be used in any situation where you want to set up an audio feed from a microphone over HTTP/mp3. Note: if you use a +microphone to feed audio to the Tensorflow model, then you’ll need another microphone for streaming. + +Just clone the repository and install the software (the only dependency is the ffmpeg executable installed on the +system): + +```shell +git clone https://github.com/BlackLight/micstream.git +cd micstream +[sudo] python3 setup.py install +``` + +You can get a full list of the available options with `micstream --help`. For example, if you want to set up streaming +on the 3rd audio input device (use `arecord -l` to get the full list), on the `/baby.mp3` endpoint, listening on port +8088 and with 96 kbps bitrate, then the command will be: + +```shell +micstream -i plughw:3,0 -e '/baby.mp3' -b 96 -p 8088 +``` + +You can now simply open `http://your-rpi:8088/baby.mp3` from any browser or audio player and you’ll have a real-time +audio feed from the baby monitor.