blog/markdown/Automate-your-music-collect...

[//]: # (title: Automate your music collection)
[//]: # (description: Use Platypush to manage your music activity, discovery playlists and be on top of new releases.)
[//]: # (image: /img/music-automation.png)
[//]: # (author: Fabio Manganiello <fabio@platypush.tech>)
[//]: # (published: 2022-09-19)
[//]: # (latex: 1)

I have been an enthusiastic user of mpd and mopidy for nearly two decades. I
have already [written an
article](https://blog.platypush.tech/article/Build-your-open-source-multi-room-and-multi-provider-sound-server-with-Platypush-Mopidy-and-Snapcast)
on how to leverage mopidy (with its tons of integrations, including Spotify,
Tidal, YouTube, Bandcamp, Plex, TuneIn, SoundCloud etc.), Snapcast (with its
multi-room listening experience out of the box) and Platypush (with its
automation hooks that allow you to easily create if-this-then-that rules for
your music events) to take your listening experience to the next level, while
using open protocols and easily extensible open-source software.

There is a feature that I haven't yet covered in my previous articles, and
that's the automation of your music collection.

Spotify, Tidal and other music streaming services offer you features such as a
_Discovery Weekly_ or _Release Radar_ playlists, respectively filled with
tracks that you may like, or newly released tracks that you may be interested
in.

The problem is that these services come with heavy trade-offs:

1. Their algorithms are closed. You don't know how Spotify figures out which
   songs should be picked in your smart playlists. In the past months, Spotify
   would often suggest me tracks from the same artists that I had already
   listened to or skipped in the past, and there's no transparent way to tell
   the algorithm "hey, actually I'd like you to suggest me more this kind of
   music - or maybe calculate suggestions only based on the music I've listened
   to in this time range, or maybe weigh this genre more".

2. Those features are tightly coupled with the service you use. If you cancel
   your Spotify subscription, you lose those smart features as well.
   Companies like Spotify use such features as a lock-in mechanism -
   you can check out any time you like, but if you do then nobody else will
   provide you with their clever suggestions.

After migrating from Spotify to Tidal in the past couple of months (TL;DR:
Spotify f*cked up their developer experience multiple times over the past
decade, and their killing of libspotify without providing any alternatives was
the last nail in the coffin for me) I felt like missing their smart mixes,
discovery and new releases playlists - and, on the other hand, Tidal took a
while to learn my listening habits, and even when it did it often generated
smart playlists that were an inch below Spotify's. I asked myself why on earth
my music discovery experience should be so tightly coupled to one single cloud
service. And I decided that the time had come for me to automatically generate
my service-agnostic music suggestions: it's not rocket science anymore, there's
plenty of services that you can piggyback on to get artist or tracks similar to
some music given as input, and there's just no excuses to feel locked in by
Spotify, Google, Tidal or some other cloud music provider.

In this article we'll cover how to:

1. Use Platypush to automatically keep track of the music you listen to from
   any of your devices;
2. Calculate the suggested tracks that may be similar to the music you've
   recently listen to by using the Last.FM API;
3. Generate a _Discover Weekly_ playlist similar to Spotify's without relying
   on Spotify;
4. Get the newly released albums and single by subscribing to an RSS feed;
5. Generate a weekly playlist with the new releases by filtering those from
   artists that you've listened to at least once.

## Ingredients

We will use Platypush to handle the following features:

1. Store our listening history to a local database, or synchronize it with a
   scrobbling service like [last.fm](https://last.fm).
2. Periodically inspect our newly listened tracks, and use the last.fm API to
   retrieve similar tracks.
3. Generate a discover weekly playlist based on a simple score that ranks
   suggestions by match score against the tracks listened on a certain period
   of time, and increases the weight of suggestions that occur multiple times.
4. Monitor new releases from the newalbumreleases.net RSS feed, and create a
   weekly _Release Radar_ playlist containing the items from artists that we
   have listened to at least once.

This tutorial will require:

1. A database to store your listening history and suggestions. The database
   initialization script has been tested against Postgres, but it should be
   easy to adapt it to MySQL or SQLite with some minimal modifications.
2. A machine (it can be a RaspberryPi, a home server, a VPS, an unused tablet
   etc.) to run the Platypush automation.
3. A Spotify or Tidal account. The reported examples will generate the
   playlists on a Tidal account by using the `music.tidal` Platypush plugin,
   but it should be straightforward to adapt them to Spotify by using the
   `music.spotify` plugin, or even to YouTube by using the YouTube API, or even
   to local M3U playlists.

## Setting up the software

Start by installing Platypush with the
[Tidal](https://docs.platypush.tech/platypush/plugins/music.tidal.html),
[RSS](https://docs.platypush.tech/platypush/plugins/rss.html) and
[Last.fm](https://docs.platypush.tech/platypush/plugins/lastfm.html)
integrations:

```
[sudo] pip install 'platypush[tidal,rss,lastfm]'
```

If you want to use Spotify instead of Tidal then just remove `tidal` from the
list of extra dependencies - no extra dependencies are required for the
[Spotify
plugin](https://docs.platypush.tech/platypush/plugins/music.spotify.html).

If you are planning to listen to music through mpd/mopidy, then you may also
want to include `mpd` in the list of extra dependencies, so Platypush can
directly monitor your listening activity over the MPD protocol.

Let's then configure a simple configuration under `~/.config/platypush/config.yaml`:

```yaml
music.tidal:
  # No configuration required

# Or, if you use Spotify, create an app at https://developer.spotify.com and
# add its credentials here
# music.spotify:
#   client_id: client_id
#   client_secret: client_secret

lastfm:
  api_key: your_api_key
  api_secret: your_api_secret
  username: your_user
  password: your_password

# Subscribe to updates from newalbumreleases.net
rss:
  subscriptions:
    - https://newalbumreleases.net/category/cat/feed/

# Optional, used to send notifications about generation issues to your
# mobile/browser. You can also use Pushbullet, an email plugin or a chatbot if
# you prefer.
ntfy:
  # No configuration required if you want to use the default server at
  # https://ntfy.sh

# Include the mpd plugin and backend if you are listening to music over
# mpd/mopidy
music.mpd:
  host: localhost
  port: 6600

backend.music.mopidy:
  host: localhost
  port: 6600
```

Start Platypush by running the `platypush` command. The first time it should
prompt you with a tidal.com link required to authenticate your user. Open it in
your browser and authorize the app - the next runs should no longer ask you to
authenticate.

Once the Platypush dependencies are in place, let's move to configure the
database.

## Database configuration

I'll assume that you have a Postgres database running somewhere, but the script
below can be easily adapted also to other DBMS's.

Database initialization script:

```sql
-- New listened tracks will be pushed to the tmp_music table, and normalized by
-- a trigger.
drop table if exists tmp_music cascade;
create table tmp_music(
    id          serial not null,
    artist      varchar(255) not null,
    title       varchar(255) not null,
    album       varchar(255),
    created_at  timestamp with time zone default CURRENT_TIMESTAMP,
    primary key(id)
);

-- This table will store the tracks' info
drop table if exists music_track cascade;
create table music_track(
    id          serial not null,
    artist      varchar(255) not null,
    title       varchar(255) not null,
    album       varchar(255),
    created_at  timestamp with time zone default CURRENT_TIMESTAMP,
    primary key(id),
    unique(artist, title)
);

-- Create an index on (artist, title), and ensure that the (artist, title) pair
-- is unique
create unique index track_artist_title_idx on music_track(lower(artist), lower(title));
create index track_artist_idx on music_track(lower(artist));

-- music_activity holds the listened tracks
drop table if exists music_activity cascade;
create table music_activity(
    id          serial not null,
    track_id    int not null,
    created_at  timestamp with time zone default CURRENT_TIMESTAMP,
    primary key(id)
);

-- music_similar keeps track of the similar tracks
drop table if exists music_similar cascade;
create table music_similar(
    source_track_id   int not null,
    target_track_id   int not null,
    match_score       float not null,
    primary key(source_track_id, target_track_id),
    foreign key(source_track_id) references music_track(id),
    foreign key(target_track_id) references music_track(id)
);

-- music_discovery_playlist keeps track of the generated discovery playlists
drop table if exists music_discovery_playlist cascade;
create table music_discovery_playlist(
    id          serial not null,
    name        varchar(255),
    created_at  timestamp with time zone default CURRENT_TIMESTAMP,
    primary key(id)
);

-- This table contains the track included in each discovery playlist
drop table if exists music_discovery_playlist_track cascade;
create table music_discovery_playlist_track(
    id            serial not null,
    playlist_id   int not null,
    track_id      int not null,
    primary key(id),
    unique(playlist_id, track_id),
    foreign key(playlist_id) references music_discovery_playlist(id),
    foreign key(track_id) references music_track(id)
);

-- This table contains the new releases from artist that we've listened to at
-- least once
drop table if exists new_release cascade;
create table new_release(
    id serial not null,
    artist varchar(255) not null,
    album varchar(255) not null,
    genre varchar(255),
    created_at timestamp with time zone default CURRENT_TIMESTAMP,

    primary key(id),
    constraint u_artist_title unique(artist, album)
);

-- This trigger normalizes the tracks inserted into tmp_track
create or replace function sync_music_data()
    returns trigger as
$$
declare
    track_id int;
begin
    insert into music_track(artist, title, album)
        values(new.artist, new.title, new.album)
    on conflict(artist, title) do update
        set album = coalesce(excluded.album, old.album)
    returning id into track_id;

    insert into music_activity(track_id, created_at)
        values (track_id, new.created_at);

    delete from tmp_music where id = new.id;
    return new;
end;
$$
language 'plpgsql';

drop trigger if exists on_sync_music on tmp_music;
create trigger on_sync_music
    after insert on tmp_music
    for each row
    execute procedure sync_music_data();

-- (Optional) accessory view to easily peek the listened tracks
drop view if exists vmusic;
create view vmusic as
select t.id as track_id
     , t.artist
     , t.title
     , t.album
     , a.created_at
from music_track t
join music_activity a
on t.id = a.track_id;
```

Run the script on your database - if everything went smooth then all the tables
should be successfully created.

## Synchronizing your music activity

Now that all the dependencies are in place, it's time to configure the logic to
store your music activity to your database.

If most of your music activity happens through mpd/mopidy, then storing your
activity to the database is as simple as creating a hook on
[`NewPlayingTrackEvent`
events](https://docs.platypush.tech/platypush/events/music.html)
that inserts any new played track on `tmp_music`. Paste the following
content to a new Platypush user script (e.g.
`~/.config/platypush/scripts/music/sync.py`):

```python
# ~/.config/platypush/scripts/music/sync.py

from logging import getLogger

from platypush.context import get_plugin
from platypush.event.hook import hook
from platypush.message.event.music import NewPlayingTrackEvent

logger = getLogger('music_sync')

# SQLAlchemy connection string that points to your database
music_db_engine = 'postgresql+pg8000://dbuser:dbpass@dbhost/dbname'


# Hook that react to NewPlayingTrackEvent events
@hook(NewPlayingTrackEvent)
def on_new_track_playing(event, **_):
    track = event.track

    # Skip if the track has no artist/title specified
    if not (track.get('artist') and track.get('title')):
        return

    logger.info(
        'Inserting track: %s - %s',
        track['artist'], track['title']
    )

    db = get_plugin('db')
    db.insert(
        engine=music_db_engine,
        table='tmp_music',
        records=[
            {
                'artist': track['artist'],
                'title': track['title'],
                'album': track.get('album'),
            }
            for track in tracks
        ]
    )
```

Alternatively, if you also want to sync music activity that happens on
other clients (such as the Spotify/Tidal app or web view, or over mobile
devices), you may consider leveraging Last.fm. Last.fm (or its open alternative
Libre.fm) is a _scrobbling_ service compatible with most of the music
players out there. Both Spotify and Tidal support scrobbling, the [Android
app](https://apkpure.com/last-fm/fm.last.android) can grab any music activity
on your phone and scrobble it, and there are even [browser
extensions](https://chrome.google.com/webstore/detail/web-scrobbler/hhinaapppaileiechjoiifaancjggfjm?hl=en)
that allow you to keep track of any music activity from any browser tab.

So an alternative approach may be to send both your mpd/mopidy music activity,
as well as your in-browser or mobile music activity, to last.fm / libre.fm. The
corresponding hook would be:

```python
# ~/.config/platypush/scripts/music/sync.py

from logging import getLogger

from platypush.context import get_plugin
from platypush.event.hook import hook
from platypush.message.event.music import NewPlayingTrackEvent

logger = getLogger('music_sync')


# Hook that react to NewPlayingTrackEvent events
@hook(NewPlayingTrackEvent)
def on_new_track_playing(event, **_):
    track = event.track

    # Skip if the track has no artist/title specified
    if not (track.get('artist') and track.get('title')):
        return

    lastfm = get_plugin('lastfm')
    logger.info(
        'Scrobbling track: %s - %s',
        track['artist'], track['title']
    )

    lastfm.scrobble(
        artist=track['artist'],
        title=track['title'],
        album=track.get('album'),
    )
```

If you go for the scrobbling way, then you may want to periodically synchronize
your scrobble history to your local database - for example, through a cron that
runs every 30 seconds:

```python
# ~/.config/platypush/scripts/music/scrobble2db.py

import logging

from datetime import datetime

from platypush.context import get_plugin, Variable
from platypush.cron import cron

logger = logging.getLogger('music_sync')
music_db_engine = 'postgresql+pg8000://dbuser:dbpass@dbhost/dbname'

# Use this stored variable to keep track of the time of the latest
# synchronized scrobble
last_timestamp_var = Variable('LAST_SCROBBLED_TIMESTAMP')


# This cron executes every 30 seconds
@cron('* * * * * */30')
def sync_scrobbled_tracks(**_):
    db = get_plugin('db')
    lastfm = get_plugin('lastfm')

    # Use the last.fm plugin to retrieve all the new tracks scrobbled since
    # the last check
    last_timestamp = int(last_timestamp_var.get() or 0)
    tracks = [
        track for track in lastfm.get_recent_tracks().output
        if track.get('timestamp', 0) > last_timestamp
    ]

    # Exit if we have no new music activity
    if not tracks:
        return

    # Insert the new tracks on the database
    db.insert(
        engine=music_db_engine,
        table='tmp_music',
        records=[
            {
                'artist': track.get('artist'),
                'title': track.get('title'),
                'album': track.get('album'),
                'created_at': (
                    datetime.fromtimestamp(track['timestamp'])
                    if track.get('timestamp') else None
                ),
            }
            for track in tracks
        ]
    )

    # Update the LAST_SCROBBLED_TIMESTAMP variable with the timestamp of the
    # most recent played track
    last_timestamp_var.set(max(
        int(t.get('timestamp', 0))
        for t in tracks
    ))

    logger.info('Stored %d new scrobbled track(s)', len(tracks))
```

This cron will basically synchronize your scrobbling history to your local
database, so we can use the local database as the source of truth for the next
steps - no matter where the music was played from.

To test the logic, simply restart Platypush, play some music from your
favourite player(s), and check that everything gets inserted on the database -
even if we are inserting tracks on the `tmp_music` table, the listening history
should be automatically normalized on the appropriate tables by the triggered
that we created at initialization time.

## Updating the suggestions

Now that all the plumbing to get all of your listening history in one data
source is in place, let's move to the logic that recalculates the suggestions
based on your listening history.

We will again use the last.fm API to get tracks that are similar to those we
listened to recently - I personally find last.fm suggestions often more
relevant than those of Spotify's.

For sake of simplicity, let's map the database tables to some SQLAlchemy ORM
classes, so the upcoming SQL interactions can be notably simplified. The ORM
model can be stored under e.g. `~/.config/platypush/music/db.py`:

```python
# ~/.config/platypush/scripts/music/db.py

from sqlalchemy import create_engine
from sqlalchemy.ext.automap import automap_base
from sqlalchemy.orm import sessionmaker, scoped_session

music_db_engine = 'postgresql+pg8000://dbuser:dbpass@dbhost/dbname'
engine = create_engine(music_db_engine)

Base = automap_base()
Base.prepare(engine, reflect=True)
Track = Base.classes.music_track
TrackActivity = Base.classes.music_activity
TrackSimilar = Base.classes.music_similar
DiscoveryPlaylist = Base.classes.music_discovery_playlist
DiscoveryPlaylistTrack = Base.classes.music_discovery_playlist_track
NewRelease = Base.classes.new_release


def get_db_session():
    session = scoped_session(sessionmaker(expire_on_commit=False))
    session.configure(bind=engine)
    return session()
```

Then create a new user script under e.g.
`~/.config/platypush/scripts/music/suggestions.py` with the following content:

```python
# ~/.config/platypush/scripts/music/suggestions.py

import logging

from sqlalchemy import tuple_
from sqlalchemy.dialects.postgresql import insert
from sqlalchemy.sql.expression import bindparam

from platypush.context import get_plugin, Variable
from platypush.cron import cron

from scripts.music.db import (
    get_db_session, Track, TrackActivity, TrackSimilar
)


logger = logging.getLogger('music_suggestions')

# This stored variable will keep track of the latest activity ID for which the
# suggestions were calculated
last_activity_id_var = Variable('LAST_PROCESSED_ACTIVITY_ID')


# A cronjob that runs every 5 minutes and updates the suggestions
@cron('*/5 * * * *')
def refresh_similar_tracks(**_):
    last_activity_id = int(last_activity_id_var.get() or 0)

    # Retrieve all the tracks played since the latest synchronized activity ID
    # that don't have any similar tracks being calculated yet
    with get_db_session() as session:
        recent_tracks_without_similars = \
            _get_recent_tracks_without_similars(last_activity_id)

    try:
        if not recent_tracks_without_similars:
            raise StopIteration(
                'All the recent tracks have processed suggestions')

        # Get the last activity_id
        batch_size = 10
        last_activity_id = (
            recent_tracks_without_similars[:batch_size][-1]['activity_id'])

        logger.info(
            'Processing suggestions for %d/%d tracks',
            min(batch_size, len(recent_tracks_without_similars)),
            len(recent_tracks_without_similars))

        # Build the track_id -> [similar_tracks] map
        similars_by_track = {
            track['track_id']: _get_similar_tracks(track['artist'], track['title'])
            for track in recent_tracks_without_similars[:batch_size]
        }

        # Map all the similar tracks in an (artist, title) -> info data structure
        similar_tracks_by_artist_and_title = \
            _get_similar_tracks_by_artist_and_title(similars_by_track)

        if not similar_tracks_by_artist_and_title:
            raise StopIteration('No new suggestions to process')

        # Sync all the new similar tracks to the database
        similar_tracks = \
            _sync_missing_similar_tracks(similar_tracks_by_artist_and_title)

        # Link listened tracks to similar tracks
        with get_db_session() as session:
            stmt = insert(TrackSimilar).values({
                'source_track_id': bindparam('source_track_id'),
                'target_track_id': bindparam('target_track_id'),
                'match_score': bindparam('match_score'),
            }).on_conflict_do_nothing()

            session.execute(
                stmt, [
                    {
                        'source_track_id': track_id,
                        'target_track_id': similar_tracks[(similar['artist'], similar['title'])].id,
                        'match_score': similar['score'],
                    }
                    for track_id, similars in similars_by_track.items()
                    for similar in (similars or [])
                    if (similar['artist'], similar['title'])
                    in similar_tracks
                ]
            )

            session.flush()
            session.commit()
    except StopIteration as e:
        logger.info(e)

    last_activity_id_var.set(last_activity_id)
    logger.info('Suggestions updated')


def _get_similar_tracks(artist, title):
    """
    Use the last.fm API to retrieve the tracks similar to a given
    artist/title pair
    """
    import pylast
    lastfm = get_plugin('lastfm')

    try:
        return lastfm.get_similar_tracks(
            artist=artist,
            title=title,
            limit=10,
        )
    except pylast.PyLastError as e:
        logger.warning(
            'Could not find tracks similar to %s - %s: %s',
            artist, title, e
        )


def _get_recent_tracks_without_similars(last_activity_id):
    """
    Get all the tracks played after a certain activity ID that don't have
    any suggestions yet.
    """
    with get_db_session() as session:
        return [
            {
                'track_id': t[0],
                'artist': t[1],
                'title': t[2],
                'activity_id': t[3],
            }
            for t in session.query(
                Track.id.label('track_id'),
                Track.artist,
                Track.title,
                TrackActivity.id.label('activity_id'),
            )
            .select_from(
                Track.__table__
                .join(
                    TrackSimilar,
                    Track.id == TrackSimilar.source_track_id,
                    isouter=True
                )
                .join(
                    TrackActivity,
                    Track.id == TrackActivity.track_id
                )
            )
            .filter(
                TrackSimilar.source_track_id.is_(None),
                TrackActivity.id > last_activity_id
            )
            .order_by(TrackActivity.id)
            .all()
        ]


def _get_similar_tracks_by_artist_and_title(similars_by_track):
    """
    Map similar tracks into an (artist, title) -> track dictionary
    """
    similar_tracks_by_artist_and_title = {}
    for similar in similars_by_track.values():
        for track in (similar or []):
            similar_tracks_by_artist_and_title[
                (track['artist'], track['title'])
            ] = track

    return similar_tracks_by_artist_and_title


def _sync_missing_similar_tracks(similar_tracks_by_artist_and_title):
    """
    Flush newly calculated similar tracks to the database.
    """
    logger.info('Syncing missing similar tracks')
    with get_db_session() as session:
        stmt = insert(Track).values({
            'artist': bindparam('artist'),
            'title': bindparam('title'),
        }).on_conflict_do_nothing()

        session.execute(stmt, list(similar_tracks_by_artist_and_title.values()))
        session.flush()
        session.commit()

        tracks = session.query(Track).filter(
            tuple_(Track.artist, Track.title).in_(
                similar_tracks_by_artist_and_title
            )
        ).all()

        return {
            (track.artist, track.title): track
            for track in tracks
        }
```

Restart Platypush and let it run for a bit. The cron will operate in batches of
10 items each (it can be easily customized), so after a few minutes your
`music_suggestions` table should start getting populated.

## Generating the discovery playlist

So far we have achieved the following targets:

- We have a piece of logic that synchronizes all of our listening history to a
  local database.
- We have a way to synchronize last.fm / libre.fm scrobbles to the same
  database as well.
- We have a cronjob that periodically scans our listening history and fetches
  the suggestions through the last.fm API.

Now let's put it all together with a cron that runs every week (or daily, or at
whatever interval we like) that does the following:

- It retrieves our listening history over the specified period.
- It retrieves the suggested tracks associated to our listening history.
- It excludes the tracks that we've already listened to, or that have already
  been included in previous discovery playlists.
- It generates a new discovery playlist with those tracks, ranked according to
  a simple score:

$$
\rho_i = \sum_{j \in L_i} m_{ij}
$$

Where \( \rho_i \) is the ranking of the suggested _i_-th suggested track, \(
L_i \) is the set of listened tracks that have the _i_-th track among its
similarities, and \( m_{ij} \) is the match score between _i_ and _j_ as
reported by the last.fm API.

Let's put all these pieces together in a cron defined in e.g.
`~/.config/platypush/scripts/music/discovery.py`:

```python
# ~/.config/platypush/scripts/music/discovery.py

import logging
from datetime import date, timedelta

from platypush.context import get_plugin
from platypush.cron import cron

from scripts.music.db import (
    get_db_session, Track, TrackActivity, TrackSimilar,
    DiscoveryPlaylist, DiscoveryPlaylistTrack
)

logger = logging.getLogger('music_discovery')


def get_suggested_tracks(days=7, limit=25):
    """
    Retrieve the suggested tracks from the database.

    :param days: Look back at the listen history for the past <n> days
        (default: 7).
    :param limit: Maximum number of track in the discovery playlist
        (default: 25).
    """
    from sqlalchemy import func

    listened_activity = TrackActivity.__table__.alias('listened_activity')
    suggested_activity = TrackActivity.__table__.alias('suggested_activity')

    with get_db_session() as session:
        return [
            {
                'track_id': t[0],
                'artist': t[1],
                'title': t[2],
                'score': t[3],
            }
            for t in session.query(
                Track.id,
                func.min(Track.artist),
                func.min(Track.title),
                func.sum(TrackSimilar.match_score).label('score'),
            )
            .select_from(
                Track.__table__
                .join(
                    TrackSimilar.__table__,
                    Track.id == TrackSimilar.target_track_id
                )
                .join(
                    listened_activity,
                    listened_activity.c.track_id == TrackSimilar.source_track_id,
                )
                .join(
                    suggested_activity,
                    suggested_activity.c.track_id == TrackSimilar.target_track_id,
                    isouter=True
                )
                .join(
                    DiscoveryPlaylistTrack,
                    Track.id == DiscoveryPlaylistTrack.track_id,
                    isouter=True
                )
            )
            .filter(
                # The track has not been listened
                suggested_activity.c.track_id.is_(None),
                # The track has not been suggested already
                DiscoveryPlaylistTrack.track_id.is_(None),
                # Filter by recent activity
                listened_activity.c.created_at >= date.today() - timedelta(days=days)
            )
            .group_by(Track.id)
            # Sort by aggregate match score
            .order_by(func.sum(TrackSimilar.match_score).desc())
            .limit(limit)
            .all()
        ]


def search_remote_tracks(tracks):
    """
    Search for Tidal tracks given a list of suggested tracks.
    """
    # If you use Spotify instead of Tidal, simply replacing `music.tidal`
    # with `music.spotify` here should suffice.
    tidal = get_plugin('music.tidal')
    found_tracks = []

    for track in tracks:
        query = track['artist'] + ' ' + track['title']
        logger.info('Searching "%s"', query)
        results = (
            tidal.search(query, type='track', limit=1).output.get('tracks', [])
        )

        if results:
            track['remote_track_id'] = results[0]['id']
            found_tracks.append(track)
        else:
            logger.warning('Could not find "%s" on TIDAL', query)

    return found_tracks


def refresh_discover_weekly():
    # If you use Spotify instead of Tidal, simply replacing `music.tidal`
    # with `music.spotify` here should suffice.
    tidal = get_plugin('music.tidal')

    # Get the latest suggested tracks
    suggestions = search_remote_tracks(get_suggested_tracks())
    if not suggestions:
        logger.info('No suggestions available')
        return

    # Retrieve the existing discovery playlists
    # Our naming convention is that discovery playlist names start with
    # "Discover Weekly" - feel free to change it
    playlists = tidal.get_playlists().output
    discover_playlists = sorted(
        [
            pl for pl in playlists
            if pl['name'].lower().startswith('discover weekly')
        ],
        key=lambda pl: pl.get('created_at', 0)
    )

    # Delete all the existing discovery playlists
    # (except the latest one). We basically keep two discovery playlists at the
    # time in our collection, so you have two weeks to listen to them before they
    # get deleted. Feel free to change this logic by modifying the -1 parameter
    # with e.g. -2, -3 etc. if you want to store more discovery playlists.
    for playlist in discover_playlists[:-1]:
        logger.info('Deleting playlist "%s"', playlist['name'])
        tidal.delete_playlist(playlist['id'])

    # Create a new discovery playlist
    playlist_name = f'Discover Weekly [{date.today().isoformat()}]'
    pl = tidal.create_playlist(playlist_name).output
    playlist_id = pl['id']

    tidal.add_to_playlist(
        playlist_id,
        [t['remote_track_id'] for t in suggestions],
    )

    # Add the playlist to the database
    with get_db_session() as session:
        pl = DiscoveryPlaylist(name=playlist_name)
        session.add(pl)
        session.flush()
        session.commit()

    # Add the playlist entries to the database
    with get_db_session() as session:
        for track in suggestions:
            session.add(
                DiscoveryPlaylistTrack(
                    playlist_id=pl.id,
                    track_id=track['track_id'],
                )
            )

        session.commit()

    logger.info('Discover Weekly playlist updated')


@cron('0 6 * * 1')
def refresh_discover_weekly_cron(**_):
    """
    This cronjob runs every Monday at 6 AM.
    """
    try:
        refresh_discover_weekly()
    except Exception as e:
        logger.exception(e)

        # (Optional) If anything went wrong with the playlist generation, send
        # a notification over ntfy
        ntfy = get_plugin('ntfy')
        ntfy.send_message(
            topic='mirrored-notifications-topic',
            title='Discover Weekly playlist generation failed',
            message=str(e),
            priority=4,
        )
```

You can test the cronjob without having to wait for the next Monday through
your Python interpreter:

```python
>>> import os
>>>
>>> # Move to the Platypush config directory
>>> path = os.path.join(os.path.expanduser('~'), '.config', 'platypush')
>>> os.chdir(path)
>>>
>>> # Import and run the cron function
>>> from scripts.music.discovery import refresh_discover_weekly_cron
>>> refresh_discover_weekly_cron()
```

If everything went well, you should soon see a new playlist in your collection
named _Discover Weekly [date]_. Congratulations!

## Release radar playlist

Another great feature of Spotify and Tidal is the ability to provide "release
radar" playlists that contain new releases from artists that we may like.

We now have a powerful way of creating such playlists ourselves though. We
previously configured Platypush to subscribe to the RSS feed from
newalbumreleases.net. Populating our release radar playlist involves the
following steps:

1. Creating a hook that reacts to [`NewFeedEntryEvent`
   events](https://docs.platypush.tech/platypush/events/rss.html) on this feed.
2. The hook will store new releases that match artists in our collection on the
   `new_release` table that we created when we initialized the database.
3. A cron will scan this table on a weekly basis, search the tracks on
   Spotify/Tidal, and populate our playlist just like we did for _Discover
   Weekly_.

Let's put these pieces together in a new user script stored under e.g.
`~/.config/platypush/scripts/music/releases.py`:

```python
# ~/.config/platypush/scripts/music/releases.py

import html
import logging
import re
import threading
from datetime import date, timedelta
from typing import Iterable, List

from platypush.context import get_plugin
from platypush.cron import cron
from platypush.event.hook import hook
from platypush.message.event.rss import NewFeedEntryEvent

from scripts.music.db import (
    music_db_engine, get_db_session, NewRelease
)


create_lock = threading.RLock()
logger = logging.getLogger(__name__)


def _split_html_lines(content: str) -> List[str]:
    """
    Utility method used to convert and split the HTML lines reported
    by the RSS feed.
    """
    return [
        l.strip()
        for l in re.sub(
            r'(</?p[^>]*>)|(<br\s*/?>)',
            '\n',
            content
        ).split('\n') if l
    ]


def _get_summary_field(title: str, lines: Iterable[str]) -> str | None:
    """
    Parse the fields of a new album from the feed HTML summary.
    """
    for line in lines:
        m = re.match(rf'^{title}:\s+(.*)$', line.strip(), re.IGNORECASE)
        if m:
            return html.unescape(m.group(1))


@hook(NewFeedEntryEvent, feed_url='https://newalbumreleases.net/category/cat/feed/')
def save_new_release(event: NewFeedEntryEvent, **_):
    """
    This hook is triggered whenever the newalbumreleases.net has new entries.
    """
    # Parse artist and album
    summary = _split_html_lines(event.summary)
    artist = _get_summary_field('artist', summary)
    album = _get_summary_field('album', summary)
    genre = _get_summary_field('style', summary)

    if not (artist and album):
        return

    # Check if we have listened to this artist at least once
    db = get_plugin('db')
    num_plays = int(
        db.select(
            engine=music_db_engine,
            query=
            '''
            select count(*)
            from music_activity a
            join music_track t
            on a.track_id = t.id
            where artist = :artist
            ''',
            data={'artist': artist},
        ).output[0].get('count', 0)
    )

    # If not, skip it
    if not num_plays:
        return

    # Insert the new release on the database
    with create_lock:
        db.insert(
            engine=music_db_engine,
            table='new_release',
            records=[{
                'artist': artist,
                'album': album,
                'genre': genre,
            }],
            key_columns=('artist', 'album'),
            on_duplicate_update=True,
        )


def get_new_releases(days=7):
    """
    Retrieve the new album releases from the database.

    :param days: Look at albums releases in the past <n> days
        (default: 7)
    """
    with get_db_session() as session:
        return [
            {
                'artist': t[0],
                'album': t[1],
            }
            for t in session.query(
                NewRelease.artist,
                NewRelease.album,
            )
            .select_from(
                NewRelease.__table__
            )
            .filter(
                # Filter by recent activity
                NewRelease.created_at >= date.today() - timedelta(days=days)
            )
            .all()
        ]


def search_tidal_new_releases(albums):
    """
    Search for Tidal albums given a list of objects with artist and title.
    """
    tidal = get_plugin('music.tidal')
    expanded_tracks = []

    for album in albums:
        query = album['artist'] + ' ' + album['album']
        logger.info('Searching "%s"', query)
        results = (
            tidal.search(query, type='album', limit=1)
            .output.get('albums', [])
        )

        if results:
            album = results[0]

            # Skip search results older than a year - some new releases may
            # actually be remasters/re-releases of existing albums
            if date.today().year - album.get('year', 0) > 1:
                continue

            expanded_tracks += (
                tidal.get_album(results[0]['id']).
                output.get('tracks', [])
            )
        else:
            logger.warning('Could not find "%s" on TIDAL', query)

    return expanded_tracks


def refresh_release_radar():
    tidal = get_plugin('music.tidal')

    # Get the latest releases
    tracks = search_tidal_new_releases(get_new_releases())
    if not tracks:
        logger.info('No new releases found')
        return

    # Retrieve the existing new releases playlists
    playlists = tidal.get_playlists().output
    new_releases_playlists = sorted(
        [
            pl for pl in playlists
            if pl['name'].lower().startswith('new releases')
        ],
        key=lambda pl: pl.get('created_at', 0)
    )

    # Delete all the existing new releases playlists
    # (except the latest one)
    for playlist in new_releases_playlists[:-1]:
        logger.info('Deleting playlist "%s"', playlist['name'])
        tidal.delete_playlist(playlist['id'])

    # Create a new releases playlist
    playlist_name = f'New Releases [{date.today().isoformat()}]'
    pl = tidal.create_playlist(playlist_name).output
    playlist_id = pl['id']

    tidal.add_to_playlist(
        playlist_id,
        [t['id'] for t in tracks],
    )


@cron('0 7 * * 1')
def refresh_release_radar_cron(**_):
    """
    This cron will execute every Monday at 7 AM.
    """
    try:
        refresh_release_radar()
    except Exception as e:
        logger.exception(e)
        get_plugin('ntfy').send_message(
            topic='mirrored-notifications-topic',
            title='Release Radar playlist generation failed',
            message=str(e),
            priority=4,
        )
```

Just like in the previous case, it's quite easy to test that it works by simply
running `refresh_release_radar_cron` in the Python interpreter. Just like in
the case of the discovery playlist, things will work also if you use Spotify
instead of Tidal - just replace the `music.tidal` plugin references with
`music.spotify`.

If it all goes as expected, you will get a new playlist named _New Releases
[date]_ every Monday with the new releases from artist that you have listened.

## Conclusions

Music junkies have the opportunity to discover a lot of new music today without
ever leaving their music app. However, smart playlists provided by the major
music cloud providers are usually implicit lock-ins, and the way they select
the tracks that should end up in your playlists may not even be transparent, or
even modifiable.

After reading this article, you should be able to generate your discovery and
new releases playlists, without relying on the suggestions from a specific
music cloud. This could also make it easier to change your music provider: even
if you decide to drop Spotify or Tidal, your music suggestions logic will
follow you whenever you decide to go.