Migrated 6th article
This commit is contained in:
parent
6a4a902dbd
commit
d14763d63a
2 changed files with 362 additions and 0 deletions
BIN
static/img/rss-1.jpeg
Normal file
BIN
static/img/rss-1.jpeg
Normal file
Binary file not shown.
After Width: | Height: | Size: 78 KiB |
|
@ -0,0 +1,362 @@
|
||||||
|
[//]: # (title: Deliver articles to your favourite e-reader using Platypush)
|
||||||
|
[//]: # (description: Leverage the RSS and HTML scraping capabilities of Platypush to set up automations to deliver articles to an e-reader.)
|
||||||
|
[//]: # (image: /img/rss-1.jpeg)
|
||||||
|
[//]: # (published: 2019-12-04)
|
||||||
|
|
||||||
|
[RSS feeds](https://www.lifewire.com/what-is-an-rss-feed-4684568) are a largely underestimated feature of the web
|
||||||
|
nowadays — at least outside the circles of geeks. Many apps and paid services exist today to aggregate and curate news
|
||||||
|
from multiple sources, often delegating the task of selecting articles and order on the screen to an opaque algorithm,
|
||||||
|
and the world seem to have largely forgotten this two-decade old technology that already solved the problem of news
|
||||||
|
curation and aggregation a while ago.
|
||||||
|
|
||||||
|
However, RSS (or Atom) feeds are much more omnipresent than many think - every single respectable news website provides
|
||||||
|
at least one feed, albeit some news outlets may not advertise them much amid the fears of losing organic traffic. Feeds
|
||||||
|
empower users with the possibility of creating their own news feeds and boards through aggregators, without relying on
|
||||||
|
the mercy of a cloud-run algorithm. And their structured nature (under the hood an RSS feed is just a structured XML)
|
||||||
|
offers the possibility to build automation pipelines that deliver the content we want wherever we want, whenever we
|
||||||
|
want, and in whichever format we want.
|
||||||
|
|
||||||
|
[IFTTT](https://ifttt.com) is a popular option to build custom logic on RSS feeds. It makes it very intuitive to build
|
||||||
|
relatively complex rules such as “send me a weekly digest with The Economist articles published in the latest issue” or
|
||||||
|
“send a telegram message with the digest from the NYT every day at 6 a.m.” or “send a notification to my mobile whenever
|
||||||
|
XKCD publishes new comics.” However, IFTTT has recently pivoted to become
|
||||||
|
a [paid service](https://thenextweb.com/apps/2020/09/10/ifttt-introduces-a-paid-plan-reduces-free-usage-to-3-applets/)
|
||||||
|
with very limited possibility for free users to create new applets.
|
||||||
|
|
||||||
|
In my opinion, however, it’s thanks to internet-connected e-readers, such as the Kindle or MobiScribe, as well as web
|
||||||
|
services like Mercury and Instapaper that can convert a web page into a clean print-friendly format, that RSS feeds can
|
||||||
|
finally shine at their full brightness.
|
||||||
|
|
||||||
|
It’s great to have our news sources neatly organized in an aggregator. It’s also nice to have the possibility to
|
||||||
|
configure push notifications upon the publication of new articles or daily/weekly/monthly digests delivered whenever we
|
||||||
|
like.
|
||||||
|
|
||||||
|
But these features solve only the first part of the problem — the content distribution. The second part of the problem —
|
||||||
|
content consumption — comes when we click on a link, delivered on whichever device and in whichever format we like, and
|
||||||
|
we start reading the actual article.
|
||||||
|
|
||||||
|
Such an experience nowadays happens mostly on laptop screens or, worse, tiny smartphone screens, where we’re expected to
|
||||||
|
hectically scroll through often nonmobile-optimized content filled with ads and paywalls, while a myriad of other
|
||||||
|
notifications demand for their share of our attention. Reading lengthy content on a smartphone screen is arguably as bad
|
||||||
|
of an experience as browsing the web on a Kindle is.
|
||||||
|
|
||||||
|
Wouldn’t it be great if we could get our favorite content automatically delivered to our favorite reading device,
|
||||||
|
properly formatted and in a comfortably readable size and without all the clutter and distractions? And without having a
|
||||||
|
backlit screen always in front of our eyes?
|
||||||
|
|
||||||
|
In this piece, we’ll see how to do this by using several technological tools (an e-reader, a Kindle account, the Mercury
|
||||||
|
API, and Instapaper) and how to glue all the pieces together through Platypush.
|
||||||
|
|
||||||
|
## Configure your Kindle account for e-mail delivery
|
||||||
|
|
||||||
|
I’ll assume in this first section you have a Kindle, a linked Amazon account, and a Gmail account that we’ll use to
|
||||||
|
programmatically send documents to the device via email - although it's also possible to leverage the [`mail.smtp`](https://platypush.readthedocs.io/en/latest/platypush/plugins/mail.smtp.html)
|
||||||
|
plugin and use another domain for delivering PDFs. We’ll later see als ohow to leverage Instapaper with other devices.
|
||||||
|
|
||||||
|
First, you’ll have to create an email address associated to your Kindle that’ll be used to remotely deliver documents:
|
||||||
|
|
||||||
|
- Head to the [Amazon content and device portal](https://amazon.com/mycd), and log in with your Amazon account.
|
||||||
|
|
||||||
|
- Click on the second tab (“Your Devices”), and click on the context menu next to the device where your content should
|
||||||
|
be delivered.
|
||||||
|
|
||||||
|
- You’ll see the email address associated to your device. Copy it, or click on “Edit” to change it.
|
||||||
|
|
||||||
|
- Click on the third tab (“Settings”), and scroll to the bottom to the section titled “Personal Document Settings.”
|
||||||
|
|
||||||
|
- Scroll to the bottom to the section named “Approved Personal Document E-mail List” and add your Gmail address as a
|
||||||
|
trusted source.
|
||||||
|
|
||||||
|
To check that everything works, you can now try and send a PDF document to your Kindle from your personal email address.
|
||||||
|
If the device is connected to WiFi, then the document should automatically download within a few seconds.
|
||||||
|
|
||||||
|
## Configure Platypush
|
||||||
|
|
||||||
|
Platypush offers all the ingredients we need for the purpose of this piece. We need, in particular, to build an
|
||||||
|
automation pipeline that:
|
||||||
|
|
||||||
|
- Periodically checks a list of RSS sources for new content
|
||||||
|
- Preprocesses the new items by simplifying the web page (through the Mercury parser or Instapaper) and optionally
|
||||||
|
exports them to PDF
|
||||||
|
- Programmatically sends emails to your device(s) with the new content
|
||||||
|
|
||||||
|
First, install Platypush with the required extras (any device with any compatible OS will do: a RaspberryPi, an unused
|
||||||
|
laptop, or a remote server):
|
||||||
|
|
||||||
|
```shell
|
||||||
|
pip install 'platypush[http,pdf,rss,google]'
|
||||||
|
```
|
||||||
|
|
||||||
|
You’ll also need to install `npm` and `mercury-parser`. Postlight used to provide a web API for its parser before, but
|
||||||
|
[they’ve discontinued it, choosing to make the project open-source](https://postlight.com/trackchanges/mercury-goes-open-source):
|
||||||
|
|
||||||
|
```shell
|
||||||
|
# Supposing you're on Debian or Debian-derived OS
|
||||||
|
apt-get install nodejs npm
|
||||||
|
npm install @postlight/mercury-parser
|
||||||
|
```
|
||||||
|
|
||||||
|
Second, link Platypush to your Gmail account to send documents via email:
|
||||||
|
|
||||||
|
- Create a new project on the [Google Developers Console](https://console.developers.google.com/).
|
||||||
|
- Click on “Credentials” from the context menu > OAuth Client ID.
|
||||||
|
- Once generated, you can see your new credentials in the “OAuth 2.0 client IDs” section. Click on the “Download” icon
|
||||||
|
to save them to a JSON file.
|
||||||
|
- Copy the file to your Platypush device/server under e.g., `~/.credentials/client_secret.json`.
|
||||||
|
- Run the following command on the device to authorize the application:
|
||||||
|
|
||||||
|
```shell
|
||||||
|
python -m platypush.plugins.google.credentials \
|
||||||
|
"https://www.googleapis.com/auth/gmail.modify" \
|
||||||
|
~/.credentials/client_secret.json \
|
||||||
|
--noauth_local_webserver
|
||||||
|
```
|
||||||
|
|
||||||
|
- Copy the link in your browser; log in with your Google account, if required; and authorize the application.
|
||||||
|
|
||||||
|
Now that you’ve got everything in place, it’s time to configure Platypush to process your favorite feeds.
|
||||||
|
|
||||||
|
## Create a rule to automatically send articles to your Kindle
|
||||||
|
|
||||||
|
The [`http.poll`](https://platypush.readthedocs.io/en/latest/platypush/backend/http.poll.html) backend is a flexible
|
||||||
|
component that can be configured to poll and process updates from many web resources — JSON, RSS, Atom etc.
|
||||||
|
|
||||||
|
Suppose you want to check for updates
|
||||||
|
on [The Daily](https://www.nytimes.com/2018/07/16/podcasts/the-daily/how-do-i-listen-to-the-daily.html) RSS feed twice a
|
||||||
|
day and deliver a digest with the new content to your Kindle.
|
||||||
|
|
||||||
|
You’ll want to create a configuration like this in `~/.config/platypush/config.yaml`:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
backend.http.poll:
|
||||||
|
requests:
|
||||||
|
# This poll will handle an RSS feed
|
||||||
|
- type: platypush.backend.http.request.rss.RssUpdates
|
||||||
|
# RSS feed URL and title
|
||||||
|
url: http://feeds.podtrac.com/zKq6WZZLTlbM
|
||||||
|
title: NYT - The Daily
|
||||||
|
# How often we want to check for updates
|
||||||
|
# 12h = 43200 secs
|
||||||
|
poll_seconds: 43200
|
||||||
|
# We want to convert content to PDF
|
||||||
|
digest_format: pdf
|
||||||
|
# We want to parse and extract the content from
|
||||||
|
# the web page using Mercury Parser
|
||||||
|
extract_content: True
|
||||||
|
```
|
||||||
|
|
||||||
|
Create an event hook under `~/.config/platypush/scripts/` that reacts to a `NewFeedEvent` and sends the processed
|
||||||
|
content to your Kindle via email:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from platypush.event.hook import hook
|
||||||
|
from platypush.utils import run
|
||||||
|
|
||||||
|
from platypush.message.event.http.rss import NewFeedEvent
|
||||||
|
|
||||||
|
@hook(NewFeedEvent)
|
||||||
|
def on_new_feed_digest(event, **context):
|
||||||
|
run('google.mail.compose',
|
||||||
|
sender='you@gmail.com',
|
||||||
|
to='your-kindle@kindle.com',
|
||||||
|
subject=f'{event.title} feed digest',
|
||||||
|
body=f'Your {event.title} feed digest delivered to your e-reader',
|
||||||
|
files=[event.digest_filename])
|
||||||
|
```
|
||||||
|
|
||||||
|
Restart Platypush. As soon as the application finds items in the target feed that haven’t yet been processed, it’ll
|
||||||
|
parse them, convert them to PDF, trigger a `NewFeedEvent` that’ll be captured by your hook, and the resulting PDF will
|
||||||
|
be delivered to your Kindle.
|
||||||
|
|
||||||
|
You can add more monitored RSS sources by simply adding more items in the `requests` attribute of the `http.poll`
|
||||||
|
backend. Now enjoy reading your articles from a proper screen, delivered directly to your e-reader once or twice a day —
|
||||||
|
tiny smartphone screens, paywalls, pop-ups, and ads feel so much more old-fashioned once you dive into this new
|
||||||
|
experience.
|
||||||
|
|
||||||
|
## Sharing content to your e-reader from your mobile on the fly
|
||||||
|
|
||||||
|
RSS feeds are awesome, but they aren’t the only way we discover and consume content today.
|
||||||
|
|
||||||
|
Many times we scroll through our favorite social-media timeline, bump into an interesting article, start reading it on
|
||||||
|
our tiny screen, and we’d like to keep reading it later when we are on a bigger screen.
|
||||||
|
|
||||||
|
Several tools and products have spawned to provide a solution to the “parse it, save it, and read it later” problem —
|
||||||
|
among those [Evernote](https://evernote.com/), [Pocket](https://getpocket.com/),
|
||||||
|
and [Instapaper](https://instapaper.com/) itself.
|
||||||
|
|
||||||
|
Most of them, however, are still affected by the same issue: Either they don’t do a good job at actually parsing and
|
||||||
|
extracting the content in a more readable format (except for Instapaper — Pocket only saves a link to the original
|
||||||
|
content, while Evernote’s content-parsing capabilities have quite some room for improvement, to say the least), or
|
||||||
|
they’re still bound to the backlit screen of the smartphone or computer that runs them.
|
||||||
|
|
||||||
|
Wouldn’t it be cool to bump into an interesting article while we scroll our Facebook timeline on our Android device and
|
||||||
|
with a single click deliver it to our Kindle in a nice and readable format? Let’s see how to implement such a rule in
|
||||||
|
Platypush.
|
||||||
|
|
||||||
|
First, we’ll need something that runs on our mobile device to programmatically communicate with the instance of
|
||||||
|
Platypush installed on our Raspberry/computer/server.
|
||||||
|
|
||||||
|
I consider [Tasker](https://tasker.joaoapps.com/) one of the best applications suited for this purpose: with Tasker (and
|
||||||
|
the other related apps developed by joaoapps), it’s possible to automate anything on your Android device and create
|
||||||
|
sophisticated rules that connect it to anything.
|
||||||
|
|
||||||
|
There are many ways for Tasker to communicate with Platypush (direct RPC over HTTP calls, using Join with an external
|
||||||
|
MQTT server to dispatch messages, using an intermediate IFTTT hook, or Pushbullet, etc.), and there are many ways for
|
||||||
|
Platypush to communicate back to Tasker on your mobile device (using [AutoRemote](https://joaoapps.com/autoremote/) with
|
||||||
|
the [Platypush plugin](https://platypush.readthedocs.io/en/latest/platypush/plugins/autoremote.html) to send custom
|
||||||
|
events, using IFTTT with any service connected to your mobile, using the [Join API](https://joaoapps.com/join/api/), or,
|
||||||
|
again, Pushbullet).
|
||||||
|
|
||||||
|
We’ll use Pushbullet in this piece because it doesn’t require as many configuration steps as other techniques.
|
||||||
|
|
||||||
|
- Install [Tasker](https://tasker.joaoapps.com/), [AutoShare](https://joaoapps.com/autoshare/),
|
||||||
|
and [Pushbullet](https://pushbullet.com/) on your Android device.
|
||||||
|
|
||||||
|
- Go to your Pushbullet account page, and click “Create Access Token” to create a new access token that’ll be used by
|
||||||
|
Platypush to listen for the messages sent to your account. Enable the Pushbullet plugin and backend on Platypush by
|
||||||
|
adding these lines to `~/.config/platypush/config.yaml`:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
backend.pushbullet:
|
||||||
|
token: YOUR-TOKEN
|
||||||
|
device: platypush-device
|
||||||
|
|
||||||
|
pushbullet:
|
||||||
|
enabled: True
|
||||||
|
```
|
||||||
|
|
||||||
|
Also add a procedure to `~/.config/platypush/scripts` that, given an URL as input, extracts the content, converts it to
|
||||||
|
PDF, and sends it to your Kindle:
|
||||||
|
|
||||||
|
```python
|
||||||
|
import re
|
||||||
|
|
||||||
|
from platypush.procedure import procedure
|
||||||
|
from platypush.utils import run
|
||||||
|
|
||||||
|
@procedure
|
||||||
|
def send_web_page_to_kindle(url, **context):
|
||||||
|
# Some apps don't share only the link, but also some
|
||||||
|
# text such as "I've found this interesting article
|
||||||
|
# on XXX". The following action strips out extra content
|
||||||
|
# from the input and only extracts the URL.
|
||||||
|
url = re.sub(r"^.*(https?://[^\s]*).*", r"\1", url)
|
||||||
|
|
||||||
|
# Extract the content through the Mercury SDK and generate a PDF
|
||||||
|
outfile = '/tmp/extract.pdf'
|
||||||
|
response = run('http.webpage.simplfy', url=url, outfile=outfile)
|
||||||
|
title = response.get('title')
|
||||||
|
|
||||||
|
# Rename the file to match the title of the page
|
||||||
|
if title:
|
||||||
|
new_outfile = f'/tmp/{response["title"]}.pdf'
|
||||||
|
run('file.rename', file=outfile, name=new_outfile)
|
||||||
|
outfile = new_outfile
|
||||||
|
|
||||||
|
# Send the file to your Kindle email address
|
||||||
|
run('google.mail.compose',
|
||||||
|
sender='you@gmail.com',
|
||||||
|
to='your-kindle@kindle.com',
|
||||||
|
subject=f'{title or "[No Title]"} feed digest',
|
||||||
|
body=f'Original URL: {url}',
|
||||||
|
files=[outfile])
|
||||||
|
|
||||||
|
# Remove the temporary file
|
||||||
|
run('file.unlink', file=outfile)
|
||||||
|
```
|
||||||
|
|
||||||
|
- Restart Platypush, and check from Pushbullet that your new virtual device, platypush-device in the example above, has
|
||||||
|
been created.
|
||||||
|
|
||||||
|
- On your mobile, open AutoShare, select “Manage Commands,” and create a new command named, for example, *Send to
|
||||||
|
Kindle*.
|
||||||
|
|
||||||
|
- In the task associated with this trigger, tap the plus icon to add a new action, and select “Push a notification” (the
|
||||||
|
action with the green Pushbullet icon next to it)
|
||||||
|
|
||||||
|
- Select “platypush-device” as a target device, and paste the following JSON as message:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{"type":"request", "action":"procedure.send_web_page_to_kindle", "args": {"url":"%astext"}}
|
||||||
|
```
|
||||||
|
|
||||||
|
- In the example above, `%astext` is a special variable in Tasker that contains the text shared by the source app (in
|
||||||
|
this case, the link sent to AutoShare).
|
||||||
|
|
||||||
|
- Open your browser, and go to the web link of an article you’d like to send to your Kindle. Select Share > AutoShare
|
||||||
|
command > Send to Kindle.
|
||||||
|
|
||||||
|
- The parsed article should be delivered to your e-reader in an optimized PDF format within seconds.
|
||||||
|
|
||||||
|
## Using Instapaper on other Android-based e-readers
|
||||||
|
|
||||||
|
I’ve briefly mentioned Instapaper already. I really love both the service as well as the app. I consider it somehow an
|
||||||
|
implementation of what Evernote should have been but has never been.
|
||||||
|
|
||||||
|
Just browse to an article on the web, click “Share to Instapaper,” and within one click, that web page will be parsed
|
||||||
|
into a readable format, with all the clutter and ads removed, and it’ll be added to your account.
|
||||||
|
|
||||||
|
What makes Instapaper really interesting, though, is the fact that its Android app is really minimal (yet extremely well
|
||||||
|
designed), and it runs well also on devices that run older versions of Android or aren’t that powerful.
|
||||||
|
|
||||||
|
That wouldn’t be such a big deal in itself if products like
|
||||||
|
the [MobiScribe](https://www.indiegogo.com/projects/mobiscribe-the-e-ink-notepad#/) weren’t slowly hitting the market —
|
||||||
|
and I hope its example will be followed by others. The MobiScribe can be used both as an e-reader and as an e-ink
|
||||||
|
notepad, but what really makes it interesting is that it runs Android — even if it’s
|
||||||
|
an [ancient Android Kit-Kat modified release](https://goodereader.com/blog/reviews/mobiscribe-e-reader-review-a-great-first-effort)
|
||||||
|
, a more recent version should arrive sooner or later.
|
||||||
|
|
||||||
|
The presence of an Android OS is what makes this e-reader/tablet much more interesting than other similar products -
|
||||||
|
like [reMarkable](https://remarkable.com/), that has better specs, looks better, costs more, but has opted instead to
|
||||||
|
use its own OS, limiting the possibilities to run any apps other than those developed by the company itself. Even if
|
||||||
|
it’s an old version of Android that runs on an underpowered device, it’s still possible to install some apps on it — and
|
||||||
|
Instapaper is one of them.
|
||||||
|
|
||||||
|
It makes it very easy to enhance your reading experience: Simply browse the web, add articles to your Instapaper
|
||||||
|
account, and deliver them on the fly to your e-reader. If you want, you can also use
|
||||||
|
the [Instapaper API](https://www.instapaper.com/api/simple) in Platypush to programmatically send content to your
|
||||||
|
Instapaper account instead of your Kindle. Just create a procedure like this:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from platypush.procedure import procedure
|
||||||
|
from platypush.utils import run
|
||||||
|
|
||||||
|
@procedure
|
||||||
|
def instapaper_add(url, **context):
|
||||||
|
run('http.request.get', url='https://www.instapaper.com/api/add',
|
||||||
|
params={
|
||||||
|
'url': url,
|
||||||
|
'username': 'your_instapaper_username',
|
||||||
|
'password': 'your_instapaper_password',
|
||||||
|
})
|
||||||
|
```
|
||||||
|
|
||||||
|
I know what you're thinking - the idea of sending my credentials for a web service over a GET request give me shiver as
|
||||||
|
well - but Instapaper has only recently [developed an OAuth-based API](https://www.instapaper.com/api) and I haven't
|
||||||
|
yet managed to implement it in Platypush.
|
||||||
|
|
||||||
|
This procedure is now callable through a simple JSON request:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{"type":"request", "action":"procedure.instapaper_add", "args": {"url":"https://custom-url/article"}}
|
||||||
|
```
|
||||||
|
|
||||||
|
If you prefer this method over the Kindle-over-email way, you can just call this procedure in the examples above to
|
||||||
|
parse the content of the page and save it to your Instapaper account instead of sending an email to your Kindle address.
|
||||||
|
|
||||||
|
## Conclusions
|
||||||
|
|
||||||
|
The amount of information and news channels available on the web has increased exponentially in the last years, but the
|
||||||
|
methods to distribute and consume such content, at least when it comes to flexibility, haven’t improved much. The
|
||||||
|
exponential growth of social media and platforms like Google News means a few large companies nowadays decide which
|
||||||
|
content should appear in front of your eyes, how that content should be delivered to you, and where you can consume it.
|
||||||
|
|
||||||
|
Technology should be about creating more opportunities and flexibility, not reducing them, so such a dramatic
|
||||||
|
centralization shouldn’t be acceptable for a power user. Luckily, decades-old technologies like RSS feeds can come to
|
||||||
|
the rescue, allowing us to tune what we want to read and build automation pipelines that distribute the content wherever
|
||||||
|
and whenever we like.
|
||||||
|
|
||||||
|
Also, e-readers are becoming more and more pervasive, thanks also to the drop in the price of e-ink displays in the last
|
||||||
|
few years and to more companies and products entering the market. Automating the delivery of web content to e-readers
|
||||||
|
can really create a new and more comfortable way to stay informed — and helps us find another great use case for our
|
||||||
|
Kindle, other than downloading novels to read on the beach.
|
Loading…
Reference in a new issue