# Create a Mastodon bot to forward Twitter and RSS feeds to your timeline
Take your favourite accounts and sources with you on the Fediverse, even if they aren't there
[//]: # (image: /img/twitter2mastodon.png)
Fabio Manganiello
Published: 2022-05-06
This article is divided in three sections:
1. A first section where I share some of my thoughts on the Fediverse, on the
trade-offs between centralized and decentralized social networks, and go
over a brief history of the protocols behind platforms like Mastodon.
2. A second section where I show with a practical example that leverages
Platypush how to set up a bot that brings your favorite Twitter profiles and
RSS feeds to your Fediverse timeline, even if they don't have an account
3. Some final observations on the current drawbacks of the Fediverse, with a
particular focus on Mastodon and the current state of relaying.
If you are just here for the code, feel free to skip to the _Creating a
cross-posting bot_ section and skip the last section. Otherwise, grab a coffee
while I go over some techno/philosophical analysis of social media in 2022, how
we got here and what the future may hold.
## Searching for a social safe harbor
My interest into the [Fediverse]( and
its ideas, protocols and products dates back to more than a decade.
I've had an account on the [centralized Diaspora
instance]( more or less since the service was spawned
in 2010 until it shut down, even though I haven't updated it for the last
couple of years.
And I've been running a [Mastodon instance](
mainly dedicated to Platypush for a while. However, I haven't advertised it
much so far, since I haven't been spending much time on it myself until
My interest in the Fediverse used to be quite sporadic until recently. Yes, I
would rant a lot about Facebook/Meta, about the irresponsibility and greediness
rooted deep in its culture, their very hostile and opaque approach against
external researchers and auditors and the deeply flawed thirst for further
centralization that motivates each of its decisions. And, whenever I got too
sick of Facebook, I would just move my social tents to Twitter for a while.
Which is far from perfect, but it probably used to be the least poisonous
between the two necessary evils. As somebody how had been on alternative social
networks for more than a decade, I know way too well the feeling of excitement
when a new shiny toy comes in town, quickly followed by the rolling
That applies [until
I don't feel comfortable anymore sharing my thoughts and communications on a
platform owned by the richest man on earth, which also so happens to be a chief
troll with distorted ideas about the balance between freedom of speech and
responsibilities for one's words.
So, just like [many other
did after Musk's takeover, I also rushed (back) to the Fediverse as a safe and
uncompromising solution. But, unlike the majority of them, instead of rushing
to []( (I don't like the idea of moving
from a centralized platform/instance to another), I rushed to upgrade and
prepare my dusty [](
## Give me back the old web
The whole idea of a Fediverse is as old as Facebook and Twitter themselves.
[](, launched in 2008, was
probably the first usable implementation of an open-source social network based
on [Activity Streams](,
an open syndacation format drafted by the W3C to represent entities, accounts,
media, posts and more across several social platforms. Considering the time
when it was born, it was a lot influenced by the ideas of the semantic web that
were popular at the time (it's about
[that pre-crypto Web 3.0 that didn't
at least not in this universe's timeline).
[GNU Social]( followed in 2009 (and it's still
active today), then
[Diaspora]( in 2010
brought the world of alternative open-source social networks into the spotlight
for a while.
A lot of progress has happened since then.
[ActivityPub](, another open protocol
drafted by the W3C, has become a de-facto standard when it comes to sharing
content across different instances and platforms. And tens of platforms
(including Mastodon itself, Pleroma, PeerTube, Pubcast, Hubzilla, NextCloud
Social, Friendica) currently support ActivityPub, making it possible for users
to follow, interact and share content regardless of where it is hosted.
Anybody can install and run a public instance using one of these platforms, and
anybody on that instance can follow and interact with other users, even if they
are on other platforms, as long as those instances are publicly searchable.
This is possible because the underlying protocols are the same, no matter who
runs the server or what application the server runs. If I have an account on a
Mastodon instance, I can use it to follow a video channel on a PeerTube
instance and comment on it. Even if they run on different machines and they run
different applications, the platforms are able to share content and ensure
federated authentication with one another, just like your web browser can be
used to render content from different web servers: as long as they speak the
same protocol (in this case, HTTP), a browser can render any content,
regardless if it comes from an Apache or a Tomcat server.
This is the way social networks should have been implemented from the very
beginning. Anybody can run one, it's up to admins of instances to decide which
other instances they want to _federate_ with (therefore importing traffic from
other instances into a unique _federated_ timeline), and it's up to individual
users to decide who they want to follow and therefore be part of their home
timeline, regardless of who runs the servers where those accounts are hosted.
It's an idea that sits somewhere between email (you can exchange emails with
anyone as long as you have their email address, even if you have a ``
account and they have a `` account, even if you use Thunderbird as
a client and they use a web app) and RSS feeds (you can aggregate links from
any source under the same interface, as long as that source provides an
RSS/Atom feed).
And that's indeed the trajectory that social networks were projected to follow
until the early 2010s. The W3C and ISO had worked feverishly on open protocols
that could make the social network experience open and distributed, like the
whole Internet had been designed to run up to that date. And implementations
such as, GNU Social and Diaspora were quickly popping up to showcase
those implementations.
But that's not how history went in this universe, as we all know.
Facebook underwent an exponential growth through aggressive centralization and
controversial data collection practices and monetization practices. Most of the
other social networks also followed the Facebook model.
Open chat protocols like XMPP were gradually replaced by centralized apps with
nearly no integrations with the outside world.
Open syndacation protocols like RSS and Atom were replaced by closed timelines
curated by centralized and closely guarded algorithms. This was in part also
due to Google killing Reader, the most used interface for feeds, because it was
in the way of their idea of web content monetization: without a major player
like Google who had interest in the development of those open protocols,
innovation on RSS/Atom largely stalled.
Open activity pub/sub algorithms were replaced by a handful of walled gardens,
whose concept of "data portability" often involved manually downloading a
heavy, unsorted and often unusable zip dump of all of your data.
Transparent, machine-readable data access was replaced by proprietary user
interfaces, and a few half-heartedly implemented APIs that cover only part of
the features, and can be deprecated with nearly no notice depending on whatever
objective a private company decides to pursue on the short term.
I would argue that the aggressive push towards centralization, closed protocols
and walled gardens of the 2010s has only benefited a handful of private
companies, while throwing a wrench in a machinery that was already working
well, replacing it with a vision of the Web that created way more problems that
the ones that it aimed to solve. All in all, the 5-6 companies behind that
disaster named Web 2.0 are responsible for pushing the Web back by at least a
The wave however, as it always happens in that eternal swing between
centralization and decentralization that propels our industry, is changing. The
drawbacks of the centralized social network model have been under everyone's
for the past few years. The "_you can check out any time you like, but you can
never leave, because all of your friends and relatives are here_" blackmail
strategy starts to be less effective, because alternatives are popping up, they
are starting to gain traction, and the bleeding of active users on Facebook and
Twitter has been a fact for at least the past two years.
Facebook is aware of it, but some reason they believe that the solution to the
problems of centralized social networks is a creepy clone of
[SecondLife]( that they call Metaverse. Twitter is much
more aware of the issue, and they have in fact decided to speed up things with
their [Bluesky
They have recently published a [Github
repo]( with a simple MVP consisting of a
server, an in-memory database and a command-line interface, and a (still quite
vague) [architecture
document]( that
resembles a lot the ActivityPub implementation, except with a more centralized
and hierarchical control chain with a (still vaguely defined)
consortium/committee sitting at its top, and a Blockchain-like append-only
ledger to manage information.
I see Twitter's announcement as a reflex reaction to the bleeding of users
towards decentralized platforms that happened shortly after Musk's takeover. It
almost feels as if an engineer was rushed to push some MVP on their laptop to
show that they have a carrot they can give to their users. But it's too little,
too late.
There are nearly two decades of work behind ActivityPub. A lot of smart people
have already figured out the (open) solutions to most of the problems. I don't
see the value of reinventing the wheel through a solution owned by a private
company, with a private consortium behind it, that proposes a solution that is
largely incompatible with what the W3C has been working on since the mid 2000s.
And I don't trust the sincerity of Twitter and the BlueSky investors. If
Twitter was that interested in building a decentralized social network, then
where have they been for the past 15 years, and why haven't they contributed
more to open protocols like ActivityPub? What's the need of yet another
closed-access committee to design the future of social media when we already
have the W3C?
It sounds like they have preferred instead to milk their centralized,
closed-source and closed-protocol cow as long as they could (even when it was
clear that it wasn't profitable). They have built some hype around BlueSky for
the past two years that was all marketing talk and no architecture document
(let alone a usable codebase), and they have rushed to push a half-baked MVP
after the richest man on earth bought them and thousands of users opened
accounts somewhere else - and, most of all, a lot of people realized that
almost anybody can set up a social network server. The sudden
Twitter❤open-source and Twitter❤open-protocols shift is [quite
Whenever it happens, it's because a company in a monopoly/oligopoly-like market
has stopped growing, and the closed+centralized approach that made their
fortunes (and allowed them to make profits without innovating much) has become
too hard to maintain and scale. Whenever this happens, the company usually
display a sudden burst of love for the open-source community, and it turns to
them for new ideas (and to write code for their products so their engineers
don't have to). They usually admit that the solutions proposed by the community
and the committees for standards were right all the time, but they usually
don't take responsibility for slowing down innovation by years while they
dragged their feet and milked their cows. However, they still want a chance of
running the show. They still want to lead the discussions around the new
platforms and protocols, or at least have a majority stake in them, so they can
more easily prepare the ground for the next step of the
cycle. Needless to say, we should play our roles so that such strategies stop
being successful.
## Is there anybody out there?
The open-source alternatives and the open protocols haven't succeeded in the
past decade not because their proposed solutions were technically inferior to
those provided by Facebook or Twitter. On the contrary, they had figured out
the solutions to the problems of distributed moderation, federated
authentication and cross-platform data exchange long before them.
They didn't succeed because it's hard to replicate the exponential snowball of
a true network effect once all the people are already using a certain platform.
Even if you pour a lot of time, money and resources into building an
alternative (like Google+ tried to do for a while), people are naturally
resistant to change, and it's just too hard to move them once all of their
contacts are on a single platform. Especially when social networks are owned by
private businesses that keep the barriers towards data portability artificially
So, even with all the advantages of a federated network of instances, the two
titans still outweighed in an industry where the winner takes it all, and for a
long time Mastodon and Diaspora instances were deserts comparable to Google+ -
except for few enthusiastic niches, and for a few active instances run from
places with strict social media limitations.
The wind has started to change [in April
And [the EU has also recently announced further
in enforcing their [vision for greater digital
After the early April diaspora I picked up my instance again, started following
some new interesting accounts and federating with some relays, and there's now
enough activity for me to use my Mastodon instance as my daily social driver.
Even if the scale of the Mastodon network (around 3-4 million users) still
pales in comparison to that of Facebook's empire, it starts to be a
considerable fraction of Twitter's active (human) user base.
However, even if many influential accounts have moved to Mastodon (or at least
they cross-post to Mastodon), such as [The
Guardian](, [Hacker
News]( and the [official EU News
channel](, there is still a big gap in terms of
accounts and content that are only available on Twitter/Facebook.
So I took some initiative, and decided that if the mountain doesn't come to me,
then I'll move it to me myself.
## Creating a cross-posting bot
There are a lot of amazing profiles to follow on the Fediverse, but you also
still miss a lot of the "official" accounts that make a timeline actually
stimulating. In my case, it's accounts of publications like the MIT Technology
Review, Quanta Magazine, Scientific American, IoT-4-All, The Gradient and The
Economist that really give me food for thought and make my social media
experience worth the effort of scrolling through memes and rants.
Those accounts are only on Twitter and Facebook for now, or maybe on some RSS
feed. But Platypush also provides integrations for [RSS
feeds]( and
[Mastodon]( So
a bot that brings our social newspaper to our new doormat is just a few lines
of code away.
Let's start by creating a new account on any Mastodon instance we like (if you
don't host one yourself, just make sure that you are aligned with the instance
admins and rules when it comes to bot activity). You can probably start your
adventure with a bot hosted on one of the largest platforms - e.g.
``/``. Specify username, email address and
password for your bot, confirm the email address, login with the bot account,
navigate to `Preferences``Development` ⇛ Create a `New Application`, give it
full access (`read`+`write`+`follow`+`push`) to the account, and copy the
provided `Access Token` - you'll need it soon.
![New application screenshot](../img/mastodon-screenshot-1.png)
It's also advised to navigate to `Profile` and tick the `This is a bot account`
box, so people on the network know that there's not a human behind it. You can
also provide a brief description of what profiles/feeds it mirrors so people
know what to expect.
![Bot account flag](../img/mastodon-screenshot-2.png)
## The Platypush automation part
You can install and run the Platypush bot on any device, including a Raspberry
Pi or an old Android phone running [Termux](, as long as it
can run a UNIX-like system and it has HTTP access to the instance that hosts
your bot.
Install Python 3 and `pip` if they aren't installed already. Then install
2022-11-17 15:53:25 +01:00
Platypush with the `rss` integration:
2022-11-17 15:50:41 +01:00
[sudo] pip3 install 'platypush[rss]'
Now create a configuration file under `~/.config/platypush/config.yaml` that
enables both the integrations:
base_url: https://some.mastodon.instance
poll_seconds: 300
Twitter no longer supports RSS feeds for profiles or lists (so much again for
the "Twitter❤open protocols" narrative), and there's a multitude of (mostly
paid or freemium) services out there that currently bridge that gap.
Fortunately, the admins of `` still do a good job in bridging Twitter
timelines to RSS feeds, so in `rss.subscriptions` we use `` URLs as a
proxy to Twitter timelines.
2022-11-17 15:59:37 +01:00
> UPDATE: `` has got a lot of traffic lately, especially after the
> recent events at Twitter. So keep in mind that the main instance may not
> always be accessible. You can consider using other nitter instances, or, even
> better, run one yourself (Nitter is open-source and light enough to run on a
> Raspberry Pi).
Now create a script under `~/.config/platypush/scripts` named e.g.
``. Its content can be something like the following:
import logging
import re
import requests
from platypush.event.hook import hook
from platypush.message.event.rss import NewFeedEntryEvent
from platypush.utils import run
logger = logging.getLogger('rss2mastodon')
url_regex = re.compile(r'http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+')
# Utility function to parse links content
def parse_bitly_link(link):
rs = requests.get(link, allow_redirects=False)
return rs.headers.get('Location', link)
# Run this hook when the application receives a `NewFeedEntryEvent`
def sync_feeds_to_mastodon(event, **context):
item_url = event.url or ''
content = event.title or ''
source_name = event.feed_title or item_url
# Find and expand the shortened links
bitly_links = set(re.findall(r'https?://[a-zA-Z0-9]+', content))
for link in bitly_links:
expanded_link = parse_bitly_link(link)
content = content.replace(link, expanded_link)
# Find all the referenced URLs
referenced_urls = url_regex.findall(content)
# Replace prefixes with
if '/' in item_url:
item_url = item_url.replace('/', '/')
source_name += ''
if item_url and content:
content = f'Originally posted by {source_name}: {item_url}\n\n{content}'
if referenced_urls:
content = f'Referenced link: {referenced_urls[-1]}\n{content}'
# Publish the status to Mastodon
)'The URL has been successfully cross-posted: {item_url}')
Now just start `platypush` with your local user:
The service will poll the configured RSS sources every five minutes (the
interval is configurable through `rss.poll_seconds` in `config.yaml`). When a
feed contains new items, a `NewFeedEntryEvent` is fired and your automation
will be triggered, resulting in a new toot from your bot account.
![Some cross-posts from a bot timeline](../img/mastodon-screenshot-3.png)
If you like, you can follow
[`crossbot`](, a Platypush-based
bot that uses the automation described in this article to cross-post several
Twitter accounts and RSS feeds to the `` Mastodon instance.
### Some performance considerations
Note that on the first execution the bot will start from an empty backlog, and
depending on the number of items in your feeds you may end up with lots of API
requests made to the instance. Depending on how large (and how bot-friendly)
the instance is, this may result either in a (small) DoS against the instance,
or your bot account being flagged/banned. A good idea may be to throttle the
amount of posts that the bot publishes on every scan, especially on the first
one. A few solutions (and common sense considerations) can work:
- Start a [Python
`Timer`]( education/how to perform threading timer in python/)
when a new item is received, if a timer is not already running. Every time a
`NewFeedEntryEvent` is received, you can append the event to the queue, and
upon a selected timeout the queue will be flushed and the most recent `n`
items synchronized to Mastodon.
from queue import Queue
from threading import Timer, RLock
from time import time
from platypush.event.hook import hook
from platypush.message.event.rss import NewFeedEntryEvent
# How often we should synchronize the feeds
flush_interval = 30
# Maximum number of items to be flushed per iteration
batch_size = 10
# Shared events cache
events_cache = []
# Current timer and its creation lock
feed_proc_timer = None
feed_proc_lock = RLock()
def feed_entries_publisher():
# Only pick the most recent events
events = sorted(
filter(lambda e: e.published, events_cache),
key=lambda e: e.published,
for event in events:
# Your event conversion and `mastodon.publish_status`
# logic goes here
# Reset the events cache
def push_feed_item_to_queue(event, **context):
global feed_proc_timer
# Create and start a timer if it's not already running
with feed_proc_lock:
if (
not feed_proc_timer or
feed_proc_timer = Timer(
flush_interval, feed_entries_publisher
# Push the event to the cache
- A producer/consumer solution can also work. Create a new hook upon
`ApplicationStartedEvent` that starts a thread that reads feed item events
from a queue and synchronizes them to your bot:
from queue import Queue, Empty
from threading import Thread
from time import time
from platypush.event.hook import hook
from platypush.message.event.application import ApplicationStartedEvent
from platypush.message.event.rss import NewFeedEntryEvent
# How often the events should be flushed, in seconds
flush_interval = 30
# Maximum number of items to be flushed per iteration
batch_size = 10
# Shared events queue
events_queue = Queue()
def feed_entries_publisher():
events_cache = []
while True:
# Read an event from the queue
except Empty:
# Only pick the most recent events
events = sorted(
filter(lambda e: e.published, events_cache),
key=lambda e: e.published,
for event in events:
# Your event conversion and `mastodon.publish_status`
# logic goes here
# Reset the events cache
def on_application_started(*_, **__):
# Start the feed processing thread
def push_feed_item_to_queue(event, **context):
# Just push the event to the processor
- A workaround for bootstrapping your bot could be to perform a _slow boot_.
Add one feed at the time to the configuration, and restart the service when
the latest feed has been synchronized, until all the items have been
After the first run the feeds' latest timestamps are updated and they won't be
reprocessed entirely upon restart. However, it's generally a good idea to keep
your bot light. If it posts too much, it may end up polluting many timelines, as
well as fill up a lot of storage space on many instances. So apply some common
sense: don't cross-post the whole Twitter, or your cross-posting bot will not
add much value.
## The advantages of a cross-posting bot
If used and configured responsibly, a cross-posting bot can vastly improve the
social experience on the Fediverse.
It brings relevant content shared on other platforms to the Fediverse, spinning
off discussions and interactions outside of the mainstream centralized
It's also a quick and efficient way to bootstrap your new instance. Many new
administrators are faced with a dilemma when it comes to kickstarting their
instances. Either they go the conventional slow way (advertise their instance
to increase their user base, and manually discover and follow accounts on other
instances in order to slowly populate the federated timeline, hoping that users
won't leave in the meantime), or they associate to one or more _relays_ (some
kind of _instance aggregators_ that bring traffic from multiple instances to
the federated timeline), just to be overwhelmed by an endless torrent of mostly
irrelevant toots that will quickly fill up their disk storage. Such a bot is an
efficient way in between: it populates your instance with the content that you
want, it brings in some hashtags and links from Twitter that you may decide or
not to boost on your instance, and it attracts people that are looking for
curated lists of content on the Fediverse.
## ...but the Fediverse isn't all that rosy either...
After so many praises of ActivityPub, Mastodon and its brothers, the time has
come to highlight some of their drawbacks.
I briefly mentioned _relays_ in the article, and that's not a coincidence.
Relays, if implemented, maintained and adopted properly, can be the killing
feature of the Fediverse. No more cold bootstrapping would be required for new
instances: as long as they share common interests and adhere to similar rules
as other instances, they can easily federate with one another by joining a
A relay is basically a server with a list of instance URLs. It subscribes to
the local timelines of the instances and it broadcasts their activities over
ActivityPub. Therefore, all the instances that are part of the same relay can
see all the public posts published on all the other instances in their
federated timeline.
Amazing, isn't it? Except that, as of today, the experience with relays is far
from this vision of a curated and manageable aggregator of instance. There are
[only a few usable open-source relay
projects](, and most of them
are still in a beta/pre-production stage. Most of the URLs you find on Reddit
or on forums are no longer working. An up-to-date list of active relays is
[available here](, it includes about
40 nodes as of today, and after trying most of them I can tell that they fall
into three categories:
- About half of them will turn your timeline into an endless torrent of spam
and saturate your database. Most of them automatically accept any relay
requests, and with no inbound filter spammers can easily take over. Also,
with no clear mission/purpose/shared interests or languages, and poor
filtering by topics and languages provided by the platform, after relaying
you can expected your federated timeline to turn into a Babylon with all the
languages and topics in this world. My database storage inflated by ~40 MB
just a couple of minutes after joining the most populated relay.
- A third of the URLs points to servers that no longer seem to accept relay
requests, or with nearly no content.
- The remaining ~15% points to a couple of relays that actually push
not-so-spammy content in a manageable way.
At the time being I have joined those relays, but there's really no concept of
curation/aggregation yet at the current stage. To me, relays should be to
Fediverse instances what OPML is to RSS feeds and podcasts: a curated way to
aggregate sources that share common traits, not a chaotic party where everybody
is allowed to join. We don't seem to be at that stage yet.
It also doesn't help that the two main instances (`` and
``) aren't part of any relays. The only way to get posts from
the largest instances pumped into yours is to follow individual accounts. I
understand the challenges of having to moderate large-scale relays involving
the two official instances, but I also think that if we keep the largest
instances out of the relay game then we can't expect relaying to improve much.
On the contrary, I see the risk for things to evolve in a direction where large
instances don't have any incentives in joining a relay, while relays are mostly
run by hobbyists and end up attracting a long tail of unfiltered and
non-curated traffic from all the other small instances. In such a scenario,
most of the people will simply open their accounts on the largest instances,
because that's where most of the things happen anyway. And then things will
just swing back towards centralization. That's why I don't get those who praise
decentralized social networks and then simply move to one of the two main
Mastodon instances. Supporting decentralization isn't just about migrating from
a large centralized platform to a smaller one. It's a much better idea to
support a smaller instance: it'll still act as a gateway to follow and interact
with anyone on the Fediverse anyway, while keeping the content really
All in all, however, I still believe that the Fediverse is the only possible
future for social media that is both scalable, portable and transparent. The
current immature state of the relaying technology will probably be fixed one
iteration at the time. And, even if Mastodon turns out to be a new centralized
titan in the future, we can simply move our data and accounts to another
instance running another server, just like we would move a website from a
hosting service to another. Because, after all, data portability and
interoperability is all the web was supposed to be about.