RSS Feeds are amazing! Many of us have built entire apps, services and businesses around it. The specifications for it have existed for a long time. Newer specifications like JSONFeed are powering new workflows around it.
Earlier in the previous decade, websites began injecting ads, trackers into their RSS feeds. Those were easy to get around for Elytra.
However, soon after the rise of Ad Blockers and similar technologies, more and more websites wanted people to open their web pages through their RSS feeds. This leads to websites providing only truncated excerpts in their feeds.
This become a legitimate problem soon after Social Media Giants began fiddling with users feeds by converting a timeline to a suggestions list like the coupons section from a local newspaper ¯\_(ツ)_/¯.
To solve this problem, a lot of genius services like Postlight’s Mercury popped up. They provided a huge boost to feed reader apps (and many others). Almost every app I used before Elytra used it. Elytra used it up till version 2022.2.4.
However, Postlight later made Mercury an open-source project which didn’t receive much attention from the original authors and the community since 2019. I tried to add my own patches to it however, it became tedious to maintain in its current form without doing a complete rewrite against the latest tooling and NodeJS versions.
Since I was considering a rewrite anyway, I thought to myself: why not do it with my new favourite programming language as of 2022: Swift. And as I was going to rewrite things, why not do it simpler (if not better!).
This led me to create Neptune, a pure Swift based system to fetch and parse full-text content from webpages for Elytra. It’s fast, very very fast (more details below) and uses a simpler logic compared to mercury to parse and process webpages.
The Swift programming language, by its design, leads to type safe code that brings fewer surprises compared to NodeJS. It also compiles down into a single executable binary, which has its own tradeoffs, but brings three powerful, key features:
- more performance
Oh, and I’m not kidding. On a tiny t3.nano (512MB RAM, 2GB swap, 1vCPU) server, these are some numbers:
|Process||0.1K Articles||1K Articles||10K Articles|
- These tests were performed with the t3.nano instance having full CPU credits available to it.
- These tests did not persist any data to disk by itself.
- These tests were performed in partial isolation: only the process being tested was run along with the test script.
If you’re asking yourself, what the numbers mean for you: the answer is quite simple: Neptune is very quick at fetching full-text context from your favourite blogs. Additionally, because it’s simpler to maintain and upgrade, adding support for new websites is vastly simpler and requires only a few lines of code of me (sometimes only 6 lines, 4 of which are bootstrap code).
I’m sure Neptune is not ready for any commercial use, not in its current form anyways. But I will eventually offer Neptune as a standalone service for other apps to use. I’ll document this at a later point in time.
Users of Elytra can take advantage of Neptune starting with the v2022.03 release as the default extractor. No settings to toggle. It’s all set up.
If you spot any issues, or articles from specific websites failing to load, please submit an issue on Github. It has a standard format making it easy for you to submit reports. I look forward to reading from you about your experience with Neptune.