By Trey Grainger, CTO of Presearch
The blocklist is common on the internet, but not many people know what it is or how it works. A block list refers to a list of websites, parts of websites, or other items to hide. The most popular use case online is ad blocking. More than 42% of Internet users worldwide now use an ad blocker.
Most people who turn on ad blockers just assume that the software transparently “protects” them from content they wouldn’t want to see online or unnecessary tracking. In reality, most ad blockers primarily work by checking blocklists that have been compiled manually over time, listing websites or parts of websites that the person editing the blocklist wants to censor.
As the CTO of a search engine, I am quite familiar with this technology. But it wasn’t until recently that I learned how few people control these increasingly ubiquitous blocklists. Presearch, the decentralized, community-based search engine with 2.3 million users that I help manage, had major cascading issues caused by a bad blocklist in the spring of 2021.
A blocking list headache
As of March 31, many Presearch users began reporting that they could not log into the site, or that when they logged in, the functionality of the site was interrupted. People have flooded our internal community channels and Reddit with their concerns, saying things like: “My searches are stuck, what’s the problem?” Reinstalled and tried several browsers. Unfortunately, I have temporarily returned to Google.
Obviously this was a big deal, but it didn’t affect everyone, and we couldn’t initially find any reasonable source of the problem within the platform. After several days of issues reported by people using Brave, uBlock, Windscribe VPN, and similar services, we discovered that the entire Presearch website had been placed on a block list. We scoured Reddit, Discord, and Github, downloading and combing dozens of blocklists to find out which one added Presearch and why, and trying to contact each service and resolve the issue.
We discovered that Presearch had been added to a publicly maintained block list by only four people. One of them had seen a referral banner on another unrelated website and hastily added a general block for ALL requests to Presearch from the web. This person’s overzealous editing created a cascading effect as other independent services also made the public blocklist part of their systems and started blocking Presearch as well.
In the case of an important VPN, the Presearch website was unreachable and appeared to have disappeared from the internet. In the case of ad blockers and privacy browsers like uBlock and Brave, the website would load, but search results were blocked causing users to think our website was badly damaged. We have wasted millions of searches and a week of development time tracking down the problem and implementing emergency countermeasures. Our traffic and our revenues have taken a big hit. I’m sure some users have left and never returned, and it’s a blow to the reputation of the brand that we’ve worked for years to build.
We reached out to one of the developers running the blocklist for help. To be removed, we had to prove that we are not an ad network and that the blocklist broke our website because users could not access it properly. The developer addressed the issue by adding a more selective block that only applied to the referral banner he initially saw. We were warned not to get around this, or they would apply a tougher filter.
I contacted another ad blocking service that was clearly linked to the original list, but the developer in charge did not disclose any details. Although their process was a black box, they ultimately helped us solve the problem by adding Presearch to an “exceptions” list.
Soon Presearch was back up and running. But after a week of navigating the blocklist rabbit hole, I started to reexamine this ecosystem and how subjective it all is. Why are the blocklists that censor the internet content of hundreds of millions of users controlled by so few people? With these blocklists now built into many services by default, are consumers even aware of what content is being censored to them? What gives a few people the right to decide what should and shouldn’t be available online, using whatever criteria they choose, with little or no due diligence?
I don’t believe these developers are malicious. Blocklists serve a useful purpose. However, I think the ad blocking industry is overzealous in their efforts to ‘clean up’ the web and decide what should and should not be viewed, breaking or hiding parts of many websites in the process. What if this practice was infiltrated by bad actors? It’s easy to imagine how quickly things could escalate. Entire websites could start to disappear depending on the whims of a small group of powerful blockers.
One of the founding principles of Presearch is resistance to “control of the few over the many”. We’re trying to create a decentralized search engine, where decisions about content and algorithms are open and made by the community, not by a small group of companies or developers.
My hope is that as people start to see what’s behind the curtain, we can start to build a better future. A future where search engines open up and reveal their algorithms, developers explain how they relegate ads, blocklists are community powered, and we all benefit from a more open, transparent and decentralized web.
About the Author
Trey Grainger is the CTO of Presearch, a decentralized search engine. He is also the founder of Searchkernel, an AI-based research and consulting development company. He lives in South Carolina.
The views and opinions expressed herein are the views and opinions of the author and do not necessarily reflect those of Nasdaq, Inc.