How to love your local newspaper again: block Taboola and spam ads, forever

Begone, Taboola, Outbrain and your horrible brethren

  • Taboola, Outbrain, Zergnet and others wreck browsing
  • Very rarely delivering content of value or quality
  • No need for subscriptions, software or plugins

The days of webrings and Geocities, of people sharing links to each others’ sites and creating content for the joy of sharing their interests may be long gone, but there’s very little inherently wrong with the model that Taboola and its ilk espouse – given the number of tracking cookies that exist on every legitimate site you visit, why shouldn’t a web service find and recommend further reading for you?

It’s a great idea, except for one thing. Monetisation. It’s a dirty word; endless pages of adverts, MPUs that interrupt and break content, pop-ups… and the people that discovered the world of targeting search terms and AdSense revenue also discovered the success of clickbait.

Thus, Taboola’s inherently agnostic concept has became fundamentally evil, full of dreck you wouldn’t ever choose to click on, but packaged in a way to be compelling. Frankly, I am sure we all just wish it would go back from whence it came, never to darken an interesting local news story or tech blog again…

Blocking malware, spam and advertising sites with hosts file

The beauty of the hosts file is that it can be configured to block everything you don’t want to see on your computer. It’s a great mechanism for protecting your kids from the less safe areas of the internet, for killing off the countless tracking systems that are monitoring your internet use, and for wiping out objectionable adverts.

It’s also a fairly effective way – though one that is easily circumvented – of weaning yourself off social media, gambling sites or blocking ‘fake news’. Wonderful people contribute new toxic hosts identified on the internet all the time, so if you’re finding unwanted content is getting through it’s worth checking for updates.

Make your local newspaper website readable again

Sick of seeing an interesting local story linked from the BBC News website only to discover that the Pagwell Tribune or Trumpton Gazette’s website has become an unusable spam-fest of pop-ups, videos and interruptions to the actual story? Most of these local news brands have been acquired by one large corporation, National World PLC.

This is a group designed to deliver shareholder value, through “technology to manage and monetise it (content) will be embedded in National World. These capabilities will be deployed across news publishing assets”. It’s only going to get worse, as this April 2024 press release confirms yet another misguided application of ‘AI’.

News publishing assets such as: “the Yorkshire Post, The Scotsman, The Portsmouth News and The Sheffield Star stretch from the south of England to the north of Scotland. In addition, the Northern Ireland titles include The Newsletter, the oldest English language newspaper in the world, and The Derry Journal.”

These were once crucial, trusted and essential publishers supporting serious jounalism. They still could be, if it weren’t for this sort of mentality: “The company was sold in November 2015 to Reach plc (the owner of Trinity Mirror plc) and delivered a total return to its investors of approximately four times their original investment.”

National World’s model does not support local businesses, local audiences, or provide any real value as a local news service because nearly all of the content is unreadable behind an astonishing load of hideously-coded websites and irrelevant advertising. By using a well-crafted hosts file on your computer or router, you stand a chance of being able to read the articles again – and it’s a lot more effective than an ad blocker.

Steven Black hosts file – download link

The big kahuna* is the Steven Black hosts file. This is an epic gathering of net addresses (an aggregated hosts file) that is user-generated and maintained to block everything unpleasant anyone’s found on the internet, ever. Well, except for Donald Trump and Nigel Farage. Those turds just won’t flush.

To make use of it, you can either download the main hosts file – direct link to the download page – or you can browse the alternate versions which have pretty self-explanatory names.

Making your computer blind to Taboola

  • Redirect any attempt a website makes to contact clickbait sites
  • Involves editing a system file – ‘hosts’
  • Ideal for laptops – works when you change networks too

At the core, Taboola is just another set of (around the) web pages. For all of measures people take to get around ad blockers and other services, if you tell your computer to go to a dead end when a web page asks it to go somewhere – that’s where it will go.

That can be done on several levels – you could just block the server where the images are, or the servers that provide the tracking, but why not get rid of the lot?

This can be done on Windows, on Mac OS, on Linux other Unix-like derivates, and on many routers too, by editing the ‘hosts’ file. In fact, any computer that communicates using internet protocol (the IP part of your IP address) has some form of hosts file.

This Wikipedia article gives the locations of hosts files on most operating systems – chances are you want access to it on one of the three major ones though (hat-tip to anyone else running Haiku).

Here’s an example of a hosts file that will block most of Taboola and Outbrain:

Click to expand: Taboola-blocking hosts…
##
# Host Database
#
# localhost is used to configure the loopback interface
# when the system is booting.  Do not change this entry.
##
127.0.0.1	localhost
255.255.255.255	broadcasthost
::1             localhost 

##
# To use this in your own hosts file copy the entries below
# and add them after the existing entries in your version
##

0.0.0.0 popup.taboola.com
0.0.0.0 www.popup.taboola.com
0.0.0.0 taboola.com
0.0.0.0 www.taboola.com
0.0.0.0 cdn.taboolasyndication.com
0.0.0.0 taboolasyndication.com
0.0.0.0 www.taboolasyndication.com
0.0.0.0 www.cdn.taboolasyndication.com
0.0.0.0 trc.taboola.com
0.0.0.0 www.trc.taboola.com
0.0.0.0 images.taboola.com
0.0.0.0 adserver.adtechus.com
0.0.0.0 www.adserver.adtechus.com
0.0.0.0 us-u.openx.net
0.0.0.0 www.us-u.openx.net
0.0.0.0 api.taboola.com
0.0.0.0 www.api.taboola.com
0.0.0.0 c2.taboola.com
0.0.0.0 www.c2.taboola.com
0.0.0.0 cdn.taboola.com
0.0.0.0 www.cdn.taboola.com
0.0.0.0 urc.taboolasyndication.com
0.0.0.0 www.urc.taboolasyndication.com
0.0.0.0 taboola.com.edgekey.net
0.0.0.0 www.taboola.com.edgekey.net
0.0.0.0 esd-secure.taboola.com.edgekey.net
0.0.0.0 www.esd-secure.taboola.com.edgekey.net
0.0.0.0 nr.taboola.com
0.0.0.0 api.taboola.com
0.0.0.0 c2.taboola.com
0.0.0.0 https://cdn.taboola.com
0.0.0.0 urc.taboolasyndication.com
0.0.0.0 taboola.com.edgekey.net
0.0.0.0 esd-secure.taboola.com.edgekey.net
0.0.0.0 cdn.taboolasyndication.co
0.0.0.0 oooutbrain.com
0.0.0.0 www.outbrain.com
0.0.0.0 odb.outbrain.com
0.0.0.0 images.outbrain.com
0.0.0.0 log.outbrain.com
0.0.0.0 amplify.outbrain.com
0.0.0.0 amplifypixel.outbrain.com
0.0.0.0 vra.outbrain.com
0.0.0.0 widgets.outbrain.com
0.0.0.0 sync.outbrain.com
0.0.0.0 tr.outbrain.com
0.0.0.0 vrp.outbrain.com
0.0.0.0 vrt.outbrain.com

It’s an old-school approach – when IP networks were first created, the hosts file served as the address book to other machines on your network (and the wider net), a process that’s now automated and transparent via services such as DNS. Normally you don’t need to think about it – but your computer’s OS still uses it.

Editing your hosts file

Have you used Notepad or TextEdit before? You’re all set. The hosts file is just a straight text file – you’ll be cutting and pasting the IP addresses and domains anyway.

You can use the Taboola-blocking hosts above – just copy the entries as directed, and paste them into your file.

If you want an easier solution you can use an application such as Hostman for Windows, which will manage your hosts file and auto-apply updates as needed. On Mac OS there are some barriers for developers poking system files, but Hosts by Dirk Fröhling (who gets a link partly for being an Apple IIgs fan) works on recent systems as a quick and easy way of editing the file, and there’s a preference pane version too.

It’s quite simple though – for example, there’s no easy way to search for a specific host to unblock. When the aggregated hosts file mentioned at the start of this article can have between 132,000 and 202,000 entries, that’s a LOT of scrolling…

On Windows 8, 8.1 and 10

Modern Windows systems mean an extra step over older ones – start by running Notepad as an administrator. Click the search box, type ‘Notepad’ and right click – select ‘Run as Administrator’ and authenticate.

In Notepad, open:

C:\Windows\System32\Drivers\etc\hosts

Add the required domains and save.

On Windows 7 and earlier

Bring up the ‘run’ box in the Start menu:

Then type this command.

notepad c:\windows\system32\drivers\etc\hosts

As with later versions, just save and restart your browser.

There are many Windows utilities to automate managing your hosts files – I can’t make any recommendations for them, and on the whole, don’t recommend giving anything you download from the internet that’s ‘solving a common problem’ Administrator rights on a Windows machine.

As a rule of thumb, if the website you’re looking at with the solution looks exactly like the kind of sites you’re trying to block in the first place… run away.

On Mac OS X

It’s fastest to do this from the terminal, but if you’re building a complex set of sites to block you might want to have it in a friendlier editor. TextEdit is ideal.

Terminal is in your Applications/Utilities folder.

The terminal app on Mac OS

When it’s opened, type:

open /etc/hosts

Which will open TextEdit and your hosts file. You can’t edit it here, though! Save a copy of it so you can make changes, and add the hosts you want to block. Some example hostfile entries point the sites you want to block to 127.0.01 (your computer) rather than 0.0.0.0 – the latter ensures the requests for the blocked sites don’t interfere with any web services you might be running.

Once you’ve added them all, open terminal again and type:

sudo nano /etc/hosts

It will ask for your password; this will only work if your account is an administrator. Then it will open the hosts file your computer is using.

nano text editor, hosts file on Mac

Cut and paste the new entries at the end of the file, like this…

Domains to block added to hosts file

nano is quite self-explanatory; once you’ve pasted the hosts you want to block, press control-o to write the file, press enter (it should say File name to write: /etc/hosts above the menu), and then press control-x and close the terminal.

Restart your browser, go back to the offending site, and marvel at all the empty space – or just a steady flow of quality content without interruptions…

The process on Linux is much the same as on Mac OS – chances are, if you’re using Linux, you don’t need this guide.

Is modifying my hosts file safe?

If you’re the one doing it, absolutely. Chances are if you’re reading this far, you’ve never even thought about such files – but it’s a pretty harmless text file from the users’ perspective as long as you’re aware of it. Other applications – malware, ransomware, viruses or whatever – may attempt to modify the hosts file in the background if they are given, or have acquired, permissions to where it’s stored.

This risk means that rather than it being a hidden part of the system, it’s actually better to be aware of your hosts file and its contents, than to be unaware of its existence. Just as it’s the quickest blunt object that can be used to redirect your computer for nefarious purposes, it’s the quickest way to stop malware even reaching the outside world.

Taking a backup of the hosts file when you first set up your computer allows you to see what, if any, changes have been made to it without your knowledge – and it also allows you to restore it quickly. It’s not a magic bullet – you will want to keep anti-malware software running – but it’s a great way to kill off frustrations and simple attacks/redirects.

Downsides to blocking all sites in the hosts file?

In an attempt to limit tracking you will probably block sites such as analytics.google.com – which you need if you want to see your website’s stats. You will also break things you might have chosen to use, such as TopCashBack, and lose the tracked purchases.

That’s why you probably want to start with a subset of known pestilence, such as Taboola/Outbrain etc.

Removing them across your home network

  • Solves the problem no matter which computer you’re using
  • Benefits smartphones and other devices, too
  • Success can depend on how sophisticated your router is

You can save time by blocking these networks at your router, instead. This should work for your smartphones and tablets as well, as long as they’re connecting over WiFi, and it naturally saves having to edit all your computers. Guests will also benefit from your sanitised internet feed.

One of the most popular tools for sanitising the internet is a Pi-hole. No, not the thing 1940s American wiseguys tell you to shut; a Raspberry Pi adapted to the single purpose of running firmware that dynamically updates a blocklist of toxic and unwanted hosts, then blocks them for everything on that network from smart speakers to smartphones. Set up correctly it can even block ads within apps.

However, you may be able to implement a hosts file on your existing router. It’s down to the device you’re using; there are different tactics depending on the level of router you have – the kind supplied by your provider may be much harder to configure than consumer routers, and high-end routers may be difficult in a different way, giving you all the power you need but none of the hand-holding.

It’s easy enough to search for instructions on blocking sites with your particular router, but you need to know what to block – you can choose domain names from lists like the Steven Black hosts file mentioned below.

Some routers will allow you to set up your own DNS server. Again, if you know how and why you’d do that, you don’t need this guide – but grab the list of ‘contextual advertising’ URLs from the sources provided!

Keyword blocking on a Netgear router

My own experience with this on a Netgear Nighthawk D7000 V2 is that the keyword blocking for domains isn’t always reliable and as you can see by the list of domains Taboola operates across, will be very time consuming to block.

If you want to take an advanced route DD-WRT firmware gives more powerful options (including a hosts file in the router you can edit), if your router is supported.

Will blocking Taboola stop anything else from working?

I’ve been blocking Taboola via the hosts file for ages. All it’s done is caused absolute shock when I’ve used a different computer and been confronted with the awful ‘One neat trick… you won’t believe’ upscaled grainy-image shit on an otherwise reputable brand’s website. None of those sites have failed to work otherwise and local news websites have become tolerable again.

Is it hurting the publishers financially? Probably, if we all block it this way – chances are, if you’re going to this much effort to remove it, you weren’t going to contribute to their revenue anyway.

But there are other models of advertising and revenue generation. Ones that can be vetted and curated for quality and audience relevance. Taboola is lazy, serving low-quality, brand-damaging ‘content’. It is no loss if it goes.

Holy crap, it worked, Taboola’s gone!

I know, right! No ‘paid extensions’, no ‘subscribe here’. It’s a wonderful world where you’ll never again wonder what that one neat trick with WD40 is, and what it’s got to do with toilets…

If you want to say thanks, feel free to click on this link to Amazon before buying something; affiliate links are the only revenue-generating measure on my site. Yes, it does make a loss – it wasn’t created to make money!

* I’m sure there are loads of similar projects, but this one is, frankly, big enough for anyone

Author