Skip to content

Soft 404s and When to Say No

Some SEOs are super diligent. We believe that removing all of the “bad” or “wrong” stuff is important to having a shiny, clean website, for which we can not possibly be penalized in any way. The philosophy is part of the broader strategy of: “You don’t have to be the best, you have to be better than the next guy.”

But sometimes, it’s just not possible to clean it all up. For a detail-oriented, 100% white hat SEO like myself, it hurts to see that Google associates my website with crap. Especially when I can’t see any way to fix it.

Here’s the story:

On a recent routine inspection of the Indexing toolset in Google Search Console, I noticed there are 139 Soft 404s. 

Soft 404 Report from Google Search Console

A Soft 404 is a page which probably should load some content (like Page 6 of a series of pages) but isn’t. I have seen this most commonly with e-commerce sites where inventory comes and goes but URLs are forever.

There are better ways to deal with inventory inconsistency, but Soft 404s are a reality, and almost a kindness from Google. Rather than de-index your page, they keep checking it to see if real content appears on it so they can index it again.

Except sometimes those Soft 404s come from external websites that had no good reason to link to you in the first place.

Which brings me back to my point. In my list of Soft 404s crawled on January 17, 2023 are these two beauties:

  • http://ns1.wineclubreviews.net/
  • http://ns1.wineclubreviews.net/cgi-sys/defaultwebpage.cgi

Let’s talk about what’s wrong with these URLs and how Google even found them in the first place.

  1. My whole site is https. There are no URLs at wineclubreviews.net that don’t redirect to https.
  2. “ns1” as a subdomain is a nameserver. Except I don’t handle nameserving on my domain, so there is no nameserver at this location.
  3. The second URL is a default error message page from cPanel (which itself serves a 200 status).

I’ve owned this website for 14 years and I know its history. I know that this URL is from before I started using Cloudflare (some time in 2014). So, Google is newly crawling a 9-year old nameserver. The question begs to be asked… why?!

By using the URL Inspection tool to determine how Google found these ridiculously old nameserver URLs, I found the culprit, MyWOT has a “scorecard” for my site. 

MyWOT is a business so old and with such a checkered past I’d forgotten it even exists. Their home page claims there are 141 million installations of their Chrome Extension (the Extension store says 1,000,000+ and their welcome email says they have 2,000,000 members).

Anyway, they let random people tell you if a website is safe. And seven years ago someone went on MyWOT and said my website is a “front for wine clubs,” and rated the site 0.5 star. The user left six crappy reviews from 2015 to 2019, including one for a competitor of mine.

So now, because some jerk seven years ago proclaimed I was a “front,” I’m stuck trying to figure out how to get Google to ignore these Soft 404s.

Or really — and here’s the whole point of this rant — sometimes we just need to let it go. Fixing this particular problem is not worth my time and therefore there will be cruft in my Google Pages tool forever more.

You might also like:   SameAs Structured Data has Hidden Benefits

Recognizing when to put the bone down and stop growling at the garbage on the Internet is an important part of being a good SEO.

Here are some additional resources to help you sort out whether you should deal with your own Soft 404s.