This guide discusses how to clean up a hacked WordPress website from an SEO perspective, not a security perspective.
In particular, it goes over the hack where hundreds or thousands of pages were created by the hackers on your website and hackers have indexed these pages in search results.
How Hacked URLs Tank Rankings
Why care about cleaning up hacked URLs from Google’s search engine result pages (SERPs)?
I have found that it affects rankings and crawl budget negatively to have Google “believe” that these hacked pages are part of your website.
Google will process these pages and determine that your website is not anymore only about topic a) i.e. dentistry in New York it is now also about the topic that the hacked pages are about b) i.e. Viagra sales in Japan.
If you had 40 dentistry pages known to Google to start with and hackers added 2000 viagra pages, Google may think you are now mainly a viagra outlet and will not trust you for rankings in the dental industry in New York anymore.
As a Google user and dental patient I also prefer dentists that don't peddle viagra on the side.
I have seen this happen in live SERPs, once the hacked pages were removed the rankings recovered.
1. Get A List Of Hacked URLs
There are three ways to get the list of URLs that hackers have inserted into your website. Some are better than others...
a) Decrypting hackers files on your server (best method as this gets you the full list immediately)
b) Getting the list of URLs from Google Search Console (second best)
c) Using site:domain.com search operator (Use Serp Extractor)
Decrypting files on your server
Decrypting files on your server is the ONLY way you will ever get all URLs that the hackers created all at once.
DO NOT delete the hacked files before decrypting them.
The problem with all other methods is that you are now using a secondary source, such as Google search results or Google search console. Secondary sources will only give a small percentage of the total URLs and you will spend a lot of time getting the full list.
I use a guy on Fiverr who specializes in decryption. I won’t mention him here but he is easy to find and charges start from $10.
Search: “decrypt php” on Fiverr.
2. Set Hacked Pages To 410 Error Status
A 410 status code tells search engines that the page has been deleted. 410 Errors are not supported by GSC reports.
It will show as an error in GSC for reporting purposes only. Google will action this correctly on their own server.
There are two ways to create 410 errors on your WP site that I recommend. I have coded two plugins that help you do this.
You can choose either the paid or the free route at this point.
Plugin 1) IndexJet (paid)
Plugin 2) 410 Delete Pages SEO (free)
IndexJet also allows you to request Google to visit your 410 pages again in bulk and generate XML sitemaps. More on that below.
Go to the 410 Tool/section in either of the WP plugins and copy/paste your list of hacked URLs that you want to be 410 errors.
Benefits of 410 errors over 404:
With a 410 we are telling Google clearly this page has been deleted. We are saying once Google has discovered the 410, there is no need for them to come back and check again, because the page is gone. This can save crawl budget when compared to a 404.
A 404 forces Google to come back regularly to check if your page is now back online. This is not something you want if you have thousands of hacked pages that Google needs to check 404’s on.
4. Create a Sitemap Of The Hacked URLs
Inside the IndexJet plugin you can directly create a sitemap.xml that includes all hacked pages, which you have now set to 410. This is important because we want Google to visit all these pages to discover our new 410 deleted status.
If you do not have IndexJet, the workaround is to use this Google sheet (make a copy) that I created to generate an XML sitemap from a list of URLs that you can copy and paste. You would then need to manually upload this XML sitemap to your server. The last step is to submit your sitemap to Google via GSC.
5. Submit Sitemap To Google Search Console
Copy and paste your sitemap URL into Google Search Console
6. Get Google To Discover The New Status (410) Of Your Hacked URLs
We have submitted a sitemap to Google, but this is not enough. Hackers have likely also hijacked your GSC and submitted their own sitemap previously to get the hacked pages indexed.
Google is reluctant to visit these hacked pages again because they are of low quality.
So how do we force Googlebot to come back as soon as possible to look at these hacked pages, in order to make Google realise that they are now 410 - deleted/gone?
The solution is IndexJet’s integration with three indexing services:
Via these services you can request Google to come discover your pages, IN ONE CLICK, directly from your WP dashboard.
At this point you need to bring your own API key. We might offer plans later that includes X amounts of Indexing credits per month/year.
An alternative for SEO's who are not IndexJet users would be to Google search console and submit hacked pages for indexing manually.
7. Monitor Googlebot While It Crawls Your URLs
We want to know when Google has discovered your 410 status codes.
At this point Google is aware that you have deleted the hacked pages from your server and can start to reprocess the signals (such as topical relevance of your website and internal link signals).
You can use the "Crawl Optimizer" section in IndexJet to monitor URLs that have been visited by Googlebot (this section is currently being worked on and will be released on 12th Jan 2021.)
An alternative to using IndexJet would be to look at your server logs. A tool such as Screaming Frog Log File Analyser would be needed in this case.
I hope the article has helped you to map out a way that you can clean up hacked pages in a straight forward fashion. Free and Paid tools were provided alike.
My goal is to help SEO's and WordPress uers make their workflows easier. Cleaning up a hacked website can be a real headache!