Why did Google/Bing bots get blocked?

why-did-google-bing-bots-get-blocked

Pradheep BS
Written by Pradheep BSLast update 4 months ago

If you've seen an entry on the Threats page talking about a Suspicious Google/Bing Bot being blocked - It’s not what it looks like. it means that Astra has blocked a malicious user pretending to be a search engine bot, and not the real bot itself. Search engine bots are important for your website, we understand that. And bad bots are equally problematic for your website – which Astra stops. 

How to verify Google/Bing bots

Google Bot


The Google bot that was stopped is most likely a bad bot pretending to be a Google bot. Often, hackers program bad bots in a way that they seem like Google bots but aren’t. Astra verifies each bot from Google’s server, so if a bot is claiming to be a Google bot Astra checks the bot’s IP with Google’s servers and only lets legit Google bots scan your website.

You can verify the IP of Google Bot yourself: https://support.google.com/webmasters/answer/80553?hl=en


Since the hostname is not googlebot.com or google.com, it is a fake bot

Bing Bot


Just like Google, hackers often make their malicious bots look like Bing Bots but in reality, they aren’t. Astra verifies the authenticity of the bot coming to your website with Bing itself and only lets legit bots come to you.

You can verify the IP of Bing Bot yourself: https://www.bing.com/toolbox/verify-bingbot

IP verification failed in the Bing tool


Troubleshooting


If there is a problem with the hosts file or a DNS configured on your server, the plugin may not be able to resolve the hostname of the bot to verify it.

You can run additional tests using a script and request assistance from our support team.

1. Login to your hosting account using the Hosting Panel File Manager, (s)FTP or SSH

2. Create a file called host-checker.php with the following code:

Do not forget to delete the file from your server to prevent misuse

<html>

    <head><title>Hostname Resolution Checker</title></head>
    <body>
        <h1>Hostname Resolution Checker</h1>
        <form method="post">
            <label for="ip"><strong>IP address</strong></label><br><br>
            <input type="text" id="ip" name="ip" placeholder="Enter an IP address" required pattern="((^|\.)((25[0-5])|(2[0-4]\d)|(1\d\d)|([1-9]?\d))){4}$">
            <input type="submit" value="Submit">
        </form>
    </body>
</html>

<?php

function addLog($msg)
{
    printf("<p><code>%s</code></p>", htmlentities($msg));
}

function getHost($host_with_subdomain)
{
    $array = explode(".", $host_with_subdomain);
    return (array_key_exists(count($array) - 2, $array) ? $array[count($array) - 2] : "") . "." . $array[count($array) - 1];
}

$ip = $_POST['ip'] ?? null;
if($ip) {

    if (!filter_var($ip, FILTER_VALIDATE_IP)) {
        die("Please enter a valid IP addrress");
    }

    echo "<h2>Logs</h1>";
    addLog("IP entered: " . $ip);
    $host = gethostbyaddr($ip);
    addLog("Hostname is: " . $host);
    
    $allowedDomains = array('googlebot.com', 'google.com', 'googleusercontent.com', 'msn.com','search.msn.com');
    $domain = getHost($host);
    
    if (!in_array($domain, $allowedDomains)) {
        addLog("It is a fake search engine bot");
        exit;
    }
    
    $forward_ip = gethostbyname($host);
    if ($forward_ip === $ip) {
        addLog("Legitimate search engine bot");
    }
}
?>


3. Open the file in your web browser
4. Enter the IP address you want to verify



5. Under the Logs section you will see the results of the test
6. For our engineers to verify the findings, please share the text under the Logs section by creating a support ticket

Did this answer your question?