Hello,
Thank you for your question regarding the adjustments to your robots.txt file. Your understanding of how the user-agent rules work is correct. The * wildcard does indeed cover all other user agents, so you won’t need to list every possible variant.
Your proposed rules would look like this:
User-agent: *
Disallow: /wp-content/uploads
User-agent: Bingbot
Allow: /wp-content/uploads
User-agent: Google-Extended
Allow: /wp-content/uploads
This setup will effectively disallow all user agents from accessing the /wp-content/uploads directory while allowing Bingbot and Google-Extended to crawl it.
Regarding your question about taking it back a directory level to /wp-content, you can certainly do that. If you want to disallow access to the entire /wp-content directory for all user agents except Bingbot and Google-Extended, you would adjust your rules as follows:
User-agent: *
Disallow: /wp-content/
User-agent: Bingbot
Allow: /wp-content/
User-agent: Google-Extended
Allow: /wp-content/
This will prevent all other user agents from crawling anything in the /wp-content directory while still allowing Bingbot and Google-Extended access.
Keep in mind that while these rules can discourage scraping, they are not foolproof, as some scrapers may ignore the robots.txt directives.
If you have any further questions or need assistance with anything else, feel free to ask!
Sources:
What Are the Best Robots.txt File Rules for Multisite? » Rank Math
Using Rank Math’s Robots.txt Generator » Rank Math
Hello,
You seemed to have marked this ticket as resolved.
We’re just following up with you to see if this is a concern that we can assist you with. Or, if your issue has been resolved, please let us know so we can mark this ticket as closed.
Thank you for choosing Rank Math.
Thanks Adetayo,
much appreciated. I’ll be doing more research into this. I need to give as best advice to clients on where to block and allow AI, depending on the content of their sites.
For example, from SiteGround documentation: We block crawlers intended for AI model training to protect our clients’ website data and intellectual property and save resources. So there are some nuances.
Hello,
Yes, there are indeed nuances when it comes to managing AI crawlers and protecting creative content. Your current approach is a solid start, and continuing your research on how different crawlers handle directives will help refine it further.
If you ever need help configuring or testing specific robots.txt rules with Rank Math, we’ll be happy to assist.
Hello,
Since we did not hear back from you for 15 days, we are assuming that you found the solution. We are closing this support ticket.
If you still need assistance or any other help, please feel free to open a new support ticket, and we will be more than happy to assist.
Thank you.