-
Dear Team,
I have 10K Not indexed URL and 1.4k Index URL almost in 10k I have parameters URL which I don’t want google to crawl.
Ex. https://example.com/category/stock_status=out_of_stock%2Cin_stock&view_mode=grid&et_columns-count=5
?stock_status, ?add_to_cart, such many parameters are getting index. I have blocked them with robots.txt but still in search console pages are getting added because of blocked by robots.txt error messages.
Could you please assist me stop them getting crawled
-
Hello,
Thank you for reaching out to us with your concern about stopping specific parameters from being crawled by Google. To prevent certain parameters from being indexed by Google, you should not rely solely on the robots.txt file. The robots.txt file is primarily used to control crawling access, not indexing.
To address this issue effectively, you should use the "No Index Robots Meta" feature provided by Rank Math. This feature allows you to specify which URLs should not be indexed by search engines, including those with specific parameters like the ones you mentioned (?stock_status, ?add_to_cart, etc.).
Here’s how you can prevent these URLs from being indexed:
-
Identify the URLs with Parameters: First, identify the URLs with parameters that you don’t want Google to index.
-
Add No Index Robots Meta: In your WordPress admin area, navigate to Rank Math SEO → General Settings → Edit robots.txt. If you don’t see this option, ensure you are using the Advanced Mode in Rank Math.
-
Use No Index Robots Meta: Instead of blocking these URLs in robots.txt, add a No Index Robots Meta tag to these specific URLs using Rank Math. This will instruct search engines not to index these pages.
If you need further assistance or detailed instructions on how to implement the No Index Robots Meta using Rank Math, please let us know. You can also provide specific URLs with parameters that need to be excluded from indexing for more tailored guidance.
We are here to help you with any additional questions or concerns you may have.
Thank you for your understanding and cooperation.
Sources:
How to Fix the "Indexed, though blocked by robots.txt" Error » Rank Math
Common robots.txt Issues & How to Fix Them » Rank Math
How to Fix ‘Blocked by robots.txt’ Error in Google Search Console » Rank MathAs it mentioned in rank math blog (Preventing a page from being crawled ≠ Preventing a page from being indexed)
I don’t want google to crawl my parameters URL, I already have no index for my parameters URL.
Hello,
If Google has already indexed the page – which it has – the rule in the robots.txt file will not remove the page from Google, you will have to use the Google Removal tool to remove the URL from Google’s index.
Then you can use the following filter on your website to set the pages to
noindexso that Google doesn’t index them on SERPs:add_filter( 'rank_math/frontend/robots', function( $robots ) { $url = home_url( $_SERVER['REQUEST_URI'] ); // Check if the URL contains either 'stock_status' or 'add_to_cart' query parameters if ( strpos($url, '?stock_status=') !== false || strpos($url, '?add_to_cart') !== false ) { $robots["index"] = 'noindex'; $robots["follow"] = 'nofollow'; } return $robots; });Here’s how you can add filter/hook to your WordPress site: https://rankmath.com/kb/wordpress-hooks-actions-filters/
Hope that helps and please don’t hesitate to let us know if you have any other questions.
Thank you for choosing Rank Math!
Dear Adetayo,
Thank you for your prompt response. I appreciate your assistance. I have already set the pages to “noindex” and “nofollow” to prevent them from appearing in search results. However, I am concerned about preventing Google from crawling these pages altogether in Google Search Console.
Could you please suggest a solution to ensure these pages are not crawled by Google?
Thank you for your support.
Hello,
If you want to block search engines from crawling those URLs with parameters, you can disallow them using your robots.txt
Disallow: *?stock_status=* Disallow: *?add_to_cart*Here’s a guide on how to edit your robots.txt using Rank Math:
https://rankmath.com/kb/how-to-edit-robots-txt-with-rank-math/Looking forward to helping you.
Dear Reinelle,
Thank you for your prompt response. I appreciate your assistance. I’ve already followed the robots.txt step but in the search console it shows an error that xyz.com/category?fliter=new is blocked by robots.txt, I am concerned about preventing Google from crawling these pages altogether in Google Search Console. I don’t want Google to crawl any parameter.
I don’t want to see them in the search console because they are rapidly growing every day more than 10-20 links
Hello,
You should use the URL inspection tool of your GSC account to inspect those URLs and check the referring page where they are coming from.
Once you know their referring page, you can remove or update them so Google won’t discover those kinds of URLs.
Hope that helps.
Dear Reinelle,
Thank you for your prompt response. I appreciate your assistance. I know where this URL are coming from, I guess you’re not able to understand my query.
/shop?filter=high_to_low?sortby_
I guess you are aware of such parameters, I want them to stop crawl
I have already blocked them by robots.txt but still those are been crawled by google
Hello,
Since you’ve already disallowed the pages in your robots.txt file, please allow Google some time to recrawl your site to reflect the changes on Search Console.
If you still face the issue, please share some exact affected URLs in the sensitive data section so that we can check the issue further for you.
Let us know how it goes. Looking forward to helping you.
Thank you.
Hello,
Since we did not hear back from you for 15 days, we are assuming that you found the solution. We are closing this support ticket.
If you still need assistance or any other help, please feel free to open a new support ticket, and we will be more than happy to assist.
Thank you.
-
The ticket ‘stop parameters to crawl in google search console wordpress’ is closed to new replies.