Hello,
I have checked the options and I cleared the cache and checked the exceptions, so if you still have the issue, it might be the cache on Cloudflare (since the error Google shows about the HTML is caused when the sitemap is cached).
Looking forward to help you.
Hey Alberto,
Thank you for the assistance and for the suggestion of the Cloudflare cache. A couple of weeks ago I already added some exclusions for the sitemap to Cloudflare, however, I don’t know if this would be enough to make sure Google Search Console won’t give the error anymore. Would you happen to know?
Above an image of how the CloudFlare rules are currently configure.
I’d love hearing you,
Kind regards,
Richard on behalf of Smart Options
Hello,
That Cloudflare rule is perfect, you shouldn’t have any issue (at least not by having Cloudflare caching the sitemaps).
In fact, I checked now your sitemap_index.xml file and the response of Cloudflare doesn’t contain the term ‘HIT’ in the cf-cache-status so it is not providing a cache result.
Looking forward to help you.
Hi Alberto,
Thank you for checking that, it indeed makes sense that Cloudflare is not doing anything with the sitemaps now, so we can rule that out. We’ve also contacted WP Rocket as we stated in the first post and they got back to us as well. Their answer was that they basically don’t do anything with the sitemap except use it to preload pages.
We still don’t know what is causing the intermittent vanishing of the sitemap. Currently Google Search Console notes that it found the sitemap index including all the sub sitemaps and is indexing everything correctly. It seems that once in a while it just disappears and Google Search Console can’t find it either, resulting in the massive loss of keywords.
Do you have any more suggestions in which direction we could search for a solution?
Hello,
Thank you for getting in touch with us.
I checked your main sitemap URL and it seems like it is still being cached by Cloudfare. Please check your exclusion rules again and ensure you have cleared your cache after saving the exclusion rules.:
If you have done this and you are able to view your sitemap without any problem, then Rank Math is outputting your sitemap correctly and the error must be coming from Google. You can try and use a 3rd party sitemap plugin to confirm. Just make sure you have disabled the Sitemap module of Rank Math in WordPress Dashboard > Rank Math > Dashboard
when using a 3rd party sitemap plugin.
If the other plugin works, please do let us know.
Thank you.
Hi Michael,
Thank for the detailed response. I’ve checked the CloudFlare rules again and they seem te be correct. I tested the following:
So it seems that clearing the Cloudflare cache is causing this issue. I’m pretty sure the page rules in CloudFlare are correct but I’m not sure why it still shows the CloudFlare cache timer in the header.
What could possibly cause this?
Kind regards,
Richard on behalf of Smart Options
Hello,
Could you please check with your host.
They might be also caching your sitemaps all the same.
Looking forward to helping you. Thank you.
Dear Michael,
Thank you for the suggestion. The host doesn’t cache anything by default so that’s something we can cross of the list. Compression is also disabled server-side.
I’d love hearing you.
Kind regards,
Richard
Hello,
It looks like the sitemap is still being cached:
However, your sitemap is being shown as XML sitemap so we are not entirely certain why Google is reading it as an HTML sometimes.
We will need to check when that happens as right now, it seems to be working correctly. Please do ping us when the issue returns and do not make any changes until we check.
We noticed you had turned off the sitemaps feature in Rank Math and were using another sitemap plugin. We re-enabled it but reverted back to the previous setup after checking it with Rank Math.
Hope that helps and please do not hesitate to let us know if you need our assistance with anything else.
Hi support,
Currently our sitemap is not working. The page https://smartoptions.io/sitemap_index.xml forwards to the homepage.
Hi Uzair,
Thank you for the response.
Currently the sitemap doesn’t work. When I go to https://smartoptions.io/sitemap_index.xml it works. But when I go to https://smartoptions.io/post-sitemap1.xml it forwards to the homepage. I guess this is why Google says it’s HTML. Could you take a look?
Kind regards,
Richard
Hello,
Your sitemap URLs are currently loading perfectly fine from my end.
It seems your sitemap is being preloaded from your cache. This is causing your sitemap to result to a 404 page which then redirects to homepage according to the fallback behavior settings.
Since the regex rule doesn’t seem to be working, you can exclude each URL individually and check if that stops the URLs from being cached.
Looking forward to helping you. Thank you.
Hi Michael,
Thank you, that makes sense. I’ve excluded all the URLs now. Is there any way we can test this would you think?
Kind regards,
Richard
Hello,
You can test your sitemap URLs with this tools: https://websniffer.cc/ to get and view HTTP Request and Response Headers by the URL. To get more details on your URL,check under Response headers in detail.
Hope that helps. If you have any further question(s), please let us know.
Thank you.
Hello,
I have updated the sensitive data as requested. Can you please check further?
Thank you.