Filter to eliminate non-canonical post versions from posts sitemap

#307631
  • Resolved Nc
    Rank Math free

    Hi,

    I use WP Multilang as multilanguage plugin. I configured it so that:
    – All site pages are translated in PT, EN, ES
    – All site blog posts are “triplicated” (not translated) in PT, EN, ES.

    The canonical tag that I generate manually in functions.php tells google which is the canonical post. Most of the posts the canonical is the PT version, 5 of them is the EN version. None is the ES version.

    This are the sitemap generated by rankmath:

    https://www.capitalinvest-group.com/sitemap_index.xml

    https://www.capitalinvest-group.com/pt/page-sitemap.xml

    https://www.capitalinvest-group.com/pt/post-sitemap.xml

    Page sitemap is ok. As it shows all versions of every page (that are all translated).

    Post sitemap is wrong. It show all version of every post, but only 1 is the canonical. Therefore in the posts sitemap there are canonical and non-canonical versions of every post.

    Is there a filter I could use on functions.php to manually adjust the posts sitemap generated by rankmath so that the 2 non-canonical versions of each post are suppressed from the sitemap?

    This would mean:
    i) Eliminate the ES versions of all posts (none are in ES)
    ii) Eliminate the PT versions of 5 posts (5 posts are in EN)
    iii) Eliminate the EN version for the rest of the posts (all posts but 5 are in PT)

    Thanks in advance!

Viewing 15 replies - 1 through 15 (of 20 total)
  • Hello,

    Thanks for contacting us and sorry for any inconvenience that might have been caused due to that.

    Rank Math automatically excludes posts from the sitemap if you set the canonical URL from the advanced tab of the meta box. But since you are using a filter to change the canonical in the frontend, our plugin is still including the posts in the sitemap.

    You can use the following filter to exclude the URLs from your sitemap: https://rankmath.com/kb/filters-hooks-api-developer/#change-remove-post-url

    Hope this helps. Let us know if you need any other assistance.

    Nc
    Rank Math free

    Hi,

    I have added (as a test) this simple code to functions.php:

    add_filter( ‘rank_math/sitemap/xml_post_url’, function( $url, $post){
    if(‘https://www.capitalinvest-group.com/pt/invest-in-brazil-ma-guide/’ == $url) {
    return false;
    }
    return $url;
    }, 10, 2 );

    Unfortunately, the referred post (https://www.capitalinvest-group.com/pt/invest-in-brazil-ma-guide/) is still on https://www.capitalinvest-group.com/pt/post-sitemap.xml

    What am I missing?

    How do I force rankmath to re-generate the post sitemap?

    Could you please provide me a complete example/ code of how to use this filter?

    Thanks in advance!

    Hello,

    Could you please try this modified filter?

    add_filter( ‘rank_math/sitemap/xml_post_url’, function( $url, $post){
    if(strpos($url, '/pt/invest-in-brazil-ma-guide/')!==false){
    return false;
    }
    return $url;
    }, 10, 2 );

    Also, ensure that your sitemaps are not being cached by following these steps:

    1. Ensure you’re using the latest version of your plugins/theme including the Rank Math plugin.

    2. Flush the Sitemap cache by following this video screencast:
    https://i.rankmath.com/xXXhDt

    3. Exclude the Sitemap files of the Rank Math plugin in your caching plugin. The cache could be via a plugin or from the server:
    https://rankmath.com/kb/exclude-sitemaps-from-caching/

    If the issue still persists, we might need to take a closer look at the settings. Please edit the first post on this ticket and include your WordPress & FTP logins in the designated Sensitive Data section.

    Please do take a complete backup of your website before sharing the information with us.
    Sensitive Data Section

    It is completely secure and only our support staff has access to that section. If you want, you can use the below plugin to generate a temporary login URL to your website and share that with us instead:

    https://wordpress.org/plugins/temporary-login-without-password/

    You can use the above plugin in conjunction with the WP Security Audit Log to monitor what changes our staff might make on your website (if any):

    https://wordpress.org/plugins/wp-security-audit-log/

    We really look forward to helping you.

    Thank you.

    Nc
    Rank Math free

    Hello,

    I have updated the sensitive data as requested. Can you please check further?

    Thank you.

    Nc
    Rank Math free

    I gave you access to the staging environment.

    You can test the stating environment in: https://staging-capitalinvest.kinsta.cloud/

    You can check the post sitemap in: https://staging-capitalinvest.kinsta.cloud/pt/post-sitemap.xml

    Please check functions.php line 512.

    The filter is not working. It should suppress non-canonical posts, but returning false, but it is not doing it

    add_filter( ‘rank_math/sitemap/xml_post_url’, function( $url, $post){ return false; }, 10, 2 );

    Hello,

    Since you are removing the canonical completely from Rank Math our plugin that filter can’t be used for this purpose.

    That’s because our plugin relies on our very own canonical to decide whether the posts should be included in the sitemap or not.

    Since the posts don’t have any canonical from Rank Math it includes all the posts no matter what the canonical because that can’t be properly detected.

    Alternatively, you can try the following filter to control the URLs before they are added into the sitemap: https://rankmath.com/kb/filters-hooks-api-developer/#filter-sitemap-item

    Here’s an example that removes any URL containing the word review in the URL:

    
    add_filter( 'rank_math/sitemap/entry', function( $url, $type, $object ){
        if( strpos( $url['loc'], 'review' ) && $type == 'post' ){
            return false;	
        }
        return $url;
    }, 10, 3 );
    

    Hope this helps solve your issues.

    Don’t hesitate to get in touch if you have any other questions.

    Nc
    Rank Math free

    Hi, your new filter does not work either.

    Could you please have a look in the staging environment to lines 512 ans 521 of functions.php?

    Thanks

    Hello,

    You didn’t make any modifications to the filter to make this work.

    The filter needs to be edited according to your requirements for the type of URL you would like to remove.

    You are simply returning false on that filter without any custom logic to track the posts that you want to remove.

    We even shared an example of the usage of this filter above.

    Further customizations fall outside the scope of Rank Math support and if you require a high level of customization like this you might need to hire a developer.

    Don’t hesitate to get in touch if you have any other questions.

    Nc
    Rank Math free

    Please disregard the previous post (you can delete it) and please consider only this one:

    Hi,

    Thanks for taking the time to check functions.php

    I used the filter you suggested, but in a different, more simple, way.

    There is an if/else structure, that works perfectly.

    What is inside the “if clause”, only applies for non-canonical posts, including line 512:

    add_filter( ‘rank_math/sitemap/entry’, function( $url, $type, $object ){ return false; }, 10, 3 );

    What is inside the “else clause”, only applies to pages and canonical posts, including line 521:

    add_filter( ‘rank_math/sitemap/entry’, function( $url, $type, $object ){ return $url; }, 10, 3 );

    $url is the current link

    It seems the new filter you suggested is also not working.

    Ex: if you check https://staging-capitalinvest.kinsta.cloud/pt/invest-in-brazil-ma-guide/ , you will see on the top left side of the post that it is written: “NON CANONICAL POST!!!” (as written by the wp_head action in the “if clause”), but on https://staging-capitalinvest.kinsta.cloud/pt/post-sitemap.xml, this non-canonical post is still there.

    Other examples of this:
    https://staging-capitalinvest.kinsta.cloud/es/invest-in-brazil-ma-guide/
    https://staging-capitalinvest.kinsta.cloud/en/valuation-avaliar-empresa/
    https://staging-capitalinvest.kinsta.cloud/es/valuation-avaliar-empresa/

    In summary:
    IN THE “IF CLAUSE”, the wp_head action is working, but rank_math/sitemap/entry filter is NOT

    IN THE “ELSE CLAUSE”, the wp_head action is working, but rank_math/sitemap/entry filter is NOT

    Thanks in advance for your precious help!

    Hello,

    We understand that but you can’t simply return false on that filter and expect it to remove the URL you need to actually explicitly state the URL similarly to what we’ve done in the example shared previously.

    The closure needs to have at the very least check one of the conditions, either the $url or the $type in order to evaluate and output the result.

    Even if it’s inside a filter that only retrieves canonical or non-canonical pages it requires the steps we mentioned.

    Don’t hesitate to get in touch if you have any other questions.

    Nc
    Rank Math free

    Thanks Miguel,

    Unfortunately, the example you gave me the logic is inside the filter, I really need the logic outside.

    The idea is to use one version of the filter for non-canonical posts, and a differente version of the filter for canonical posts & pages.

    The reason, is that this is not the only filter I am using on this external logic (I also use another rankmath filter precisely to add noindex to non-canonical posts, that by the way works perfectly).

    Would you be so kind to show me how this filter can work with the logic outside? Ex: can we make a simple comparison inside the filter that is alwaysTRUE?
    ex for non-canonical posts (line 512)

    add_filter( ‘rank_math/sitemap/entry’, function( $url, $type, $object ){ if( $url == $url ){ return false; } return $url; }, 10, 3 );

    ex: for canonical posts & pages (line 521)

    add_filter( ‘rank_math/sitemap/entry’, function( $url, $type, $object ){ if( $url == $url ){ return $url; } return false; }, 10, 3 );

    I tried it and it still does NOT work.

    Any other suggestion?

    Could you please be so kind to help!

    Thanks again!

    Hello,

    The only other way you can try to achieve this is by creating a function with all your logic and calling the filter with that function, which is the same procedure we do to exclude hidden language posts in our sitemap.

    For this you would create a function with the logic that returns the variable $url and then call it inside the filter like so:

    
    add_filter('rank_math/sitemap/entry', 'my_custom_sitemap_logic', 10, 3);
    

    This is the only other way to achieve this and you’ll need to create the custom code yourself as such a high level of customization fall outside the scope of Rank Math support.

    Don’t hesitate to get in touch if you have any other questions.

    Nc
    Rank Math free

    Dear Miguel,

    Thanks for your answer, that, being transparent, sounds very strange to me, and that did not respond to my previous questions, nor solve any issue.

    Did you take 2 minutes to check lines 512 and 521 of functions.php? Did you read any of my questions?

    Could you please answer any of my questions?

    I have a couple of additional very very very simple questions to make a very very very simple test to check if any of your sitemap filters really works:

    A) This filter when used for all pages (without any outside logic), should eliminate all entries from all sitemaps. Right? Why it does NOT do it?

    add_filter( ‘rank_math/sitemap/entry’, function( $url, $type, $object ){
    return false;
    }, 10, 3 );

    B) This another filter when used for all pages (without any outside logic), should also eliminate all entries from posts sitemap. Right? Why it does NOT do it?

    add_filter( ‘rank_math/sitemap/xml_post_url’, function( $url, $post){
    return false;
    }, 10, 2 );

    Why do you need any logic inside if what you want to test precisely is if the filter works, and really eliminates all sitemap entries?

    Could you please give me a single example (ideally on my functions.php file but not necessarily) that proves that any of those 2 sitemap filters really works?

    I ask you this, because in all the tests we did, simple and complex, that apparently you did not bother to check, none of those above filters ever worked.

    I would be immensely grateful if you could devote 2 minutes to answer to what I asked in this post and in the previous posts, in order to be able solve the issue (your filters do not seem to work) to be able to close this case.

    Hello,

    Please check your website functions.php file and see that we have removed the posts with the words invest-in-brazil-ma-guide from the post sitemap with this function:

    
    add_filter( 'rank_math/sitemap/entry', function( $url, $type, $object ){
        if( strpos( $url['loc'], 'invest-in-brazil-ma-guide' ) && $type == 'post' ){
            return false;	
        }
        return $url;
    }, 10, 3 );
    

    We also disabled the caching on the sitemaps so they generate correctly with this filter:

    
    add_filter( 'rank_math/sitemap/enable_caching', '__return_false');
    

    Don’t hesitate to get in touch if you have any other questions.

    Nc
    Rank Math free

    Thanks Miguel,

    You are very kind. Really you showed me an example in which the 3 versions (EN, PT and ES) of a posts are excluded from sitemap.

    I made many many many tests, and I am still not sure that this filter can work for what we want to do (exclude specific language post version), because I have tried to exclude only one language version (and not the 3 of them, as in your example) and it did not work:

    On line 654 is your example.

    When I use:

    add_filter( 'rank_math/sitemap/entry', function( $url, $type, $object) {
       if( strpos( $url['loc'], '/pt/invest-in-brazil-ma-guide' ) && $type == 'post' ){ return false; }
       return $url;
    }, 10, 3 );

    The 3 links (PT, EN, ES) versions of invest-in-brazil-ma-guide post are suppressed from the sitemap

    When I use (please note “/en/invest-in…”) :

    add_filter( 'rank_math/sitemap/entry', function( $url, $type, $object) {
       if( strpos( $url['loc'], '/en/invest-in-brazil-ma-guide' ) && $type == 'post' ){ return false; }
       return $url;
    }, 10, 3 );

    or when I use (please note “/es/invest-in…”) :

    add_filter( 'rank_math/sitemap/entry', function( $url, $type, $object) {
       if( strpos( $url['loc'], '/es/invest-in-brazil-ma-guide' ) && $type == 'post' ){ return false; }
       return $url;
    }, 10, 3 );

    None of the 3 versions are suppressed.

    Any suggestion on how this filter could work to suppress only 1 version of a chosen post (and the 3 of them)?

    Any other suggestion of filter we could use for this purpose?

    Please note that in the example you used:
    /en/invest-in-brazil-ma-guide is canonical
    /pt/invest-in-brazil-ma-guide is NOT canonical
    /es/invest-in-brazil-ma-guide is NOT canonical

    Hello,

    Since we did not hear back from you for 15 days, we are assuming that you found the solution. We are closing this support ticket.

    If you still need assistance or any other help, please feel free to open a new support ticket, and we will be more than happy to assist.

    Thank you.

Viewing 15 replies - 1 through 15 (of 20 total)

The ticket ‘Filter to eliminate non-canonical post versions from posts sitemap’ is closed to new replies.