Excluding certain URL paths from Site Map

#935878
  • Resolved Maderightmedia
    Rank Math pro
    SEO Course

    Hello,

    First off, I love Rank Math. Thank you for making such a great plugin!

    I have a rather large website with several hundred pages, and it is using an automated translation plugin, TranslatePress.

    The plugin serves an translated version of the website with the language in the URL. For example, https://homepage.com/es/ would be the Spanish version. This has been fantastic for SEO, however it’s creating a massive sitemap.

    We would like the Spanish version of the site to be store in the sitemap, anything with the /es/, but we would not like the other languages to be stored in the sitemap. Getting SEO from those languages isn’t relevant for our community, and we need to trim down our site map.

    Is there a way to exclude content based on the URL path? Perhaps using some REGEX?

    The website is https://cdh.idaho.gov

    Thank you!

Viewing 7 replies - 1 through 7 (of 7 total)
  • Hello,

    Thank you for contacting us and bringing your concern to our attention.

    We deeply apologize for the unexpected delay in response.

    Can you please confirm if you want to include only the URLs with the /es/ slug on your sitemap? If so, you can use the following filter on your website and see if that works for you:

    add_filter( 'rank_math/sitemap/entry', function ( $url, $type, $object ) {
        if ( isset( $url['loc'] ) && strpos( $url['loc'], '/es/' ) === false ) {
            return false; 
        }
    
        return $url; 
    }, 10, 3 );
    

    Please note that it will be applied to your post, page, and category sitemap.

    Here’s how you can add filter/hook to your WordPress site: https://rankmath.com/kb/wordpress-hooks-actions-filters/

    Let us know how it goes. Looking forward to helping you.

    Maderightmedia
    Rank Math pro
    SEO Course

    Hello Rakibuzzaman,

    Thank you very much for getting back to me, and sorry for my delayed response. The alert went into my junk mail.

    Actually, I would like to INCLUDE all of the default URLs (which don’t have an extra slug), and the Spanish URLs, which have the extra slug /es/ added to the root URL.

    I would like to EXCLUDE all of the other languages which have the following extra slugs:

    /zh/
    /fr/
    /de/
    /pt/
    /ru/
    /uk/

    Here are examples of what the pages do with their respective slugs:

    https://cdh.idaho.gov/health/
    https://cdh.idaho.gov/es/health/
    https://cdh.idaho.gov/de/health/

    Thank you!

    Hello,

    In this case, you can replace the filter with the following one and see if that works for you:

    add_filter( 'rank_math/sitemap/entry', function ( $url, $type, $object ) {
        // List of slugs to exclude
        $exclude_slugs = [ '/zh/', '/fr/', '/de/', '/pt/', '/ru/', '/uk/' ];
    
        // Check if the URL contains any of the slugs to exclude
        if ( isset( $url['loc'] ) ) {
            foreach ( $exclude_slugs as $slug ) {
                if ( strpos( $url['loc'], $slug ) !== false ) {
                    return false; 
                }
            }
        }
    
        return $url; 
    }, 10, 3 );
    

    Let us know how it goes. Looking forward to helping you.

    Thank you.

    Luis Mendoza
    Rank Math free

    Hi.

    Did you manage to make this work?
    I’m trying that too but the URLs with the slugs I’m trying to filter out are still appearing in the sitemap xml file.

    add_filter( 'rank_math/sitemap/entry', function ( $url, $type, $object ) {
        // List of slugs to exclude
        $exclude_slugs = [ '/en/', '/de/' ];
    
        // Check if the URL contains any of the slugs to exclude
        if ( isset( $url['loc'] ) ) {
            foreach ( $exclude_slugs as $slug ) {
                if ( strpos( $url['loc'], $slug ) !== false ) {
                    return false; 
                }
            }
        }
    
        return $url; 
    }, 10, 3 );

    Hello @mendozal,

    Thank you for your query and we are so sorry about the trouble this must have caused.

    After applying the filter code, please follow the steps below:

    1. Flush the Sitemap cache by following this video screencast:
    https://i.rankmath.com/pipRDp

    2. Exclude the Sitemap files of the Rank Math plugin in your caching plugin. The cache could be via a plugin or from the server. For plugins or Cloudflare, please follow this article:
    https://rankmath.com/kb/exclude-sitemaps-from-caching/

    3. Apply the following filter code to your site.

    
    add_filter( 'rank_math/sitemap/enable_caching', '__return_false');

    Here’s how you can add filter to your WordPress site:https://rankmath.com/kb/wordpress-hooks-actions-filters/

    Let us know how that goes.

    Looking forward to helping you.

    Luis Mendoza
    Rank Math free

    I did all that but it didn’t work.

    Doing a little troubleshooting I noticed that the /en, /de, or any other language slugs are not present in the $url array for this filter. Only the main URLs without the language slugs.

    This is why they aren’t filtered out in the first place.

    I’m using Translatepress, maybe it has something to do with how this plugin work, since pages don’t exist as separate pages created for each language like in WPML. I don’t know how they’re added to the sitemap.

    Hello,

    Since your TranslatePress setup includes both subdirectory-based and regular slug-based translations, the URLs may not always follow a strict pattern.

    Can you share some of the URLs that were not excluded after implementing the previous filter.

    We look forward to hearing back from you.

    Hello,

    Since we did not hear back from you for 15 days, we are assuming that you found the solution. We are closing this support ticket.

    If you still need assistance or any other help, please feel free to open a new support ticket, and we will be more than happy to assist.

    Thank you.

Viewing 7 replies - 1 through 7 (of 7 total)

The ticket ‘Excluding certain URL paths from Site Map’ is closed to new replies.