Sanitize HTML classes in WordPress

A few days ago, I found out that the WordPress core function does not necessarily return a valid HTML class. For example, 123-class, --class, and -1-class are invalid, but the sanitize_html_class will let them pass.

Based on the current level-3 selectors grammar, you can add the following filter callback. It could be shortened, but I tried to keep it as close as possible to the Flex tokens.

add_filter('sanitize_html_class', static function (string $sanitized, string $class, string $fallback): string {  
    $nonascii = '[^\0-\177]';  
    $unicode  = '\\[0-9a-f]{1,6}(\r\n|[ \n\r\t\f])?';  
    $escape   = sprintf('%s|\\[^\n\r\f0-9a-f]', $unicode);  
    $nmstart  = sprintf('[_a-z]|%s|%s', $nonascii, $escape);  
    $nmchar   = sprintf('[_a-z0-9-]|%s|%s', $nonascii, $escape);  

    if (!preg_match(sprintf('/-?(?:%s)(?:%s)*/i', $nmstart, $nmchar), $class, $matches)) {  
        return $fallback;  
    }  

    return $matches[0];  
}, 10, 3);

You can test the RegEx here: https://regex101.com/r/7Kx506/1