seems_utf8( string $str )

Checks to see if a string is utf8 encoded.


Description Description

NOTE: This function checks for 5-Byte sequences, UTF8 has Bytes Sequences with a maximum length of 4.


Parameters Parameters

$str

(string) (Required) The string to be checked


Top ↑

Return Return

(bool) True if $str fits a UTF-8 model, false otherwise.


Top ↑

Source Source

File: wp-includes/formatting.php

893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
function seems_utf8( $str ) {
    mbstring_binary_safe_encoding();
    $length = strlen( $str );
    reset_mbstring_encoding();
    for ( $i = 0; $i < $length; $i++ ) {
        $c = ord( $str[ $i ] );
        if ( $c < 0x80 ) {
            $n = 0; // 0bbbbbbb
        } elseif ( ( $c & 0xE0 ) == 0xC0 ) {
            $n = 1; // 110bbbbb
        } elseif ( ( $c & 0xF0 ) == 0xE0 ) {
            $n = 2; // 1110bbbb
        } elseif ( ( $c & 0xF8 ) == 0xF0 ) {
            $n = 3; // 11110bbb
        } elseif ( ( $c & 0xFC ) == 0xF8 ) {
            $n = 4; // 111110bb
        } elseif ( ( $c & 0xFE ) == 0xFC ) {
            $n = 5; // 1111110b
        } else {
            return false; // Does not match any model
        }
        for ( $j = 0; $j < $n; $j++ ) { // n bytes matching 10bbbbbb follow ?
            if ( ( ++$i == $length ) || ( ( ord( $str[ $i ] ) & 0xC0 ) != 0x80 ) ) {
                return false;
            }
        }
    }
    return true;
}

Top ↑

Changelog Changelog

Changelog
Version Description
1.2.1 Introduced.


Top ↑

User Contributed Notes User Contributed Notes

You must log in before being able to contribute a note or feedback.