mb_strlen

(PHP 4 >= 4.0.6, PHP 5, PHP 7)

mb_strlen — Get string length

Description

mixed mb_strlen ( string $str [, string $encoding = mb_internal_encoding() ] )

Gets the length of a string.

Parameters

str: The string being checked for length.
encoding: The encoding parameter is the character encoding. If it is omitted, the internal character encoding value will be used.

Return Values

Returns the number of characters in string str having character encoding encoding. A multi-byte character is counted as 1.

Returns FALSE if the given encoding is invalid.

Warning

This function may return Boolean FALSE, but may also return a non-Boolean value which evaluates to FALSE. Please read the section on Booleans for more information. Use the === operator for testing the return value of this function.

User Contributed Notes

Yzmir Ramirez

5 years ago


If you are unsure about what $encoding can be set to, here's a full list of all the encodings supported by this extension:

http://www.php.net/manual/en/mbstring.supported-encodings.php

drake127

8 years ago


Speed of mb_strlen varies a lot according to specified character set.

If you need length of string in bytes (strlen cannot be trusted anymore because of mbstring.func_overload) you should use <?php mb_strlen($string, '8bit'); ?>.
It's the fastest way (still a way slower than strlen, though) to determine byte length of string. Other single byte character sets (ASCII, ISO-8859-1, ...) are several times slower than 8bit.

koala at example dot com

8 years ago


Just did a little benchmarking (1.000.000 times with lorem ipsum text) on the mbs functions

especially mb_strtolower and mb_strtoupper are really slow (up to 100 times slower compared to normal functions). Other functions are alike-ish, but sometimes up to 5 times slower.

just be cautious when using mb_ functions in high frequented scripts.

# test runs: 1000000
# benchmarking strlen vs. mb_strlen
# normal strlen: 3.6795361042023 ms, average: 3.6795361042023E-6 ms
# mb_strlen: 5.5934538841248 ms, average: 5.5934538841248E-6 ms
ok 1 - mb_strlen is slower than strlen
# mb_strlen is 1.52 slower than strlen
#
#
# benchmarking strpos vs. mb_strpos
# normal strpos: 5.5523281097412 ms, average: 5.5523281097412E-6 ms
# mb_strlen: 31.180974960327 ms, average: 3.1180974960327E-5 ms
ok 2 - mb_strlen is slower than strlen
# mb_strpos is 5.62 slower than strpos
#
#
# benchmarking substr vs. mb_substr
# normal substr: 3.4437320232391 ms, average: 3.4437320232391E-6 ms
# mb_strlen: 3.5374181270599 ms, average: 3.5374181270599E-6 ms
ok 3 - mb_strlen is slower than strlen
# mb_substr is 1.03 slower than substr
#
#
# benchmarking strtolower vs. mb_strtolower
# normal strtolower: 4.446839094162 ms, average: 4.446839094162E-6 ms
# mb_strlen: 193.44901108742 ms, average: 0.00019344901108742 ms
ok 4 - mb_strlen is slower than strlen
# mb_strtolower is 43.5 slower than strtolower
#
#
# benchmarking strtoupper vs. mb_strtoupper
# normal strtoupper: 3.0210740566254 ms, average: 3.0210740566254E-6 ms
# mb_strlen: 340.71775603294 ms, average: 0.00034071775603294 ms
ok 5 - mb_strlen is slower than strlen
# mb_strtoupper is 112.78 slower than strtoupper

Ben

7 years ago


If you find yourself without the mb string functions and can't easily change it, a quick hack replacement for mb_strlen for utf8 characters is to use a a PCRE regex with utf8 turned on.

$strlen = preg_match_all("/.{1}/us",$utf8string,$dummy);

This is basically an ugly hack which counts all single character matches, and I'd expect it to be painfully slow on large strings.

Peter Albertsson

11 years ago


If you wish to find the byte length of a multi-byte string when you are using mbstring.func_overload 2 and UTF-8 strings, then you can use the following:

mb_strlen($utf8_string, 'latin1');

motin at demomusic dot nu

9 years ago


Thank you Peter Albertsson for presenting that!

After spending more than eight hours tracking down two specific bugs in my mbstring-func_overloaded environment I have learned a very important lesson:

Many developers rely on strlen to give the amount of bytes in a string. While mb-overloading has very many advantages, the most hard-spotted pitfall must be this issue. 

Two examples (from the two bugs found earlier):

1. Writing a string to a file:

<?php
$str = "string with utf-8 chars åèä - doo-bee doo-bee dooh";
$fp = fopen($this->_file, "wb");
if ($fp) {
  $len = strlen($str);
  fwrite($fp, $str, $len);
}
?>

PS This is found i the PEAR::Cache_Lite package (Lite.php) - Reported

2. Iterating through a string's characters:

<?php
$str = "string with utf-8 chars åèö - doo-bee doo-bee dooh";
$newStr = "";
for ($i = 0; $i < strlen($str); $i++) {
  $newStr .= $str[$i];
}
?>

Both of these situations will fail to save / store the last characters in $str. This can be very hard to spot and can be especially fatal for say serialized strings, xml etc. 

So, try to avoid these situations to support overloaded environments, and remeber Peter Albertssons remark if you find problems under such an environment.

filatei at gmail dot com