System.String Class

Represents text as a series of Unicode characters.

See Also: String Members

Syntax

[System.Runtime.InteropServices.ComVisible(true)]
public sealed class String : IEnumerable<char>, ICloneable, IComparable, IComparable<string>, IConvertible, IEquatable<string>

Remarks

A string is a sequential collection of Unicode characters that is used to represent text. A string object is a sequential collection of char objects that represent a string. The value of the string object is the content of the sequential collection, and that value is immutable (that is, it is read-only). For more information about the immutability of strings, see the Immutability and the StringBuilder class section later in this topic. The maximum size of a string object in memory is 2 GB, or about 1 billion characters.

In this section:

Instantiating a String object Char objects and Unicode characters Strings and embedded null characters Strings and indexes Null strings and empty strings Immutability and the StringBuilder class Ordinal vs. culture-sensitive operations Normalization String operations by category

Instantiating a String object

You can instantiate a string object in the following ways:

Char objects and Unicode characters

Each character in a string is defined by a Unicode scalar value, also called a Unicode code point or the ordinal (numeric) value of the Unicode character. Each code point is encoded by using UTF-16 encoding, and the numeric value of each element of the encoding is represented by a char object.

A single char object usually represents a single code point; that is, the numeric value of the char equals the code point. For example, the code point for the character "a" is U+0061. However, a code point might require more than one encoded element (more than one char object). The Unicode standard defines three types of characters that correspond to multiple char objects: graphemes, Unicode supplementary code points, and characters in the supplementary planes.

Strings and embedded null characters

In the .NET Framework, a string object can include embedded null characters, which count as a part of the string's length. However, in some languages such as C and C++, a null character indicates the end of a string; it is not considered a part of the string and is not counted as part of the string's length. This means that the following common assumptions that C and C++ programmers or libraries written in C or C++ might make about strings are not necessarily valid when applied to string objects:

You should ensure that native C and C++ code that instantiates string objects, and code that is passed string objects through platform invoke, do not assume that an embedded null character marks the end of the string.

Embedded null characters in a string are also treated differently when a string is sorted (or compared) and when a string is searched. Null characters are ignored when performing culture-sensitive comparisons between two strings, including comparisons using the invariant culture. They are considered only for ordinal or case-insensitive ordinal comparisons. On the other hand, embedded null characters are always considered when searching a string with methods such as string.Contains(string), string.StartsWith(string), and string.IndexOf(string).

Strings and indexes

An index is the position of a char object (not a Unicode character) in a string. An index is a zero-based, nonnegative number that starts from the first position in the string, which is index position zero. A number of search methods, such as string.IndexOf(char) and string.LastIndexOf(char), return the index of a character or substring in the string instance.

The string.Chars(int) property lets you access individual char objects by their index position in the string. Because the string.Chars(int) property is the default property (in Visual Basic) or the indexer (in C#), you can access the individual char objects in a string by using code such as the following. This code looks for white space or punctuation characters in a string to determine how many words the string contains.

code reference: System.String.Class#4

Because the string class implements the IEnumerable interface, you can also iterate through the char objects in a string by using a foreach construct, as the following example shows.

code reference: System.String.Class#5

Consecutive index values might not correspond to consecutive Unicode characters, because a Unicode character might be encoded as more than one char object. In particular, a string may contain multi-character units of text that are formed by a base character followed by one or more combining characters or by surrogate pairs. To work with Unicode characters instead of char objects, use the System.Globalization.StringInfo and System.Globalization.TextElementEnumerator classes. The following example illustrates the difference between code that works with char objects and code that works with Unicode characters. It compares the number of characters or text elements in each word of a sentence. The string includes two sequences of a base character followed by a combining character.

code reference: System.String.Class#6

This example works with text elements by using the System.Globalization.StringInfo.GetTextElementEnumerator(string) method and the System.Globalization.TextElementEnumerator class to enumerate all the text elements in a string. You can also retrieve an array that contains the starting index of each text element by calling the System.Globalization.StringInfo.ParseCombiningCharacters(string) method.

For more information about working with units of text rather than individual char values, see the System.Globalization.StringInfo class.

Null strings and empty strings

A string that has been declared but has not been assigned a value is null. Attempting to call methods on that string throws a NullReferenceException. A null string is different from an empty string, which is a string whose value is "" or string.Empty. In some cases, passing either a null string or an empty string as an argument in a method call throws an exception. For example, passing a null string to the int.Parse(string) method throws an ArgumentNullException, and passing an empty string throws a FormatException. In other cases, a method argument can be either a null string or an empty string. For example, if you are providing an IFormattable implementation for a class, you want to equate both a null string and an empty string with the general ("G") format specifier.

The string class includes the following two convenience methods that enable you to test whether a string is null or empty:

The following example uses the string.IsNullOrEmpty(string) method in the IFormattable.ToString(string, IFormatProvider) implementation of a custom Temperature class. The method supports the "G", "C", "F", and "K" format strings. If an empty format string or a format string whose value is null is passed to the method, its value is changed to the "G" format string.

code reference: System.String.Class.Null#3

Immutability and the StringBuilder class

A string object is called immutable (read-only), because its value cannot be modified after it has been created. Methods that appear to modify a string object actually return a new string object that contains the modification.

Because strings are immutable, string manipulation routines that perform repeated additions or deletions to what appears to be a single string can exact a significant performance penalty. For example, the following code uses a random number generator to create a string with 1000 characters in the range 0x0001 to 0x052F. Although the code appears to use string concatenation to append a new character to the existing string named str, it actually creates a new string object for each concatenation operation.

code reference: System.String.Class#15

You can use the System.Text.StringBuilder class instead of the string class for operations that make multiple changes to the value of a string. Unlike instances of the string class, System.Text.StringBuilder objects are mutable; when you concatenate, append, or delete substrings from a string, the operations are performed on a single string. When you have finished modifying the value of a System.Text.StringBuilder object, you can call its System.Text.StringBuilder.ToString method to convert it to a string. The following example replaces the string used in the previous example to concatenate 1000 random characters in the range to 0x0001 to 0x052F with a System.Text.StringBuilder object.

code reference: System.String.Class#16

Ordinal vs. culture-sensitive operations

Members of the string class perform either ordinal or culture-sensitive (linguistic) operations on a string object. An ordinal operation acts on the numeric value of each char object. A culture-sensitive operation acts on the value of the string object, and takes culture-specific casing, sorting, formatting, and parsing rules into account. Culture-sensitive operations execute in the context of an explicitly declared culture or the implicit current culture. The two kinds of operations can produce very different results when they are performed on the same string.

Note:

If your application makes a security decision about a symbolic identifier such as a file name or named pipe, or about persisted data such as the text-based data in an XML file, the operation should use an ordinal comparison instead of a culture-sensitive comparison. This is because a culture-sensitive comparison can yield different results depending on the culture in effect, whereas an ordinal comparison depends solely on the binary value of the compared characters.

Note:

Most methods that perform string operations include an overload that has a parameter of type StringComparison, which enables you to specify whether the method performs an ordinal or culture-sensitive operation. In general, you should call this overload to make the intent of your method call clear. For best practices and guidance for using ordinal and culture-sensitive operations on strings, see Best Practices for Using Strings in the .NET Framework.

Operations for casing, parsing and formatting, comparison and sorting, and testing for equality can be either ordinal or culture-sensitive. The following sections discuss each category of operation.

Casing

Casing rules determine how to change the capitalization of a Unicode character; for example, from lowercase to uppercase. Often, a casing operation is performed before a string comparison. For example, a string might be converted to uppercase so that it can be compared with another uppercase string. You can convert the characters in a string to lowercase by calling the string.ToLower or string.ToLowerInvariant method, and you can convert them to uppercase by calling the string.ToUpper or string.ToUpperInvariant method. In addition, you can use the System.Globalization.TextInfo.ToTitleCase(string) method to convert a string to title case.

Casing operations can be based on the rules of the current culture, a specified culture, or the invariant culture. Because case mappings can vary depending on the culture used, the result of casing operations can vary based on culture. The actual differences in casing are of three kinds:

The following example illustrates some of the differences in casing rules between cultures when converting strings to uppercase.

code reference: System.String.Class#7

Parsing and formatting

Formatting and parsing are inverse operations. Formatting rules determine how to convert a value, such as a date and time or a number, to its string representation, whereas parsing rules determine how to convert a string representation to a value such as a date and time. Both formatting and parsing rules are dependent on cultural conventions. The following example illustrates the ambiguity that can arise when interpreting a culture-specific date string. Without knowing the conventions of the culture that was used to produce a date string, it is not possible to know whether 03/01/2011, 3/1/2011, and 01/03/2011 represent January 3, 2011 or March 1, 2011.

code reference: System.String.Class#8

Similarly, as the following example shows, a single string can produce different dates depending on the culture whose conventions are used in the parsing operation.

code reference: System.String.Class#9

String comparison and sorting

Sort rules determine the alphabetic order of Unicode characters and how two strings compare to each other. For example, the string.Compare(string, string, StringComparison) method compares two strings based on the StringComparison parameter. If the parameter value is StringComparison.CurrentCulture, the method performs a linguistic comparison that uses the conventions of the current culture; if the parameter value is StringComparison.Ordinal, the method performs an ordinal comparison. Consequently, as the following example shows, if the current culture is U.S. English, the first call to the string.Compare(string, string, StringComparison) method (using culture-sensitive comparison) considers "a" less than "A", but the second call to the same method (using ordinal comparison) considers "a" greater than "A".

code reference: System.String.Class#10

The .NET Framework supports word, string, and ordinal sort rules:

A culture-sensitive comparison is any comparison that explicitly or implicitly uses a System.Globalization.CultureInfo object, including the invariant culture that is specified by the System.Globalization.CultureInfo.InvariantCulture property. The implicit culture is the current culture, which is specified by the System.Threading.Thread.CurrentCulture and System.Globalization.CultureInfo.CurrentCulture properties. There is considerable variation in the sort order of alphabetic characters (that is, characters for which the char.IsLetter(char) property returns true) across cultures. You can specify a culture-sensitive comparison that uses the conventions of a specific culture by supplying a System.Globalization.CultureInfo object to a string comparison method such as string.Compare(string, string, System.Globalization.CultureInfo, System.Globalization.CompareOptions). You can specify a culture-sensitive comparison that uses the conventions of the current culture by supplying StringComparison.CurrentCulture, StringComparison.CurrentCultureIgnoreCase, or any member of the System.Globalization.CompareOptions enumeration other than System.Globalization.CompareOptions.Ordinal or System.Globalization.CompareOptions.OrdinalIgnoreCase to an appropriate overload of the string.Compare(string, string) method. A culture-sensitive comparison is generally appropriate for sorting whereas an ordinal comparison is not. An ordinal comparison is generally appropriate for determining whether two strings are equal (that is, for determining identity) whereas a culture-sensitive comparison is not.

Use the following general guidelines to choose an appropriate sorting or string comparison method:

Note:

The culture-sensitive sorting and casing rules used in string comparison depend on the version of the .NET Framework. In the net_v45 running on the win8 operating system, sorting, casing, normalization, and Unicode character information conforms to the Unicode 6.0 standard. On other operating systems, it conforms to the Unicode 5.0 standard.

For more information about word, string, and ordinal sort rules, see the System.Globalization.CompareOptions topic. For additional recommendations on when to use each rule, see Best Practices for Using Strings in the .NET Framework.

Ordinarily, you do not call string comparison methods such as string.Compare(string, string) directly to determine the sort order of strings. Instead, comparison methods are called by sorting methods such as Array.Sort(Array) or List`1.Sort. The following example performs four different sorting operations (word sort using the current culture, word sort using the invariant culture, ordinal sort, and string sort using the invariant culture) without explicitly calling a string comparison method. Note that each type of sort produces a unique ordering of strings in its array.

code reference: System.String.Class#12

Note:

If your primary purpose in comparing strings is to determine whether they are equal, you should call the string.Equals(object) method. Typically, you should use string.Equals(object) to perform an ordinal comparison. The string.Compare(string, string) method is intended primarily to sort strings.

String search methods, such as string.StartsWith(string) and string.IndexOf(char), also can perform culture-sensitive or ordinal string comparisons. The following example illustrates the differences between ordinal and culture-sensitive comparisons using the string.IndexOf(char) method. A culture-sensitive search in which the current culture is English (United States) considers the substring "oe" to match the ligature "œ". Because a soft hyphen (U+00AD) is a zero-width character, the search treats the soft hyphen as equivalent to string.Empty and finds a match at the beginning of the string. An ordinal search, on the other hand, does not find a match in either case.

code reference: System.String.Class#13

Testing for equality

Use the string.Compare(string, string) method to determine the relationship of two strings in the sort order. Typically, this is a culture-sensitive operation. In contrast, call the string.Equals(string) method to test for equality. Because the test for equality usually compares user input with some known string, such as a valid user name, a password, or a file system path, it is typically an ordinal operation.

Note:

It is possible to test for equality by calling the string.Compare(string, string) method and determining whether the return value is zero. However, this practice is not recommended. To determine whether two strings are equal, you should call one of the overloads of the string.Equals(string) method. The preferred overload to call is either the instance string.Equals(string, StringComparison) method or the static string.Equals(string, string, StringComparison) method, because both methods include a StringComparison parameter that explicitly specifies the type of comparison.

The following example illustrates the danger of performing a culture-sensitive comparison for equality when an ordinal one should be used instead. In this case, the intent of the code is to prohibit file system access from URLs that begin with "FILE://" or "file://" by performing a case-insensitive comparison of the beginning of a URL with the string "FILE://". However, if a culture-sensitive comparison is performed using the Turkish (Turkey) culture on a URL that begins with "file://", the comparison for equality fails, because the Turkish uppercase equivalent of the lowercase "i" is "İ" instead of "I". As a result, file system access is inadvertently permitted. On the other hand, if an ordinal comparison is performed, the comparison for equality succeeds, and file system access is denied.

code reference: System.String.Class#11

Normalization

Some Unicode characters have multiple representations. For example, any of the following code points can represent the letter "ắ":

Multiple representations for a single character complicate searching, sorting, matching, and other string operations.

The Unicode standard defines a process called normalization that returns one binary representation of a Unicode character for any of its equivalent binary representations. Normalization can use several algorithms, called normalization forms, that follow different rules. The .NET Framework supports Unicode normalization forms C, D, KC, and KD. When strings have been normalized to the same normalization form, they can be compared by using ordinal comparison.

An ordinal comparison is a binary comparison of the Unicode scalar value of corresponding char objects in each string. The string class includes a number of methods that can perform an ordinal comparison, including the following:

You can determine whether a string is normalized to normalization form C by calling the string.IsNormalized method, or you can call the string.IsNormalized(System.Text.NormalizationForm) method to determine whether a string is normalized to a specified normalization form. You can also call the string.Normalize method to convert a string to normalization form C, or you can call the string.Normalize(System.Text.NormalizationForm) method to convert a string to a specified normalization form. For step-by-step information about normalizing and comparing strings, see the string.Normalize and string.Normalize(System.Text.NormalizationForm) methods.

The following simple example illustrates string normalization. It defines the letter "ố" in three different ways in three different strings, and uses an ordinal comparison for equality to determine that each string differs from the other two strings. It then converts each string to the supported normalization forms, and again performs an ordinal comparison of each string in a specified normalization form. In each case, the second test for equality shows that the strings are equal.

code reference: System.String.Class#14

For more information about normalization and normalization forms, see System.Text.NormalizationForm, as well as tp://unicode.org/reports/tr15/ and the tp://www.unicode.org/faq/normalization.html on the unicode.org website.

String operations by category

The string class provides members for comparing strings, testing strings for equality, finding characters or substrings in a string, modifying a string, extracting substrings from a string, combining strings, formatting values, copying a string, and normalizing a string.

Comparing strings

You can compare strings to determine their relative position in the sort order by using the following string methods:

Testing strings for equality

You call the string.Equals(string) method to determine whether two strings are equal. The instance string.Equals(string, string, StringComparison) and the static string.Equals(string, StringComparison) overloads let you specify whether the comparison is culture-sensitive or ordinal, and whether case is considered or ignored. Most tests for equality are ordinal, and comparisons for equality that determine access to a system resource (such as a file system object) should always be ordinal.

Finding characters in a string

The string class includes two kinds of search methods:

Note:

If you want to search a string for a particular pattern rather than a specific substring, you should use regular expressions. For more information, see .NET Framework Regular Expressions.

Modifying a string

The string class includes the following methods that appear to modify the value of a string:

Note:

All string modification methods return a new string object. They do not modify the value of the current instance.

Extracting substrings from a string

The string.Split(Char[]) method separates a single string into multiple strings. Overloads of the method allow you to specify multiple delimiters, to determine the maximum number of substrings that the method extracts, and to determine whether empty strings (which occur when delimiters are adjacent) are included among the returned strings.

Combining strings

The following string methods can be used for string concatenation:

Formatting values

The erload:System.String.Format method uses the composite formatting feature to replace one or more placeholders in a string with the string representation of some object or value. The erload:System.String.Format method is often used to do the following:

For detailed information about formatting operations and examples, see the erload:System.String.Format overload summary.

Copying a string

You can call the following string methods to make a copy of a string:

Normalizing a string

In Unicode, a single character can have multiple code points. Normalization converts these equivalent characters into the same binary representation. The string.Normalize method performs the normalization, and the string.IsNormalized method determines whether a string is normalized.

Thread Safety

This type is safe for multithreaded operations.

Example

Example 1

The following example demonstrates formatting numeric data types and inserting literal curly brackets into strings.

C# Example

using System;
class StringFormatTest {
    public static void Main() {
        decimal dec = 1.99999m;
        double doub = 1.0000000001;

        string somenums = String.Format("Some formatted numbers: dec={0,15:E} doub={1,20}", dec, doub);
        Console.WriteLine(somenums);

        string curlies = "Literal curly brackets: {{ and }} and {{0}}";
        Console.WriteLine(curlies);

        object nullObject = null;
        string embeddedNull = String.Format("A null argument looks like: {0}", nullObject);
        Console.WriteLine(embeddedNull);
    }
}
   

The output is

Example

Some formatted numbers: dec=  1.999990E+000 doub=        1.0000000001
Literal curly brackets: {{ and }} and {{0}}
A null argument looks like: 
 

Example 2

The following example demonstrates how formatting works if IFormattable is or is not implemented by an argument to the string.Format(string, object) method. Note that the format specifier is ignored if the argument does not implement IFormattable.

C# Example

using System;
class StringFormatTest {
    public class DefaultFormatEleven {
        public override string ToString() {
            return "11 string";
        }
    }
    public class FormattableEleven:IFormattable {
        // The IFormattable ToString implementation.
        public string ToString(string format, IFormatProvider formatProvider) {
            Console.Write("[IFormattable called] ");
            return 11.ToString(format, formatProvider);
        }
        // Override Object.ToString to show that it is not called.
        public override string ToString() {
            return "Formatted 11 string";
        }
    }

    public static void Main() {
        DefaultFormatEleven def11 = new DefaultFormatEleven ();
         FormattableEleven for11 = new  FormattableEleven();
        string def11string = String.Format("{0}",def11);
        Console.WriteLine(def11string);
        // The format specifier x is ignored.
        def11string = String.Format("{0,15:x}", def11);
        Console.WriteLine(def11string);

        string form11string = String.Format("{0}",for11);
        Console.WriteLine(form11string );
        form11string = String.Format("{0,15:x}",for11);
        Console.WriteLine(form11string);
    }
}

The output is

Example

11 string
      11 string
[IFormattable called] 11
[IFormattable called]               b
 

Example 3

The following example demonstrates searching for an empty string in a non-empty string.

C# Example

using System;
class EmptyStringSearch {
	public static void Main() 	{
		Console.WriteLine("ABCDEF".IndexOf(""));
		Console.WriteLine("ABCDEF".IndexOf("", 2));
		Console.WriteLine("ABCDEF".IndexOf("", 3, 2));
		Console.WriteLine("ABCDEF".LastIndexOf(""));
		Console.WriteLine("ABCDEF".LastIndexOf("", 1));
		Console.WriteLine("ABCDEF".LastIndexOf("", 4, 2));
	}
}

The output is

Example

0
2
3
5
1
4

Requirements

Namespace: System
Assembly: mscorlib (in mscorlib.dll)
Assembly Versions: 1.0.5000.0, 2.0.0.0, 4.0.0.0