Table of Contents
Expressions can be used at several points in
SQL statements, such as in the
ORDER BY
or HAVING
clauses of
SELECT
statements, in the
WHERE
clause of a
SELECT
,
DELETE
, or
UPDATE
statement, or in
SET
statements. Expressions can be written using literal values, column
values, NULL
, built-in functions, stored
functions, user-defined functions, and operators. This chapter
describes the functions and operators that are permitted for writing
expressions in MySQL. Instructions for writing stored functions and
user-defined functions are given in
Section 24.2, “Using Stored Routines (Procedures and Functions)”, and
Section 29.4, “Adding New Functions to MySQL”. See
Section 9.2.4, “Function Name Parsing and Resolution”, for the rules describing how
the server interprets references to different kinds of functions.
An expression that contains NULL
always produces
a NULL
value unless otherwise indicated in the
documentation for a particular function or operator.
By default, there must be no whitespace between a function name and the parenthesis following it. This helps the MySQL parser distinguish between function calls and references to tables or columns that happen to have the same name as a function. However, spaces around function arguments are permitted.
You can tell the MySQL server to accept spaces after function names
by starting it with the
--sql-mode=IGNORE_SPACE
option. (See
Section 5.1.11, “Server SQL Modes”.) Individual client programs can request
this behavior by using the CLIENT_IGNORE_SPACE
option for mysql_real_connect()
. In
either case, all function names become reserved words.
For the sake of brevity, most examples in this chapter display the output from the mysql program in abbreviated form. Rather than showing examples in this format:
mysql> SELECT MOD(29,9);
+-----------+
| mod(29,9) |
+-----------+
| 2 |
+-----------+
1 rows in set (0.00 sec)
This format is used instead:
mysql> SELECT MOD(29,9);
-> 2
Table 12.1 Functions and Operators
Name | Description |
---|---|
ABS() |
Return the absolute value |
ACOS() |
Return the arc cosine |
ADDDATE() |
Add time values (intervals) to a date value |
ADDTIME() |
Add time |
AES_DECRYPT() |
Decrypt using AES |
AES_ENCRYPT() |
Encrypt using AES |
AND , && |
Logical AND |
ANY_VALUE() |
Suppress ONLY_FULL_GROUP_BY value rejection |
ASCII() |
Return numeric value of left-most character |
ASIN() |
Return the arc sine |
= |
Assign a value (as part of a
SET
statement, or as part of the SET clause in an
UPDATE statement)
|
:= |
Assign a value |
ASYMMETRIC_DECRYPT() |
Decrypt ciphertext using private or public key |
ASYMMETRIC_DERIVE() |
Derive symmetric key from asymmetric keys |
ASYMMETRIC_ENCRYPT() |
Encrypt cleartext using private or public key |
ASYMMETRIC_SIGN() |
Generate signature from digest |
ASYMMETRIC_VERIFY() |
Verify that signature matches digest |
ATAN() |
Return the arc tangent |
ATAN2() , ATAN() |
Return the arc tangent of the two arguments |
AVG() |
Return the average value of the argument |
BENCHMARK() |
Repeatedly execute an expression |
BETWEEN ... AND ... |
Check whether a value is within a range of values |
BIN() |
Return a string containing binary representation of a number |
BIN_TO_UUID() |
Convert binary UUID to string |
BINARY |
Cast a string to a binary string |
BIT_AND() |
Return bitwise AND |
BIT_COUNT() |
Return the number of bits that are set |
BIT_LENGTH() |
Return length of argument in bits |
BIT_OR() |
Return bitwise OR |
BIT_XOR() |
Return bitwise XOR |
& |
Bitwise AND |
~ |
Bitwise inversion |
| |
Bitwise OR |
^ |
Bitwise XOR |
CAN_ACCESS_COLUMN() |
Internal use only |
CAN_ACCESS_DATABASE() |
Internal use only |
CAN_ACCESS_TABLE() |
Internal use only |
CAN_ACCESS_VIEW() |
Internal use only |
CASE |
Case operator |
CAST() |
Cast a value as a certain type |
CEIL() |
Return the smallest integer value not less than the argument |
CEILING() |
Return the smallest integer value not less than the argument |
CHAR() |
Return the character for each integer passed |
CHAR_LENGTH() |
Return number of characters in argument |
CHARACTER_LENGTH() |
Synonym for CHAR_LENGTH() |
CHARSET() |
Return the character set of the argument |
COALESCE() |
Return the first non-NULL argument |
COERCIBILITY() |
Return the collation coercibility value of the string argument |
COLLATION() |
Return the collation of the string argument |
COMPRESS() |
Return result as a binary string |
CONCAT() |
Return concatenated string |
CONCAT_WS() |
Return concatenate with separator |
CONNECTION_ID() |
Return the connection ID (thread ID) for the connection |
CONV() |
Convert numbers between different number bases |
CONVERT() |
Cast a value as a certain type |
CONVERT_TZ() |
Convert from one time zone to another |
COS() |
Return the cosine |
COT() |
Return the cotangent |
COUNT() |
Return a count of the number of rows returned |
COUNT(DISTINCT) |
Return the count of a number of different values |
CRC32() |
Compute a cyclic redundancy check value |
CREATE_ASYMMETRIC_PRIV_KEY() |
Create private key |
CREATE_ASYMMETRIC_PUB_KEY() |
Create public key |
CREATE_DH_PARAMETERS() |
Generate shared DH secret |
CREATE_DIGEST() |
Generate digest from string |
CUME_DIST() |
Cumulative distribution value |
CURDATE() |
Return the current date |
CURRENT_DATE() , CURRENT_DATE |
Synonyms for CURDATE() |
CURRENT_ROLE() |
Return the current active roles |
CURRENT_TIME() , CURRENT_TIME |
Synonyms for CURTIME() |
CURRENT_TIMESTAMP() , CURRENT_TIMESTAMP |
Synonyms for NOW() |
CURRENT_USER() , CURRENT_USER |
The authenticated user name and host name |
CURTIME() |
Return the current time |
DATABASE() |
Return the default (current) database name |
DATE() |
Extract the date part of a date or datetime expression |
DATE_ADD() |
Add time values (intervals) to a date value |
DATE_FORMAT() |
Format date as specified |
DATE_SUB() |
Subtract a time value (interval) from a date |
DATEDIFF() |
Subtract two dates |
DAY() |
Synonym for DAYOFMONTH() |
DAYNAME() |
Return the name of the weekday |
DAYOFMONTH() |
Return the day of the month (0-31) |
DAYOFWEEK() |
Return the weekday index of the argument |
DAYOFYEAR() |
Return the day of the year (1-366) |
DECODE() |
Decode a string encrypted using ENCODE() |
DEFAULT() |
Return the default value for a table column |
DEGREES() |
Convert radians to degrees |
DENSE_RANK() |
Rank of current row within its partition, without gaps |
DES_DECRYPT() |
Decrypt a string |
DES_ENCRYPT() |
Encrypt a string |
DIV |
Integer division |
/ |
Division operator |
ELT() |
Return string at index number |
ENCODE() |
Encode a string |
ENCRYPT() |
Encrypt a string |
= |
Equal operator |
<=> |
NULL-safe equal to operator |
EXP() |
Raise to the power of |
EXPORT_SET() |
Return a string such that for every bit set in the value bits, you get an on string and for every unset bit, you get an off string |
EXTRACT() |
Extract part of a date |
ExtractValue() |
Extract a value from an XML string using XPath notation |
FIELD() |
Index (position) of first argument in subsequent arguments |
FIND_IN_SET() |
Index (position) of first argument within second argument |
FIRST_VALUE() |
Value of argument from first row of window frame |
FLOOR() |
Return the largest integer value not greater than the argument |
FORMAT() |
Return a number formatted to specified number of decimal places |
FORMAT_BYTES() |
Convert byte count to value with units |
FORMAT_PICO_TIME() |
Convert time in picoseconds to value with units |
FOUND_ROWS() |
For a SELECT with a LIMIT clause, the number of rows that would be returned were there no LIMIT clause |
FROM_BASE64() |
Decode base64 encoded string and return result |
FROM_DAYS() |
Convert a day number to a date |
FROM_UNIXTIME() |
Format Unix timestamp as a date |
GeomCollection() |
Construct geometry collection from geometries |
GeometryCollection() |
Construct geometry collection from geometries |
GET_DD_COLUMN_PRIVILEGES() |
Internal use only |
GET_DD_CREATE_OPTIONS() |
Internal use only |
GET_DD_INDEX_SUB_PART_LENGTH() |
Internal use only |
GET_FORMAT() |
Return a date format string |
GET_LOCK() |
Get a named lock |
> |
Greater than operator |
>= |
Greater than or equal operator |
GREATEST() |
Return the largest argument |
GROUP_CONCAT() |
Return a concatenated string |
GROUPING() |
Distinguish super-aggregate ROLLUP rows from regular rows |
GTID_SUBSET() |
Return true if all GTIDs in subset are also in set; otherwise false. |
GTID_SUBTRACT() |
Return all GTIDs in set that are not in subset. |
HEX() |
Hexadecimal representation of decimal or string value |
HOUR() |
Extract the hour |
ICU_VERSION() |
ICU library version |
IF() |
If/else construct |
IFNULL() |
Null if/else construct |
IN() |
Check whether a value is within a set of values |
INET_ATON() |
Return the numeric value of an IP address |
INET_NTOA() |
Return the IP address from a numeric value |
INET6_ATON() |
Return the numeric value of an IPv6 address |
INET6_NTOA() |
Return the IPv6 address from a numeric value |
INSERT() |
Insert substring at specified position up to specified number of characters |
INSTR() |
Return the index of the first occurrence of substring |
INTERNAL_AUTO_INCREMENT() |
Internal use only |
INTERNAL_AVG_ROW_LENGTH() |
Internal use only |
INTERNAL_CHECK_TIME() |
Internal use only |
INTERNAL_CHECKSUM() |
Internal use only |
INTERNAL_DATA_FREE() |
Internal use only |
INTERNAL_DATA_LENGTH() |
Internal use only |
INTERNAL_DD_CHAR_LENGTH() |
Internal use only |
INTERNAL_GET_COMMENT_OR_ERROR() |
Internal use only |
INTERNAL_GET_VIEW_WARNING_OR_ERROR() |
Internal use only |
INTERNAL_INDEX_COLUMN_CARDINALITY() |
Internal use only |
INTERNAL_INDEX_LENGTH() |
Internal use only |
INTERNAL_KEYS_DISABLED() |
Internal use only |
INTERNAL_MAX_DATA_LENGTH() |
Internal use only |
INTERNAL_TABLE_ROWS() |
Internal use only |
INTERNAL_UPDATE_TIME() |
Internal use only |
INTERVAL() |
Return the index of the argument that is less than the first argument |
IS |
Test a value against a boolean |
IS_FREE_LOCK() |
Whether the named lock is free |
IS_IPV4() |
Whether argument is an IPv4 address |
IS_IPV4_COMPAT() |
Whether argument is an IPv4-compatible address |
IS_IPV4_MAPPED() |
Whether argument is an IPv4-mapped address |
IS_IPV6() |
Whether argument is an IPv6 address |
IS NOT |
Test a value against a boolean |
IS NOT NULL |
NOT NULL value test |
IS NULL |
NULL value test |
IS_USED_LOCK() |
Whether the named lock is in use; return connection identifier if true |
IS_UUID() |
Whether argument is a valid UUID |
ISNULL() |
Test whether the argument is NULL |
JSON_ARRAY() |
Create JSON array |
JSON_ARRAY_APPEND() |
Append data to JSON document |
JSON_ARRAY_INSERT() |
Insert into JSON array |
JSON_ARRAYAGG() |
Return result set as a single JSON array |
-> |
Return value from JSON column after evaluating path; equivalent to JSON_EXTRACT(). |
JSON_CONTAINS() |
Whether JSON document contains specific object at path |
JSON_CONTAINS_PATH() |
Whether JSON document contains any data at path |
JSON_DEPTH() |
Maximum depth of JSON document |
JSON_EXTRACT() |
Return data from JSON document |
->> |
Return value from JSON column after evaluating path and unquoting the result; equivalent to JSON_UNQUOTE(JSON_EXTRACT()). |
JSON_INSERT() |
Insert data into JSON document |
JSON_KEYS() |
Array of keys from JSON document |
JSON_LENGTH() |
Number of elements in JSON document |
JSON_MERGE() (deprecated 8.0.3) |
Merge JSON documents, preserving duplicate keys. Deprecated synonym for JSON_MERGE_PRESERVE() |
JSON_MERGE_PATCH() |
Merge JSON documents, replacing values of duplicate keys |
JSON_MERGE_PRESERVE() |
Merge JSON documents, preserving duplicate keys |
JSON_OBJECT() |
Create JSON object |
JSON_OBJECTAGG() |
Return result set as a single JSON object |
JSON_PRETTY() |
Print a JSON document in human-readable format |
JSON_QUOTE() |
Quote JSON document |
JSON_REMOVE() |
Remove data from JSON document |
JSON_REPLACE() |
Replace values in JSON document |
JSON_SEARCH() |
Path to value within JSON document |
JSON_SET() |
Insert data into JSON document |
JSON_STORAGE_FREE() |
Freed space within binary representation of a JSON column value following a partial update |
JSON_STORAGE_SIZE() |
Space used for storage of binary representation of a JSON document |
JSON_TABLE() |
Return data from a JSON expression as a relational table |
JSON_TYPE() |
Type of JSON value |
JSON_UNQUOTE() |
Unquote JSON value |
JSON_VALID() |
Whether JSON value is valid |
LAG() |
Value of argument from row lagging current row within partition |
LAST_DAY |
Return the last day of the month for the argument |
LAST_INSERT_ID() |
Value of the AUTOINCREMENT column for the last INSERT |
LAST_VALUE() |
Value of argument from last row of window frame |
LCASE() |
Synonym for LOWER() |
LEAD() |
Value of argument from row leading current row within partition |
LEAST() |
Return the smallest argument |
LEFT() |
Return the leftmost number of characters as specified |
<< |
Left shift |
LENGTH() |
Return the length of a string in bytes |
< |
Less than operator |
<= |
Less than or equal operator |
LIKE |
Simple pattern matching |
LineString() |
Construct LineString from Point values |
LN() |
Return the natural logarithm of the argument |
LOAD_FILE() |
Load the named file |
LOCALTIME() , LOCALTIME |
Synonym for NOW() |
LOCALTIMESTAMP , LOCALTIMESTAMP() |
Synonym for NOW() |
LOCATE() |
Return the position of the first occurrence of substring |
LOG() |
Return the natural logarithm of the first argument |
LOG10() |
Return the base-10 logarithm of the argument |
LOG2() |
Return the base-2 logarithm of the argument |
LOWER() |
Return the argument in lowercase |
LPAD() |
Return the string argument, left-padded with the specified string |
LTRIM() |
Remove leading spaces |
MAKE_SET() |
Return a set of comma-separated strings that have the corresponding bit in bits set |
MAKEDATE() |
Create a date from the year and day of year |
MAKETIME() |
Create time from hour, minute, second |
MASTER_POS_WAIT() |
Block until the slave has read and applied all updates up to the specified position |
MATCH |
Perform full-text search |
MAX() |
Return the maximum value |
MBRContains() |
Whether MBR of one geometry contains MBR of another |
MBRCoveredBy() |
Whether one MBR is covered by another |
MBRCovers() |
Whether one MBR covers another |
MBRDisjoint() |
Whether MBRs of two geometries are disjoint |
MBREquals() |
Whether MBRs of two geometries are equal |
MBRIntersects() |
Whether MBRs of two geometries intersect |
MBROverlaps() |
Whether MBRs of two geometries overlap |
MBRTouches() |
Whether MBRs of two geometries touch |
MBRWithin() |
Whether MBR of one geometry is within MBR of another |
MD5() |
Calculate MD5 checksum |
MICROSECOND() |
Return the microseconds from argument |
MID() |
Return a substring starting from the specified position |
MIN() |
Return the minimum value |
- |
Minus operator |
MINUTE() |
Return the minute from the argument |
MOD() |
Return the remainder |
% , MOD |
Modulo operator |
MONTH() |
Return the month from the date passed |
MONTHNAME() |
Return the name of the month |
MultiLineString() |
Contruct MultiLineString from LineString values |
MultiPoint() |
Construct MultiPoint from Point values |
MultiPolygon() |
Construct MultiPolygon from Polygon values |
NAME_CONST() |
Cause the column to have the given name |
NOT , ! |
Negates value |
NOT BETWEEN ... AND ... |
Check whether a value is not within a range of values |
!= , <> |
Not equal operator |
NOT IN() |
Check whether a value is not within a set of values |
NOT LIKE |
Negation of simple pattern matching |
NOT REGEXP |
Negation of REGEXP |
NOW() |
Return the current date and time |
NTH_VALUE() |
Value of argument from N-th row of window frame |
NTILE() |
Bucket number of current row within its partition. |
NULLIF() |
Return NULL if expr1 = expr2 |
OCT() |
Return a string containing octal representation of a number |
OCTET_LENGTH() |
Synonym for LENGTH() |
|| , OR |
Logical OR |
ORD() |
Return character code for leftmost character of the argument |
PASSWORD() |
Calculate and return a password string |
PERCENT_RANK() |
Percentage rank value |
PERIOD_ADD() |
Add a period to a year-month |
PERIOD_DIFF() |
Return the number of months between periods |
PI() |
Return the value of pi |
+ |
Addition operator |
Point() |
Construct Point from coordinates |
Polygon() |
Construct Polygon from LineString arguments |
POSITION() |
Synonym for LOCATE() |
POW() |
Return the argument raised to the specified power |
POWER() |
Return the argument raised to the specified power |
PS_CURRENT_THREAD_ID() |
Performance Schema thread ID for current thread |
PS_THREAD_ID() |
Performance Schema thread ID for given thread |
QUARTER() |
Return the quarter from a date argument |
QUOTE() |
Escape the argument for use in an SQL statement |
RADIANS() |
Return argument converted to radians |
RAND() |
Return a random floating-point value |
RANDOM_BYTES() |
Return a random byte vector |
RANK() |
Rank of current row within its partition, with gaps |
REGEXP |
Whether string matches regular expression |
REGEXP_INSTR() |
Starting index of substring matching regular expression |
REGEXP_LIKE() |
Whether string matches regular expression |
REGEXP_REPLACE() |
Replace substrings matching regular expression |
REGEXP_SUBSTR() |
Return substring matching regular expression |
RELEASE_ALL_LOCKS() |
Release all current named locks |
RELEASE_LOCK() |
Release the named lock |
REPEAT() |
Repeat a string the specified number of times |
REPLACE() |
Replace occurrences of a specified string |
REVERSE() |
Reverse the characters in a string |
RIGHT() |
Return the specified rightmost number of characters |
>> |
Right shift |
RLIKE |
Whether string matches regular expression |
ROLES_GRAPHML() |
Return a GraphML document representing memory role subgraphs |
ROUND() |
Round the argument |
ROW_COUNT() |
The number of rows updated |
ROW_NUMBER() |
Number of current row within its partition |
RPAD() |
Append string the specified number of times |
RTRIM() |
Remove trailing spaces |
SCHEMA() |
Synonym for DATABASE() |
SEC_TO_TIME() |
Converts seconds to 'HH:MM:SS' format |
SECOND() |
Return the second (0-59) |
SESSION_USER() |
Synonym for USER() |
SHA1() , SHA() |
Calculate an SHA-1 160-bit checksum |
SHA2() |
Calculate an SHA-2 checksum |
SIGN() |
Return the sign of the argument |
SIN() |
Return the sine of the argument |
SLEEP() |
Sleep for a number of seconds |
SOUNDEX() |
Return a soundex string |
SOUNDS LIKE |
Compare sounds |
SPACE() |
Return a string of the specified number of spaces |
SQRT() |
Return the square root of the argument |
ST_Area() |
Return Polygon or MultiPolygon area |
ST_AsBinary() , ST_AsWKB() |
Convert from internal geometry format to WKB |
ST_AsGeoJSON() |
Generate GeoJSON object from geometry |
ST_AsText() , ST_AsWKT() |
Convert from internal geometry format to WKT |
ST_Buffer() |
Return geometry of points within given distance from geometry |
ST_Buffer_Strategy() |
Produce strategy option for ST_Buffer() |
ST_Centroid() |
Return centroid as a point |
ST_Contains() |
Whether one geometry contains another |
ST_ConvexHull() |
Return convex hull of geometry |
ST_Crosses() |
Whether one geometry crosses another |
ST_Difference() |
Return point set difference of two geometries |
ST_Dimension() |
Dimension of geometry |
ST_Disjoint() |
Whether one geometry is disjoint from another |
ST_Distance() |
The distance of one geometry from another |
ST_Distance_Sphere() |
Minimum distance on earth between two geometries |
ST_EndPoint() |
End Point of LineString |
ST_Envelope() |
Return MBR of geometry |
ST_Equals() |
Whether one geometry is equal to another |
ST_ExteriorRing() |
Return exterior ring of Polygon |
ST_GeoHash() |
Produce a geohash value |
ST_GeomCollFromText() , ST_GeometryCollectionFromText() , ST_GeomCollFromTxt() |
Return geometry collection from WKT |
ST_GeomCollFromWKB() , ST_GeometryCollectionFromWKB() |
Return geometry collection from WKB |
ST_GeometryN() |
Return N-th geometry from geometry collection |
ST_GeometryType() |
Return name of geometry type |
ST_GeomFromGeoJSON() |
Generate geometry from GeoJSON object |
ST_GeomFromText() , ST_GeometryFromText() |
Return geometry from WKT |
ST_GeomFromWKB() , ST_GeometryFromWKB() |
Return geometry from WKB |
ST_InteriorRingN() |
Return N-th interior ring of Polygon |
ST_Intersection() |
Return point set intersection of two geometries |
ST_Intersects() |
Whether one geometry intersects another |
ST_IsClosed() |
Whether a geometry is closed and simple |
ST_IsEmpty() |
Placeholder function |
ST_IsSimple() |
Whether a geometry is simple |
ST_IsValid() |
Whether a geometry is valid |
ST_LatFromGeoHash() |
Return latitude from geohash value |
ST_Latitude() |
Return latitude of Point |
ST_Length() |
Return length of LineString |
ST_LineFromText() , ST_LineStringFromText() |
Construct LineString from WKT |
ST_LineFromWKB() , ST_LineStringFromWKB() |
Construct LineString from WKB |
ST_LongFromGeoHash() |
Return longitude from geohash value |
ST_Longitude() |
Return longitude of Point |
ST_MakeEnvelope() |
Rectangle around two points |
ST_MLineFromText() , ST_MultiLineStringFromText() |
Construct MultiLineString from WKT |
ST_MLineFromWKB() , ST_MultiLineStringFromWKB() |
Construct MultiLineString from WKB |
ST_MPointFromText() , ST_MultiPointFromText() |
Construct MultiPoint from WKT |
ST_MPointFromWKB() , ST_MultiPointFromWKB() |
Construct MultiPoint from WKB |
ST_MPolyFromText() , ST_MultiPolygonFromText() |
Construct MultiPolygon from WKT |
ST_MPolyFromWKB() , ST_MultiPolygonFromWKB() |
Construct MultiPolygon from WKB |
ST_NumGeometries() |
Return number of geometries in geometry collection |
ST_NumInteriorRing() , ST_NumInteriorRings() |
Return number of interior rings in Polygon |
ST_NumPoints() |
Return number of points in LineString |
ST_Overlaps() |
Whether one geometry overlaps another |
ST_PointFromGeoHash() |
Convert geohash value to POINT value |
ST_PointFromText() |
Construct Point from WKT |
ST_PointFromWKB() |
Construct Point from WKB |
ST_PointN() |
Return N-th point from LineString |
ST_PolyFromText() , ST_PolygonFromText() |
Construct Polygon from WKT |
ST_PolyFromWKB() , ST_PolygonFromWKB() |
Construct Polygon from WKB |
ST_Simplify() |
Return simplified geometry |
ST_SRID() |
Return spatial reference system ID for geometry |
ST_StartPoint() |
Start Point of LineString |
ST_SwapXY() |
Return argument with X/Y coordinates swapped |
ST_SymDifference() |
Return point set symmetric difference of two geometries |
ST_Touches() |
Whether one geometry touches another |
ST_Transform() |
Transform coordinates of geometry |
ST_Union() |
Return point set union of two geometries |
ST_Validate() |
Return validated geometry |
ST_Within() |
Whether one geometry is within another |
ST_X() |
Return X coordinate of Point |
ST_Y() |
Return Y coordinate of Point |
STATEMENT_DIGEST() |
Compute statement digest hash value |
STATEMENT_DIGEST_TEXT() |
Compute normalized statement digest |
STD() |
Return the population standard deviation |
STDDEV() |
Return the population standard deviation |
STDDEV_POP() |
Return the population standard deviation |
STDDEV_SAMP() |
Return the sample standard deviation |
STR_TO_DATE() |
Convert a string to a date |
STRCMP() |
Compare two strings |
SUBDATE() |
Synonym for DATE_SUB() when invoked with three arguments |
SUBSTR() |
Return the substring as specified |
SUBSTRING() |
Return the substring as specified |
SUBSTRING_INDEX() |
Return a substring from a string before the specified number of occurrences of the delimiter |
SUBTIME() |
Subtract times |
SUM() |
Return the sum |
SYSDATE() |
Return the time at which the function executes |
SYSTEM_USER() |
Synonym for USER() |
TAN() |
Return the tangent of the argument |
TIME() |
Extract the time portion of the expression passed |
TIME_FORMAT() |
Format as time |
TIME_TO_SEC() |
Return the argument converted to seconds |
TIMEDIFF() |
Subtract time |
* |
Multiplication operator |
TIMESTAMP() |
With a single argument, this function returns the date or datetime expression; with two arguments, the sum of the arguments |
TIMESTAMPADD() |
Add an interval to a datetime expression |
TIMESTAMPDIFF() |
Subtract an interval from a datetime expression |
TO_BASE64() |
Return the argument converted to a base-64 string |
TO_DAYS() |
Return the date argument converted to days |
TO_SECONDS() |
Return the date or datetime argument converted to seconds since Year 0 |
TRIM() |
Remove leading and trailing spaces |
TRUNCATE() |
Truncate to specified number of decimal places |
UCASE() |
Synonym for UPPER() |
- |
Change the sign of the argument |
UNCOMPRESS() |
Uncompress a string compressed |
UNCOMPRESSED_LENGTH() |
Return the length of a string before compression |
UNHEX() |
Return a string containing hex representation of a number |
UNIX_TIMESTAMP() |
Return a Unix timestamp |
UpdateXML() |
Return replaced XML fragment |
UPPER() |
Convert to uppercase |
USER() |
The user name and host name provided by the client |
UTC_DATE() |
Return the current UTC date |
UTC_TIME() |
Return the current UTC time |
UTC_TIMESTAMP() |
Return the current UTC date and time |
UUID() |
Return a Universal Unique Identifier (UUID) |
UUID_SHORT() |
Return an integer-valued universal identifier |
UUID_TO_BIN() |
Convert string UUID to binary |
VALIDATE_PASSWORD_STRENGTH() |
Determine strength of password |
VALUES() |
Define the values to be used during an INSERT |
VAR_POP() |
Return the population standard variance |
VAR_SAMP() |
Return the sample variance |
VARIANCE() |
Return the population standard variance |
VERSION() |
Return a string that indicates the MySQL server version |
WAIT_FOR_EXECUTED_GTID_SET() |
Wait until the given GTIDs have executed on slave. |
WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS() |
Wait until the given GTIDs have executed on slave. |
WEEK() |
Return the week number |
WEEKDAY() |
Return the weekday index |
WEEKOFYEAR() |
Return the calendar week of the date (1-53) |
WEIGHT_STRING() |
Return the weight string for a string |
XOR |
Logical XOR |
YEAR() |
Return the year |
YEARWEEK() |
Return the year and week |
When an operator is used with operands of different types, type conversion occurs to make the operands compatible. Some conversions occur implicitly. For example, MySQL automatically converts strings to numbers as necessary, and vice versa.
mysql>SELECT 1+'1';
-> 2 mysql>SELECT CONCAT(2,' test');
-> '2 test'
It is also possible to convert a number to a string explicitly
using the CAST()
function.
Conversion occurs implicitly with the
CONCAT()
function because it
expects string arguments.
mysql>SELECT 38.8, CAST(38.8 AS CHAR);
-> 38.8, '38.8' mysql>SELECT 38.8, CONCAT(38.8);
-> 38.8, '38.8'
See later in this section for information about the character set
of implicit number-to-string conversions, and for modified rules
that apply to CREATE TABLE ... SELECT
statements.
The following rules describe how conversion occurs for comparison operations:
If one or both arguments are NULL
, the
result of the comparison is NULL
, except
for the NULL
-safe
<=>
equality comparison operator. For NULL <=>
NULL
, the result is true. No conversion is needed.
If both arguments in a comparison operation are strings, they are compared as strings.
If both arguments are integers, they are compared as integers.
Hexadecimal values are treated as binary strings if not compared to a number.
If one of the arguments is a
TIMESTAMP
or
DATETIME
column and the other
argument is a constant, the constant is converted to a
timestamp before the comparison is performed. This is done to
be more ODBC-friendly. This is not done for the arguments to
IN()
. To be safe, always use
complete datetime, date, or time strings when doing
comparisons. For example, to achieve best results when using
BETWEEN
with date or time values,
use CAST()
to explicitly
convert the values to the desired data type.
A single-row subquery from a table or tables is not considered
a constant. For example, if a subquery returns an integer to
be compared to a DATETIME
value, the comparison is done as two integers. The integer is
not converted to a temporal value. To compare the operands as
DATETIME
values, use
CAST()
to explicitly convert
the subquery value to DATETIME
.
If one of the arguments is a decimal value, comparison depends on the other argument. The arguments are compared as decimal values if the other argument is a decimal or integer value, or as floating-point values if the other argument is a floating-point value.
In all other cases, the arguments are compared as floating-point (real) numbers.
For information about conversion of values from one temporal type to another, see Section 11.3.6, “Conversion Between Date and Time Types”.
Comparison of JSON values takes place at two levels. The first level of comparison is based on the JSON types of the compared values. If the types differ, the comparison result is determined solely by which type has higher precedence. If the two values have the same JSON type, a second level of comparison occurs using type-specific rules. For comparison of JSON and non-JSON values, the non-JSON value is converted to JSON and the values compared as JSON values. For details, see Comparison and Ordering of JSON Values.
The following examples illustrate conversion of strings to numbers for comparison operations:
mysql>SELECT 1 > '6x';
-> 0 mysql>SELECT 7 > '6x';
-> 1 mysql>SELECT 0 > 'x6';
-> 0 mysql>SELECT 0 = 'x6';
-> 1
For comparisons of a string column with a number, MySQL cannot use
an index on the column to look up the value quickly. If
str_col
is an indexed string column,
the index cannot be used when performing the lookup in the
following statement:
SELECT * FROMtbl_name
WHEREstr_col
=1;
The reason for this is that there are many different strings that
may convert to the value 1
, such as
'1'
, ' 1'
, or
'1a'
.
Comparisons that use floating-point numbers (or values that are converted to floating-point numbers) are approximate because such numbers are inexact. This might lead to results that appear inconsistent:
mysql>SELECT '18015376320243458' = 18015376320243458;
-> 1 mysql>SELECT '18015376320243459' = 18015376320243459;
-> 0
Such results can occur because the values are converted to floating-point numbers, which have only 53 bits of precision and are subject to rounding:
mysql> SELECT '18015376320243459'+0.0;
-> 1.8015376320243e+16
Furthermore, the conversion from string to floating-point and from integer to floating-point do not necessarily occur the same way. The integer may be converted to floating-point by the CPU, whereas the string is converted digit by digit in an operation that involves floating-point multiplications.
The results shown will vary on different systems, and can be
affected by factors such as computer architecture or the compiler
version or optimization level. One way to avoid such problems is
to use CAST()
so that a value is
not converted implicitly to a float-point number:
mysql> SELECT CAST('18015376320243459' AS UNSIGNED) = 18015376320243459;
-> 1
For more information about floating-point comparisons, see Section B.6.4.8, “Problems with Floating-Point Values”.
The server includes dtoa
, a conversion library
that provides the basis for improved conversion between string or
DECIMAL
values and
approximate-value
(FLOAT
/DOUBLE
)
numbers:
Consistent conversion results across platforms, which eliminates, for example, Unix versus Windows conversion differences.
Accurate representation of values in cases where results previously did not provide sufficient precision, such as for values close to IEEE limits.
Conversion of numbers to string format with the best possible
precision. The precision of dtoa
is always
the same or better than that of the standard C library
functions.
Because the conversions produced by this library differ in some
cases from non-dtoa
results, the potential
exists for incompatibilities in applications that rely on previous
results. For example, applications that depend on a specific exact
result from previous conversions might need adjustment to
accommodate additional precision.
The dtoa
library provides conversions with the
following properties. D
represents a
value with a DECIMAL
or string
representation, and F
represents a
floating-point number in native binary (IEEE) format.
F
->
D
conversion is done with the best
possible precision, returning D
as
the shortest string that yields F
when read back in and rounded to the nearest value in native
binary format as specified by IEEE.
D
->
F
conversion is done such that
F
is the nearest native binary
number to the input decimal string
D
.
These properties imply that F
->
D
-> F
conversions are lossless unless F
is
-inf
, +inf
, or
NaN
. The latter values are not supported
because the SQL standard defines them as invalid values for
FLOAT
or
DOUBLE
.
For D
->
F
-> D
conversions, a sufficient condition for losslessness is that
D
uses 15 or fewer digits of precision,
is not a denormal value, -inf
,
+inf
, or NaN
. In some cases,
the conversion is lossless even if D
has more than 15 digits of precision, but this is not always the
case.
Implicit conversion of a numeric or temporal value to string
produces a value that has a character set and collation determined
by the character_set_connection
and collation_connection
system
variables. (These variables commonly are set with
SET NAMES
. For information about
connection character sets, see
Section 10.4, “Connection Character Sets and Collations”.)
This means that such a conversion results in a character
(nonbinary) string (a CHAR
,
VARCHAR
, or
LONGTEXT
value), except in the case
that the connection character set is set to
binary
. In that case, the conversion result is
a binary string (a BINARY
,
VARBINARY
, or
LONGBLOB
value).
For integer expressions, the preceding remarks about expression evaluation apply somewhat differently for expression assignment; for example, in a statement such as this:
CREATE TABLE t SELECT integer_expr
;
In this case, the table in the column resulting from the
expression has type INT
or
BIGINT
depending on the length of
the integer expression. If the maximum length of the expression
does not fit in an INT
,
BIGINT
is used instead. The length
is taken from the max_length
value of the
SELECT
result set metadata (see
Section 28.7.5, “C API Data Structures”). This means that you can
force a BIGINT
rather than
INT
by use of a sufficiently long
expression:
CREATE TABLE t SELECT 000000000000000000000;
Table 12.2 Operators
Name | Description |
---|---|
AND , && |
Logical AND |
= |
Assign a value (as part of a
SET
statement, or as part of the SET clause in an
UPDATE statement)
|
:= |
Assign a value |
BETWEEN ... AND ... |
Check whether a value is within a range of values |
BINARY |
Cast a string to a binary string |
& |
Bitwise AND |
~ |
Bitwise inversion |
| |
Bitwise OR |
^ |
Bitwise XOR |
CASE |
Case operator |
DIV |
Integer division |
/ |
Division operator |
= |
Equal operator |
<=> |
NULL-safe equal to operator |
> |
Greater than operator |
>= |
Greater than or equal operator |
IS |
Test a value against a boolean |
IS NOT |
Test a value against a boolean |
IS NOT NULL |
NOT NULL value test |
IS NULL |
NULL value test |
-> |
Return value from JSON column after evaluating path; equivalent to JSON_EXTRACT(). |
->> |
Return value from JSON column after evaluating path and unquoting the result; equivalent to JSON_UNQUOTE(JSON_EXTRACT()). |
<< |
Left shift |
< |
Less than operator |
<= |
Less than or equal operator |
LIKE |
Simple pattern matching |
- |
Minus operator |
% , MOD |
Modulo operator |
NOT , ! |
Negates value |
NOT BETWEEN ... AND ... |
Check whether a value is not within a range of values |
!= , <> |
Not equal operator |
NOT LIKE |
Negation of simple pattern matching |
NOT REGEXP |
Negation of REGEXP |
|| , OR |
Logical OR |
+ |
Addition operator |
REGEXP |
Whether string matches regular expression |
>> |
Right shift |
RLIKE |
Whether string matches regular expression |
SOUNDS LIKE |
Compare sounds |
* |
Multiplication operator |
- |
Change the sign of the argument |
XOR |
Logical XOR |
Operator precedences are shown in the following list, from highest precedence to the lowest. Operators that are shown together on a line have the same precedence.
INTERVAL BINARY, COLLATE ! - (unary minus), ~ (unary bit inversion) ^ *, /, DIV, %, MOD -, + <<, >> & | = (comparison), <=>, >=, >, <=, <, <>, !=, IS, LIKE, REGEXP, IN BETWEEN, CASE, WHEN, THEN, ELSE NOT AND, && XOR OR, || = (assignment), :=
The precedence of =
depends on whether it is
used as a comparison operator
(=
) or as an
assignment operator
(=
). When
used as a comparison operator, it has the same precedence as
<=>
,
>=
,
>
,
<=
,
<
,
<>
,
!=
,
IS
,
LIKE
,
REGEXP
, and
IN
. When used as an assignment
operator, it has the same precedence as
:=
.
Section 13.7.5.1, “SET Syntax for Variable Assignment”, and
Section 9.4, “User-Defined Variables”, explain how MySQL determines
which interpretation of =
should apply.
For operators that occur at the same precedence level within an expression, evaluation proceeds left to right, with the exception that assignments evaluate right to left.
The precedence and meaning of some operators depends on the SQL mode:
By default, ||
is a logical OR
operator. With
PIPES_AS_CONCAT
enabled,
||
is string
concatenation, with a precedence between
^
and
the unary operators.
By default, !
has a higher precedence than NOT
. With
HIGH_NOT_PRECEDENCE
enabled, !
and
NOT
have the same precedence.
See Section 5.1.11, “Server SQL Modes”.
The precedence of operators determines the order of evaluation of terms in an expression. To override this order and group terms explicitly, use parentheses. For example:
mysql>SELECT 1+2*3;
-> 7 mysql>SELECT (1+2)*3;
-> 9
Table 12.3 Comparison Operators
Name | Description |
---|---|
BETWEEN ... AND ... |
Check whether a value is within a range of values |
COALESCE() |
Return the first non-NULL argument |
= |
Equal operator |
<=> |
NULL-safe equal to operator |
> |
Greater than operator |
>= |
Greater than or equal operator |
GREATEST() |
Return the largest argument |
IN() |
Check whether a value is within a set of values |
INTERVAL() |
Return the index of the argument that is less than the first argument |
IS |
Test a value against a boolean |
IS NOT |
Test a value against a boolean |
IS NOT NULL |
NOT NULL value test |
IS NULL |
NULL value test |
ISNULL() |
Test whether the argument is NULL |
LEAST() |
Return the smallest argument |
< |
Less than operator |
<= |
Less than or equal operator |
LIKE |
Simple pattern matching |
NOT BETWEEN ... AND ... |
Check whether a value is not within a range of values |
!= , <> |
Not equal operator |
NOT IN() |
Check whether a value is not within a set of values |
NOT LIKE |
Negation of simple pattern matching |
STRCMP() |
Compare two strings |
Comparison operations result in a value of 1
(TRUE
), 0
(FALSE
), or NULL
. These
operations work for both numbers and strings. Strings are
automatically converted to numbers and numbers to strings as
necessary.
The following relational comparison operators can be used to compare not only scalar operands, but row operands:
= > < >= <= <> !=
The descriptions for those operators later in this section detail how they work with row operands. For additional examples of row comparisons in the context of row subqueries, see Section 13.2.11.5, “Row Subqueries”.
Some of the functions in this section return values other than
1
(TRUE
),
0
(FALSE
), or
NULL
. LEAST()
and GREATEST()
are examples of
such functions; Section 12.2, “Type Conversion in Expression Evaluation”, describes the
rules for comparison operations performed by these and similar
functions for determining their return values.
In previous versions of MySQL, when evaluating an expression
containing LEAST()
or
GREATEST()
, the server attempted to guess
the context in which the function was used, and to coerce the
function's arguments to the data type of the expression
as a whole. For example, the arguments to LEAST("11",
"45", "2")
are evaluated and sorted as strings, so
that this expression returns "11"
. In MySQL
8.0.3 and earlier, when evaluating the expression
LEAST("11", "45", "2") + 0
, the server
converted the arguments to integers (anticipating the addition
of integer 0 to the result) before sorting them, thus
returning 2.
Beginning with MySQL 8.0.4, the server no longer attempts to
infer context in this fashion. Instead, the function is
executed using the arguments as provided, performing data type
conversions to one or more of the arguments if and only if
they are not all of the same type. Any type coercion mandated
by an expression that makes use of the return value is now
performed following function execution. This means that, in
MySQl 8.0.4 and later, LEAST("11", "45", "2") +
0
evaluates to "11" + 0
and thus
to integer 11. (Bug #83895, Bug #25123839)
To convert a value to a specific type for comparison purposes,
you can use the CAST()
function.
String values can be converted to a different character set
using CONVERT()
. See
Section 12.10, “Cast Functions and Operators”.
By default, string comparisons are not case-sensitive and use
the current character set. The default is
utf8mb4
.
Equal:
mysql>SELECT 1 = 0;
-> 0 mysql>SELECT '0' = 0;
-> 1 mysql>SELECT '0.0' = 0;
-> 1 mysql>SELECT '0.01' = 0;
-> 0 mysql>SELECT '.01' = 0.01;
-> 1
For row comparisons, (a, b) = (x, y)
is
equivalent to:
(a = x) AND (b = y)
NULL
-safe equal. This operator performs
an equality comparison like the
=
operator,
but returns 1
rather than
NULL
if both operands are
NULL
, and 0
rather
than NULL
if one operand is
NULL
.
The
<=>
operator is equivalent to the standard SQL IS NOT
DISTINCT FROM
operator.
mysql>SELECT 1 <=> 1, NULL <=> NULL, 1 <=> NULL;
-> 1, 1, 0 mysql>SELECT 1 = 1, NULL = NULL, 1 = NULL;
-> 1, NULL, NULL
For row comparisons, (a, b) <=> (x,
y)
is equivalent to:
(a <=> x) AND (b <=> y)
Not equal:
mysql>SELECT '.01' <> '0.01';
-> 1 mysql>SELECT .01 <> '0.01';
-> 0 mysql>SELECT 'zapp' <> 'zappp';
-> 1
For row comparisons, (a, b) <> (x,
y)
and (a, b) != (x, y)
are
equivalent to:
(a <> x) OR (b <> y)
Less than or equal:
mysql> SELECT 0.1 <= 2;
-> 1
For row comparisons, (a, b) <= (x, y)
is equivalent to:
(a < x) OR ((a = x) AND (b <= y))
Less than:
mysql> SELECT 2 < 2;
-> 0
For row comparisons, (a, b) < (x, y)
is equivalent to:
(a < x) OR ((a = x) AND (b < y))
Greater than or equal:
mysql> SELECT 2 >= 2;
-> 1
For row comparisons, (a, b) >= (x, y)
is equivalent to:
(a > x) OR ((a = x) AND (b >= y))
Greater than:
mysql> SELECT 2 > 2;
-> 0
For row comparisons, (a, b) > (x, y)
is equivalent to:
(a > x) OR ((a = x) AND (b > y))
Tests a value against a boolean value, where
boolean_value
can be
TRUE
, FALSE
, or
UNKNOWN
.
mysql> SELECT 1 IS TRUE, 0 IS FALSE, NULL IS UNKNOWN;
-> 1, 1, 1
Tests a value against a boolean value, where
boolean_value
can be
TRUE
, FALSE
, or
UNKNOWN
.
mysql> SELECT 1 IS NOT UNKNOWN, 0 IS NOT UNKNOWN, NULL IS NOT UNKNOWN;
-> 1, 1, 0
Tests whether a value is NULL
.
mysql> SELECT 1 IS NULL, 0 IS NULL, NULL IS NULL;
-> 0, 0, 1
To work well with ODBC programs, MySQL supports the
following extra features when using IS
NULL
:
If sql_auto_is_null
variable is set to 1, then after a statement that
successfully inserts an automatically generated
AUTO_INCREMENT
value, you can find
that value by issuing a statement of the following form:
SELECT * FROMtbl_name
WHEREauto_col
IS NULL
If the statement returns a row, the value returned is
the same as if you invoked the
LAST_INSERT_ID()
function. For details, including the return value after
a multiple-row insert, see
Section 12.15, “Information Functions”. If no
AUTO_INCREMENT
value was successfully
inserted, the SELECT
statement returns no row.
The behavior of retrieving an
AUTO_INCREMENT
value by using an
IS NULL
comparison can be
disabled by setting
sql_auto_is_null = 0
.
See Section 5.1.8, “Server System Variables”.
The default value of
sql_auto_is_null
is 0.
For DATE
and
DATETIME
columns that are
declared as NOT NULL
, you can find
the special date '0000-00-00'
by
using a statement like this:
SELECT * FROMtbl_name
WHEREdate_column
IS NULL
This is needed to get some ODBC applications to work
because ODBC does not support a
'0000-00-00'
date value.
See
Obtaining Auto-Increment Values,
and the description for the
FLAG_AUTO_IS_NULL
option at
Connector/ODBC Connection Parameters.
Tests whether a value is not NULL
.
mysql> SELECT 1 IS NOT NULL, 0 IS NOT NULL, NULL IS NOT NULL;
-> 1, 1, 0
If expr
is greater than or equal
to min
and
expr
is less than or equal to
max
,
BETWEEN
returns
1
, otherwise it returns
0
. This is equivalent to the expression
(
if all the
arguments are of the same type. Otherwise type conversion
takes place according to the rules described in
Section 12.2, “Type Conversion in Expression Evaluation”, but applied to all the
three arguments.
min
<=
expr
AND
expr
<=
max
)
mysql>SELECT 2 BETWEEN 1 AND 3, 2 BETWEEN 3 and 1;
-> 1, 0 mysql>SELECT 1 BETWEEN 2 AND 3;
-> 0 mysql>SELECT 'b' BETWEEN 'a' AND 'c';
-> 1 mysql>SELECT 2 BETWEEN 2 AND '3';
-> 1 mysql>SELECT 2 BETWEEN 2 AND 'x-3';
-> 0
For best results when using
BETWEEN
with date or time
values, use CAST()
to
explicitly convert the values to the desired data type.
Examples: If you compare a
DATETIME
to two
DATE
values, convert the
DATE
values to
DATETIME
values. If you use a
string constant such as '2001-1-1'
in a
comparison to a DATE
, cast
the string to a DATE
.
This is the same as NOT
(
.
expr
BETWEEN
min
AND
max
)
Returns the first non-NULL
value in the
list, or NULL
if there are no
non-NULL
values.
The return type of COALESCE()
is the aggregated type of the argument types.
mysql>SELECT COALESCE(NULL,1);
-> 1 mysql>SELECT COALESCE(NULL,NULL,NULL);
-> NULL
With two or more arguments, returns the largest
(maximum-valued) argument. The arguments are compared using
the same rules as for
LEAST()
.
mysql>SELECT GREATEST(2,0);
-> 2 mysql>SELECT GREATEST(34.0,3.0,5.0,767.0);
-> 767.0 mysql>SELECT GREATEST('B','A','C');
-> 'C'
GREATEST()
returns
NULL
if any argument is
NULL
.
Returns 1
if
expr
is equal to any of the
values in the IN
list, else returns
0
. If all values are constants, they are
evaluated according to the type of
expr
and sorted. The search for
the item then is done using a binary search. This means
IN
is very quick if the
IN
value list consists entirely of
constants. Otherwise, type conversion takes place according
to the rules described in Section 12.2, “Type Conversion in Expression Evaluation”,
but applied to all the arguments.
mysql>SELECT 2 IN (0,3,5,7);
-> 0 mysql>SELECT 'wefwf' IN ('wee','wefwf','weg');
-> 1
IN
can be used to compare row
constructors:
mysql>SELECT (3,4) IN ((1,2), (3,4));
-> 1 mysql>SELECT (3,4) IN ((1,2), (3,5));
-> 0
You should never mix quoted and unquoted values in an
IN
list because the comparison rules for
quoted values (such as strings) and unquoted values (such as
numbers) differ. Mixing types may therefore lead to
inconsistent results. For example, do not write an
IN
expression like this:
SELECT val1 FROM tbl1 WHERE val1 IN (1,2,'a');
Instead, write it like this:
SELECT val1 FROM tbl1 WHERE val1 IN ('1','2','a');
The number of values in the IN
list is
only limited by the
max_allowed_packet
value.
To comply with the SQL standard, IN
returns NULL
not only if the expression
on the left hand side is NULL
, but also
if no match is found in the list and one of the expressions
in the list is NULL
.
IN()
syntax can also be used to write
certain types of subqueries. See
Section 13.2.11.3, “Subqueries with ANY, IN, or SOME”.
This is the same as NOT
(
.
expr
IN
(value
,...))
If expr
is
NULL
,
ISNULL()
returns
1
, otherwise it returns
0
.
mysql>SELECT ISNULL(1+1);
-> 0 mysql>SELECT ISNULL(1/0);
-> 1
ISNULL()
can be used instead
of =
to test
whether a value is NULL
. (Comparing a
value to NULL
using
=
always
yields NULL
.)
The ISNULL()
function shares
some special behaviors with the
IS NULL
comparison operator. See the description of
IS NULL
.
Returns 0
if N
< N1
, 1
if
N
<
N2
and so on or
-1
if N
is
NULL
. All arguments are treated as
integers. It is required that N1
< N2
<
N3
< ...
< Nn
for this function to work
correctly. This is because a binary search is used (very
fast).
mysql>SELECT INTERVAL(23, 1, 15, 17, 30, 44, 200);
-> 3 mysql>SELECT INTERVAL(10, 1, 10, 100, 1000);
-> 2 mysql>SELECT INTERVAL(22, 23, 30, 44, 200);
-> 0
With two or more arguments, returns the smallest (minimum-valued) argument. The arguments are compared using the following rules:
If any argument is NULL
, the result
is NULL
. No comparison is needed.
If all arguments are integer-valued, they are compared as integers.
If at least one argument is double precision, they are
compared as double-precision values. Otherwise, if at
least one argument is a
DECIMAL
value, they are
compared as DECIMAL
values.
If the arguments comprise a mix of numbers and strings, they are compared as strings.
If any argument is a nonbinary (character) string, the arguments are compared as nonbinary strings.
In all other cases, the arguments are compared as binary strings.
The return type of LEAST()
is
the aggregated type of the comparison argument types.
mysql>SELECT LEAST(2,0);
-> 0 mysql>SELECT LEAST(34.0,3.0,5.0,767.0);
-> 3.0 mysql>SELECT LEAST('B','A','C');
-> 'A'
In SQL, all logical operators evaluate to
TRUE
, FALSE
, or
NULL
(UNKNOWN
). In MySQL,
these are implemented as 1 (TRUE
), 0
(FALSE
), and NULL
. Most of
this is common to different SQL database servers, although some
servers may return any nonzero value for
TRUE
.
MySQL evaluates any nonzero, non-NULL
value
to TRUE
. For example, the following
statements all assess to TRUE
:
mysql>SELECT 10 IS TRUE;
-> 1 mysql>SELECT -10 IS TRUE;
-> 1 mysql>SELECT 'string' IS NOT NULL;
-> 1
Logical NOT. Evaluates to 1
if the
operand is 0
, to 0
if
the operand is nonzero, and NOT NULL
returns NULL
.
mysql>SELECT NOT 10;
-> 0 mysql>SELECT NOT 0;
-> 1 mysql>SELECT NOT NULL;
-> NULL mysql>SELECT ! (1+1);
-> 0 mysql>SELECT ! 1+1;
-> 1
The last example produces 1
because the
expression evaluates the same way as
(!1)+1
.
Logical AND. Evaluates to 1
if all
operands are nonzero and not NULL
, to
0
if one or more operands are
0
, otherwise NULL
is
returned.
mysql>SELECT 1 AND 1;
-> 1 mysql>SELECT 1 AND 0;
-> 0 mysql>SELECT 1 AND NULL;
-> NULL mysql>SELECT 0 AND NULL;
-> 0 mysql>SELECT NULL AND 0;
-> 0
Logical OR. When both operands are
non-NULL
, the result is
1
if any operand is nonzero, and
0
otherwise. With a
NULL
operand, the result is
1
if the other operand is nonzero, and
NULL
otherwise. If both operands are
NULL
, the result is
NULL
.
mysql>SELECT 1 OR 1;
-> 1 mysql>SELECT 1 OR 0;
-> 1 mysql>SELECT 0 OR 0;
-> 0 mysql>SELECT 0 OR NULL;
-> NULL mysql>SELECT 1 OR NULL;
-> 1
Logical XOR. Returns NULL
if either
operand is NULL
. For
non-NULL
operands, evaluates to
1
if an odd number of operands is
nonzero, otherwise 0
is returned.
mysql>SELECT 1 XOR 1;
-> 0 mysql>SELECT 1 XOR 0;
-> 1 mysql>SELECT 1 XOR NULL;
-> NULL mysql>SELECT 1 XOR 1 XOR 1;
-> 1
a XOR b
is mathematically equal to
(a AND (NOT b)) OR ((NOT a) and b)
.
Assignment operator. Causes the user variable on the left
hand side of the operator to take on the value to its right.
The value on the right hand side may be a literal value,
another variable storing a value, or any legal expression
that yields a scalar value, including the result of a query
(provided that this value is a scalar value). You can
perform multiple assignments in the same
SET
statement. You can perform multiple assignments in the same
statement.
Unlike
=
, the
:=
operator is never interpreted as a comparison operator. This
means you can use
:=
in
any valid SQL statement (not just in
SET
statements) to assign a value to a variable.
mysql>SELECT @var1, @var2;
-> NULL, NULL mysql>SELECT @var1 := 1, @var2;
-> 1, NULL mysql>SELECT @var1, @var2;
-> 1, NULL mysql>SELECT @var1, @var2 := @var1;
-> 1, 1 mysql>SELECT @var1, @var2;
-> 1, 1 mysql>SELECT @var1:=COUNT(*) FROM t1;
-> 4 mysql>SELECT @var1;
-> 4
You can make value assignments using
:=
in
other statements besides
SELECT
, such as
UPDATE
, as shown here:
mysql>SELECT @var1;
-> 4 mysql>SELECT * FROM t1;
-> 1, 3, 5, 7 mysql>UPDATE t1 SET c1 = 2 WHERE c1 = @var1:= 1;
Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql>SELECT @var1;
-> 1 mysql>SELECT * FROM t1;
-> 2, 3, 5, 7
While it is also possible both to set and to read the value
of the same variable in a single SQL statement using the
:=
operator, this is not recommended.
Section 9.4, “User-Defined Variables”, explains why you should
avoid doing this.
This operator is used to perform value assignments in two cases, described in the next two paragraphs.
Within a
SET
statement, =
is treated as an assignment
operator that causes the user variable on the left hand side
of the operator to take on the value to its right. (In other
words, when used in a
SET
statement, =
is treated identically to
:=
.)
The value on the right hand side may be a literal value,
another variable storing a value, or any legal expression
that yields a scalar value, including the result of a query
(provided that this value is a scalar value). You can
perform multiple assignments in the same
SET
statement.
In the SET
clause of an
UPDATE
statement,
=
also acts as an assignment operator; in
this case, however, it causes the column named on the left
hand side of the operator to assume the value given to the
right, provided any WHERE
conditions that
are part of the UPDATE
are
met. You can make multiple assignments in the same
SET
clause of an
UPDATE
statement.
In any other context, =
is treated as a
comparison operator.
mysql>SELECT @var1, @var2;
-> NULL, NULL mysql>SELECT @var1 := 1, @var2;
-> 1, NULL mysql>SELECT @var1, @var2;
-> 1, NULL mysql>SELECT @var1, @var2 := @var1;
-> 1, 1 mysql>SELECT @var1, @var2;
-> 1, 1
For more information, see Section 13.7.5.1, “SET Syntax for Variable Assignment”, Section 13.2.12, “UPDATE Syntax”, and Section 13.2.11, “Subquery Syntax”.
CASE
value
WHEN
[compare_value
] THEN
result
[WHEN
[compare_value
] THEN
result
...] [ELSE
result
] END
CASE WHEN
[
condition
] THEN
result
[WHEN
[condition
] THEN
result
...] [ELSE
result
] END
The first CASE
syntax returns the
result
for the first
comparison that is true. The second syntax returns the result
for the first condition that is true. If no comparison or
condition is true, the result after value
=compare_value
ELSE
is
returned, or NULL
if there is no
ELSE
part.
The syntax of the CASE
expr described here differs slightly
from that of the SQL CASE
statement described in
Section 13.6.5.1, “CASE Syntax”, for use inside stored programs. The
CASE
statement cannot have an
ELSE NULL
clause, and it is terminated
with END CASE
instead of
END
.
The return type of a CASE
expression result is the aggregated type of all result values:
If all types are numeric, the aggregated type is also numeric:
If at least one argument is double precision, the result is double precision.
Otherwise, if at least one argument is
DECIMAL
, the result is
DECIMAL
.
Otherwise, the result is an integer type (with one exception):
If all integer types are all signed or all
unsigned, the result is the same sign and the
precision is the highest of all specified integer
types (that is,
TINYINT
,
SMALLINT
,
MEDIUMINT
,
INT
, or
BIGINT
).
If there is a combination of signed and unsigned
integer types, the result is signed and the
precision may be higher. For example, if the types
are signed INT
and
unsigned INT
, the
result is signed
BIGINT
.
The exception is unsigned
BIGINT
combined
with any signed integer type. The result is
DECIMAL
with
sufficient precision and scale 0.
If all types are BIT
, the
result is BIT
. Otherwise,
BIT
arguments are treated
similar to BIGINT
.
If all types are YEAR
, the
result is YEAR
. Otherwise,
YEAR
arguments are treated similar to
INT
.
If all types are character string
(CHAR
or
VARCHAR
), the result is
VARCHAR
with maximum length
determined by the longest character length of the
operands.
If all types are character or binary string, the result is
VARBINARY
.
SET
and
ENUM
are treated similar to
VARCHAR
; the result is
VARCHAR
.
If all types are temporal, the result is temporal:
If all types are GEOMETRY
, the result
is GEOMETRY
.
For all other type combinations, the result is
VARCHAR
.
Literal NULL
operands are ignored for
type aggregation.
mysql>SELECT CASE 1 WHEN 1 THEN 'one'
->WHEN 2 THEN 'two' ELSE 'more' END;
-> 'one' mysql>SELECT CASE WHEN 1>0 THEN 'true' ELSE 'false' END;
-> 'true' mysql>SELECT CASE BINARY 'B'
->WHEN 'a' THEN 1 WHEN 'b' THEN 2 END;
-> NULL
If expr1
is TRUE
(
and expr1
<>
0
), expr1
<> NULLIF()
returns expr2
. Otherwise, it
returns expr3
.
There is also an IF
statement, which differs from the
IF()
function described here. See
Section 13.6.5.2, “IF Syntax”.
If only one of expr2
or
expr3
is explicitly
NULL
, the result type of the
IF()
function is the type of
the non-NULL
expression.
The default return type of IF()
(which may matter when it is stored into a temporary table) is
calculated as follows:
If expr2
or
expr3
produce a string, the
result is a string.
If expr2
and
expr3
are both strings, the
result is case-sensitive if either string is case
sensitive.
If expr2
or
expr3
produce a floating-point
value, the result is a floating-point value.
If expr2
or
expr3
produce an integer, the
result is an integer.
mysql>SELECT IF(1>2,2,3);
-> 3 mysql>SELECT IF(1<2,'yes','no');
-> 'yes' mysql>SELECT IF(STRCMP('test','test1'),'no','yes');
-> 'no'
If expr1
is not
NULL
,
IFNULL()
returns
expr1
; otherwise it returns
expr2
.
mysql>SELECT IFNULL(1,0);
-> 1 mysql>SELECT IFNULL(NULL,10);
-> 10 mysql>SELECT IFNULL(1/0,10);
-> 10 mysql>SELECT IFNULL(1/0,'yes');
-> 'yes'
The default return type of
IFNULL(
is the more “general” of the two expressions, in
the order expr1
,expr2
)STRING
, REAL
,
or INTEGER
. Consider the case of a table
based on expressions or where MySQL must internally store a
value returned by IFNULL()
in a
temporary table:
mysql>CREATE TABLE tmp SELECT IFNULL(1,'test') AS test;
mysql>DESCRIBE tmp;
+-------+--------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+--------------+------+-----+---------+-------+ | test | varbinary(4) | NO | | | | +-------+--------------+------+-----+---------+-------+
In this example, the type of the test
column is VARBINARY(4)
(a
string type).
Returns NULL
if
is true, otherwise
returns expr1
=
expr2
expr1
. This is the same as
CASE WHEN
.
expr1
=
expr2
THEN NULL ELSE
expr1
END
The return value has the same type as the first argument.
mysql>SELECT NULLIF(1,1);
-> NULL mysql>SELECT NULLIF(1,2);
-> 1
MySQL evaluates expr1
twice if
the arguments are not equal.
Table 12.7 String Operators
Name | Description |
---|---|
ASCII() |
Return numeric value of left-most character |
BIN() |
Return a string containing binary representation of a number |
BIT_LENGTH() |
Return length of argument in bits |
CHAR() |
Return the character for each integer passed |
CHAR_LENGTH() |
Return number of characters in argument |
CHARACTER_LENGTH() |
Synonym for CHAR_LENGTH() |
CONCAT() |
Return concatenated string |
CONCAT_WS() |
Return concatenate with separator |
ELT() |
Return string at index number |
EXPORT_SET() |
Return a string such that for every bit set in the value bits, you get an on string and for every unset bit, you get an off string |
FIELD() |
Index (position) of first argument in subsequent arguments |
FIND_IN_SET() |
Index (position) of first argument within second argument |
FORMAT() |
Return a number formatted to specified number of decimal places |
FROM_BASE64() |
Decode base64 encoded string and return result |
HEX() |
Hexadecimal representation of decimal or string value |
INSERT() |
Insert substring at specified position up to specified number of characters |
INSTR() |
Return the index of the first occurrence of substring |
LCASE() |
Synonym for LOWER() |
LEFT() |
Return the leftmost number of characters as specified |
LENGTH() |
Return the length of a string in bytes |
LIKE |
Simple pattern matching |
LOAD_FILE() |
Load the named file |
LOCATE() |
Return the position of the first occurrence of substring |
LOWER() |
Return the argument in lowercase |
LPAD() |
Return the string argument, left-padded with the specified string |
LTRIM() |
Remove leading spaces |
MAKE_SET() |
Return a set of comma-separated strings that have the corresponding bit in bits set |
MATCH |
Perform full-text search |
MID() |
Return a substring starting from the specified position |
NOT LIKE |
Negation of simple pattern matching |
NOT REGEXP |
Negation of REGEXP |
OCT() |
Return a string containing octal representation of a number |
OCTET_LENGTH() |
Synonym for LENGTH() |
ORD() |
Return character code for leftmost character of the argument |
POSITION() |
Synonym for LOCATE() |
QUOTE() |
Escape the argument for use in an SQL statement |
REGEXP |
Whether string matches regular expression |
REGEXP_INSTR() |
Starting index of substring matching regular expression |
REGEXP_LIKE() |
Whether string matches regular expression |
REGEXP_REPLACE() |
Replace substrings matching regular expression |
REGEXP_SUBSTR() |
Return substring matching regular expression |
REPEAT() |
Repeat a string the specified number of times |
REPLACE() |
Replace occurrences of a specified string |
REVERSE() |
Reverse the characters in a string |
RIGHT() |
Return the specified rightmost number of characters |
RLIKE |
Whether string matches regular expression |
RPAD() |
Append string the specified number of times |
RTRIM() |
Remove trailing spaces |
SOUNDEX() |
Return a soundex string |
SOUNDS LIKE |
Compare sounds |
SPACE() |
Return a string of the specified number of spaces |
STRCMP() |
Compare two strings |
SUBSTR() |
Return the substring as specified |
SUBSTRING() |
Return the substring as specified |
SUBSTRING_INDEX() |
Return a substring from a string before the specified number of occurrences of the delimiter |
TO_BASE64() |
Return the argument converted to a base-64 string |
TRIM() |
Remove leading and trailing spaces |
UCASE() |
Synonym for UPPER() |
UNHEX() |
Return a string containing hex representation of a number |
UPPER() |
Convert to uppercase |
WEIGHT_STRING() |
Return the weight string for a string |
String-valued functions return NULL
if the
length of the result would be greater than the value of the
max_allowed_packet
system
variable. See Section 5.1.1, “Configuring the Server”.
For functions that operate on string positions, the first position is numbered 1.
For functions that take length arguments, noninteger arguments are rounded to the nearest integer.
Returns the numeric value of the leftmost character of the
string str
. Returns
0
if str
is the
empty string. Returns NULL
if
str
is NULL
.
ASCII()
works for 8-bit
characters.
mysql>SELECT ASCII('2');
-> 50 mysql>SELECT ASCII(2);
-> 50 mysql>SELECT ASCII('dx');
-> 100
See also the ORD()
function.
Returns a string representation of the binary value of
N
, where
N
is a longlong
(BIGINT
) number. This is
equivalent to
CONV(
.
Returns N
,10,2)NULL
if
N
is NULL
.
mysql> SELECT BIN(12);
-> '1100'
Returns the length of the string
str
in bits.
mysql> SELECT BIT_LENGTH('text');
-> 32
CHAR(
N
,...
[USING charset_name
])
CHAR()
interprets each argument
N
as an integer and returns a
string consisting of the characters given by the code values
of those integers. NULL
values are skipped.
mysql>SELECT CHAR(77,121,83,81,'76');
-> 'MySQL' mysql>SELECT CHAR(77,77.3,'77.3');
-> 'MMM'
CHAR()
arguments larger than
255 are converted into multiple result bytes. For example,
CHAR(256)
is equivalent to
CHAR(1,0)
, and
CHAR(256*256)
is equivalent to
CHAR(1,0,0)
:
mysql>SELECT HEX(CHAR(1,0)), HEX(CHAR(256));
+----------------+----------------+ | HEX(CHAR(1,0)) | HEX(CHAR(256)) | +----------------+----------------+ | 0100 | 0100 | +----------------+----------------+ mysql>SELECT HEX(CHAR(1,0,0)), HEX(CHAR(256*256));
+------------------+--------------------+ | HEX(CHAR(1,0,0)) | HEX(CHAR(256*256)) | +------------------+--------------------+ | 010000 | 010000 | +------------------+--------------------+
By default, CHAR()
returns a
binary string. To produce a string in a given character set,
use the optional USING
clause:
mysql> SELECT CHARSET(CHAR(X'65')), CHARSET(CHAR(X'65' USING utf8));
+----------------------+---------------------------------+
| CHARSET(CHAR(X'65')) | CHARSET(CHAR(X'65' USING utf8)) |
+----------------------+---------------------------------+
| binary | utf8 |
+----------------------+---------------------------------+
If USING
is given and the result string is
illegal for the given character set, a warning is issued.
Also, if strict SQL mode is enabled, the result from
CHAR()
becomes
NULL
.
Returns the length of the string
str
, measured in characters. A
multibyte character counts as a single character. This means
that for a string containing five 2-byte characters,
LENGTH()
returns
10
, whereas
CHAR_LENGTH()
returns
5
.
CHARACTER_LENGTH()
is a synonym
for CHAR_LENGTH()
.
Returns the string that results from concatenating the arguments. May have one or more arguments. If all arguments are nonbinary strings, the result is a nonbinary string. If the arguments include any binary strings, the result is a binary string. A numeric argument is converted to its equivalent nonbinary string form.
CONCAT()
returns
NULL
if any argument is
NULL
.
mysql>SELECT CONCAT('My', 'S', 'QL');
-> 'MySQL' mysql>SELECT CONCAT('My', NULL, 'QL');
-> NULL mysql>SELECT CONCAT(14.3);
-> '14.3'
For quoted strings, concatenation can be performed by placing the strings next to each other:
mysql> SELECT 'My' 'S' 'QL';
-> 'MySQL'
CONCAT_WS(
separator
,str1
,str2
,...)
CONCAT_WS()
stands for
Concatenate With Separator and is a special form of
CONCAT()
. The first argument is
the separator for the rest of the arguments. The separator is
added between the strings to be concatenated. The separator
can be a string, as can the rest of the arguments. If the
separator is NULL
, the result is
NULL
.
mysql>SELECT CONCAT_WS(',','First name','Second name','Last Name');
-> 'First name,Second name,Last Name' mysql>SELECT CONCAT_WS(',','First name',NULL,'Last Name');
-> 'First name,Last Name'
CONCAT_WS()
does not skip empty
strings. However, it does skip any NULL
values after the separator argument.
ELT()
returns the
N
th element of the list of strings:
str1
if
N
= 1
,
str2
if
N
= 2
, and so
on. Returns NULL
if
N
is less than 1
or greater than the number of arguments.
ELT()
is the complement of
FIELD()
.
mysql>SELECT ELT(1, 'Aa', 'Bb', 'Cc', 'Dd');
-> 'Aa' mysql>SELECT ELT(4, 'Aa', 'Bb', 'Cc', 'Dd');
-> 'Dd'
EXPORT_SET(
bits
,on
,off
[,separator
[,number_of_bits
]])
Returns a string such that for every bit set in the value
bits
, you get an
on
string and for every bit not set
in the value, you get an off
string. Bits in bits
are examined
from right to left (from low-order to high-order bits).
Strings are added to the result from left to right, separated
by the separator
string (the
default being the comma character ,
). The
number of bits examined is given by
number_of_bits
, which has a default
of 64 if not specified.
number_of_bits
is silently clipped
to 64 if larger than 64. It is treated as an unsigned integer,
so a value of −1 is effectively the same as 64.
mysql>SELECT EXPORT_SET(5,'Y','N',',',4);
-> 'Y,N,Y,N' mysql>SELECT EXPORT_SET(6,'1','0',',',10);
-> '0,1,1,0,0,0,0,0,0,0'
Returns the index (position) of str
in the str1
,
str2
,
str3
, ...
list.
Returns 0
if str
is not found.
If all arguments to FIELD()
are
strings, all arguments are compared as strings. If all
arguments are numbers, they are compared as numbers.
Otherwise, the arguments are compared as double.
If str
is NULL
,
the return value is 0
because
NULL
fails equality comparison with any
value. FIELD()
is the
complement of ELT()
.
mysql>SELECT FIELD('Bb', 'Aa', 'Bb', 'Cc', 'Dd', 'Ff');
-> 2 mysql>SELECT FIELD('Gg', 'Aa', 'Bb', 'Cc', 'Dd', 'Ff');
-> 0
Returns a value in the range of 1 to
N
if the string
str
is in the string list
strlist
consisting of
N
substrings. A string list is a
string composed of substrings separated by
,
characters. If the first argument is a
constant string and the second is a column of type
SET
, the
FIND_IN_SET()
function is
optimized to use bit arithmetic. Returns 0
if str
is not in
strlist
or if
strlist
is the empty string.
Returns NULL
if either argument is
NULL
. This function does not work properly
if the first argument contains a comma (,
)
character.
mysql> SELECT FIND_IN_SET('b','a,b,c,d');
-> 2
Formats the number X
to a format
like '#,###,###.##'
, rounded to
D
decimal places, and returns the
result as a string. If D
is
0
, the result has no decimal point or
fractional part.
The optional third parameter enables a locale to be specified
to be used for the result number's decimal point, thousands
separator, and grouping between separators. Permissible locale
values are the same as the legal values for the
lc_time_names
system variable
(see Section 10.15, “MySQL Server Locale Support”). If no locale is
specified, the default is 'en_US'
.
mysql>SELECT FORMAT(12332.123456, 4);
-> '12,332.1235' mysql>SELECT FORMAT(12332.1,4);
-> '12,332.1000' mysql>SELECT FORMAT(12332.2,0);
-> '12,332' mysql>SELECT FORMAT(12332.2,2,'de_DE');
-> '12.332,20'
Takes a string encoded with the base-64 encoded rules used by
TO_BASE64()
and returns the
decoded result as a binary string. The result is
NULL
if the argument is
NULL
or not a valid base-64 string. See the
description of TO_BASE64()
for
details about the encoding and decoding rules.
mysql> SELECT TO_BASE64('abc'), FROM_BASE64(TO_BASE64('abc'));
-> 'JWJj', 'abc'
For a string argument str
,
HEX()
returns a hexadecimal
string representation of str
where
each byte of each character in str
is converted to two hexadecimal digits. (Multibyte characters
therefore become more than two digits.) The inverse of this
operation is performed by the
UNHEX()
function.
For a numeric argument N
,
HEX()
returns a hexadecimal
string representation of the value of
N
treated as a longlong
(BIGINT
) number. This is
equivalent to
CONV(
.
The inverse of this operation is performed by
N
,10,16)CONV(HEX(
.
N
),16,10)
mysql>SELECT X'616263', HEX('abc'), UNHEX(HEX('abc'));
-> 'abc', 616263, 'abc' mysql>SELECT HEX(255), CONV(HEX(255),16,10);
-> 'FF', 255
Returns the string str
, with the
substring beginning at position pos
and len
characters long replaced by
the string newstr
. Returns the
original string if pos
is not
within the length of the string. Replaces the rest of the
string from position pos
if
len
is not within the length of the
rest of the string. Returns NULL
if any
argument is NULL
.
mysql>SELECT INSERT('Quadratic', 3, 4, 'What');
-> 'QuWhattic' mysql>SELECT INSERT('Quadratic', -1, 4, 'What');
-> 'Quadratic' mysql>SELECT INSERT('Quadratic', 3, 100, 'What');
-> 'QuWhat'
This function is multibyte safe.
Returns the position of the first occurrence of substring
substr
in string
str
. This is the same as the
two-argument form of LOCATE()
,
except that the order of the arguments is reversed.
mysql>SELECT INSTR('foobarbar', 'bar');
-> 4 mysql>SELECT INSTR('xbar', 'foobar');
-> 0
This function is multibyte safe, and is case-sensitive only if at least one argument is a binary string.
LCASE()
is a synonym for
LOWER()
.
LCASE()
used in a view is rewritten as
LOWER()
when storing the view's
definition. (Bug #12844279)
Returns the leftmost len
characters
from the string str
, or
NULL
if any argument is
NULL
.
mysql> SELECT LEFT('foobarbar', 5);
-> 'fooba'
This function is multibyte safe.
Returns the length of the string
str
, measured in bytes. A multibyte
character counts as multiple bytes. This means that for a
string containing five 2-byte characters,
LENGTH()
returns
10
, whereas
CHAR_LENGTH()
returns
5
.
mysql> SELECT LENGTH('text');
-> 4
The Length()
OpenGIS spatial function is
named ST_Length()
in MySQL.
Reads the file and returns the file contents as a string. To
use this function, the file must be located on the server
host, you must specify the full path name to the file, and you
must have the FILE
privilege.
The file must be readable by all and its size less than
max_allowed_packet
bytes. If
the secure_file_priv
system
variable is set to a nonempty directory name, the file to be
loaded must be located in that directory.
If the file does not exist or cannot be read because one of
the preceding conditions is not satisfied, the function
returns NULL
.
The character_set_filesystem
system variable controls interpretation of file names that are
given as literal strings.
mysql>UPDATE t
SET blob_col=LOAD_FILE('/tmp/picture')
WHERE id=1;
LOCATE(
,
substr
,str
)LOCATE(
substr
,str
,pos
)
The first syntax returns the position of the first occurrence
of substring substr
in string
str
. The second syntax returns the
position of the first occurrence of substring
substr
in string
str
, starting at position
pos
. Returns 0
if substr
is not in
str
. Returns
NULL
if any argument is
NULL
.
mysql>SELECT LOCATE('bar', 'foobarbar');
-> 4 mysql>SELECT LOCATE('xbar', 'foobar');
-> 0 mysql>SELECT LOCATE('bar', 'foobarbar', 5);
-> 7
This function is multibyte safe, and is case-sensitive only if at least one argument is a binary string.
Returns the string str
with all
characters changed to lowercase according to the current
character set mapping. The default is
utf8mb4
.
mysql> SELECT LOWER('QUADRATICALLY');
-> 'quadratically'
LOWER()
(and
UPPER()
) are ineffective when
applied to binary strings
(BINARY
,
VARBINARY
,
BLOB
). To perform lettercase
conversion, convert the string to a nonbinary string:
mysql>SET @str = BINARY 'New York';
mysql>SELECT LOWER(@str), LOWER(CONVERT(@str USING utf8mb4));
+-------------+------------------------------------+ | LOWER(@str) | LOWER(CONVERT(@str USING utf8mb4)) | +-------------+------------------------------------+ | New York | new york | +-------------+------------------------------------+
For collations of Unicode character sets,
LOWER()
and
UPPER()
work according to the
Unicode Collation Algorithm (UCA) version in the collation
name, if there is one, and UCA 4.0.0 if no version is
specified. For example, utf8mb4_0900_ai_ci
and utf8_unicode_520_ci
work according to
UCA 9.0.0 and 5.2.0, respectively, whereas
utf8_unicode_ci
works according to UCA
4.0.0. See Section 10.10.1, “Unicode Character Sets”.
This function is multibyte safe.
LCASE()
used within views is rewritten as
LOWER()
.
Returns the string str
, left-padded
with the string padstr
to a length
of len
characters. If
str
is longer than
len
, the return value is shortened
to len
characters.
mysql>SELECT LPAD('hi',4,'??');
-> '??hi' mysql>SELECT LPAD('hi',1,'??');
-> 'h'
Returns the string str
with leading
space characters removed.
mysql> SELECT LTRIM(' barbar');
-> 'barbar'
This function is multibyte safe.
Returns a set value (a string containing substrings separated
by ,
characters) consisting of the strings
that have the corresponding bit in
bits
set.
str1
corresponds to bit 0,
str2
to bit 1, and so on.
NULL
values in
str1
,
str2
, ...
are
not appended to the result.
mysql>SELECT MAKE_SET(1,'a','b','c');
-> 'a' mysql>SELECT MAKE_SET(1 | 4,'hello','nice','world');
-> 'hello,world' mysql>SELECT MAKE_SET(1 | 4,'hello','nice',NULL,'world');
-> 'hello' mysql>SELECT MAKE_SET(0,'a','b','c');
-> ''
MID(
is a synonym for
str
,pos
,len
)SUBSTRING(
.
str
,pos
,len
)
Returns a string representation of the octal value of
N
, where
N
is a longlong
(BIGINT
) number. This is
equivalent to
CONV(
.
Returns N
,10,8)NULL
if
N
is NULL
.
mysql> SELECT OCT(12);
-> '14'
OCTET_LENGTH()
is a synonym for
LENGTH()
.
If the leftmost character of the string
str
is a multibyte character,
returns the code for that character, calculated from the
numeric values of its constituent bytes using this formula:
(1st byte code) + (2nd byte code * 256) + (3rd byte code * 256^2) ...
If the leftmost character is not a multibyte character,
ORD()
returns the same value as
the ASCII()
function.
mysql> SELECT ORD('2');
-> 50
POSITION(
is a synonym for
substr
IN str
)LOCATE(
.
substr
,str
)
Quotes a string to produce a result that can be used as a
properly escaped data value in an SQL statement. The string is
returned enclosed by single quotation marks and with each
instance of backslash (\
), single quote
('
), ASCII NUL
, and
Control+Z preceded by a backslash. If the argument is
NULL
, the return value is the word
“NULL” without enclosing single quotation marks.
mysql>SELECT QUOTE('Don\'t!');
-> 'Don\'t!' mysql>SELECT QUOTE(NULL);
-> NULL
For comparison, see the quoting rules for literal strings and within the C API in Section 9.1.1, “String Literals”, and Section 28.7.7.56, “mysql_real_escape_string_quote()”.
Returns a string consisting of the string
str
repeated
count
times. If
count
is less than 1, returns an
empty string. Returns NULL
if
str
or
count
are NULL
.
mysql> SELECT REPEAT('MySQL', 3);
-> 'MySQLMySQLMySQL'
Returns the string str
with all
occurrences of the string from_str
replaced by the string to_str
.
REPLACE()
performs a
case-sensitive match when searching for
from_str
.
mysql> SELECT REPLACE('www.mysql.com', 'w', 'Ww');
-> 'WwWwWw.mysql.com'
This function is multibyte safe.
Returns the string str
with the
order of the characters reversed.
mysql> SELECT REVERSE('abc');
-> 'cba'
This function is multibyte safe.
Returns the rightmost len
characters from the string str
, or
NULL
if any argument is
NULL
.
mysql> SELECT RIGHT('foobarbar', 4);
-> 'rbar'
This function is multibyte safe.
Returns the string str
,
right-padded with the string padstr
to a length of len
characters. If
str
is longer than
len
, the return value is shortened
to len
characters.
mysql>SELECT RPAD('hi',5,'?');
-> 'hi???' mysql>SELECT RPAD('hi',1,'?');
-> 'h'
This function is multibyte safe.
Returns the string str
with
trailing space characters removed.
mysql> SELECT RTRIM('barbar ');
-> 'barbar'
This function is multibyte safe.
Returns a soundex string from str
.
Two strings that sound almost the same should have identical
soundex strings. A standard soundex string is four characters
long, but the SOUNDEX()
function returns an arbitrarily long string. You can use
SUBSTRING()
on the result to
get a standard soundex string. All nonalphabetic characters in
str
are ignored. All international
alphabetic characters outside the A-Z range are treated as
vowels.
When using SOUNDEX()
, you
should be aware of the following limitations:
This function, as currently implemented, is intended to work well with strings that are in the English language only. Strings in other languages may not produce reliable results.
This function is not guaranteed to provide consistent
results with strings that use multibyte character sets,
including utf-8
. See Bug #22638 for
more information.
mysql>SELECT SOUNDEX('Hello');
-> 'H400' mysql>SELECT SOUNDEX('Quadratically');
-> 'Q36324'
This function implements the original Soundex algorithm, not the more popular enhanced version (also described by D. Knuth). The difference is that original version discards vowels first and duplicates second, whereas the enhanced version discards duplicates first and vowels second.
This is the same as
SOUNDEX(
.
expr1
)
= SOUNDEX(expr2
)
Returns a string consisting of N
space characters.
mysql> SELECT SPACE(6);
-> ' '
SUBSTR(
,
str
,pos
)SUBSTR(
,
str
FROM pos
)SUBSTR(
,
str
,pos
,len
)SUBSTR(
str
FROM pos
FOR
len
)
SUBSTR()
is a synonym for
SUBSTRING()
.
SUBSTRING(
,
str
,pos
)SUBSTRING(
,
str
FROM pos
)SUBSTRING(
,
str
,pos
,len
)SUBSTRING(
str
FROM pos
FOR
len
)
The forms without a len
argument
return a substring from string str
starting at position pos
. The forms
with a len
argument return a
substring len
characters long from
string str
, starting at position
pos
. The forms that use
FROM
are standard SQL syntax. It is also
possible to use a negative value for
pos
. In this case, the beginning of
the substring is pos
characters
from the end of the string, rather than the beginning. A
negative value may be used for pos
in any of the forms of this function.
For all forms of SUBSTRING()
,
the position of the first character in the string from which
the substring is to be extracted is reckoned as
1
.
mysql>SELECT SUBSTRING('Quadratically',5);
-> 'ratically' mysql>SELECT SUBSTRING('foobarbar' FROM 4);
-> 'barbar' mysql>SELECT SUBSTRING('Quadratically',5,6);
-> 'ratica' mysql>SELECT SUBSTRING('Sakila', -3);
-> 'ila' mysql>SELECT SUBSTRING('Sakila', -5, 3);
-> 'aki' mysql>SELECT SUBSTRING('Sakila' FROM -4 FOR 2);
-> 'ki'
This function is multibyte safe.
If len
is less than 1, the result
is the empty string.
SUBSTRING_INDEX(
str
,delim
,count
)
Returns the substring from string
str
before
count
occurrences of the delimiter
delim
. If
count
is positive, everything to
the left of the final delimiter (counting from the left) is
returned. If count
is negative,
everything to the right of the final delimiter (counting from
the right) is returned.
SUBSTRING_INDEX()
performs a
case-sensitive match when searching for
delim
.
mysql>SELECT SUBSTRING_INDEX('www.mysql.com', '.', 2);
-> 'www.mysql' mysql>SELECT SUBSTRING_INDEX('www.mysql.com', '.', -2);
-> 'mysql.com'
This function is multibyte safe.
Converts the string argument to base-64 encoded form and
returns the result as a character string with the connection
character set and collation. If the argument is not a string,
it is converted to a string before conversion takes place. The
result is NULL
if the argument is
NULL
. Base-64 encoded strings can be
decoded using the FROM_BASE64()
function.
mysql> SELECT TO_BASE64('abc'), FROM_BASE64(TO_BASE64('abc'));
-> 'JWJj', 'abc'
Different base-64 encoding schemes exist. These are the
encoding and decoding rules used by
TO_BASE64()
and
FROM_BASE64()
:
The encoding for alphabet value 62 is
'+'
.
The encoding for alphabet value 63 is
'/'
.
Encoded output consists of groups of 4 printable
characters. Each 3 bytes of the input data are encoded
using 4 characters. If the last group is incomplete, it is
padded with '='
characters to a length
of 4.
A newline is added after each 76 characters of encoded output to divide long output into multiple lines.
Decoding recognizes and ignores newline, carriage return, tab, and space.
TRIM([{BOTH | LEADING | TRAILING}
[
,
remstr
] FROM]
str
)TRIM([
remstr
FROM] str
)
Returns the string str
with all
remstr
prefixes or suffixes
removed. If none of the specifiers BOTH
,
LEADING
, or TRAILING
is
given, BOTH
is assumed.
remstr
is optional and, if not
specified, spaces are removed.
mysql>SELECT TRIM(' bar ');
-> 'bar' mysql>SELECT TRIM(LEADING 'x' FROM 'xxxbarxxx');
-> 'barxxx' mysql>SELECT TRIM(BOTH 'x' FROM 'xxxbarxxx');
-> 'bar' mysql>SELECT TRIM(TRAILING 'xyz' FROM 'barxxyz');
-> 'barx'
This function is multibyte safe.
UCASE()
is a synonym for
UPPER()
.
UCASE()
used within views is rewritten as
UPPER()
.
For a string argument str
,
UNHEX(
interprets each pair of characters in the argument as a
hexadecimal number and converts it to the byte represented by
the number. The return value is a binary string.
str
)
mysql>SELECT UNHEX('4D7953514C');
-> 'MySQL' mysql>SELECT X'4D7953514C';
-> 'MySQL' mysql>SELECT UNHEX(HEX('string'));
-> 'string' mysql>SELECT HEX(UNHEX('1267'));
-> '1267'
The characters in the argument string must be legal
hexadecimal digits: '0'
..
'9'
, 'A'
..
'F'
, 'a'
..
'f'
. If the argument contains any
nonhexadecimal digits, the result is NULL
:
mysql> SELECT UNHEX('GG');
+-------------+
| UNHEX('GG') |
+-------------+
| NULL |
+-------------+
A NULL
result can occur if the argument to
UNHEX()
is a
BINARY
column, because values
are padded with 0x00 bytes when stored but those bytes are not
stripped on retrieval. For example, '41'
is
stored into a CHAR(3)
column as
'41 '
and retrieved as
'41'
(with the trailing pad space
stripped), so UNHEX()
for the
column value returns 'A'
. By contrast
'41'
is stored into a
BINARY(3)
column as
'41\0'
and retrieved as
'41\0'
(with the trailing pad
0x00
byte not stripped).
'\0'
is not a legal hexadecimal digit, so
UNHEX()
for the column value
returns NULL
.
For a numeric argument N
, the
inverse of
HEX(
is not performed by N
)UNHEX()
.
Use
CONV(HEX(
instead. See the description of
N
),16,10)HEX()
.
Returns the string str
with all
characters changed to uppercase according to the current
character set mapping. The default is
utf8mb4
.
mysql> SELECT UPPER('Hej');
-> 'HEJ'
See the description of LOWER()
for information that also applies to
UPPER()
. This included
information about how to perform lettercase conversion of
binary strings (BINARY
,
VARBINARY
,
BLOB
) for which these functions
are ineffective, and information about case folding for
Unicode character sets.
This function is multibyte safe.
UCASE()
used within views is rewritten as
UPPER()
.
WEIGHT_STRING(
str
[AS {CHAR|BINARY}(N
)]
[flags
])
This function returns the weight string for the input string. The return value is a binary string that represents the comparison and sorting value of the string. It has these properties:
If
WEIGHT_STRING(
=
str1
)WEIGHT_STRING(
,
then str2
)
(str1
=
str2
str1
and
str2
are considered equal)
If
WEIGHT_STRING(
<
str1
)WEIGHT_STRING(
,
then str2
)
(str1
<
str2
str1
sorts before
str2
)
WEIGHT_STRING()
is a debugging
function intended for internal use. Its behavior can change
without notice between MySQL versions. It can be used for
testing and debugging of collations, especially if you are
adding a new collation. See
Section 10.13, “Adding a Collation to a Character Set”.
This list briefly summarizes the arguments. More details are given in the discussion following the list.
str
: The input string
expression.
AS
clause: Optional; cast the input
string to a given type and length.
flags
: Optional; unused.
The input string, str
, is a string
expression. If the input is a nonbinary (character) string
such as a CHAR
,
VARCHAR
, or
TEXT
value, the return value
contains the collation weights for the string. If the input is
a binary (byte) string such as a
BINARY
,
VARBINARY
, or
BLOB
value, the return value is
the same as the input (the weight for each byte in a binary
string is the byte value). If the input is
NULL
,
WEIGHT_STRING()
returns
NULL
.
Examples:
mysql>SET @s = _utf8mb4 'AB' COLLATE utf8mb4_0900_ai_ci;
mysql>SELECT @s, HEX(@s), HEX(WEIGHT_STRING(@s));
+------+---------+------------------------+ | @s | HEX(@s) | HEX(WEIGHT_STRING(@s)) | +------+---------+------------------------+ | AB | 4142 | 1C471C60 | +------+---------+------------------------+
mysql>SET @s = _utf8mb4 'ab' COLLATE utf8mb4_0900_ai_ci;
mysql>SELECT @s, HEX(@s), HEX(WEIGHT_STRING(@s));
+------+---------+------------------------+ | @s | HEX(@s) | HEX(WEIGHT_STRING(@s)) | +------+---------+------------------------+ | ab | 6162 | 1C471C60 | +------+---------+------------------------+
mysql>SET @s = CAST('AB' AS BINARY);
mysql>SELECT @s, HEX(@s), HEX(WEIGHT_STRING(@s));
+------+---------+------------------------+ | @s | HEX(@s) | HEX(WEIGHT_STRING(@s)) | +------+---------+------------------------+ | AB | 4142 | 4142 | +------+---------+------------------------+
mysql>SET @s = CAST('ab' AS BINARY);
mysql>SELECT @s, HEX(@s), HEX(WEIGHT_STRING(@s));
+------+---------+------------------------+ | @s | HEX(@s) | HEX(WEIGHT_STRING(@s)) | +------+---------+------------------------+ | ab | 6162 | 6162 | +------+---------+------------------------+
The preceding examples use
HEX()
to display the
WEIGHT_STRING()
result. Because
the result is a binary value,
HEX()
can be especially useful
when the result contains nonprinting values, to display it in
printable form:
mysql>SET @s = CONVERT(X'C39F' USING utf8) COLLATE utf8_czech_ci;
mysql>SELECT HEX(WEIGHT_STRING(@s));
+------------------------+ | HEX(WEIGHT_STRING(@s)) | +------------------------+ | 0FEA0FEA | +------------------------+
For non-NULL
return values, the data type
of the value is VARBINARY
if
its length is within the maximum length for
VARBINARY
, otherwise the data
type is BLOB
.
The AS
clause may be given to cast the
input string to a nonbinary or binary string and to force it
to a given length:
AS CHAR(
casts the string to a nonbinary string and pads it on the
right with spaces to a length of
N
)N
characters.
N
must be at least 1. If
N
is less than the length of
the input string, the string is truncated to
N
characters. No warning occurs
for truncation.
AS BINARY(
is similar but casts the string to a binary string,
N
)N
is measured in bytes (not
characters), and padding uses 0x00
bytes (not spaces).
mysql>SET NAMES 'latin1';
mysql>SELECT HEX(WEIGHT_STRING('ab' AS CHAR(4)));
+-------------------------------------+ | HEX(WEIGHT_STRING('ab' AS CHAR(4))) | +-------------------------------------+ | 41422020 | +-------------------------------------+ mysql>SET NAMES 'utf8';
mysql>SELECT HEX(WEIGHT_STRING('ab' AS CHAR(4)));
+-------------------------------------+ | HEX(WEIGHT_STRING('ab' AS CHAR(4))) | +-------------------------------------+ | 0041004200200020 | +-------------------------------------+
mysql> SELECT HEX(WEIGHT_STRING('ab' AS BINARY(4)));
+---------------------------------------+
| HEX(WEIGHT_STRING('ab' AS BINARY(4))) |
+---------------------------------------+
| 61620000 |
+---------------------------------------+
The flags
clause currently is
unused.
If a string function is given a binary string as an argument, the resulting string is also a binary string. A number converted to a string is treated as a binary string. This affects only comparisons.
Normally, if any expression in a string comparison is case sensitive, the comparison is performed in case-sensitive fashion.
expr
LIKE pat
[ESCAPE
'escape_char
']
Pattern matching using an SQL pattern. Returns
1
(TRUE
) or
0
(FALSE
). If either
expr
or
pat
is NULL
,
the result is NULL
.
The pattern need not be a literal string. For example, it can be specified as a string expression or table column.
Per the SQL standard, LIKE
performs matching on a per-character basis, thus it can
produce results different from the
=
comparison
operator:
mysql>SELECT 'ä' LIKE 'ae' COLLATE latin1_german2_ci;
+-----------------------------------------+ | 'ä' LIKE 'ae' COLLATE latin1_german2_ci | +-----------------------------------------+ | 0 | +-----------------------------------------+ mysql>SELECT 'ä' = 'ae' COLLATE latin1_german2_ci;
+--------------------------------------+ | 'ä' = 'ae' COLLATE latin1_german2_ci | +--------------------------------------+ | 1 | +--------------------------------------+
In particular, trailing spaces are significant, which is not
true for CHAR
or
VARCHAR
comparisons performed
with the =
operator:
mysql> SELECT 'a' = 'a ', 'a' LIKE 'a ';
+------------+---------------+
| 'a' = 'a ' | 'a' LIKE 'a ' |
+------------+---------------+
| 1 | 0 |
+------------+---------------+
1 row in set (0.00 sec)
With LIKE
you can use the
following two wildcard characters in the pattern:
%
matches any number of characters,
even zero characters.
_
matches exactly one character.
mysql>SELECT 'David!' LIKE 'David_';
-> 1 mysql>SELECT 'David!' LIKE '%D%v%';
-> 1
To test for literal instances of a wildcard character,
precede it by the escape character. If you do not specify
the ESCAPE
character,
\
is assumed.
\%
matches one %
character.
\_
matches one _
character.
mysql>SELECT 'David!' LIKE 'David\_';
-> 0 mysql>SELECT 'David_' LIKE 'David\_';
-> 1
To specify a different escape character, use the
ESCAPE
clause:
mysql> SELECT 'David_' LIKE 'David|_' ESCAPE '|';
-> 1
The escape sequence should be empty or one character long.
The expression must evaluate as a constant at execution
time. If the
NO_BACKSLASH_ESCAPES
SQL
mode is enabled, the sequence cannot be empty.
The following two statements illustrate that string comparisons are not case-sensitive unless one of the operands is case-sensitive (uses a case-sensitive collation or is a binary string):
mysql>SELECT 'abc' LIKE 'ABC';
-> 1 mysql>SELECT 'abc' LIKE _utf8mb4 'ABC' COLLATE utf8mb4_0900_as_cs;
-> 0 mysql>SELECT 'abc' LIKE _utf8mb4 'ABC' COLLATE utf8mb4_bin;
-> 0 mysql>SELECT 'abc' LIKE BINARY 'ABC';
-> 0
As an extension to standard SQL, MySQL permits
LIKE
on numeric expressions.
mysql> SELECT 10 LIKE '1%';
-> 1
Because MySQL uses C escape syntax in strings (for
example, \n
to represent a newline
character), you must double any \
that
you use in LIKE
strings. For
example, to search for \n
, specify it
as \\n
. To search for
\
, specify it as
\\\\
; this is because the backslashes
are stripped once by the parser and again when the pattern
match is made, leaving a single backslash to be matched
against.
Exception: At the end of the pattern string, backslash can
be specified as \\
. At the end of the
string, backslash stands for itself because there is
nothing following to escape. Suppose that a table contains
the following values:
mysql> SELECT filename FROM t1;
+--------------+
| filename |
+--------------+
| C: |
| C:\ |
| C:\Programs |
| C:\Programs\ |
+--------------+
To test for values that end with backslash, you can match the values using either of the following patterns:
mysql>SELECT filename, filename LIKE '%\\' FROM t1;
+--------------+---------------------+ | filename | filename LIKE '%\\' | +--------------+---------------------+ | C: | 0 | | C:\ | 1 | | C:\Programs | 0 | | C:\Programs\ | 1 | +--------------+---------------------+ mysql>SELECT filename, filename LIKE '%\\\\' FROM t1;
+--------------+-----------------------+ | filename | filename LIKE '%\\\\' | +--------------+-----------------------+ | C: | 0 | | C:\ | 1 | | C:\Programs | 0 | | C:\Programs\ | 1 | +--------------+-----------------------+
expr
NOT LIKE pat
[ESCAPE
'escape_char
']
This is the same as NOT
(
.
expr
LIKE
pat
[ESCAPE
'escape_char
'])
Aggregate queries involving NOT
LIKE
comparisons with columns containing
NULL
may yield unexpected results. For
example, consider the following table and data:
CREATE TABLE foo (bar VARCHAR(10)); INSERT INTO foo VALUES (NULL), (NULL);
The query SELECT COUNT(*) FROM foo WHERE bar LIKE
'%baz%';
returns 0
. You might
assume that SELECT COUNT(*) FROM foo WHERE bar
NOT LIKE '%baz%';
would return
2
. However, this is not the case: The
second query returns 0
. This is because
NULL NOT LIKE
always returns
expr
NULL
, regardless of the value of
expr
. The same is true for
aggregate queries involving NULL
and
comparisons using
NOT
RLIKE
or NOT
REGEXP
. In such cases, you must test explicitly
for NOT NULL
using
OR
(and not
AND
), as shown here:
SELECT COUNT(*) FROM foo WHERE bar NOT LIKE '%baz%' OR bar IS NULL;
STRCMP()
returns
0
if the strings are the same,
-1
if the first argument is smaller than
the second according to the current sort order, and
1
otherwise.
mysql>SELECT STRCMP('text', 'text2');
-> -1 mysql>SELECT STRCMP('text2', 'text');
-> 1 mysql>SELECT STRCMP('text', 'text');
-> 0
STRCMP()
performs the
comparison using the collation of the arguments.
mysql>SET @s1 = _utf8mb4 'x' COLLATE utf8mb4_0900_ai_ci;
mysql>SET @s2 = _utf8mb4 'X' COLLATE utf8mb4_0900_ai_ci;
mysql>SET @s3 = _utf8mb4 'x' COLLATE utf8mb4_0900_as_cs;
mysql>SET @s4 = _utf8mb4 'X' COLLATE utf8mb4_0900_as_cs;
mysql>SELECT STRCMP(@s1, @s2), STRCMP(@s3, @s4);
+------------------+------------------+ | STRCMP(@s1, @s2) | STRCMP(@s3, @s4) | +------------------+------------------+ | 0 | -1 | +------------------+------------------+
If the collations are incompatible, one of the arguments must be converted to be compatible with the other. See Section 10.8.4, “Collation Coercibility in Expressions”.
mysql> SET @s1 = _utf8mb4 'x' COLLATE utf8mb4_0900_ai_ci; mysql> SET @s2 = _utf8mb4 'X' COLLATE utf8mb4_0900_ai_ci; mysql> SET @s3 = _utf8mb4 'x' COLLATE utf8mb4_0900_as_cs; mysql> SET @s4 = _utf8mb4 'X' COLLATE utf8mb4_0900_as_cs; --> mysql>SELECT STRCMP(@s1, @s3);
ERROR 1267 (HY000): Illegal mix of collations (utf8mb4_0900_ai_ci,IMPLICIT) and (utf8mb4_0900_as_cs,IMPLICIT) for operation 'strcmp' mysql>SELECT STRCMP(@s1, @s3 COLLATE utf8mb4_0900_ai_ci);
+---------------------------------------------+ | STRCMP(@s1, @s3 COLLATE utf8mb4_0900_ai_ci) | +---------------------------------------------+ | 0 | +---------------------------------------------+
Table 12.9 Regular Expression Functions and Operators
Name | Description |
---|---|
NOT REGEXP |
Negation of REGEXP |
REGEXP |
Whether string matches regular expression |
REGEXP_INSTR() |
Starting index of substring matching regular expression |
REGEXP_LIKE() |
Whether string matches regular expression |
REGEXP_REPLACE() |
Replace substrings matching regular expression |
REGEXP_SUBSTR() |
Return substring matching regular expression |
RLIKE |
Whether string matches regular expression |
A regular expression is a powerful way of specifying a pattern for a complex search. This section discusses the functions and operators available for regular expression matching and illustrates, with examples, some of the special characters and constructs that can be used for regular expression operations. See also Section 3.3.4.7, “Pattern Matching”.
MySQL implements regular expression support using International Components for Unicode (ICU), which provides full Unicode support and is multibyte safe. (Prior to MySQL 8.0.4, MySQL used Henry Spencer's implementation of regular expressions, which operates in byte-wise fashion and is not multibyte safe. For information about ways in which applications that use regular expressions may be affected by the implementation change, see Regular Expression Compatibility Considerations.)
,
expr
NOT REGEXP pat
expr
NOT RLIKE pat
This is the same as NOT
(
.
expr
REGEXP
pat
)
,
expr
REGEXP pat
expr
RLIKE pat
Returns 1 if the string expr
matches the regular expression specified by the pattern
pat
, 0 otherwise. If
expr
or
pat
is NULL
,
the return value is NULL
.
REGEXP
and
RLIKE
are
synonyms for REGEXP_LIKE()
.
For additional information about how matching occurs, see
the description for
REGEXP_LIKE()
.
mysql>SELECT 'Michael!' REGEXP '.*';
+------------------------+ | 'Michael!' REGEXP '.*' | +------------------------+ | 1 | +------------------------+ mysql>SELECT 'new*\n*line' REGEXP 'new\\*.\\*line';
+---------------------------------------+ | 'new*\n*line' REGEXP 'new\\*.\\*line' | +---------------------------------------+ | 0 | +---------------------------------------+ mysql>SELECT 'a' REGEXP '^[a-d]';
+---------------------+ | 'a' REGEXP '^[a-d]' | +---------------------+ | 1 | +---------------------+ mysql>SELECT 'a' REGEXP 'A', 'a' REGEXP BINARY 'A';
+----------------+-----------------------+ | 'a' REGEXP 'A' | 'a' REGEXP BINARY 'A' | +----------------+-----------------------+ | 1 | 0 | +----------------+-----------------------+
REGEXP_INSTR(
expr
,
pat
[,
pos
[,
occurrence
[,
return_option
[,
match_type
]]]])
Returns the starting index of the substring of the string
expr
that matches the regular
expression specified by the pattern
pat
, 0 if there is no match. If
expr
or
pat
is NULL
,
the return value is NULL
. Character
indexes begin at 1.
REGEXP_INSTR()
takes these
optional arguments:
pos
: The position in
expr
at which to start the
search. If omitted, the default is 1.
occurrence
: Which
occurrence of a match to search for. If omitted, the
default is 1.
return_option
: Which type
of position to return. If this value is 0,
REGEXP_INSTR()
returns
the position of the matched substring's first
character. If this value is 1,
REGEXP_INSTR()
returns
the position following the matched substring. If
omitted, the default is 0.
match_type
: A string that
specifies how to perform matching. The meaning is as
described for
REGEXP_LIKE()
.
For additional information about how matching occurs, see
the description for
REGEXP_LIKE()
.
mysql>SELECT REGEXP_INSTR('dog cat dog', 'dog');
+------------------------------------+ | REGEXP_INSTR('dog cat dog', 'dog') | +------------------------------------+ | 1 | +------------------------------------+ mysql>SELECT REGEXP_INSTR('dog cat dog', 'dog', 2);
+---------------------------------------+ | REGEXP_INSTR('dog cat dog', 'dog', 2) | +---------------------------------------+ | 9 | +---------------------------------------+ mysql>SELECT REGEXP_INSTR('aa aaa aaaa', 'a{2}');
+-------------------------------------+ | REGEXP_INSTR('aa aaa aaaa', 'a{2}') | +-------------------------------------+ | 1 | +-------------------------------------+ mysql>SELECT REGEXP_INSTR('aa aaa aaaa', 'a{4}');
+-------------------------------------+ | REGEXP_INSTR('aa aaa aaaa', 'a{4}') | +-------------------------------------+ | 8 | +-------------------------------------+
REGEXP_LIKE(
expr
,
pat
[,
match_type
])
Returns 1 if the string expr
matches the regular expression specified by the pattern
pat
, 0 otherwise. If
expr
or
pat
is NULL
,
the return value is NULL
.
The pattern can be an extended regular expression, the syntax for which is discussed in Regular Expression Syntax. The pattern need not be a literal string. For example, it can be specified as a string expression or table column.
The optional match_type
argument is a string that may contain any or all the
following characters specifying how to perform matching:
c
: Case sensitive matching.
i
: Case insensitive matching.
m
: Multiple-line mode. Recognize
line terminators within the string. The default
behavior is to match line terminators only at the
start and end of the string expression.
n
: The .
character matches line terminators. The default is for
.
matching to stop at the end of a
line.
u
: Unix-only line endings. Only the
newline character is recognized as a line ending by
the .
, ^
, and
$
match operators.
If characters specifying contradictory options are
specified within match_type
,
the rightmost one takes precedence.
By default, regular expression operations use the
character set and collation of the
expr
and
pat
arguments when deciding the
type of a character and performing the comparison. If the
arguments have different character sets or collations,
coercibility rules apply as described in
Section 10.8.4, “Collation Coercibility in Expressions”.
Arguments may be specified with explicit collation
indicators to change comparison behavior.
mysql>SELECT REGEXP_LIKE('CamelCase', 'CAMELCASE');
+---------------------------------------+ | REGEXP_LIKE('CamelCase', 'CAMELCASE') | +---------------------------------------+ | 1 | +---------------------------------------+ mysql>SELECT REGEXP_LIKE('CamelCase', 'CAMELCASE' COLLATE utf8mb4_0900_as_cs);
+------------------------------------------------------------------+ | REGEXP_LIKE('CamelCase', 'CAMELCASE' COLLATE utf8mb4_0900_as_cs) | +------------------------------------------------------------------+ | 0 | +------------------------------------------------------------------+
match_type
may be specified
with the c
or i
characters to override the default case sensitivity.
Exception: If either argument is a binary string, the
arguments are handled in case-sensitive fashion as binary
strings, even if match_type
contains the i
character.
Because MySQL uses the C escape syntax in strings (for
example, \n
to represent the newline
character), you must double any \
that you use in your expr
and
pat
arguments.
mysql>SELECT REGEXP_LIKE('Michael!', '.*');
+-------------------------------+ | REGEXP_LIKE('Michael!', '.*') | +-------------------------------+ | 1 | +-------------------------------+ mysql>SELECT REGEXP_LIKE('new*\n*line', 'new\\*.\\*line');
+----------------------------------------------+ | REGEXP_LIKE('new*\n*line', 'new\\*.\\*line') | +----------------------------------------------+ | 0 | +----------------------------------------------+ mysql>SELECT REGEXP_LIKE('a', '^[a-d]');
+----------------------------+ | REGEXP_LIKE('a', '^[a-d]') | +----------------------------+ | 1 | +----------------------------+ mysql>SELECT REGEXP_LIKE('a', 'A'), REGEXP_LIKE('a', BINARY 'A');
+-----------------------+------------------------------+ | REGEXP_LIKE('a', 'A') | REGEXP_LIKE('a', BINARY 'A') | +-----------------------+------------------------------+ | 1 | 0 | +-----------------------+------------------------------+
mysql>SELECT REGEXP_LIKE('abc', 'ABC');
+---------------------------+ | REGEXP_LIKE('abc', 'ABC') | +---------------------------+ | 1 | +---------------------------+ mysql>SELECT REGEXP_LIKE('abc', 'ABC', 'c');
+--------------------------------+ | REGEXP_LIKE('abc', 'ABC', 'c') | +--------------------------------+ | 0 | +--------------------------------+
REGEXP_REPLACE(
expr
,
pat
,
repl
[,
pos
[,
occurrence
[,
match_type
]]])
Replaces occurrences in the string
expr
that match the regular
expression specified by the pattern
pat
with the replacement string
repl
, and returns the resulting
string. If expr
,
pat
, or
repl
is
NULL
, the return value is
NULL
.
REGEXP_REPLACE()
takes
these optional arguments:
pos
: The position in
expr
at which to start the
search. If omitted, the default is 1.
occurrence
: Which
occurrence of a match to replace. If omitted, the
default is 0 (which means “replace all
occurrences”).
match_type
: A string that
specifies how to perform matching. The meaning is as
described for
REGEXP_LIKE()
.
For additional information about how matching occurs, see
the description for
REGEXP_LIKE()
.
mysql>SELECT REGEXP_REPLACE('a b c', 'b', 'X');
+-----------------------------------+ | REGEXP_REPLACE('a b c', 'b', 'X') | +-----------------------------------+ | a X c | +-----------------------------------+ mysql>SELECT REGEXP_REPLACE('abc def ghi', '[a-z]+', 'X', 1, 3);
+----------------------------------------------------+ | REGEXP_REPLACE('abc def ghi', '[a-z]+', 'X', 1, 3) | +----------------------------------------------------+ | abc def X | +----------------------------------------------------+
REGEXP_SUBSTR(
expr
,
pat
[,
pos
[,
occurrence
[,
match_type
]]])
Returns the substring of the string
expr
that matches the regular
expression specified by the pattern
pat
, NULL
if
there is no match. If expr
or
pat
is NULL
,
the return value is NULL
.
REGEXP_SUBSTR()
takes these
optional arguments:
pos
: The position in
expr
at which to start the
search. If omitted, the default is 1.
occurrence
: Which
occurrence of a match to search for. If omitted, the
default is 1.
match_type
: A string that
specifies how to perform matching. The meaning is as
described for
REGEXP_LIKE()
.
For additional information about how matching occurs, see
the description for
REGEXP_LIKE()
.
mysql>SELECT REGEXP_SUBSTR('abc def ghi', '[a-z]+');
+----------------------------------------+ | REGEXP_SUBSTR('abc def ghi', '[a-z]+') | +----------------------------------------+ | abc | +----------------------------------------+ mysql>SELECT REGEXP_SUBSTR('abc def ghi', '[a-z]+', 1, 3);
+----------------------------------------------+ | REGEXP_SUBSTR('abc def ghi', '[a-z]+', 1, 3) | +----------------------------------------------+ | ghi | +----------------------------------------------+
A regular expression describes a set of strings. The simplest
regular expression is one that has no special characters in
it. For example, the regular expression
hello
matches hello
and
nothing else.
Nontrivial regular expressions use certain special constructs
so that they can match more than one string. For example, the
regular expression hello|world
contains the
|
alternation operator and matches either
the hello
or world
.
As a more complex example, the regular expression
B[an]*s
matches any of the strings
Bananas
, Baaaaas
,
Bs
, and any other string starting with a
B
, ending with an s
, and
containing any number of a
or
n
characters in between.
The following list covers some of the basic special characters and constructs that can be used in regular expressions. For information about the full regular expression syntax supported by the ICU library used to implement regular expression support, visit the International Components for Unicode website.
^
Match the beginning of a string.
mysql>SELECT REGEXP_LIKE('fo\nfo', '^fo$');
-> 0 mysql>SELECT REGEXP_LIKE('fofo', '^fo');
-> 1
$
Match the end of a string.
mysql>SELECT REGEXP_LIKE('fo\no', '^fo\no$');
-> 1 mysql>SELECT REGEXP_LIKE('fo\no', '^fo$');
-> 0
.
Match any character (including carriage return and
newline, although to match these in the middle of a
string, the m
(multiple line)
match-control character or the (?m)
within-pattern modifier must be given).
mysql>SELECT REGEXP_LIKE('fofo', '^f.*$');
-> 1 mysql>SELECT REGEXP_LIKE('fo\r\nfo', '^f.*$');
-> 0 mysql>SELECT REGEXP_LIKE('fo\r\nfo', '^f.*$', 'm');
-> 1 mysql>SELECT REGEXP_LIKE('fo\r\nfo', '(?m)^f.*$');
-> 1
a*
Match any sequence of zero or more a
characters.
mysql>SELECT REGEXP_LIKE('Ban', '^Ba*n');
-> 1 mysql>SELECT REGEXP_LIKE('Baaan', '^Ba*n');
-> 1 mysql>SELECT REGEXP_LIKE('Bn', '^Ba*n');
-> 1
a+
Match any sequence of one or more a
characters.
mysql>SELECT REGEXP_LIKE('Ban', '^Ba+n');
-> 1 mysql>SELECT REGEXP_LIKE('Bn', '^Ba+n');
-> 0
a?
Match either zero or one a
character.
mysql>SELECT REGEXP_LIKE('Bn', '^Ba?n');
-> 1 mysql>SELECT REGEXP_LIKE('Ban', '^Ba?n');
-> 1 mysql>SELECT REGEXP_LIKE('Baan', '^Ba?n');
-> 0
de|abc
Alternation; match either of the sequences
de
or abc
.
mysql>SELECT REGEXP_LIKE('pi', 'pi|apa');
-> 1 mysql>SELECT REGEXP_LIKE('axe', 'pi|apa');
-> 0 mysql>SELECT REGEXP_LIKE('apa', 'pi|apa');
-> 1 mysql>SELECT REGEXP_LIKE('apa', '^(pi|apa)$');
-> 1 mysql>SELECT REGEXP_LIKE('pi', '^(pi|apa)$');
-> 1 mysql>SELECT REGEXP_LIKE('pix', '^(pi|apa)$');
-> 0
(abc)*
Match zero or more instances of the sequence
abc
.
mysql>SELECT REGEXP_LIKE('pi', '^(pi)*$');
-> 1 mysql>SELECT REGEXP_LIKE('pip', '^(pi)*$');
-> 0 mysql>SELECT REGEXP_LIKE('pipi', '^(pi)*$');
-> 1
{1}
, {2,3}
Repetition;
{
and
n
}{
notation provide a more general way of writing regular
expressions that match many occurrences of the previous
atom (or “piece”) of the pattern.
m
,n
}m
and
n
are integers.
a*
Can be written as a{0,}
.
a+
Can be written as a{1,}
.
a?
Can be written as a{0,1}
.
To be more precise,
a{
matches
exactly n
}n
instances of
a
.
a{
matches n
,}n
or more instances of
a
.
a{
matches m
,n
}m
through
n
instances of
a
, inclusive. If both
m
and
n
are given,
m
must be less than or equal to
n
.
mysql>SELECT REGEXP_LIKE('abcde', 'a[bcd]{2}e');
-> 0 mysql>SELECT REGEXP_LIKE('abcde', 'a[bcd]{3}e');
-> 1 mysql>SELECT REGEXP_LIKE('abcde', 'a[bcd]{1,10}e');
-> 1
[a-dX]
, [^a-dX]
Matches any character that is (or is not, if
^
is used) either a
,
b
, c
,
d
or X
. A
-
character between two other
characters forms a range that matches all characters from
the first character to the second. For example,
[0-9]
matches any decimal digit. To
include a literal ]
character, it must
immediately follow the opening bracket
[
. To include a literal
-
character, it must be written first
or last. Any character that does not have a defined
special meaning inside a []
pair
matches only itself.
mysql>SELECT REGEXP_LIKE('aXbc', '[a-dXYZ]');
-> 1 mysql>SELECT REGEXP_LIKE('aXbc', '^[a-dXYZ]$');
-> 0 mysql>SELECT REGEXP_LIKE('aXbc', '^[a-dXYZ]+$');
-> 1 mysql>SELECT REGEXP_LIKE('aXbc', '^[^a-dXYZ]+$');
-> 0 mysql>SELECT REGEXP_LIKE('gheis', '^[^a-dXYZ]+$');
-> 1 mysql>SELECT REGEXP_LIKE('gheisa', '^[^a-dXYZ]+$');
-> 0
[=character_class=]
Within a bracket expression (written using
[
and ]
),
[=character_class=]
represents an
equivalence class. It matches all characters with the same
collation value, including itself. For example, if
o
and (+)
are the
members of an equivalence class,
[[=o=]]
, [[=(+)=]]
,
and [o(+)]
are all synonymous. An
equivalence class may not be used as an endpoint of a
range.
[:character_class:]
Within a bracket expression (written using
[
and ]
),
[:character_class:]
represents a
character class that matches all characters belonging to
that class. The following table lists the standard class
names. These names stand for the character classes defined
in the ctype(3)
manual page. A
particular locale may provide other class names. A
character class may not be used as an endpoint of a range.
Character Class Name | Meaning |
---|---|
alnum |
Alphanumeric characters |
alpha |
Alphabetic characters |
blank |
Whitespace characters |
cntrl |
Control characters |
digit |
Digit characters |
graph |
Graphic characters |
lower |
Lowercase alphabetic characters |
print |
Graphic or space characters |
punct |
Punctuation characters |
space |
Space, tab, newline, and carriage return |
upper |
Uppercase alphabetic characters |
xdigit |
Hexadecimal digit characters |
mysql>SELECT REGEXP_LIKE('justalnums', '[[:alnum:]]+');
-> 1 mysql>SELECT REGEXP_LIKE('!!', '[[:alnum:]]+');
-> 0
To use a literal instance of a special character in a regular
expression, precede it by two backslash (\) characters. The
MySQL parser interprets one of the backslashes, and the
regular expression library interprets the other. For example,
to match the string 1+2
that contains the
special +
character, only the last of the
following regular expressions is the correct one:
mysql>SELECT REGEXP_LIKE('1+2', '1+2');
-> 0 mysql>SELECT REGEXP_LIKE('1+2', '1\+2');
-> 0 mysql>SELECT REGEXP_LIKE('1+2', '1\\+2');
-> 1
REGEXP_LIKE()
and similar
functions use resources that can be controlled by setting
system variables:
The match engine uses memory for its internal stack. To
control the maximum available memory for the stack in
bytes, set the
regexp_stack_limit
system
variable.
The match engine operates in steps. To control the maximum
number of steps performed by the engine (and thus
indirectly the execution time), set the
regexp_time_limit
system
variable. Because this limit is expressed as number of
steps, it affects execution time only indirectly.
Typically, it is on the order of milliseconds.
Prior to MySQL 8.0.4, MySQL used the Henry Spencer regular expression library to support regular expression operations, rather than International Components for Unicode (ICU). The following discussion describes differences between the Spencer and ICU libraries that may affect applications:
With the Spencer library, the
REGEXP
and
RLIKE
operators work in byte-wise fashion, so they are not
multibyte safe and may produce unexpected results with
multibyte character sets. In addition, these operators
compare characters by their byte values and accented
characters may not compare as equal even if a given
collation treats them as equal.
ICU has full Unicode support and is multibyte safe. Its
regular expression functions treat all strings as as
UTF-16
. You should keep in mind that
positional indexes are based on 16-bit chunks and not on
code points. This means that, when passed to such
functions, characters using more than one chunk may
produce unanticipated results, such as those shown here:
mysql>SELECT REGEXP_INSTR('🍣🍣b', 'b');
+--------------------------+ | REGEXP_INSTR('??b', 'b') | +--------------------------+ | 5 | +--------------------------+ 1 row in set (0.00 sec) mysql>SELECT REGEXP_INSTR('🍣🍣bxxx', 'b', 4);
+--------------------------------+ | REGEXP_INSTR('??bxxx', 'b', 4) | +--------------------------------+ | 5 | +--------------------------------+ 1 row in set (0.00 sec)
Characters within the Unicode Basic Multilingual Plane, which includes characters used by most modern languages, are safe in this regard:
mysql>SELECT REGEXP_INSTR('бжb', 'b');
+----------------------------+ | REGEXP_INSTR('бжb', 'b') | +----------------------------+ | 3 | +----------------------------+ 1 row in set (0.00 sec) mysql>SELECT REGEXP_INSTR('עבb', 'b');
+----------------------------+ | REGEXP_INSTR('עבb', 'b') | +----------------------------+ | 3 | +----------------------------+ 1 row in set (0.00 sec) mysql>SELECT REGEXP_INSTR('µå周çб', '周');
+------------------------------------+ | REGEXP_INSTR('µå周çб', '周') | +------------------------------------+ | 3 | +------------------------------------+ 1 row in set (0.00 sec)
Emoji, such as the “sushi” character
🍣
(U+1F363) used in the first two
examples, are not included in the Basic Multilingual
Plane, but rather in Unicode's Supplementary
Multilingual Plane. Another issue can arise with emoji and
other 4-byte characters when
REGEXP_SUBSTR()
or a
similar function begins searching in the middle of a
character. Each of the two statements in the following
example starts from the second 2-byte position in the
first argument. The first statement works on a string
consisting solely of 2-byte (BMP) characters. The second
statement contains 4-byte characters which are incorrectly
interpreted in the result because the first two bytes are
stripped off and so the remainder of the character data is
misaligned.
mysql>SELECT REGEXP_SUBSTR('周周周周', '.*', 2);
+----------------------------------------+ | REGEXP_SUBSTR('周周周周', '.*', 2) | +----------------------------------------+ | 周周周 | +----------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT REGEXP_SUBSTR('🍣🍣🍣🍣', '.*', 2);
+--------------------------------+ | REGEXP_SUBSTR('????', '.*', 2) | +--------------------------------+ | ?㳟揘㳟揘㳟揘 | +--------------------------------+ 1 row in set (0.00 sec)
For the .
operator, the Spencer library
matches line-terminator characters (carriage return,
newline) anywhere in string expressions, including in the
middle. To match line terminator characters in the middle
of strings with ICU, specify the m
match-control character.
The Spencer library supports word-beginning and word-end
boundary markers ([[:<:]]
and
[[:>:]]
notation). ICU does not. For
ICU, you can use \b
to match word
boundaries; double the backslash because MySQL interprets
it as the escape character within strings.
The Spencer library supports collating element bracket
expressions ([.characters.]
notation).
ICU does not.
For repetition counts ({n}
and
{m,n}
notation), the Spencer library
has a maximum of 255. ICU has no such limit, although the
maximum number of match engine steps can be limited by
setting the
regexp_time_limit
system
variable.
ICU interprets parentheses as metacharacters. To specify a
literal open parenthesis (
in a regular
expression, it must be escaped:
mysql>SELECT REGEXP_LIKE('(', '(');
ERROR 3692 (HY000): Mismatched parenthesis in regular expression. mysql>SELECT REGEXP_LIKE('(', '\\(');
+-------------------------+ | REGEXP_LIKE('(', '\\(') | +-------------------------+ | 1 | +-------------------------+
MySQL has many operators and functions that return a string. This section answers the question: What is the character set and collation of such a string?
For simple functions that take string input and return a string
result as output, the output's character set and collation are
the same as those of the principal input value. For example,
UPPER(
returns a string with the same character string and collation as
X
)X
. The same applies for
INSTR()
,
LCASE()
,
LOWER()
,
LTRIM()
,
MID()
,
REPEAT()
,
REPLACE()
,
REVERSE()
,
RIGHT()
,
RPAD()
,
RTRIM()
,
SOUNDEX()
,
SUBSTRING()
,
TRIM()
,
UCASE()
, and
UPPER()
.
The REPLACE()
function, unlike
all other functions, always ignores the collation of the
string input and performs a case-sensitive comparison.
If a string input or function result is a binary string, the
string has the binary
character set and
collation. This can be checked by using the
CHARSET()
and
COLLATION()
functions, both of
which return binary
for a binary string
argument:
mysql> SELECT CHARSET(BINARY 'a'), COLLATION(BINARY 'a');
+---------------------+-----------------------+
| CHARSET(BINARY 'a') | COLLATION(BINARY 'a') |
+---------------------+-----------------------+
| binary | binary |
+---------------------+-----------------------+
For operations that combine multiple string inputs and return a single string output, the “aggregation rules” of standard SQL apply for determining the collation of the result:
If an explicit COLLATE
occurs, use
Y
Y
.
If explicit COLLATE
and Y
COLLATE
occur, raise an
error.
Z
Otherwise, if all collations are
Y
, use
Y
.
Otherwise, the result has no collation.
For example, with CASE ... WHEN a THEN b WHEN b THEN c
COLLATE
, the
resulting collation is X
ENDX
. The same
applies for UNION
,
||
,
CONCAT()
,
ELT()
,
GREATEST()
,
IF()
, and
LEAST()
.
For operations that convert to character data, the character set
and collation of the strings that result from the operations are
defined by the
character_set_connection
and
collation_connection
system
variables that determine the default connection character set
and collation (see Section 10.4, “Connection Character Sets and Collations”). This
applies only to BIN_TO_UUID()
,
CAST()
,
CONV()
,
FORMAT()
,
HEX()
, and
SPACE()
.
An exception to the preceding priniciple occurs for expressions
for virtual generated columns. In such expressions, the table
character set is used for
BIN_TO_UUID()
,
CONV()
, or
HEX()
results, regardless of
connection character set.
If there is any question about the character set or collation of
the result returned by a string function, use the
CHARSET()
or
COLLATION()
function to find out:
mysql>SELECT USER(), CHARSET(USER()), COLLATION(USER());
+----------------+-----------------+-------------------+ | USER() | CHARSET(USER()) | COLLATION(USER()) | +----------------+-----------------+-------------------+ | test@localhost | utf8 | utf8_general_ci | +----------------+-----------------+-------------------+ mysql>SELECT CHARSET(COMPRESS('abc')), COLLATION(COMPRESS('abc'));
+--------------------------+----------------------------+ | CHARSET(COMPRESS('abc')) | COLLATION(COMPRESS('abc')) | +--------------------------+----------------------------+ | binary | binary | +--------------------------+----------------------------+
Table 12.10 Numeric Functions and Operators
Name | Description |
---|---|
ABS() |
Return the absolute value |
ACOS() |
Return the arc cosine |
ASIN() |
Return the arc sine |
ATAN() |
Return the arc tangent |
ATAN2() , ATAN() |
Return the arc tangent of the two arguments |
CEIL() |
Return the smallest integer value not less than the argument |
CEILING() |
Return the smallest integer value not less than the argument |
CONV() |
Convert numbers between different number bases |
COS() |
Return the cosine |
COT() |
Return the cotangent |
CRC32() |
Compute a cyclic redundancy check value |
DEGREES() |
Convert radians to degrees |
DIV |
Integer division |
/ |
Division operator |
EXP() |
Raise to the power of |
FLOOR() |
Return the largest integer value not greater than the argument |
LN() |
Return the natural logarithm of the argument |
LOG() |
Return the natural logarithm of the first argument |
LOG10() |
Return the base-10 logarithm of the argument |
LOG2() |
Return the base-2 logarithm of the argument |
- |
Minus operator |
MOD() |
Return the remainder |
% , MOD |
Modulo operator |
PI() |
Return the value of pi |
+ |
Addition operator |
POW() |
Return the argument raised to the specified power |
POWER() |
Return the argument raised to the specified power |
RADIANS() |
Return argument converted to radians |
RAND() |
Return a random floating-point value |
ROUND() |
Round the argument |
SIGN() |
Return the sign of the argument |
SIN() |
Return the sine of the argument |
SQRT() |
Return the square root of the argument |
TAN() |
Return the tangent of the argument |
* |
Multiplication operator |
TRUNCATE() |
Truncate to specified number of decimal places |
- |
Change the sign of the argument |
The usual arithmetic operators are available. The result is determined according to the following rules:
In the case of
-
,
+
, and
*
, the result
is calculated with BIGINT
(64-bit) precision if both operands are integers.
If both operands are integers and any of them are unsigned,
the result is an unsigned integer. For subtraction, if the
NO_UNSIGNED_SUBTRACTION
SQL mode is enabled, the result is signed even if any
operand is unsigned.
If any of the operands of a
+
,
-
,
/
,
*
,
%
is a real or
string value, the precision of the result is the precision
of the operand with the maximum precision.
In division performed with
/
, the scale
of the result when using two exact-value operands is the
scale of the first operand plus the value of the
div_precision_increment
system variable (which is 4 by default). For example, the
result of the expression 5.05 / 0.014
has
a scale of six decimal places
(360.714286
).
These rules are applied for each operation, such that nested
calculations imply the precision of each component. Hence,
(14620 / 9432456) / (24250 / 9432456)
,
resolves first to (0.0014) / (0.0026)
, with
the final result having 8 decimal places
(0.60288653
).
Because of these rules and the way they are applied, care should be taken to ensure that components and subcomponents of a calculation use the appropriate level of precision. See Section 12.10, “Cast Functions and Operators”.
For information about handling of overflow in numeric expression evaluation, see Section 11.2.6, “Out-of-Range and Overflow Handling”.
Arithmetic operators apply to numbers. For other types of
values, alternative operations may be available. For example, to
add date values, use DATE_ADD()
;
see Section 12.7, “Date and Time Functions”.
Addition:
mysql> SELECT 3+5;
-> 8
Subtraction:
mysql> SELECT 3-5;
-> -2
Unary minus. This operator changes the sign of the operand.
mysql> SELECT - 2;
-> -2
Multiplication:
mysql>SELECT 3*5;
-> 15 mysql>SELECT 18014398509481984*18014398509481984.0;
-> 324518553658426726783156020576256.0 mysql>SELECT 18014398509481984*18014398509481984;
-> out-of-range error
The last expression produces an error because the result of
the integer multiplication exceeds the 64-bit range of
BIGINT
calculations. (See
Section 11.2, “Numeric Types”.)
Division:
mysql> SELECT 3/5;
-> 0.60
Division by zero produces a NULL
result:
mysql> SELECT 102/(1-1);
-> NULL
A division is calculated with
BIGINT
arithmetic only if
performed in a context where its result is converted to an
integer.
Integer division. Discards from the division result any fractional part to the right of the decimal point.
If either operand has a noninteger type, the operands are
converted to DECIMAL
and
divided using DECIMAL
arithmetic before converting the result to
BIGINT
. If the result exceeds
BIGINT
range, an error occurs.
mysql> SELECT 5 DIV 2, -5 DIV 2, 5 DIV -2, -5 DIV -2;
-> 2, -2, -2, 2
Modulo operation. Returns the remainder of
N
divided by
M
. For more information, see the
description for the MOD()
function in Section 12.6.2, “Mathematical Functions”.
Table 12.12 Mathematical Functions
Name | Description |
---|---|
ABS() |
Return the absolute value |
ACOS() |
Return the arc cosine |
ASIN() |
Return the arc sine |
ATAN() |
Return the arc tangent |
ATAN2() , ATAN() |
Return the arc tangent of the two arguments |
CEIL() |
Return the smallest integer value not less than the argument |
CEILING() |
Return the smallest integer value not less than the argument |
CONV() |
Convert numbers between different number bases |
COS() |
Return the cosine |
COT() |
Return the cotangent |
CRC32() |
Compute a cyclic redundancy check value |
DEGREES() |
Convert radians to degrees |
EXP() |
Raise to the power of |
FLOOR() |
Return the largest integer value not greater than the argument |
LN() |
Return the natural logarithm of the argument |
LOG() |
Return the natural logarithm of the first argument |
LOG10() |
Return the base-10 logarithm of the argument |
LOG2() |
Return the base-2 logarithm of the argument |
MOD() |
Return the remainder |
PI() |
Return the value of pi |
POW() |
Return the argument raised to the specified power |
POWER() |
Return the argument raised to the specified power |
RADIANS() |
Return argument converted to radians |
RAND() |
Return a random floating-point value |
ROUND() |
Round the argument |
SIGN() |
Return the sign of the argument |
SIN() |
Return the sine of the argument |
SQRT() |
Return the square root of the argument |
TAN() |
Return the tangent of the argument |
TRUNCATE() |
Truncate to specified number of decimal places |
All mathematical functions return NULL
in the
event of an error.
Returns the absolute value of X
.
mysql>SELECT ABS(2);
-> 2 mysql>SELECT ABS(-32);
-> 32
This function is safe to use with
BIGINT
values.
Returns the arc cosine of X
, that
is, the value whose cosine is X
.
Returns NULL
if
X
is not in the range
-1
to 1
.
mysql>SELECT ACOS(1);
-> 0 mysql>SELECT ACOS(1.0001);
-> NULL mysql>SELECT ACOS(0);
-> 1.5707963267949
Returns the arc sine of X
, that
is, the value whose sine is X
.
Returns NULL
if
X
is not in the range
-1
to 1
.
mysql>SELECT ASIN(0.2);
-> 0.20135792079033 mysql>SELECT ASIN('foo');
+-------------+ | ASIN('foo') | +-------------+ | 0 | +-------------+ 1 row in set, 1 warning (0.00 sec) mysql>SHOW WARNINGS;
+---------+------+-----------------------------------------+ | Level | Code | Message | +---------+------+-----------------------------------------+ | Warning | 1292 | Truncated incorrect DOUBLE value: 'foo' | +---------+------+-----------------------------------------+
Returns the arc tangent of X
,
that is, the value whose tangent is
X
.
mysql>SELECT ATAN(2);
-> 1.1071487177941 mysql>SELECT ATAN(-2);
-> -1.1071487177941
Returns the arc tangent of the two variables
X
and
Y
. It is similar to calculating
the arc tangent of
, except that the
signs of both arguments are used to determine the quadrant
of the result.
Y
/
X
mysql>SELECT ATAN(-2,2);
-> -0.78539816339745 mysql>SELECT ATAN2(PI(),0);
-> 1.5707963267949
Returns the smallest integer value not less than
X
.
mysql>SELECT CEILING(1.23);
-> 2 mysql>SELECT CEILING(-1.23);
-> -1
For exact-value numeric arguments, the return value has an exact-value numeric type. For string or floating-point arguments, the return value has a floating-point type.
Converts numbers between different number bases. Returns a
string representation of the number
N
, converted from base
from_base
to base
to_base
. Returns
NULL
if any argument is
NULL
. The argument
N
is interpreted as an integer,
but may be specified as an integer or a string. The minimum
base is 2
and the maximum base is
36
. If
from_base
is a negative number,
N
is regarded as a signed number.
Otherwise, N
is treated as
unsigned. CONV()
works with
64-bit precision.
mysql>SELECT CONV('a',16,2);
-> '1010' mysql>SELECT CONV('6E',18,8);
-> '172' mysql>SELECT CONV(-17,10,-18);
-> '-H' mysql>SELECT CONV(10+'10'+'10'+X'0a',10,10);
-> '40'
Returns the cosine of X
, where
X
is given in radians.
mysql> SELECT COS(PI());
-> -1
Returns the cotangent of X
.
mysql>SELECT COT(12);
-> -1.5726734063977 mysql>SELECT COT(0);
-> out-of-range error
Computes a cyclic redundancy check value and returns a
32-bit unsigned value. The result is NULL
if the argument is NULL
. The argument is
expected to be a string and (if possible) is treated as one
if it is not.
mysql>SELECT CRC32('MySQL');
-> 3259397556 mysql>SELECT CRC32('mysql');
-> 2501908538
Returns the argument X
, converted
from radians to degrees.
mysql>SELECT DEGREES(PI());
-> 180 mysql>SELECT DEGREES(PI() / 2);
-> 90
Returns the value of e (the base of
natural logarithms) raised to the power of
X
. The inverse of this function
is LOG()
(using a single
argument only) or LN()
.
mysql>SELECT EXP(2);
-> 7.3890560989307 mysql>SELECT EXP(-2);
-> 0.13533528323661 mysql>SELECT EXP(0);
-> 1
Returns the largest integer value not greater than
X
.
mysql> SELECT FLOOR(1.23), FLOOR(-1.23);
-> 1, -2
For exact-value numeric arguments, the return value has an exact-value numeric type. For string or floating-point arguments, the return value has a floating-point type.
Formats the number X
to a format
like '#,###,###.##'
, rounded to
D
decimal places, and returns the
result as a string. For details, see
Section 12.5, “String Functions”.
This function can be used to obtain a hexadecimal representation of a decimal number or a string; the manner in which it does so varies according to the argument's type. See this function's description in Section 12.5, “String Functions”, for details.
Returns the natural logarithm of
X
; that is, the
base-e logarithm of
X
. If
X
is less than or equal to 0.0E0,
the function returns NULL
and a warning
“Invalid argument for logarithm” is reported.
mysql>SELECT LN(2);
-> 0.69314718055995 mysql>SELECT LN(-2);
-> NULL
This function is synonymous with
LOG(
.
The inverse of this function is the
X
)EXP()
function.
If called with one parameter, this function returns the
natural logarithm of X
. If
X
is less than or equal to 0.0E0,
the function returns NULL
and a warning
“Invalid argument for logarithm” is reported.
The inverse of this function (when called with a single
argument) is the EXP()
function.
mysql>SELECT LOG(2);
-> 0.69314718055995 mysql>SELECT LOG(-2);
-> NULL
If called with two parameters, this function returns the
logarithm of X
to the base
B
. If
X
is less than or equal to 0, or
if B
is less than or equal to 1,
then NULL
is returned.
mysql>SELECT LOG(2,65536);
-> 16 mysql>SELECT LOG(10,100);
-> 2 mysql>SELECT LOG(1,100);
-> NULL
LOG(
is equivalent to
B
,X
)LOG(
.
X
) /
LOG(B
)
Returns the base-2 logarithm of
. If
X
X
is less than or equal to 0.0E0,
the function returns NULL
and a warning
“Invalid argument for logarithm” is reported.
mysql>SELECT LOG2(65536);
-> 16 mysql>SELECT LOG2(-100);
-> NULL
LOG2()
is useful for finding
out how many bits a number requires for storage. This
function is equivalent to the expression
LOG(
.
X
) /
LOG(2)
Returns the base-10 logarithm of
X
. If
X
is less than or equal to 0.0E0,
the function returns NULL
and a warning
“Invalid argument for logarithm” is reported.
mysql>SELECT LOG10(2);
-> 0.30102999566398 mysql>SELECT LOG10(100);
-> 2 mysql>SELECT LOG10(-100);
-> NULL
Modulo operation. Returns the remainder of
N
divided by
M
.
mysql>SELECT MOD(234, 10);
-> 4 mysql>SELECT 253 % 7;
-> 1 mysql>SELECT MOD(29,9);
-> 2 mysql>SELECT 29 MOD 9;
-> 2
This function is safe to use with
BIGINT
values.
MOD()
also works on values
that have a fractional part and returns the exact remainder
after division:
mysql> SELECT MOD(34.5,3);
-> 1.5
MOD(
returns N
,0)NULL
.
Returns the value of π (pi). The default number of decimal places displayed is seven, but MySQL uses the full double-precision value internally.
mysql>SELECT PI();
-> 3.141593 mysql>SELECT PI()+0.000000000000000000;
-> 3.141592653589793116
Returns the value of X
raised to
the power of Y
.
mysql>SELECT POW(2,2);
-> 4 mysql>SELECT POW(2,-2);
-> 0.25
This is a synonym for POW()
.
Returns the argument X
, converted
from degrees to radians. (Note that π radians equals 180
degrees.)
mysql> SELECT RADIANS(90);
-> 1.5707963267949
Returns a random floating-point value
v
in the range
0
<= v
<
1.0
. To obtain a random integer
R
in the range
i
<=
R
<
j
, use the expression
FLOOR(
− i
+ RAND() * (j
.
For example, to obtain a random integer in the range the
range i
))7
<=
R
< 12
, use
the following statement:
SELECT FLOOR(7 + (RAND() * 5));
If an integer argument N
is
specified, it is used as the seed value:
With a constant initializer argument, the seed is initialized once when the statement is prepared, prior to execution.
With a nonconstant initializer argument (such as a
column name), the seed is initialized with the value for
each invocation of
RAND()
.
One implication of this behavior is that for equal argument
values,
RAND(
returns the same value each time, and thus produces a
repeatable sequence of column values. In the following
example, the sequence of values produced by
N
)RAND(3)
is the same both places it
occurs.
mysql>CREATE TABLE t (i INT);
Query OK, 0 rows affected (0.42 sec) mysql>INSERT INTO t VALUES(1),(2),(3);
Query OK, 3 rows affected (0.00 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql>SELECT i, RAND() FROM t;
+------+------------------+ | i | RAND() | +------+------------------+ | 1 | 0.61914388706828 | | 2 | 0.93845168309142 | | 3 | 0.83482678498591 | +------+------------------+ 3 rows in set (0.00 sec) mysql>SELECT i, RAND(3) FROM t;
+------+------------------+ | i | RAND(3) | +------+------------------+ | 1 | 0.90576975597606 | | 2 | 0.37307905813035 | | 3 | 0.14808605345719 | +------+------------------+ 3 rows in set (0.00 sec) mysql>SELECT i, RAND() FROM t;
+------+------------------+ | i | RAND() | +------+------------------+ | 1 | 0.35877890638893 | | 2 | 0.28941420772058 | | 3 | 0.37073435016976 | +------+------------------+ 3 rows in set (0.00 sec) mysql>SELECT i, RAND(3) FROM t;
+------+------------------+ | i | RAND(3) | +------+------------------+ | 1 | 0.90576975597606 | | 2 | 0.37307905813035 | | 3 | 0.14808605345719 | +------+------------------+ 3 rows in set (0.01 sec)
RAND()
in a
WHERE
clause is evaluated for every row
(when selecting from one table) or combination of rows (when
selecting from a multiple-table join). Thus, for optimizer
purposes, RAND()
is not a
constant value and cannot be used for index optimizations.
For more information, see
Section 8.2.1.18, “Function Call Optimization”.
Use of a column with RAND()
values in an ORDER BY
or GROUP
BY
clause may yield unexpected results because for
either clause a RAND()
expression can be evaluated multiple times for the same row,
each time returning a different result. If the goal is to
retrieve rows in random order, you can use a statement like
this:
SELECT * FROM tbl_name
ORDER BY RAND();
To select a random sample from a set of rows, combine
ORDER BY RAND()
with
LIMIT
:
SELECT * FROM table1, table2 WHERE a=b AND c<d ORDER BY RAND() LIMIT 1000;
RAND()
is not meant to be a
perfect random generator. It is a fast way to generate
random numbers on demand that is portable between platforms
for the same MySQL version.
This function is unsafe for statement-based replication. A
warning is logged if you use this function when
binlog_format
is set to
STATEMENT
. (Bug #49222)
Rounds the argument X
to
D
decimal places. The rounding
algorithm depends on the data type of
X
. D
defaults to 0 if not specified. D
can be negative to cause D
digits
left of the decimal point of the value
X
to become zero.
mysql>SELECT ROUND(-1.23);
-> -1 mysql>SELECT ROUND(-1.58);
-> -2 mysql>SELECT ROUND(1.58);
-> 2 mysql>SELECT ROUND(1.298, 1);
-> 1.3 mysql>SELECT ROUND(1.298, 0);
-> 1 mysql>SELECT ROUND(23.298, -1);
-> 20
The return value has the same type as the first argument (assuming that it is integer, double, or decimal). This means that for an integer argument, the result is an integer (no decimal places):
mysql> SELECT ROUND(150.000,2), ROUND(150,2);
+------------------+--------------+
| ROUND(150.000,2) | ROUND(150,2) |
+------------------+--------------+
| 150.00 | 150 |
+------------------+--------------+
ROUND()
uses the following
rules depending on the type of the first argument:
For exact-value numbers,
ROUND()
uses the
“round half away from zero” or “round
toward nearest” rule: A value with a fractional
part of .5 or greater is rounded up to the next integer
if positive or down to the next integer if negative. (In
other words, it is rounded away from zero.) A value with
a fractional part less than .5 is rounded down to the
next integer if positive or up to the next integer if
negative.
For approximate-value numbers, the result depends on the
C library. On many systems, this means that
ROUND()
uses the
“round to nearest even” rule: A value with
a fractional part exactly halfway between two integers
is rounded to the nearest even integer.
The following example shows how rounding differs for exact and approximate values:
mysql> SELECT ROUND(2.5), ROUND(25E-1);
+------------+--------------+
| ROUND(2.5) | ROUND(25E-1) |
+------------+--------------+
| 3 | 2 |
+------------+--------------+
For more information, see Section 12.25, “Precision Math”.
Returns the sign of the argument as -1
,
0
, or 1
, depending on
whether X
is negative, zero, or
positive.
mysql>SELECT SIGN(-32);
-> -1 mysql>SELECT SIGN(0);
-> 0 mysql>SELECT SIGN(234);
-> 1
Returns the sine of X
, where
X
is given in radians.
mysql>SELECT SIN(PI());
-> 1.2246063538224e-16 mysql>SELECT ROUND(SIN(PI()));
-> 0
Returns the square root of a nonnegative number
X
.
mysql>SELECT SQRT(4);
-> 2 mysql>SELECT SQRT(20);
-> 4.4721359549996 mysql>SELECT SQRT(-16);
-> NULL
Returns the tangent of X
, where
X
is given in radians.
mysql>SELECT TAN(PI());
-> -1.2246063538224e-16 mysql>SELECT TAN(PI()+1);
-> 1.5574077246549
Returns the number X
, truncated
to D
decimal places. If
D
is 0
, the
result has no decimal point or fractional part.
D
can be negative to cause
D
digits left of the decimal
point of the value X
to become
zero.
mysql>SELECT TRUNCATE(1.223,1);
-> 1.2 mysql>SELECT TRUNCATE(1.999,1);
-> 1.9 mysql>SELECT TRUNCATE(1.999,0);
-> 1 mysql>SELECT TRUNCATE(-1.999,1);
-> -1.9 mysql>SELECT TRUNCATE(122,-2);
-> 100 mysql>SELECT TRUNCATE(10.28*100,0);
-> 1028
All numbers are rounded toward zero.
This section describes the functions that can be used to manipulate temporal values. See Section 11.3, “Date and Time Types”, for a description of the range of values each date and time type has and the valid formats in which values may be specified.
Table 12.13 Date and Time Functions
Name | Description |
---|---|
ADDDATE() |
Add time values (intervals) to a date value |
ADDTIME() |
Add time |
CONVERT_TZ() |
Convert from one time zone to another |
CURDATE() |
Return the current date |
CURRENT_DATE() , CURRENT_DATE |
Synonyms for CURDATE() |
CURRENT_TIME() , CURRENT_TIME |
Synonyms for CURTIME() |
CURRENT_TIMESTAMP() , CURRENT_TIMESTAMP |
Synonyms for NOW() |
CURTIME() |
Return the current time |
DATE() |
Extract the date part of a date or datetime expression |
DATE_ADD() |
Add time values (intervals) to a date value |
DATE_FORMAT() |
Format date as specified |
DATE_SUB() |
Subtract a time value (interval) from a date |
DATEDIFF() |
Subtract two dates |
DAY() |
Synonym for DAYOFMONTH() |
DAYNAME() |
Return the name of the weekday |
DAYOFMONTH() |
Return the day of the month (0-31) |
DAYOFWEEK() |
Return the weekday index of the argument |
DAYOFYEAR() |
Return the day of the year (1-366) |
EXTRACT() |
Extract part of a date |
FROM_DAYS() |
Convert a day number to a date |
FROM_UNIXTIME() |
Format Unix timestamp as a date |
GET_FORMAT() |
Return a date format string |
HOUR() |
Extract the hour |
LAST_DAY |
Return the last day of the month for the argument |
LOCALTIME() , LOCALTIME |
Synonym for NOW() |
LOCALTIMESTAMP , LOCALTIMESTAMP() |
Synonym for NOW() |
MAKEDATE() |
Create a date from the year and day of year |
MAKETIME() |
Create time from hour, minute, second |
MICROSECOND() |
Return the microseconds from argument |
MINUTE() |
Return the minute from the argument |
MONTH() |
Return the month from the date passed |
MONTHNAME() |
Return the name of the month |
NOW() |
Return the current date and time |
PERIOD_ADD() |
Add a period to a year-month |
PERIOD_DIFF() |
Return the number of months between periods |
QUARTER() |
Return the quarter from a date argument |
SEC_TO_TIME() |
Converts seconds to 'HH:MM:SS' format |
SECOND() |
Return the second (0-59) |
STR_TO_DATE() |
Convert a string to a date |
SUBDATE() |
Synonym for DATE_SUB() when invoked with three arguments |
SUBTIME() |
Subtract times |
SYSDATE() |
Return the time at which the function executes |
TIME() |
Extract the time portion of the expression passed |
TIME_FORMAT() |
Format as time |
TIME_TO_SEC() |
Return the argument converted to seconds |
TIMEDIFF() |
Subtract time |
TIMESTAMP() |
With a single argument, this function returns the date or datetime expression; with two arguments, the sum of the arguments |
TIMESTAMPADD() |
Add an interval to a datetime expression |
TIMESTAMPDIFF() |
Subtract an interval from a datetime expression |
TO_DAYS() |
Return the date argument converted to days |
TO_SECONDS() |
Return the date or datetime argument converted to seconds since Year 0 |
UNIX_TIMESTAMP() |
Return a Unix timestamp |
UTC_DATE() |
Return the current UTC date |
UTC_TIME() |
Return the current UTC time |
UTC_TIMESTAMP() |
Return the current UTC date and time |
WEEK() |
Return the week number |
WEEKDAY() |
Return the weekday index |
WEEKOFYEAR() |
Return the calendar week of the date (1-53) |
YEAR() |
Return the year |
YEARWEEK() |
Return the year and week |
Here is an example that uses date functions. The following query
selects all rows with a date_col
value
from within the last 30 days:
mysql>SELECT
->something
FROMtbl_name
WHERE DATE_SUB(CURDATE(),INTERVAL 30 DAY) <=
date_col
;
The query also selects rows with dates that lie in the future.
Functions that expect date values usually accept datetime values and ignore the time part. Functions that expect time values usually accept datetime values and ignore the date part.
Functions that return the current date or time each are evaluated
only once per query at the start of query execution. This means
that multiple references to a function such as
NOW()
within a single query always
produce the same result. (For our purposes, a single query also
includes a call to a stored program (stored routine, trigger, or
event) and all subprograms called by that program.) This principle
also applies to CURDATE()
,
CURTIME()
,
UTC_DATE()
,
UTC_TIME()
,
UTC_TIMESTAMP()
, and to any of
their synonyms.
The CURRENT_TIMESTAMP()
,
CURRENT_TIME()
,
CURRENT_DATE()
, and
FROM_UNIXTIME()
functions return
values in the current session time zone, which is available as the
session value of the time_zone
system variable. In addition,
UNIX_TIMESTAMP()
assumes that its
argument is a datetime value in the session time zone. See
Section 5.1.13, “MySQL Server Time Zone Support”.
Some date functions can be used with “zero” dates or
incomplete dates such as '2001-11-00'
, whereas
others cannot. Functions that extract parts of dates typically
work with incomplete dates and thus can return 0 when you might
otherwise expect a nonzero value. For example:
mysql> SELECT DAYOFMONTH('2001-11-00'), MONTH('2005-00-00');
-> 0, 0
Other functions expect complete dates and return
NULL
for incomplete dates. These include
functions that perform date arithmetic or that map parts of dates
to names. For example:
mysql>SELECT DATE_ADD('2006-05-00',INTERVAL 1 DAY);
-> NULL mysql>SELECT DAYNAME('2006-05-00');
-> NULL
Several functions are more strict when passed a
DATE()
function value as their
argument and reject incomplete dates with a day part of zero.
These functions are affected:
CONVERT_TZ()
,
DATE_ADD()
,
DATE_SUB()
,
DAYOFYEAR()
,
LAST_DAY()
(permits a day part of
zero), TIMESTAMPDIFF()
,
TO_DAYS()
,
TO_SECONDS()
,
WEEK()
,
WEEKDAY()
,
WEEKOFYEAR()
,
YEARWEEK()
.
Fractional seconds for TIME
,
DATETIME
, and TIMESTAMP
values are supported, with up to microsecond precision. Functions
that take temporal arguments accept values with fractional
seconds. Return values from temporal functions include fractional
seconds as appropriate.
ADDDATE(
,
date
,INTERVAL
expr
unit
)ADDDATE(
expr
,days
)
When invoked with the INTERVAL
form of the
second argument, ADDDATE()
is a
synonym for DATE_ADD()
. The
related function SUBDATE()
is a
synonym for DATE_SUB()
. For
information on the INTERVAL
unit
argument, see
Temporal Intervals.
mysql>SELECT DATE_ADD('2008-01-02', INTERVAL 31 DAY);
-> '2008-02-02' mysql>SELECT ADDDATE('2008-01-02', INTERVAL 31 DAY);
-> '2008-02-02'
When invoked with the days
form of
the second argument, MySQL treats it as an integer number of
days to be added to expr
.
mysql> SELECT ADDDATE('2008-01-02', 31);
-> '2008-02-02'
ADDTIME()
adds
expr2
to
expr1
and returns the result.
expr1
is a time or datetime
expression, and expr2
is a time
expression.
mysql>SELECT ADDTIME('2007-12-31 23:59:59.999999', '1 1:1:1.000002');
-> '2008-01-02 01:01:01.000001' mysql>SELECT ADDTIME('01:00:00.999999', '02:00:00.999998');
-> '03:00:01.999997'
CONVERT_TZ()
converts a
datetime value dt
from the time
zone given by from_tz
to the time
zone given by to_tz
and returns the
resulting value. Time zones are specified as described in
Section 5.1.13, “MySQL Server Time Zone Support”. This function returns
NULL
if the arguments are invalid.
If the value falls out of the supported range of the
TIMESTAMP
type when converted
from from_tz
to UTC, no conversion
occurs. The TIMESTAMP
range is
described in Section 11.1.2, “Date and Time Type Overview”.
mysql>SELECT CONVERT_TZ('2004-01-01 12:00:00','GMT','MET');
-> '2004-01-01 13:00:00' mysql>SELECT CONVERT_TZ('2004-01-01 12:00:00','+00:00','+10:00');
-> '2004-01-01 22:00:00'
To use named time zones such as 'MET'
or
'Europe/Amsterdam'
, the time zone tables
must be properly set up. For instructions, see
Section 5.1.13, “MySQL Server Time Zone Support”.
Returns the current date as a value in
'YYYY-MM-DD'
or YYYYMMDD
format, depending on whether the function is used in a string
or numeric context.
mysql>SELECT CURDATE();
-> '2008-06-13' mysql>SELECT CURDATE() + 0;
-> 20080613
CURRENT_DATE
and
CURRENT_DATE()
are synonyms for
CURDATE()
.
CURRENT_TIME
,
CURRENT_TIME([
fsp
])
CURRENT_TIME
and
CURRENT_TIME()
are synonyms for
CURTIME()
.
CURRENT_TIMESTAMP
,
CURRENT_TIMESTAMP([
fsp
])
CURRENT_TIMESTAMP
and
CURRENT_TIMESTAMP()
are
synonyms for NOW()
.
Returns the current time as a value in
'HH:MM:SS'
or HHMMSS
format, depending on whether the function is used in a string
or numeric context. The value is expressed in the session time
zone.
If the fsp
argument is given to
specify a fractional seconds precision from 0 to 6, the return
value includes a fractional seconds part of that many digits.
mysql>SELECT CURTIME();
-> '23:50:26' mysql>SELECT CURTIME() + 0;
-> 235026.000000
Extracts the date part of the date or datetime expression
expr
.
mysql> SELECT DATE('2003-12-31 01:02:03');
-> '2003-12-31'
DATEDIFF()
returns
expr1
−
expr2
expressed as a value in days
from one date to the other. expr1
and expr2
are date or date-and-time
expressions. Only the date parts of the values are used in the
calculation.
mysql>SELECT DATEDIFF('2007-12-31 23:59:59','2007-12-30');
-> 1 mysql>SELECT DATEDIFF('2010-11-30 23:59:59','2010-12-31');
-> -31
DATE_ADD(
,
date
,INTERVAL
expr
unit
)DATE_SUB(
date
,INTERVAL
expr
unit
)
These functions perform date arithmetic. The
date
argument specifies the
starting date or datetime value.
expr
is an expression specifying
the interval value to be added or subtracted from the starting
date. expr
is evaluated as a
string; it may start with a -
for negative
intervals. unit
is a keyword
indicating the units in which the expression should be
interpreted.
For more information about temporal interval syntax, including
a full list of unit
specifiers, the
expected form of the expr
argument
for each unit
value, and rules for
operand interpretation in temporal arithmetic, see
Temporal Intervals.
The return value depends on the arguments:
DATE
if the
date
argument is a
DATE
value and your
calculations involve only YEAR
,
MONTH
, and DAY
parts
(that is, no time parts).
DATETIME
if the first
argument is a DATETIME
(or
TIMESTAMP
) value, or if the
first argument is a DATE
and the unit
value uses
HOURS
, MINUTES
, or
SECONDS
.
String otherwise.
To ensure that the result is
DATETIME
, you can use
CAST()
to convert the first
argument to DATETIME
.
mysql>SELECT DATE_ADD('2018-05-01',INTERVAL 1 DAY);
-> '2018-05-02' mysql>SELECT DATE_SUB('2018-05-01',INTERVAL 1 YEAR);
-> '2017-05-01' mysql>SELECT DATE_ADD('2020-12-31 23:59:59',
->INTERVAL 1 SECOND);
-> '2021-01-01 00:00:00' mysql>SELECT DATE_ADD('2018-12-31 23:59:59',
->INTERVAL 1 DAY);
-> '2019-01-01 23:59:59' mysql>SELECT DATE_ADD('2100-12-31 23:59:59',
->INTERVAL '1:1' MINUTE_SECOND);
-> '2101-01-01 00:01:00' mysql>SELECT DATE_SUB('2025-01-01 00:00:00',
->INTERVAL '1 1:1:1' DAY_SECOND);
-> '2024-12-30 22:58:59' mysql>SELECT DATE_ADD('1900-01-01 00:00:00',
->INTERVAL '-1 10' DAY_HOUR);
-> '1899-12-30 14:00:00' mysql>SELECT DATE_SUB('1998-01-02', INTERVAL 31 DAY);
-> '1997-12-02' mysql>SELECT DATE_ADD('1992-12-31 23:59:59.000002',
->INTERVAL '1.999999' SECOND_MICROSECOND);
-> '1993-01-01 00:00:01.000001'
Formats the date
value according to
the format
string.
The specifiers shown in the following table may be used in the
format
string. The
%
character is required before format
specifier characters. The specifiers apply to other functions
as well: STR_TO_DATE()
,
TIME_FORMAT()
,
UNIX_TIMESTAMP()
.
Specifier | Description |
---|---|
%a |
Abbreviated weekday name
(Sun ..Sat ) |
%b |
Abbreviated month name (Jan ..Dec ) |
%c |
Month, numeric (0 ..12 ) |
%D |
Day of the month with English suffix (0th ,
1st , 2nd ,
3rd , …) |
%d |
Day of the month, numeric (00 ..31 ) |
%e |
Day of the month, numeric (0 ..31 ) |
%f |
Microseconds (000000 ..999999 ) |
%H |
Hour (00 ..23 ) |
%h |
Hour (01 ..12 ) |
%I |
Hour (01 ..12 ) |
%i |
Minutes, numeric (00 ..59 ) |
%j |
Day of year (001 ..366 ) |
%k |
Hour (0 ..23 ) |
%l |
Hour (1 ..12 ) |
%M |
Month name (January ..December ) |
%m |
Month, numeric (00 ..12 ) |
%p |
AM or PM |
%r |
Time, 12-hour (hh:mm:ss followed by
AM or PM ) |
%S |
Seconds (00 ..59 ) |
%s |
Seconds (00 ..59 ) |
%T |
Time, 24-hour (hh:mm:ss ) |
%U |
Week (00 ..53 ), where Sunday is the
first day of the week;
WEEK() mode 0 |
%u |
Week (00 ..53 ), where Monday is the
first day of the week;
WEEK() mode 1 |
%V |
Week (01 ..53 ), where Sunday is the
first day of the week;
WEEK() mode 2; used with
%X |
%v |
Week (01 ..53 ), where Monday is the
first day of the week;
WEEK() mode 3; used with
%x |
%W |
Weekday name (Sunday ..Saturday ) |
%w |
Day of the week
(0 =Sunday..6 =Saturday) |
%X |
Year for the week where Sunday is the first day of the week, numeric,
four digits; used with %V |
%x |
Year for the week, where Monday is the first day of the week, numeric,
four digits; used with %v |
%Y |
Year, numeric, four digits |
%y |
Year, numeric (two digits) |
%% |
A literal % character |
% |
x , for any
“x ” not listed
above |
Ranges for the month and day specifiers begin with zero due to
the fact that MySQL permits the storing of incomplete dates
such as '2014-00-00'
.
The language used for day and month names and abbreviations is
controlled by the value of the
lc_time_names
system variable
(Section 10.15, “MySQL Server Locale Support”).
For the %U
, %u
,
%V
, and %v
specifiers,
see the description of the
WEEK()
function for information
about the mode values. The mode affects how week numbering
occurs.
DATE_FORMAT()
returns a string
with a character set and collation given by
character_set_connection
and
collation_connection
so that
it can return month and weekday names containing non-ASCII
characters.
mysql>SELECT DATE_FORMAT('2009-10-04 22:23:00', '%W %M %Y');
-> 'Sunday October 2009' mysql>SELECT DATE_FORMAT('2007-10-04 22:23:00', '%H:%i:%s');
-> '22:23:00' mysql>SELECT DATE_FORMAT('1900-10-04 22:23:00',
->'%D %y %a %d %m %b %j');
-> '4th 00 Thu 04 10 Oct 277' mysql>SELECT DATE_FORMAT('1997-10-04 22:23:00',
->'%H %k %I %r %T %S %w');
-> '22 22 10 10:23:00 PM 22:23:00 00 6' mysql>SELECT DATE_FORMAT('1999-01-01', '%X %V');
-> '1998 52' mysql>SELECT DATE_FORMAT('2006-06-00', '%d');
-> '00'
DATE_SUB(
date
,INTERVAL
expr
unit
)
See the description for
DATE_ADD()
.
DAY()
is a synonym for
DAYOFMONTH()
.
Returns the name of the weekday for
date
. The language used for the
name is controlled by the value of the
lc_time_names
system variable
(Section 10.15, “MySQL Server Locale Support”).
mysql> SELECT DAYNAME('2007-02-03');
-> 'Saturday'
Returns the day of the month for
date
, in the range
1
to 31
, or
0
for dates such as
'0000-00-00'
or
'2008-00-00'
that have a zero day part.
mysql> SELECT DAYOFMONTH('2007-02-03');
-> 3
Returns the weekday index for date
(1
= Sunday, 2
= Monday,
…, 7
= Saturday). These index values
correspond to the ODBC standard.
mysql> SELECT DAYOFWEEK('2007-02-03');
-> 7
Returns the day of the year for
date
, in the range
1
to 366
.
mysql> SELECT DAYOFYEAR('2007-02-03');
-> 34
The EXTRACT()
function uses the
same kinds of unit
specifiers as
DATE_ADD()
or
DATE_SUB()
, but extracts parts
from the date rather than performing date arithmetic. For
information on the unit
argument,
see Temporal Intervals.
mysql>SELECT EXTRACT(YEAR FROM '2019-07-02');
-> 2019 mysql>SELECT EXTRACT(YEAR_MONTH FROM '2019-07-02 01:02:03');
-> 201907 mysql>SELECT EXTRACT(DAY_MINUTE FROM '2019-07-02 01:02:03');
-> 20102 mysql>SELECT EXTRACT(MICROSECOND
->FROM '2003-01-02 10:30:00.000123');
-> 123
Given a day number N
, returns a
DATE
value.
mysql> SELECT FROM_DAYS(730669);
-> '2000-07-03'
Use FROM_DAYS()
with caution on
old dates. It is not intended for use with values that precede
the advent of the Gregorian calendar (1582). See
Section 12.8, “What Calendar Is Used By MySQL?”.
FROM_UNIXTIME(
unix_timestamp
[,format
])
Returns a representation of the
unix_timestamp
argument as a value
in 'YYYY-MM-DD HH:MM:SS'
or
YYYYMMDDHHMMSS.uuuuuu
format, depending on
whether the function is used in a string or numeric context.
unix_timestamp
is an internal
timestamp value representing seconds since
'1970-01-01 00:00:00'
UTC, such as produced
by the UNIX_TIMESTAMP()
function.
The return value is expressed in the session time zone.
(Clients can set the session time zone as described in
Section 5.1.13, “MySQL Server Time Zone Support”.) The
format
string, if given, is used to
format the result the same way as described in the entry for
the DATE_FORMAT()
function.
mysql>SELECT FROM_UNIXTIME(1447430881);
-> '2015-11-13 10:08:01' mysql>SELECT FROM_UNIXTIME(1447430881) + 0;
-> 20151113100801 mysql>SELECT FROM_UNIXTIME(1447430881,
->'%Y %D %M %h:%i:%s %x');
-> '2015 13th November 10:08:01 2015'
If you use UNIX_TIMESTAMP()
and FROM_UNIXTIME()
to
convert between values in a non-UTC time zone and Unix
timestamp values, the conversion is lossy because the
mapping is not one-to-one in both directions. For details,
see the description of the
UNIX_TIMESTAMP()
function.
GET_FORMAT({DATE|TIME|DATETIME},
{'EUR'|'USA'|'JIS'|'ISO'|'INTERNAL'})
Returns a format string. This function is useful in
combination with the
DATE_FORMAT()
and the
STR_TO_DATE()
functions.
The possible values for the first and second arguments result
in several possible format strings (for the specifiers used,
see the table in the
DATE_FORMAT()
function
description). ISO format refers to ISO 9075, not ISO 8601.
Function Call | Result |
---|---|
GET_FORMAT(DATE,'USA') |
'%m.%d.%Y' |
GET_FORMAT(DATE,'JIS') |
'%Y-%m-%d' |
GET_FORMAT(DATE,'ISO') |
'%Y-%m-%d' |
GET_FORMAT(DATE,'EUR') |
'%d.%m.%Y' |
GET_FORMAT(DATE,'INTERNAL') |
'%Y%m%d' |
GET_FORMAT(DATETIME,'USA') |
'%Y-%m-%d %H.%i.%s' |
GET_FORMAT(DATETIME,'JIS') |
'%Y-%m-%d %H:%i:%s' |
GET_FORMAT(DATETIME,'ISO') |
'%Y-%m-%d %H:%i:%s' |
GET_FORMAT(DATETIME,'EUR') |
'%Y-%m-%d %H.%i.%s' |
GET_FORMAT(DATETIME,'INTERNAL') |
'%Y%m%d%H%i%s' |
GET_FORMAT(TIME,'USA') |
'%h:%i:%s %p' |
GET_FORMAT(TIME,'JIS') |
'%H:%i:%s' |
GET_FORMAT(TIME,'ISO') |
'%H:%i:%s' |
GET_FORMAT(TIME,'EUR') |
'%H.%i.%s' |
GET_FORMAT(TIME,'INTERNAL') |
'%H%i%s' |
TIMESTAMP
can also be used as
the first argument to
GET_FORMAT()
, in which case the
function returns the same values as for
DATETIME
.
mysql>SELECT DATE_FORMAT('2003-10-03',GET_FORMAT(DATE,'EUR'));
-> '03.10.2003' mysql>SELECT STR_TO_DATE('10.31.2003',GET_FORMAT(DATE,'USA'));
-> '2003-10-31'
Returns the hour for time
. The
range of the return value is 0
to
23
for time-of-day values. However, the
range of TIME
values actually
is much larger, so HOUR
can return values
greater than 23
.
mysql>SELECT HOUR('10:05:03');
-> 10 mysql>SELECT HOUR('272:59:59');
-> 272
Takes a date or datetime value and returns the corresponding
value for the last day of the month. Returns
NULL
if the argument is invalid.
mysql>SELECT LAST_DAY('2003-02-05');
-> '2003-02-28' mysql>SELECT LAST_DAY('2004-02-05');
-> '2004-02-29' mysql>SELECT LAST_DAY('2004-01-01 01:01:01');
-> '2004-01-31' mysql>SELECT LAST_DAY('2003-03-32');
-> NULL
LOCALTIME
and
LOCALTIME()
are synonyms for
NOW()
.
LOCALTIMESTAMP
,
LOCALTIMESTAMP([
fsp
])
LOCALTIMESTAMP
and
LOCALTIMESTAMP()
are synonyms
for NOW()
.
Returns a date, given year and day-of-year values.
dayofyear
must be greater than 0 or
the result is NULL
.
mysql>SELECT MAKEDATE(2011,31), MAKEDATE(2011,32);
-> '2011-01-31', '2011-02-01' mysql>SELECT MAKEDATE(2011,365), MAKEDATE(2014,365);
-> '2011-12-31', '2014-12-31' mysql>SELECT MAKEDATE(2011,0);
-> NULL
Returns a time value calculated from the
hour
,
minute
, and
second
arguments.
The second
argument can have a
fractional part.
mysql> SELECT MAKETIME(12,15,30);
-> '12:15:30'
Returns the microseconds from the time or datetime expression
expr
as a number in the range from
0
to 999999
.
mysql>SELECT MICROSECOND('12:00:00.123456');
-> 123456 mysql>SELECT MICROSECOND('2019-12-31 23:59:59.000010');
-> 10
Returns the minute for time
, in the
range 0
to 59
.
mysql> SELECT MINUTE('2008-02-03 10:05:03');
-> 5
Returns the month for date
, in the
range 1
to 12
for
January to December, or 0
for dates such as
'0000-00-00'
or
'2008-00-00'
that have a zero month part.
mysql> SELECT MONTH('2008-02-03');
-> 2
Returns the full name of the month for
date
. The language used for the
name is controlled by the value of the
lc_time_names
system variable
(Section 10.15, “MySQL Server Locale Support”).
mysql> SELECT MONTHNAME('2008-02-03');
-> 'February'
Returns the current date and time as a value in
'YYYY-MM-DD HH:MM:SS'
or
YYYYMMDDHHMMSS
format, depending on whether
the function is used in a string or numeric context. The value
is expressed in the session time zone.
If the fsp
argument is given to
specify a fractional seconds precision from 0 to 6, the return
value includes a fractional seconds part of that many digits.
mysql>SELECT NOW();
-> '2007-12-15 23:50:26' mysql>SELECT NOW() + 0;
-> 20071215235026.000000
NOW()
returns a constant time
that indicates the time at which the statement began to
execute. (Within a stored function or trigger,
NOW()
returns the time at which
the function or triggering statement began to execute.) This
differs from the behavior for
SYSDATE()
, which returns the
exact time at which it executes.
mysql>SELECT NOW(), SLEEP(2), NOW();
+---------------------+----------+---------------------+ | NOW() | SLEEP(2) | NOW() | +---------------------+----------+---------------------+ | 2006-04-12 13:47:36 | 0 | 2006-04-12 13:47:36 | +---------------------+----------+---------------------+ mysql>SELECT SYSDATE(), SLEEP(2), SYSDATE();
+---------------------+----------+---------------------+ | SYSDATE() | SLEEP(2) | SYSDATE() | +---------------------+----------+---------------------+ | 2006-04-12 13:47:44 | 0 | 2006-04-12 13:47:46 | +---------------------+----------+---------------------+
In addition, the SET TIMESTAMP
statement
affects the value returned by
NOW()
but not by
SYSDATE()
. This means that
timestamp settings in the binary log have no effect on
invocations of SYSDATE()
.
Setting the timestamp to a nonzero value causes each
subsequent invocation of NOW()
to return that value. Setting the timestamp to zero cancels
this effect so that NOW()
once
again returns the current date and time.
See the description for
SYSDATE()
for additional
information about the differences between the two functions.
Adds N
months to period
P
(in the format
YYMM
or YYYYMM
). Returns
a value in the format YYYYMM
. Note that the
period argument P
is
not a date value.
mysql> SELECT PERIOD_ADD(200801,2);
-> 200803
Returns the number of months between periods
P1
and
P2
. P1
and P2
should be in the format
YYMM
or YYYYMM
. Note
that the period arguments P1
and
P2
are not
date values.
mysql> SELECT PERIOD_DIFF(200802,200703);
-> 11
Returns the quarter of the year for
date
, in the range
1
to 4
.
mysql> SELECT QUARTER('2008-04-01');
-> 2
Returns the second for time
, in the
range 0
to 59
.
mysql> SELECT SECOND('10:05:03');
-> 3
Returns the seconds
argument,
converted to hours, minutes, and seconds, as a
TIME
value. The range of the
result is constrained to that of the
TIME
data type. A warning
occurs if the argument corresponds to a value outside that
range.
mysql>SELECT SEC_TO_TIME(2378);
-> '00:39:38' mysql>SELECT SEC_TO_TIME(2378) + 0;
-> 3938
This is the inverse of the
DATE_FORMAT()
function. It
takes a string str
and a format
string format
.
STR_TO_DATE()
returns a
DATETIME
value if the format
string contains both date and time parts, or a
DATE
or
TIME
value if the string
contains only date or time parts. If the date, time, or
datetime value extracted from str
is illegal, STR_TO_DATE()
returns NULL
and produces a warning.
The server scans str
attempting to
match format
to it. The format
string can contain literal characters and format specifiers
beginning with %
. Literal characters in
format
must match literally in
str
. Format specifiers in
format
must match a date or time
part in str
. For the specifiers
that can be used in format
, see the
DATE_FORMAT()
function
description.
mysql>SELECT STR_TO_DATE('01,5,2013','%d,%m,%Y');
-> '2013-05-01' mysql>SELECT STR_TO_DATE('May 1, 2013','%M %d,%Y');
-> '2013-05-01'
Scanning starts at the beginning of
str
and fails if
format
is found not to match. Extra
characters at the end of str
are
ignored.
mysql>SELECT STR_TO_DATE('a09:30:17','a%h:%i:%s');
-> '09:30:17' mysql>SELECT STR_TO_DATE('a09:30:17','%h:%i:%s');
-> NULL mysql>SELECT STR_TO_DATE('09:30:17a','%h:%i:%s');
-> '09:30:17'
Unspecified date or time parts have a value of 0, so
incompletely specified values in
str
produce a result with some or
all parts set to 0:
mysql>SELECT STR_TO_DATE('abc','abc');
-> '0000-00-00' mysql>SELECT STR_TO_DATE('9','%m');
-> '0000-09-00' mysql>SELECT STR_TO_DATE('9','%s');
-> '00:00:09'
Range checking on the parts of date values is as described in Section 11.3.1, “The DATE, DATETIME, and TIMESTAMP Types”. This means, for example, that “zero” dates or dates with part values of 0 are permitted unless the SQL mode is set to disallow such values.
mysql>SELECT STR_TO_DATE('00/00/0000', '%m/%d/%Y');
-> '0000-00-00' mysql>SELECT STR_TO_DATE('04/31/2004', '%m/%d/%Y');
-> '2004-04-31'
If the NO_ZERO_DATE
or
NO_ZERO_IN_DATE
SQL mode is
enabled, zero dates or part of dates are disallowed. In that
case, STR_TO_DATE()
returns
NULL
and generates a warning:
mysql>SET sql_mode = '';
mysql>SELECT STR_TO_DATE('15:35:00', '%H:%i:%s');
+-------------------------------------+ | STR_TO_DATE('15:35:00', '%H:%i:%s') | +-------------------------------------+ | 15:35:00 | +-------------------------------------+ mysql>SET sql_mode = 'NO_ZERO_IN_DATE';
mysql>SELECT STR_TO_DATE('15:35:00', '%h:%i:%s');
+-------------------------------------+ | STR_TO_DATE('15:35:00', '%h:%i:%s') | +-------------------------------------+ | NULL | +-------------------------------------+ mysql>SHOW WARNINGS\G
*************************** 1. row *************************** Level: Warning Code: 1411 Message: Incorrect datetime value: '15:35:00' for function str_to_date
You cannot use format "%X%V"
to convert a
year-week string to a date because the combination of a year
and week does not uniquely identify a year and month if the
week crosses a month boundary. To convert a year-week to a
date, you should also specify the weekday:
mysql> SELECT STR_TO_DATE('200442 Monday', '%X%V %W');
-> '2004-10-18'
SUBDATE(
,
date
,INTERVAL
expr
unit
)SUBDATE(
expr
,days
)
When invoked with the INTERVAL
form of the
second argument, SUBDATE()
is a
synonym for DATE_SUB()
. For
information on the INTERVAL
unit
argument, see the discussion
for DATE_ADD()
.
mysql>SELECT DATE_SUB('2008-01-02', INTERVAL 31 DAY);
-> '2007-12-02' mysql>SELECT SUBDATE('2008-01-02', INTERVAL 31 DAY);
-> '2007-12-02'
The second form enables the use of an integer value for
days
. In such cases, it is
interpreted as the number of days to be subtracted from the
date or datetime expression expr
.
mysql> SELECT SUBDATE('2008-01-02 12:00:00', 31);
-> '2007-12-02 12:00:00'
SUBTIME()
returns
expr1
−
expr2
expressed as a value in the
same format as expr1
.
expr1
is a time or datetime
expression, and expr2
is a time
expression.
mysql>SELECT SUBTIME('2007-12-31 23:59:59.999999','1 1:1:1.000002');
-> '2007-12-30 22:58:58.999997' mysql>SELECT SUBTIME('01:00:00.999999', '02:00:00.999998');
-> '-00:59:59.999999'
Returns the current date and time as a value in
'YYYY-MM-DD HH:MM:SS'
or
YYYYMMDDHHMMSS
format, depending on whether
the function is used in a string or numeric context.
If the fsp
argument is given to
specify a fractional seconds precision from 0 to 6, the return
value includes a fractional seconds part of that many digits.
SYSDATE()
returns the time at
which it executes. This differs from the behavior for
NOW()
, which returns a constant
time that indicates the time at which the statement began to
execute. (Within a stored function or trigger,
NOW()
returns the time at which
the function or triggering statement began to execute.)
mysql>SELECT NOW(), SLEEP(2), NOW();
+---------------------+----------+---------------------+ | NOW() | SLEEP(2) | NOW() | +---------------------+----------+---------------------+ | 2006-04-12 13:47:36 | 0 | 2006-04-12 13:47:36 | +---------------------+----------+---------------------+ mysql>SELECT SYSDATE(), SLEEP(2), SYSDATE();
+---------------------+----------+---------------------+ | SYSDATE() | SLEEP(2) | SYSDATE() | +---------------------+----------+---------------------+ | 2006-04-12 13:47:44 | 0 | 2006-04-12 13:47:46 | +---------------------+----------+---------------------+
In addition, the SET TIMESTAMP
statement
affects the value returned by
NOW()
but not by
SYSDATE()
. This means that
timestamp settings in the binary log have no effect on
invocations of SYSDATE()
.
Because SYSDATE()
can return
different values even within the same statement, and is not
affected by SET TIMESTAMP
, it is
nondeterministic and therefore unsafe for replication if
statement-based binary logging is used. If that is a problem,
you can use row-based logging.
Alternatively, you can use the
--sysdate-is-now
option to
cause SYSDATE()
to be an alias
for NOW()
. This works if the
option is used on both the master and the slave.
The nondeterministic nature of
SYSDATE()
also means that
indexes cannot be used for evaluating expressions that refer
to it.
Extracts the time part of the time or datetime expression
expr
and returns it as a string.
This function is unsafe for statement-based replication. A
warning is logged if you use this function when
binlog_format
is set to
STATEMENT
.
mysql>SELECT TIME('2003-12-31 01:02:03');
-> '01:02:03' mysql>SELECT TIME('2003-12-31 01:02:03.000123');
-> '01:02:03.000123'
TIMEDIFF()
returns
expr1
−
expr2
expressed as a time value.
expr1
and
expr2
are time or date-and-time
expressions, but both must be of the same type.
The result returned by TIMEDIFF()
is
limited to the range allowed for
TIME
values. Alternatively, you
can use either of the functions
TIMESTAMPDIFF()
and
UNIX_TIMESTAMP()
, both of which
return integers.
mysql>SELECT TIMEDIFF('2000:01:01 00:00:00',
->'2000:01:01 00:00:00.000001');
-> '-00:00:00.000001' mysql>SELECT TIMEDIFF('2008-12-31 23:59:59.000001',
->'2008-12-30 01:01:01.000002');
-> '46:58:57.999999'
TIMESTAMP(
,
expr
)TIMESTAMP(
expr1
,expr2
)
With a single argument, this function returns the date or
datetime expression expr
as a
datetime value. With two arguments, it adds the time
expression expr2
to the date or
datetime expression expr1
and
returns the result as a datetime value.
mysql>SELECT TIMESTAMP('2003-12-31');
-> '2003-12-31 00:00:00' mysql>SELECT TIMESTAMP('2003-12-31 12:00:00','12:00:00');
-> '2004-01-01 00:00:00'
TIMESTAMPADD(
unit
,interval
,datetime_expr
)
Adds the integer expression
interval
to the date or datetime
expression datetime_expr
. The unit
for interval
is given by the
unit
argument, which should be one
of the following values: MICROSECOND
(microseconds), SECOND
,
MINUTE
, HOUR
,
DAY
, WEEK
,
MONTH
, QUARTER
, or
YEAR
.
The unit
value may be specified
using one of keywords as shown, or with a prefix of
SQL_TSI_
. For example,
DAY
and SQL_TSI_DAY
both
are legal.
mysql>SELECT TIMESTAMPADD(MINUTE,1,'2003-01-02');
-> '2003-01-02 00:01:00' mysql>SELECT TIMESTAMPADD(WEEK,1,'2003-01-02');
-> '2003-01-09'
TIMESTAMPDIFF(
unit
,datetime_expr1
,datetime_expr2
)
Returns datetime_expr2
−
datetime_expr1
, where
datetime_expr1
and
datetime_expr2
are date or datetime
expressions. One expression may be a date and the other a
datetime; a date value is treated as a datetime having the
time part '00:00:00'
where necessary. The
unit for the result (an integer) is given by the
unit
argument. The legal values for
unit
are the same as those listed
in the description of the
TIMESTAMPADD()
function.
mysql>SELECT TIMESTAMPDIFF(MONTH,'2003-02-01','2003-05-01');
-> 3 mysql>SELECT TIMESTAMPDIFF(YEAR,'2002-05-01','2001-01-01');
-> -1 mysql>SELECT TIMESTAMPDIFF(MINUTE,'2003-02-01','2003-05-01 12:05:55');
-> 128885
The order of the date or datetime arguments for this
function is the opposite of that used with the
TIMESTAMP()
function when
invoked with 2 arguments.
This is used like the
DATE_FORMAT()
function, but the
format
string may contain format
specifiers only for hours, minutes, seconds, and microseconds.
Other specifiers produce a NULL
value or
0
.
If the time
value contains an hour
part that is greater than 23
, the
%H
and %k
hour format
specifiers produce a value larger than the usual range of
0..23
. The other hour format specifiers
produce the hour value modulo 12.
mysql> SELECT TIME_FORMAT('100:00:00', '%H %k %h %I %l');
-> '100 100 04 04 4'
Returns the time
argument,
converted to seconds.
mysql>SELECT TIME_TO_SEC('22:23:00');
-> 80580 mysql>SELECT TIME_TO_SEC('00:39:38');
-> 2378
Given a date date
, returns a day
number (the number of days since year 0).
mysql>SELECT TO_DAYS(950501);
-> 728779 mysql>SELECT TO_DAYS('2007-10-07');
-> 733321
TO_DAYS()
is not intended for
use with values that precede the advent of the Gregorian
calendar (1582), because it does not take into account the
days that were lost when the calendar was changed. For dates
before 1582 (and possibly a later year in other locales),
results from this function are not reliable. See
Section 12.8, “What Calendar Is Used By MySQL?”, for details.
Remember that MySQL converts two-digit year values in dates to
four-digit form using the rules in
Section 11.3, “Date and Time Types”. For example,
'2008-10-07'
and
'08-10-07'
are seen as identical dates:
mysql> SELECT TO_DAYS('2008-10-07'), TO_DAYS('08-10-07');
-> 733687, 733687
In MySQL, the zero date is defined as
'0000-00-00'
, even though this date is
itself considered invalid. This means that, for
'0000-00-00'
and
'0000-01-01'
,
TO_DAYS()
returns the values
shown here:
mysql>SELECT TO_DAYS('0000-00-00');
+-----------------------+ | to_days('0000-00-00') | +-----------------------+ | NULL | +-----------------------+ 1 row in set, 1 warning (0.00 sec) mysql>SHOW WARNINGS;
+---------+------+----------------------------------------+ | Level | Code | Message | +---------+------+----------------------------------------+ | Warning | 1292 | Incorrect datetime value: '0000-00-00' | +---------+------+----------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT TO_DAYS('0000-01-01');
+-----------------------+ | to_days('0000-01-01') | +-----------------------+ | 1 | +-----------------------+ 1 row in set (0.00 sec)
This is true whether or not the
ALLOW_INVALID_DATES
SQL
server mode is enabled.
Given a date or datetime expr
,
returns the number of seconds since the year 0. If
expr
is not a valid date or
datetime value, returns NULL
.
mysql>SELECT TO_SECONDS(950501);
-> 62966505600 mysql>SELECT TO_SECONDS('2009-11-29');
-> 63426672000 mysql>SELECT TO_SECONDS('2009-11-29 13:43:32');
-> 63426721412 mysql>SELECT TO_SECONDS( NOW() );
-> 63426721458
Like TO_DAYS()
,
TO_SECONDS()
is not intended for use with
values that precede the advent of the Gregorian calendar
(1582), because it does not take into account the days that
were lost when the calendar was changed. For dates before 1582
(and possibly a later year in other locales), results from
this function are not reliable. See
Section 12.8, “What Calendar Is Used By MySQL?”, for details.
Like TO_DAYS()
,
TO_SECONDS()
, converts two-digit year
values in dates to four-digit form using the rules in
Section 11.3, “Date and Time Types”.
In MySQL, the zero date is defined as
'0000-00-00'
, even though this date is
itself considered invalid. This means that, for
'0000-00-00'
and
'0000-01-01'
,
TO_SECONDS()
returns the values
shown here:
mysql>SELECT TO_SECONDS('0000-00-00');
+--------------------------+ | TO_SECONDS('0000-00-00') | +--------------------------+ | NULL | +--------------------------+ 1 row in set, 1 warning (0.00 sec) mysql>SHOW WARNINGS;
+---------+------+----------------------------------------+ | Level | Code | Message | +---------+------+----------------------------------------+ | Warning | 1292 | Incorrect datetime value: '0000-00-00' | +---------+------+----------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT TO_SECONDS('0000-01-01');
+--------------------------+ | TO_SECONDS('0000-01-01') | +--------------------------+ | 86400 | +--------------------------+ 1 row in set (0.00 sec)
This is true whether or not the
ALLOW_INVALID_DATES
SQL
server mode is enabled.
If UNIX_TIMESTAMP()
is called
with no date
argument, it returns a
Unix timestamp representing seconds since '1970-01-01
00:00:00'
UTC.
If UNIX_TIMESTAMP()
is called
with a date
argument, it returns
the value of the argument as seconds since
'1970-01-01 00:00:00'
UTC. The server
interprets date
as a value in the
session time zone and converts it to an internal Unix
timestamp value in UTC. (Clients can set the session time zone
as described in Section 5.1.13, “MySQL Server Time Zone Support”.) The
date
argument may be a
DATE
,
DATETIME
, or
TIMESTAMP
string, or a number
in YYMMDD
, YYMMDDHHMMSS
,
YYYYMMDD
, or
YYYYMMDDHHMMSS
format. If the argument
includes a time part, it may optionally include a fractional
seconds part.
The return value is an integer if no argument is given or the
argument does not include a fractional seconds part, or
DECIMAL
if an argument is given
that includes a fractional seconds part.
When the date
argument is a
TIMESTAMP
column,
UNIX_TIMESTAMP()
returns the
internal timestamp value directly, with no implicit
“string-to-Unix-timestamp” conversion.
The valid range of argument values is the same as for the
TIMESTAMP
data type:
'1970-01-01 00:00:01.000000'
UTC to
'2038-01-19 03:14:07.999999'
UTC. If you
pass an out-of-range date to
UNIX_TIMESTAMP()
, it returns
0
.
mysql>SELECT UNIX_TIMESTAMP();
-> 1447431666 mysql>SELECT UNIX_TIMESTAMP('2015-11-13 10:20:19');
-> 1447431619 mysql>SELECT UNIX_TIMESTAMP('2015-11-13 10:20:19.012');
-> 1447431619.012
If you use UNIX_TIMESTAMP()
and
FROM_UNIXTIME()
to convert
between values in a non-UTC time zone and Unix timestamp
values, the conversion is lossy because the mapping is not
one-to-one in both directions. For example, due to conventions
for local time zone changes such as Daylight Saving Time
(DST), it is possible for
UNIX_TIMESTAMP()
to map two
values that are distinct in a non-UTC time zone to the same
Unix timestamp value.
FROM_UNIXTIME()
will map that
value back to only one of the original values. Here is an
example, using values that are distinct in the
MET
time zone:
mysql>SET time_zone = 'MET';
mysql>SELECT UNIX_TIMESTAMP('2005-03-27 03:00:00');
+---------------------------------------+ | UNIX_TIMESTAMP('2005-03-27 03:00:00') | +---------------------------------------+ | 1111885200 | +---------------------------------------+ mysql>SELECT UNIX_TIMESTAMP('2005-03-27 02:00:00');
+---------------------------------------+ | UNIX_TIMESTAMP('2005-03-27 02:00:00') | +---------------------------------------+ | 1111885200 | +---------------------------------------+ mysql>SELECT FROM_UNIXTIME(1111885200);
+---------------------------+ | FROM_UNIXTIME(1111885200) | +---------------------------+ | 2005-03-27 03:00:00 | +---------------------------+
To use named time zones such as 'MET'
or
'Europe/Amsterdam'
, the time zone tables
must be properly set up. For instructions, see
Section 5.1.13, “MySQL Server Time Zone Support”.
If you want to subtract
UNIX_TIMESTAMP()
columns, you
might want to cast them to signed integers. See
Section 12.10, “Cast Functions and Operators”.
Returns the current UTC date as a value in
'YYYY-MM-DD'
or YYYYMMDD
format, depending on whether the function is used in a string
or numeric context.
mysql> SELECT UTC_DATE(), UTC_DATE() + 0;
-> '2003-08-14', 20030814
Returns the current UTC time as a value in
'HH:MM:SS'
or HHMMSS
format, depending on whether the function is used in a string
or numeric context.
If the fsp
argument is given to
specify a fractional seconds precision from 0 to 6, the return
value includes a fractional seconds part of that many digits.
mysql> SELECT UTC_TIME(), UTC_TIME() + 0;
-> '18:07:53', 180753.000000
UTC_TIMESTAMP
,
UTC_TIMESTAMP([
fsp
])
Returns the current UTC date and time as a value in
'YYYY-MM-DD HH:MM:SS'
or
YYYYMMDDHHMMSS
format, depending on whether
the function is used in a string or numeric context.
If the fsp
argument is given to
specify a fractional seconds precision from 0 to 6, the return
value includes a fractional seconds part of that many digits.
mysql> SELECT UTC_TIMESTAMP(), UTC_TIMESTAMP() + 0;
-> '2003-08-14 18:08:04', 20030814180804.000000
This function returns the week number for
date
. The two-argument form of
WEEK()
enables you to specify
whether the week starts on Sunday or Monday and whether the
return value should be in the range from 0
to 53
or from 1
to
53
. If the mode
argument is omitted, the value of the
default_week_format
system
variable is used. See
Section 5.1.8, “Server System Variables”.
The following table describes how the
mode
argument works.
Mode | First day of week | Range | Week 1 is the first week … |
---|---|---|---|
0 | Sunday | 0-53 | with a Sunday in this year |
1 | Monday | 0-53 | with 4 or more days this year |
2 | Sunday | 1-53 | with a Sunday in this year |
3 | Monday | 1-53 | with 4 or more days this year |
4 | Sunday | 0-53 | with 4 or more days this year |
5 | Monday | 0-53 | with a Monday in this year |
6 | Sunday | 1-53 | with 4 or more days this year |
7 | Monday | 1-53 | with a Monday in this year |
For mode
values with a meaning of
“with 4 or more days this year,” weeks are
numbered according to ISO 8601:1988:
If the week containing January 1 has 4 or more days in the new year, it is week 1.
Otherwise, it is the last week of the previous year, and the next week is week 1.
mysql>SELECT WEEK('2008-02-20');
-> 7 mysql>SELECT WEEK('2008-02-20',0);
-> 7 mysql>SELECT WEEK('2008-02-20',1);
-> 8 mysql>SELECT WEEK('2008-12-31',1);
-> 53
If a date falls in the last week of the previous year, MySQL
returns 0
if you do not use
2
, 3
,
6
, or 7
as the optional
mode
argument:
mysql> SELECT YEAR('2000-01-01'), WEEK('2000-01-01',0);
-> 2000, 0
One might argue that WEEK()
should return 52
because the given date
actually occurs in the 52nd week of 1999.
WEEK()
returns
0
instead so that the return value is
“the week number in the given year.” This makes
use of the WEEK()
function
reliable when combined with other functions that extract a
date part from a date.
If you prefer a result evaluated with respect to the year that
contains the first day of the week for the given date, use
0
, 2
,
5
, or 7
as the optional
mode
argument.
mysql> SELECT WEEK('2000-01-01',2);
-> 52
Alternatively, use the
YEARWEEK()
function:
mysql>SELECT YEARWEEK('2000-01-01');
-> 199952 mysql>SELECT MID(YEARWEEK('2000-01-01'),5,2);
-> '52'
Returns the weekday index for date
(0
= Monday, 1
=
Tuesday, … 6
= Sunday).
mysql>SELECT WEEKDAY('2008-02-03 22:23:00');
-> 6 mysql>SELECT WEEKDAY('2007-11-06');
-> 1
Returns the calendar week of the date as a number in the range
from 1
to 53
.
WEEKOFYEAR()
is a compatibility
function that is equivalent to
WEEK(
.
date
,3)
mysql> SELECT WEEKOFYEAR('2008-02-20');
-> 8
Returns the year for date
, in the
range 1000
to 9999
, or
0
for the “zero” date.
mysql> SELECT YEAR('1987-01-01');
-> 1987
YEARWEEK(
,
date
)YEARWEEK(
date
,mode
)
Returns year and week for a date. The year in the result may be different from the year in the date argument for the first and the last week of the year.
The mode
argument works exactly
like the mode
argument to
WEEK()
. For the single-argument
syntax, a mode
value of 0 is used.
Unlike WEEK()
, the value of
default_week_format
does not
influence YEARWEEK()
.
mysql> SELECT YEARWEEK('1987-01-01');
-> 198652
The week number is different from what the
WEEK()
function would return
(0
) for optional arguments
0
or 1
, as
WEEK()
then returns the week in
the context of the given year.
MySQL uses what is known as a proleptic Gregorian calendar.
Every country that has switched from the Julian to the Gregorian calendar has had to discard at least ten days during the switch. To see how this works, consider the month of October 1582, when the first Julian-to-Gregorian switch occurred.
Monday | Tuesday | Wednesday | Thursday | Friday | Saturday | Sunday |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 15 | 16 | 17 |
18 | 19 | 20 | 21 | 22 | 23 | 24 |
25 | 26 | 27 | 28 | 29 | 30 | 31 |
There are no dates between October 4 and October 15. This discontinuity is called the cutover. Any dates before the cutover are Julian, and any dates following the cutover are Gregorian. Dates during a cutover are nonexistent.
A calendar applied to dates when it was not actually in use is
called proleptic. Thus, if
we assume there was never a cutover and Gregorian rules always
rule, we have a proleptic Gregorian calendar. This is what is used
by MySQL, as is required by standard SQL. For this reason, dates
prior to the cutover stored as MySQL
DATE
or
DATETIME
values must be adjusted to
compensate for the difference. It is important to realize that the
cutover did not occur at the same time in all countries, and that
the later it happened, the more days were lost. For example, in
Great Britain, it took place in 1752, when Wednesday September 2
was followed by Thursday September 14. Russia remained on the
Julian calendar until 1918, losing 13 days in the process, and
what is popularly referred to as its “October
Revolution” occurred in November according to the Gregorian
calendar.
MATCH
(
col1
,col2
,...)
AGAINST (expr
[search_modifier
])
search_modifier:
{
IN NATURAL LANGUAGE MODE
| IN NATURAL LANGUAGE MODE WITH QUERY EXPANSION
| IN BOOLEAN MODE
| WITH QUERY EXPANSION
}
MySQL has support for full-text indexing and searching:
A full-text index in MySQL is an index of type
FULLTEXT
.
Full-text indexes can be used only with
InnoDB
or
MyISAM
tables, and can be created
only for CHAR
,
VARCHAR
, or
TEXT
columns.
MySQL
provides a built-in full-text ngram
parser that supports Chinese, Japanese, and Korean (CJK), and
an installable MeCab full-text parser plugin for Japanese.
Parsing differences are outlined in
Section 12.9.8, “ngram Full-Text Parser”, and
Section 12.9.9, “MeCab Full-Text Parser Plugin”.
A FULLTEXT
index definition can be given in
the CREATE TABLE
statement when
a table is created, or added later using
ALTER TABLE
or
CREATE INDEX
.
For large data sets, it is much faster to load your data into
a table that has no FULLTEXT
index and then
create the index after that, than to load data into a table
that has an existing FULLTEXT
index.
Full-text searching is performed using
MATCH() ... AGAINST
syntax.
MATCH()
takes a comma-separated
list that names the columns to be searched.
AGAINST
takes a string to search for, and an
optional modifier that indicates what type of search to perform.
The search string must be a string value that is constant during
query evaluation. This rules out, for example, a table column
because that can differ for each row.
There are three types of full-text searches:
A natural language search interprets the search string as a phrase in natural human language (a phrase in free text). There are no special operators, with the exception of double quote (") characters. The stopword list applies. For more information about stopword lists, see Section 12.9.4, “Full-Text Stopwords”.
Full-text searches are natural language searches if the
IN NATURAL LANGUAGE MODE
modifier is given
or if no modifier is given. For more information, see
Section 12.9.1, “Natural Language Full-Text Searches”.
A boolean search interprets the search string using the rules
of a special query language. The string contains the words to
search for. It can also contain operators that specify
requirements such that a word must be present or absent in
matching rows, or that it should be weighted higher or lower
than usual. Certain common words (stopwords) are omitted from
the search index and do not match if present in the search
string. The IN BOOLEAN MODE
modifier
specifies a boolean search. For more information, see
Section 12.9.2, “Boolean Full-Text Searches”.
A query expansion search is a modification of a natural
language search. The search string is used to perform a
natural language search. Then words from the most relevant
rows returned by the search are added to the search string and
the search is done again. The query returns the rows from the
second search. The IN NATURAL LANGUAGE MODE WITH
QUERY EXPANSION
or WITH QUERY
EXPANSION
modifier specifies a query expansion
search. For more information, see
Section 12.9.3, “Full-Text Searches with Query Expansion”.
For information about FULLTEXT
query
performance, see Section 8.3.5, “Column Indexes”.
For more information about InnoDB
FULLTEXT
indexes, see
Section 15.6.2.4, “InnoDB FULLTEXT Indexes”.
Constraints on full-text searching are listed in Section 12.9.5, “Full-Text Restrictions”.
The myisam_ftdump utility dumps the contents of
a MyISAM
full-text index. This may be helpful
for debugging full-text queries. See
Section 4.6.3, “myisam_ftdump — Display Full-Text Index information”.
By default or with the IN NATURAL LANGUAGE
MODE
modifier, the
MATCH()
function performs a
natural language search for a string against a
text collection. A
collection is a set of one or more columns included in a
FULLTEXT
index. The search string is given as
the argument to AGAINST()
. For each row in
the table, MATCH()
returns a
relevance value; that is, a similarity measure between the
search string and the text in that row in the columns named in
the MATCH()
list.
mysql>CREATE TABLE articles (
id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
title VARCHAR(200),
body TEXT,
FULLTEXT (title,body)
) ENGINE=InnoDB;
Query OK, 0 rows affected (0.08 sec) mysql>INSERT INTO articles (title,body) VALUES
('MySQL Tutorial','DBMS stands for DataBase ...'),
('How To Use MySQL Well','After you went through a ...'),
('Optimizing MySQL','In this tutorial we will show ...'),
('1001 MySQL Tricks','1. Never run mysqld as root. 2. ...'),
('MySQL vs. YourSQL','In the following database comparison ...'),
('MySQL Security','When configured properly, MySQL ...');
Query OK, 6 rows affected (0.01 sec) Records: 6 Duplicates: 0 Warnings: 0 mysql>SELECT * FROM articles
WHERE MATCH (title,body)
AGAINST ('database' IN NATURAL LANGUAGE MODE);
+----+-------------------+------------------------------------------+ | id | title | body | +----+-------------------+------------------------------------------+ | 1 | MySQL Tutorial | DBMS stands for DataBase ... | | 5 | MySQL vs. YourSQL | In the following database comparison ... | +----+-------------------+------------------------------------------+ 2 rows in set (0.00 sec)
By default, the search is performed in case-insensitive fashion.
To perform a case-sensitive full-text search, use a
case-sensitive or binary collation for the indexed columns. For
example, a column that uses the utf8mb4
character set of can be assigned a collation of
utf8mb4_0900_as_cs
or
utf8mb4_bin
to make it case-sensitive for
full-text searches.
When MATCH()
is used in a
WHERE
clause, as in the example shown
earlier, the rows returned are automatically sorted with the
highest relevance first. Relevance values are nonnegative
floating-point numbers. Zero relevance means no similarity.
Relevance is computed based on the number of words in the row
(document), the number of unique words in the row, the total
number of words in the collection, and the number of rows that
contain a particular word.
The term “document” may be used interchangeably with the term “row”, and both terms refer to the indexed part of the row. The term “collection” refers to the indexed columns and encompasses all rows.
To simply count matches, you could use a query like this:
mysql>SELECT COUNT(*) FROM articles
WHERE MATCH (title,body)
AGAINST ('database' IN NATURAL LANGUAGE MODE);
+----------+ | COUNT(*) | +----------+ | 2 | +----------+ 1 row in set (0.00 sec)
You might find it quicker to rewrite the query as follows:
mysql>SELECT
COUNT(IF(MATCH (title,body) AGAINST ('database' IN NATURAL LANGUAGE MODE), 1, NULL))
AS count
FROM articles;
+-------+ | count | +-------+ | 2 | +-------+ 1 row in set (0.03 sec)
The first query does some extra work (sorting the results by
relevance) but also can use an index lookup based on the
WHERE
clause. The index lookup might make the
first query faster if the search matches few rows. The second
query performs a full table scan, which might be faster than the
index lookup if the search term was present in most rows.
For natural-language full-text searches, the columns named in
the MATCH()
function must be the
same columns included in some FULLTEXT
index
in your table. For the preceding query, note that the columns
named in the MATCH()
function
(title
and body
) are the
same as those named in the definition of the
article
table's FULLTEXT
index. To search the title
or
body
separately, you would create separate
FULLTEXT
indexes for each column.
You can also perform a boolean search or a search with query expansion. These search types are described in Section 12.9.2, “Boolean Full-Text Searches”, and Section 12.9.3, “Full-Text Searches with Query Expansion”.
A full-text search that uses an index can name columns only from
a single table in the MATCH()
clause because an index cannot span multiple tables. For
MyISAM
tables, a boolean search can be done
in the absence of an index (albeit more slowly), in which case
it is possible to name columns from multiple tables.
The preceding example is a basic illustration that shows how to
use the MATCH()
function where
rows are returned in order of decreasing relevance. The next
example shows how to retrieve the relevance values explicitly.
Returned rows are not ordered because the
SELECT
statement includes neither
WHERE
nor ORDER BY
clauses:
mysql>SELECT id, MATCH (title,body)
AGAINST ('Tutorial' IN NATURAL LANGUAGE MODE) AS score
FROM articles;
+----+---------------------+ | id | score | +----+---------------------+ | 1 | 0.22764469683170319 | | 2 | 0 | | 3 | 0.22764469683170319 | | 4 | 0 | | 5 | 0 | | 6 | 0 | +----+---------------------+ 6 rows in set (0.00 sec)
The following example is more complex. The query returns the
relevance values and it also sorts the rows in order of
decreasing relevance. To achieve this result, specify
MATCH()
twice: once in the
SELECT
list and once in the
WHERE
clause. This causes no additional
overhead, because the MySQL optimizer notices that the two
MATCH()
calls are identical and
invokes the full-text search code only once.
mysql>SELECT id, body, MATCH (title,body) AGAINST
('Security implications of running MySQL as root'
IN NATURAL LANGUAGE MODE) AS score
FROM articles WHERE MATCH (title,body) AGAINST
('Security implications of running MySQL as root'
IN NATURAL LANGUAGE MODE);
+----+-------------------------------------+-----------------+ | id | body | score | +----+-------------------------------------+-----------------+ | 4 | 1. Never run mysqld as root. 2. ... | 1.5219271183014 | | 6 | When configured properly, MySQL ... | 1.3114095926285 | +----+-------------------------------------+-----------------+ 2 rows in set (0.00 sec)
A phrase that is enclosed within double quote
("
) characters matches only rows that contain
the phrase literally, as it was typed. The
full-text engine splits the phrase into words and performs a
search in the FULLTEXT
index for the words.
Nonword characters need not be matched exactly: Phrase searching
requires only that matches contain exactly the same words as the
phrase and in the same order. For example, "test
phrase"
matches "test, phrase"
. If
the phrase contains no words that are in the index, the result
is empty. For example, if all words are either stopwords or
shorter than the minimum length of indexed words, the result is
empty.
The MySQL FULLTEXT
implementation regards any
sequence of true word characters (letters, digits, and
underscores) as a word. That sequence may also contain
apostrophes ('
), but not more than one in a
row. This means that aaa'bbb
is regarded as
one word, but aaa''bbb
is regarded as two
words. Apostrophes at the beginning or the end of a word are
stripped by the FULLTEXT
parser;
'aaa'bbb'
would be parsed as
aaa'bbb
.
The built-in FULLTEXT
parser determines where
words start and end by looking for certain delimiter characters;
for example,
(space),
,
(comma), and .
(period).
If words are not separated by delimiters (as in, for example,
Chinese), the built-in FULLTEXT
parser cannot
determine where a word begins or ends. To be able to add words
or other indexed terms in such languages to a
FULLTEXT
index that uses the built-in
FULLTEXT
parser, you must preprocess them so
that they are separated by some arbitrary delimiter.
Alternatively, you can create FULLTEXT
indexes using the ngram parser plugin (for Chinese, Japanese, or
Korean) or the MeCab parser plugin (for Japanese).
It is possible to write a plugin that replaces the built-in
full-text parser. For details, see Section 29.2, “The MySQL Plugin API”.
For example parser plugin source code, see the
plugin/fulltext
directory of a MySQL source
distribution.
Some words are ignored in full-text searches:
Any word that is too short is ignored. The default minimum
length of words that are found by full-text searches is
three characters for InnoDB
search
indexes, or four characters for MyISAM
.
You can control the cutoff by setting a configuration option
before creating the index:
innodb_ft_min_token_size
configuration option for InnoDB
search
indexes, or ft_min_word_len
for MyISAM
.
This behavior does not apply to
FULLTEXT
indexes that use the ngram
parser. For the ngram parser, token length is defined by
the ngram_token_size
option.
Words in the stopword list are ignored. A stopword is a word
such as “the” or “some” that is so
common that it is considered to have zero semantic value.
There is a built-in stopword list, but it can be overridden
by a user-defined list. The stopword lists and related
configuration options are different for
InnoDB
search indexes and
MyISAM
ones. Stopword processing is
controlled by the configuration options
innodb_ft_enable_stopword
,
innodb_ft_server_stopword_table
,
and
innodb_ft_user_stopword_table
for InnoDB
search indexes, and
ft_stopword_file
for
MyISAM
ones.
See Section 12.9.4, “Full-Text Stopwords” to view default stopword lists and how to change them. The default minimum word length can be changed as described in Section 12.9.6, “Fine-Tuning MySQL Full-Text Search”.
Every correct word in the collection and in the query is weighted according to its significance in the collection or query. Thus, a word that is present in many documents has a lower weight, because it has lower semantic value in this particular collection. Conversely, if the word is rare, it receives a higher weight. The weights of the words are combined to compute the relevance of the row. This technique works best with large collections.
For very small tables, word distribution does not adequately
reflect their semantic value, and this model may sometimes
produce bizarre results for search indexes on
MyISAM
tables. For example, although the
word “MySQL” is present in every row of the
articles
table shown earlier, a search for
the word in a MyISAM
search index produces
no results:
mysql>SELECT * FROM articles
WHERE MATCH (title,body)
AGAINST ('MySQL' IN NATURAL LANGUAGE MODE);
Empty set (0.00 sec)
The search result is empty because the word “MySQL” is present in at least 50% of the rows, and so is effectively treated as a stopword. This filtering technique is more suitable for large data sets, where you might not want the result set to return every second row from a 1GB table, than for small data sets where it might cause poor results for popular terms.
The 50% threshold can surprise you when you first try
full-text searching to see how it works, and makes
InnoDB
tables more suited to
experimentation with full-text searches. If you create a
MyISAM
table and insert only one or two
rows of text into it, every word in the text occurs in at
least 50% of the rows. As a result, no search returns any
results until the table contains more rows. Users who need to
bypass the 50% limitation can build search indexes on
InnoDB
tables, or use the boolean search
mode explained in Section 12.9.2, “Boolean Full-Text Searches”.
MySQL can perform boolean full-text searches using the
IN BOOLEAN MODE
modifier. With this modifier,
certain characters have special meaning at the beginning or end
of words in the search string. In the following query, the
+
and -
operators indicate
that a word must be present or absent, respectively, for a match
to occur. Thus, the query retrieves all the rows that contain
the word “MySQL” but that do
not contain the word
“YourSQL”:
mysql>SELECT * FROM articles WHERE MATCH (title,body)
AGAINST ('+MySQL -YourSQL' IN BOOLEAN MODE);
+----+-----------------------+-------------------------------------+ | id | title | body | +----+-----------------------+-------------------------------------+ | 1 | MySQL Tutorial | DBMS stands for DataBase ... | | 2 | How To Use MySQL Well | After you went through a ... | | 3 | Optimizing MySQL | In this tutorial we will show ... | | 4 | 1001 MySQL Tricks | 1. Never run mysqld as root. 2. ... | | 6 | MySQL Security | When configured properly, MySQL ... | +----+-----------------------+-------------------------------------+
In implementing this feature, MySQL uses what is sometimes referred to as implied Boolean logic, in which
+
stands for AND
-
stands for NOT
[no operator] implies
OR
Boolean full-text searches have these characteristics:
They do not automatically sort rows in order of decreasing relevance.
InnoDB
tables require a
FULLTEXT
index on all columns of the
MATCH()
expression to perform
boolean queries. Boolean queries against a
MyISAM
search index can work even without
a FULLTEXT
index, although a search
executed in this fashion would be quite slow.
The minimum and maximum word length full-text parameters
apply to FULLTEXT
indexes created using
the built-in FULLTEXT
parser and MeCab
parser plugin.
innodb_ft_min_token_size
and
innodb_ft_max_token_size
are used for InnoDB
search indexes.
ft_min_word_len
and
ft_max_word_len
are used
for MyISAM
search indexes.
Minimum and maximum word length full-text parameters do not
apply to FULLTEXT
indexes created using
the ngram parser. ngram token size is defined by the
ngram_token_size
option.
The stopword list applies, controlled by
innodb_ft_enable_stopword
,
innodb_ft_server_stopword_table
,
and
innodb_ft_user_stopword_table
for InnoDB
search indexes, and
ft_stopword_file
for
MyISAM
ones.
InnoDB
full-text search does not support
the use of multiple operators on a single search word, as in
this example: '++apple'
. Use of multiple
operators on a single search word returns a syntax error to
standard out. MyISAM full-text search will successfully
process the same search ignoring all operators except for
the operator immediately adjacent to the search word.
InnoDB
full-text search only supports
leading plus or minus signs. For example,
InnoDB
supports
'+apple'
but does not support
'apple+'
. Specifying a trailing plus or
minus sign causes InnoDB
to report a
syntax error.
InnoDB
full-text search does not support
the use of a leading plus sign with wildcard
('+*'
), a plus and minus sign combination
('+-'
), or leading a plus and minus sign
combination ('+-apple'
). These invalid
queries return a syntax error.
InnoDB
full-text search does not support
the use of the @
symbol in boolean
full-text searches. The @
symbol is
reserved for use by the @distance
proximity search operator.
They do not use the 50% threshold that applies to
MyISAM
search indexes.
The boolean full-text search capability supports the following operators:
+
A leading or trailing plus sign indicates that this word
must be present in each row that is
returned. InnoDB
only supports leading
plus signs.
-
A leading or trailing minus sign indicates that this word
must not be present in any of the rows
that are returned. InnoDB
only supports
leading minus signs.
Note: The -
operator acts only to exclude
rows that are otherwise matched by other search terms. Thus,
a boolean-mode search that contains only terms preceded by
-
returns an empty result. It does not
return “all rows except those containing any of the
excluded terms.”
(no operator)
By default (when neither +
nor
-
is specified), the word is optional,
but the rows that contain it are rated higher. This mimics
the behavior of MATCH() ...
AGAINST()
without the IN BOOLEAN
MODE
modifier.
@
distance
This operator works on InnoDB
tables
only. It tests whether two or more words all start within a
specified distance from each other, measured in words.
Specify the search words within a double-quoted string
immediately before the
@
operator, for example, distance
MATCH(col1) AGAINST('"word1
word2 word3" @8' IN BOOLEAN MODE)
> <
These two operators are used to change a word's contribution
to the relevance value that is assigned to a row. The
>
operator increases the contribution
and the <
operator decreases it. See
the example following this list.
( )
Parentheses group words into subexpressions. Parenthesized groups can be nested.
~
A leading tilde acts as a negation operator, causing the
word's contribution to the row's relevance to be negative.
This is useful for marking “noise” words. A row
containing such a word is rated lower than others, but is
not excluded altogether, as it would be with the
-
operator.
*
The asterisk serves as the truncation (or wildcard)
operator. Unlike the other operators, it is
appended to the word to be affected.
Words match if they begin with the word preceding the
*
operator.
If a word is specified with the truncation operator, it is
not stripped from a boolean query, even if it is too short
or a stopword. Whether a word is too short is determined
from the
innodb_ft_min_token_size
setting for InnoDB
tables, or
ft_min_word_len
for
MyISAM
tables. These options are not
applicable to FULLTEXT
indexes that use
the ngram parser.
The wildcarded word is considered as a prefix that must be
present at the start of one or more words. If the minimum
word length is 4, a search for
'+
could return fewer rows than a search for
word
+the*''+
,
because the second query ignores the too-short search term
word
+the'the
.
"
A phrase that is enclosed within double quote
("
) characters matches only rows that
contain the phrase literally, as it was
typed. The full-text engine splits the phrase
into words and performs a search in the
FULLTEXT
index for the words. Nonword
characters need not be matched exactly: Phrase searching
requires only that matches contain exactly the same words as
the phrase and in the same order. For example,
"test phrase"
matches "test,
phrase"
.
If the phrase contains no words that are in the index, the result is empty. The words might not be in the index because of a combination of factors: if they do not exist in the text, are stopwords, or are shorter than the minimum length of indexed words.
The following examples demonstrate some search strings that use boolean full-text operators:
'apple banana'
Find rows that contain at least one of the two words.
'+apple +juice'
Find rows that contain both words.
'+apple macintosh'
Find rows that contain the word “apple”, but rank rows higher if they also contain “macintosh”.
'+apple -macintosh'
Find rows that contain the word “apple” but not “macintosh”.
'+apple ~macintosh'
Find rows that contain the word “apple”, but if
the row also contains the word “macintosh”,
rate it lower than if row does not. This is
“softer” than a search for '+apple
-macintosh'
, for which the presence of
“macintosh” causes the row not to be returned
at all.
'+apple +(>turnover <strudel)'
Find rows that contain the words “apple” and “turnover”, or “apple” and “strudel” (in any order), but rank “apple turnover” higher than “apple strudel”.
'apple*'
Find rows that contain words such as “apple”, “apples”, “applesauce”, or “applet”.
'"some words"'
Find rows that contain the exact phrase “some
words” (for example, rows that contain “some
words of wisdom” but not “some noise
words”). Note that the "
characters that enclose the phrase are operator characters
that delimit the phrase. They are not the quotation marks
that enclose the search string itself.
InnoDB
full-text search is
modeled on the
Sphinx full-text
search engine, and the algorithms used are based on
BM25
and
TF-IDF
ranking algorithms. For these reasons, relevancy rankings for
InnoDB
boolean full-text search may differ
from MyISAM
relevancy rankings.
InnoDB
uses a variation of the “term
frequency-inverse document frequency”
(TF-IDF
) weighting system to rank a
document's relevance for a given full-text search query. The
TF-IDF
weighting is based on how frequently
a word appears in a document, offset by how frequently the
word appears in all documents in the collection. In other
words, the more frequently a word appears in a document, and
the less frequently the word appears in the document
collection, the higher the document is ranked.
The term frequency (TF
) value is the number
of times that a word appears in a document. The inverse
document frequency (IDF
) value of a word is
calculated using the following formula, where
total_records
is the number of records in
the collection, and matching_records
is the
number of records that the search term appears in.
${IDF} = log10( ${total_records} / ${matching_records} )
When a document contains a word multiple times, the IDF value is multiplied by the TF value:
${TF} * ${IDF}
Using the TF
and IDF
values, the relevancy ranking for a document is calculated
using this formula:
${rank} = ${TF} * ${IDF} * ${IDF}
The formula is demonstrated in the following examples.
This example demonstrates the relevancy ranking calculation for a single-word search.
mysql> CREATE TABLE articles ( id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY, title VARCHAR(200), body TEXT, FULLTEXT (title,body) ) ENGINE=InnoDB; Query OK, 0 rows affected (1.04 sec) mysql> INSERT INTO articles (title,body) VALUES ('MySQL Tutorial','This database tutorial ...'), ("How To Use MySQL",'After you went through a ...'), ('Optimizing Your Database','In this database tutorial ...'), ('MySQL vs. YourSQL','When comparing databases ...'), ('MySQL Security','When configured properly, MySQL ...'), ('Database, Database, Database','database database database'), ('1001 MySQL Tricks','1. Never run mysqld as root. 2. ...'), ('MySQL Full-Text Indexes', 'MySQL fulltext indexes use a ..'); Query OK, 8 rows affected (0.06 sec) Records: 8 Duplicates: 0 Warnings: 0 mysql> SELECT id, title, body, MATCH (title,body) AGAINST ('database' IN BOOLEAN MODE) AS score FROM articles ORDER BY score DESC; +----+------------------------------+-------------------------------------+---------------------+ | id | title | body | score | +----+------------------------------+-------------------------------------+---------------------+ | 6 | Database, Database, Database | database database database | 1.0886961221694946 | | 3 | Optimizing Your Database | In this database tutorial ... | 0.36289870738983154 | | 1 | MySQL Tutorial | This database tutorial ... | 0.18144935369491577 | | 2 | How To Use MySQL | After you went through a ... | 0 | | 4 | MySQL vs. YourSQL | When comparing databases ... | 0 | | 5 | MySQL Security | When configured properly, MySQL ... | 0 | | 7 | 1001 MySQL Tricks | 1. Never run mysqld as root. 2. ... | 0 | | 8 | MySQL Full-Text Indexes | MySQL fulltext indexes use a .. | 0 | +----+------------------------------+-------------------------------------+---------------------+ 8 rows in set (0.00 sec)
There are 8 records in total, with 3 that match the
“database” search term. The first record
(id 6
) contains the search term 6 times and
has a relevancy ranking of
1.0886961221694946
. This ranking value is
calculated using a TF
value of 6 (the
“database” search term appears 6 times in record
id 6
) and an IDF
value
of 0.42596873216370745, which is calculated as follows (where
8 is the total number of records and 3 is the number of
records that the search term appears in):
${IDF} = log10( 8 / 3 ) = 0.42596873216370745
The TF
and IDF
values
are then entered into the ranking formula:
${rank} = ${TF} * ${IDF} * ${IDF}
Performing the calculation in the MySQL command-line client returns a ranking value of 1.088696164686938.
mysql> SELECT 6*log10(8/3)*log10(8/3); +-------------------------+ | 6*log10(8/3)*log10(8/3) | +-------------------------+ | 1.088696164686938 | +-------------------------+ 1 row in set (0.00 sec)
You may notice a slight difference in the ranking values
returned by the SELECT ... MATCH ...
AGAINST
statement and the MySQL command-line
client (1.0886961221694946
versus
1.088696164686938
). The difference is due
to how the casts between integers and floats/doubles are
performed internally by InnoDB
(along
with related precision and rounding decisions), and how they
are performed elsewhere, such as in the MySQL command-line
client or other types of calculators.
This example demonstrates the relevancy ranking calculation
for a multiple-word full-text search based on the
articles
table and data used in the
previous example.
If you search on more than one word, the relevancy ranking value is a sum of the relevancy ranking value for each word, as shown in this formula:
${rank} = ${TF} * ${IDF} * ${IDF} + ${TF} * ${IDF} * ${IDF}
Performing a search on two terms ('mysql tutorial') returns the following results:
mysql> SELECT id, title, body, MATCH (title,body) AGAINST ('mysql tutorial' IN BOOLEAN MODE) AS score FROM articles ORDER BY score DESC; +----+------------------------------+-------------------------------------+----------------------+ | id | title | body | score | +----+------------------------------+-------------------------------------+----------------------+ | 1 | MySQL Tutorial | This database tutorial ... | 0.7405621409416199 | | 3 | Optimizing Your Database | In this database tutorial ... | 0.3624762296676636 | | 5 | MySQL Security | When configured properly, MySQL ... | 0.031219376251101494 | | 8 | MySQL Full-Text Indexes | MySQL fulltext indexes use a .. | 0.031219376251101494 | | 2 | How To Use MySQL | After you went through a ... | 0.015609688125550747 | | 4 | MySQL vs. YourSQL | When comparing databases ... | 0.015609688125550747 | | 7 | 1001 MySQL Tricks | 1. Never run mysqld as root. 2. ... | 0.015609688125550747 | | 6 | Database, Database, Database | database database database | 0 | +----+------------------------------+-------------------------------------+----------------------+ 8 rows in set (0.00 sec)
In the first record (id 8
), 'mysql' appears
once and 'tutorial' appears twice. There are six matching
records for 'mysql' and two matching records for 'tutorial'.
The MySQL command-line client returns the expected ranking
value when inserting these values into the ranking formula for
a multiple word search:
mysql> SELECT (1*log10(8/6)*log10(8/6)) + (2*log10(8/2)*log10(8/2)); +-------------------------------------------------------+ | (1*log10(8/6)*log10(8/6)) + (2*log10(8/2)*log10(8/2)) | +-------------------------------------------------------+ | 0.7405621541938003 | +-------------------------------------------------------+ 1 row in set (0.00 sec)
The slight difference in the ranking values returned by the
SELECT ... MATCH ... AGAINST
statement
and the MySQL command-line client is explained in the
preceding example.
Full-text search supports query expansion (and in particular, its variant “blind query expansion”). This is generally useful when a search phrase is too short, which often means that the user is relying on implied knowledge that the full-text search engine lacks. For example, a user searching for “database” may really mean that “MySQL”, “Oracle”, “DB2”, and “RDBMS” all are phrases that should match “databases” and should be returned, too. This is implied knowledge.
Blind query expansion (also known as automatic relevance
feedback) is enabled by adding WITH QUERY
EXPANSION
or IN NATURAL LANGUAGE MODE WITH
QUERY EXPANSION
following the search phrase. It works
by performing the search twice, where the search phrase for the
second search is the original search phrase concatenated with
the few most highly relevant documents from the first search.
Thus, if one of these documents contains the word
“databases” and the word “MySQL”, the
second search finds the documents that contain the word
“MySQL” even if they do not contain the word
“database”. The following example shows this
difference:
mysql>SELECT * FROM articles
WHERE MATCH (title,body)
AGAINST ('database' IN NATURAL LANGUAGE MODE);
+----+-------------------+------------------------------------------+ | id | title | body | +----+-------------------+------------------------------------------+ | 1 | MySQL Tutorial | DBMS stands for DataBase ... | | 5 | MySQL vs. YourSQL | In the following database comparison ... | +----+-------------------+------------------------------------------+ 2 rows in set (0.00 sec) mysql>SELECT * FROM articles
WHERE MATCH (title,body)
AGAINST ('database' WITH QUERY EXPANSION);
+----+-----------------------+------------------------------------------+ | id | title | body | +----+-----------------------+------------------------------------------+ | 5 | MySQL vs. YourSQL | In the following database comparison ... | | 1 | MySQL Tutorial | DBMS stands for DataBase ... | | 3 | Optimizing MySQL | In this tutorial we will show ... | | 6 | MySQL Security | When configured properly, MySQL ... | | 2 | How To Use MySQL Well | After you went through a ... | | 4 | 1001 MySQL Tricks | 1. Never run mysqld as root. 2. ... | +----+-----------------------+------------------------------------------+ 6 rows in set (0.00 sec)
Another example could be searching for books by Georges Simenon about Maigret, when a user is not sure how to spell “Maigret”. A search for “Megre and the reluctant witnesses” finds only “Maigret and the Reluctant Witnesses” without query expansion. A search with query expansion finds all books with the word “Maigret” on the second pass.
Because blind query expansion tends to increase noise significantly by returning nonrelevant documents, use it only when a search phrase is short.
The stopword list is loaded and searched for full-text queries
using the server character set and collation (the values of the
character_set_server
and
collation_server
system
variables). False hits or misses might occur for stopword
lookups if the stopword file or columns used for full-text
indexing or searches have a character set or collation different
from character_set_server
or
collation_server
.
Case sensitivity of stopword lookups depends on the server
collation. For example, lookups are case insensitive if the
collation is utf8mb4_0900_ai_ci
, whereas
lookups are case-sensitive if the collation is
utf8mb4_0900_as_cs
or
utf8mb4_bin
.
InnoDB
has a relatively short list of
default stopwords, because documents from technical, literary,
and other sources often use short words as keywords or in
significant phrases. For example, you might search for
“to be or not to be” and expect to get a sensible
result, rather than having all those words ignored.
To see the default InnoDB
stopword list,
query the
INFORMATION_SCHEMA.INNODB_FT_DEFAULT_STOPWORD
table.
mysql> SELECT * FROM INFORMATION_SCHEMA.INNODB_FT_DEFAULT_STOPWORD; +-------+ | value | +-------+ | a | | about | | an | | are | | as | | at | | be | | by | | com | | de | | en | | for | | from | | how | | i | | in | | is | | it | | la | | of | | on | | or | | that | | the | | this | | to | | was | | what | | when | | where | | who | | will | | with | | und | | the | | www | +-------+ 36 rows in set (0.00 sec)
To define your own stopword list for all
InnoDB
tables, define a table with the same
structure as the
INNODB_FT_DEFAULT_STOPWORD
table,
populate it with stopwords, and set the value of the
innodb_ft_server_stopword_table
option to a value in the form
before creating the full-text index. The stopword table must
have a single db_name
/table_name
VARCHAR
column
named value
. The following example
demonstrates creating and configuring a new global stopword
table for InnoDB
.
-- Create a new stopword table mysql> CREATE TABLE my_stopwords(value VARCHAR(30)) ENGINE = INNODB; Query OK, 0 rows affected (0.01 sec) -- Insert stopwords (for simplicity, a single stopword is used in this example) mysql> INSERT INTO my_stopwords(value) VALUES ('Ishmael'); Query OK, 1 row affected (0.00 sec) -- Create the table mysql> CREATE TABLE opening_lines ( id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY, opening_line TEXT(500), author VARCHAR(200), title VARCHAR(200) ) ENGINE=InnoDB; Query OK, 0 rows affected (0.01 sec) -- Insert data into the table mysql> INSERT INTO opening_lines(opening_line,author,title) VALUES ('Call me Ishmael.','Herman Melville','Moby-Dick'), ('A screaming comes across the sky.','Thomas Pynchon','Gravity\'s Rainbow'), ('I am an invisible man.','Ralph Ellison','Invisible Man'), ('Where now? Who now? When now?','Samuel Beckett','The Unnamable'), ('It was love at first sight.','Joseph Heller','Catch-22'), ('All this happened, more or less.','Kurt Vonnegut','Slaughterhouse-Five'), ('Mrs. Dalloway said she would buy the flowers herself.','Virginia Woolf','Mrs. Dalloway'), ('It was a pleasure to burn.','Ray Bradbury','Fahrenheit 451'); Query OK, 8 rows affected (0.00 sec) Records: 8 Duplicates: 0 Warnings: 0 -- Set the innodb_ft_server_stopword_table option to the new stopword table mysql> SET GLOBAL innodb_ft_server_stopword_table = 'test/my_stopwords'; Query OK, 0 rows affected (0.00 sec) -- Create the full-text index (which rebuilds the table if no FTS_DOC_ID column is defined) mysql> CREATE FULLTEXT INDEX idx ON opening_lines(opening_line); Query OK, 0 rows affected, 1 warning (1.17 sec) Records: 0 Duplicates: 0 Warnings: 1
Verify that the specified stopword ('Ishmael') does not appear
by querying the words in
INFORMATION_SCHEMA.INNODB_FT_INDEX_TABLE
.
By default, words less than 3 characters in length or
greater than 84 characters in length do not appear in an
InnoDB
full-text search index. Maximum
and minimum word length values are configurable using the
innodb_ft_max_token_size
and
innodb_ft_min_token_size
variables. This default behavior does not apply to the ngram
parser plugin. ngram token size is defined by the
ngram_token_size
option.
mysql> SET GLOBAL innodb_ft_aux_table='test/opening_lines'; Query OK, 0 rows affected (0.00 sec) mysql> SELECT word FROM INFORMATION_SCHEMA.INNODB_FT_INDEX_TABLE LIMIT 15; +-----------+ | word | +-----------+ | across | | all | | burn | | buy | | call | | comes | | dalloway | | first | | flowers | | happened | | herself | | invisible | | less | | love | | man | +-----------+ 15 rows in set (0.00 sec)
To create stopword lists on a table-by-table basis, create
other stopword tables and use the
innodb_ft_user_stopword_table
option to specify the stopword table that you want to use
before you create the full-text index.
The stopword file is loaded and searched using
latin1
if
character_set_server
is
ucs2
, utf16
,
utf16le
, or utf32
.
To override the default stopword list for MyISAM tables, set
the ft_stopword_file
system
variable. (See Section 5.1.8, “Server System Variables”.) The
variable value should be the path name of the file containing
the stopword list, or the empty string to disable stopword
filtering. The server looks for the file in the data directory
unless an absolute path name is given to specify a different
directory. After changing the value of this variable or the
contents of the stopword file, restart the server and rebuild
your FULLTEXT
indexes.
The stopword list is free-form, separating stopwords with any
nonalphanumeric character such as newline, space, or comma.
Exceptions are the underscore character (_
)
and a single apostrophe ('
) which are
treated as part of a word. The character set of the stopword
list is the server's default character set; see
Section 10.3.2, “Server Character Set and Collation”.
The following list shows the default stopwords for
MyISAM
search indexes. In a MySQL source
distribution, you can find this list in the
storage/myisam/ft_static.c
file.
a's able about above according accordingly across actually after afterwards again against ain't all allow allows almost alone along already also although always am among amongst an and another any anybody anyhow anyone anything anyway anyways anywhere apart appear appreciate appropriate are aren't around as aside ask asking associated at available away awfully be became because become becomes becoming been before beforehand behind being believe below beside besides best better between beyond both brief but by c'mon c's came can can't cannot cant cause causes certain certainly changes clearly co com come comes concerning consequently consider considering contain containing contains corresponding could couldn't course currently definitely described despite did didn't different do does doesn't doing don't done down downwards during each edu eg eight either else elsewhere enough entirely especially et etc even ever every everybody everyone everything everywhere ex exactly example except far few fifth first five followed following follows for former formerly forth four from further furthermore get gets getting given gives go goes going gone got gotten greetings had hadn't happens hardly has hasn't have haven't having he he's hello help hence her here here's hereafter hereby herein hereupon hers herself hi him himself his hither hopefully how howbeit however i'd i'll i'm i've ie if ignored immediate in inasmuch inc indeed indicate indicated indicates inner insofar instead into inward is isn't it it'd it'll it's its itself just keep keeps kept know known knows last lately later latter latterly least less lest let let's like liked likely little look looking looks ltd mainly many may maybe me mean meanwhile merely might more moreover most mostly much must my myself name namely nd near nearly necessary need needs neither never nevertheless new next nine no nobody non none noone nor normally not nothing novel now nowhere obviously of off often oh ok okay old on once one ones only onto or other others otherwise ought our ours ourselves out outside over overall own particular particularly per perhaps placed please plus possible presumably probably provides que quite qv rather rd re really reasonably regarding regardless regards relatively respectively right said same saw say saying says second secondly see seeing seem seemed seeming seems seen self selves sensible sent serious seriously seven several shall she should shouldn't since six so some somebody somehow someone something sometime sometimes somewhat somewhere soon sorry specified specify specifying still sub such sup sure t's take taken tell tends th than thank thanks thanx that that's thats the their theirs them themselves then thence there there's thereafter thereby therefore therein theres thereupon these they they'd they'll they're they've think third this thorough thoroughly those though three through throughout thru thus to together too took toward towards tried tries truly try trying twice two un under unfortunately unless unlikely until unto up upon us use used useful uses using usually value various very via viz vs want wants was wasn't way we we'd we'll we're we've welcome well went were weren't what what's whatever when whence whenever where where's whereafter whereas whereby wherein whereupon wherever whether which while whither who who's whoever whole whom whose why will willing wish with within without won't wonder would wouldn't yes yet you you'd you'll you're you've your yours yourself yourselves zero
Full-text searches are supported for
InnoDB
and
MyISAM
tables only.
Full-text searches are not supported for partitioned tables. See Section 23.6, “Restrictions and Limitations on Partitioning”.
Full-text searches can be used with most multibyte character
sets. The exception is that for Unicode, the
utf8
character set can be used, but not
the ucs2
character set. Although
FULLTEXT
indexes on
ucs2
columns cannot be used, you can
perform IN BOOLEAN MODE
searches on a
ucs2
column that has no such index.
The remarks for utf8
also apply to
utf8mb4
, and the remarks for
ucs2
also apply to
utf16
, utf16le
, and
utf32
.
Ideographic languages such as Chinese and Japanese do not have word delimiters. Therefore, the built-in full-text parser cannot determine where words begin and end in these and other such languages.
A character-based ngram full-text parser that supports
Chinese, Japanese, and Korean (CJK), and a word-based MeCab
parser plugin that supports Japanese are provided for use
with InnoDB
and MyISAM
tables.
Although the use of multiple character sets within a single
table is supported, all columns in a
FULLTEXT
index must use the same
character set and collation.
The MATCH()
column list must
match exactly the column list in some
FULLTEXT
index definition for the table,
unless this MATCH()
is
IN BOOLEAN MODE
on a
MyISAM
table. For
MyISAM
tables, boolean-mode searches can
be done on nonindexed columns, although they are likely to
be slow.
The argument to AGAINST()
must be a
string value that is constant during query evaluation. This
rules out, for example, a table column because that can
differ for each row.
Index hints are more limited for FULLTEXT
searches than for non-FULLTEXT
searches.
See Section 8.9.4, “Index Hints”.
For InnoDB
, all DML operations
(INSERT
,
UPDATE
,
DELETE
) involving columns
with full-text indexes are processed at transaction commit
time. For example, for an INSERT
operation, an inserted string is tokenized and decomposed
into individual words. The individual words are then added
to full-text index tables when the transaction is committed.
As a result, full-text searches only return committed data.
The '%' character is not a supported wildcard character for full-text searches.
MySQL's full-text search capability has few user-tunable parameters. You can exert more control over full-text searching behavior if you have a MySQL source distribution because some changes require source code modifications. See Section 2.9, “Installing MySQL from Source”.
Full-text search is carefully tuned for effectiveness. Modifying the default behavior in most cases can actually decrease effectiveness. Do not alter the MySQL sources unless you know what you are doing.
Most full-text variables described in this section must be set at server startup time. A server restart is required to change them; they cannot be modified while the server is running.
Some variable changes require that you rebuild the
FULLTEXT
indexes in your tables. Instructions
for doing so are given later in this section.
The minimum and maximum lengths of words to be indexed are
defined by the
innodb_ft_min_token_size
and
innodb_ft_max_token_size
for
InnoDB
search indexes, and
ft_min_word_len
and
ft_max_word_len
for
MyISAM
ones.
Minimum and maximum word length full-text parameters do not
apply to FULLTEXT
indexes created using
the ngram parser. ngram token size is defined by the
ngram_token_size
option.
After changing any of these options, rebuild your
FULLTEXT
indexes for the change to take
effect. For example, to make two-character words searchable,
you could put the following lines in an option file:
[mysqld] innodb_ft_min_token_size=2 ft_min_word_len=2
Then restart the server and rebuild your
FULLTEXT
indexes. For
MyISAM
tables, note the remarks regarding
myisamchk in the instructions that follow
for rebuilding MyISAM
full-text indexes.
For MyISAM
search indexes, the 50%
threshold for natural language searches is determined by the
particular weighting scheme chosen. To disable it, look for
the following line in
storage/myisam/ftdefs.h
:
#define GWS_IN_USE GWS_PROB
Change that line to this:
#define GWS_IN_USE GWS_FREQ
Then recompile MySQL. There is no need to rebuild the indexes in this case.
By making this change, you severely
decrease MySQL's ability to provide adequate relevance
values for the MATCH()
function. If you really need to search for such common
words, it would be better to search using IN
BOOLEAN MODE
instead, which does not observe the
50% threshold.
To change the operators used for boolean full-text searches on
MyISAM
tables, set the
ft_boolean_syntax
system
variable. (InnoDB
does not have an
equivalent setting.) This variable can be changed while the
server is running, but you must have privileges sufficient to
set global system variables (see
Section 5.1.9.1, “System Variable Privileges”). No rebuilding
of indexes is necessary in this case.
For the built-in full-text parser, you can change the set of
characters that are considered word characters in several
ways, as described in the following list. After making the
modification, rebuild the indexes for each table that contains
any FULLTEXT
indexes. Suppose that you want
to treat the hyphen character ('-') as a word character. Use
one of these methods:
Modify the MySQL source: In
storage/innobase/handler/ha_innodb.cc
(for InnoDB
), or in
storage/myisam/ftdefs.h
(for
MyISAM
), see the
true_word_char()
and
misc_word_char()
macros. Add
'-'
to one of those macros and
recompile MySQL.
Modify a character set file: This requires no
recompilation. The true_word_char()
macro uses a “character type” table to
distinguish letters and numbers from other characters. .
You can edit the contents of the
<ctype><map>
array in one
of the character set XML files to specify that
'-'
is a “letter.” Then
use the given character set for your
FULLTEXT
indexes. For information about
the <ctype><map>
array
format, see Section 10.12.1, “Character Definition Arrays”.
Add a new collation for the character set used by the indexed columns, and alter the columns to use that collation. For general information about adding collations, see Section 10.13, “Adding a Collation to a Character Set”. For an example specific to full-text indexing, see Section 12.9.7, “Adding a Collation for Full-Text Indexing”.
For the changes to take effect, FULLTEXT
indexes must be rebuilt after modifying any of the following
full-text index variables:
innodb_ft_min_token_size
;
innodb_ft_max_token_size
;
innodb_ft_server_stopword_table
;
innodb_ft_user_stopword_table
;
innodb_ft_enable_stopword
;
ngram_token_size
. Modifying
innodb_ft_min_token_size
,
innodb_ft_max_token_size
, or
ngram_token_size
requires
restarting the server.
To rebuild FULLTEXT
indexes for an
InnoDB
table, use
ALTER TABLE
with the
DROP INDEX
and ADD INDEX
options to drop and re-create each index.
Running OPTIMIZE TABLE
on a
table with a full-text index rebuilds the full-text index,
removing deleted Document IDs and consolidating multiple
entries for the same word, where possible.
To optimize a full-text index, enable
innodb_optimize_fulltext_only
and run OPTIMIZE TABLE
.
mysql> set GLOBAL innodb_optimize_fulltext_only=ON; Query OK, 0 rows affected (0.01 sec) mysql> OPTIMIZE TABLE opening_lines; +--------------------+----------+----------+----------+ | Table | Op | Msg_type | Msg_text | +--------------------+----------+----------+----------+ | test.opening_lines | optimize | status | OK | +--------------------+----------+----------+----------+ 1 row in set (0.01 sec)
To avoid lengthy rebuild times for full-text indexes on large
tables, you can use the
innodb_ft_num_word_optimize
option to perform the optimization in stages. The
innodb_ft_num_word_optimize
option defines
the number of words that are optimized each time
OPTIMIZE TABLE
is run. The
default setting is 2000, which means that 2000 words are
optimized each time OPTIMIZE
TABLE
is run. Subsequent
OPTIMIZE TABLE
operations
continue from where the preceding
OPTIMIZE TABLE
operation ended.
If you modify full-text variables that affect indexing
(ft_min_word_len
,
ft_max_word_len
, or
ft_stopword_file
), or if you
change the stopword file itself, you must rebuild your
FULLTEXT
indexes after making the changes
and restarting the server.
To rebuild the FULLTEXT
indexes for a
MyISAM
table, it is sufficient to do a
QUICK
repair operation:
mysql> REPAIR TABLE tbl_name
QUICK;
Alternatively, use ALTER TABLE
as just described. In some cases, this may be faster than a
repair operation.
Each table that contains any FULLTEXT
index
must be repaired as just shown. Otherwise, queries for the
table may yield incorrect results, and modifications to the
table will cause the server to see the table as corrupt and in
need of repair.
If you use myisamchk to perform an
operation that modifies MyISAM
table
indexes (such as repair or analyze), the
FULLTEXT
indexes are rebuilt using the
default full-text parameter values for
minimum word length, maximum word length, and stopword file
unless you specify otherwise. This can result in queries
failing.
The problem occurs because these parameters are known only by
the server. They are not stored in MyISAM
index files. To avoid the problem if you have modified the
minimum or maximum word length or stopword file values used by
the server, specify the same
ft_min_word_len
,
ft_max_word_len
, and
ft_stopword_file
values for
myisamchk that you use for
mysqld. For example, if you have set the
minimum word length to 3, you can repair a table with
myisamchk like this:
myisamchk --recover --ft_min_word_len=3 tbl_name
.MYI
To ensure that myisamchk and the server use
the same values for full-text parameters, place each one in
both the [mysqld]
and
[myisamchk]
sections of an option file:
[mysqld] ft_min_word_len=3 [myisamchk] ft_min_word_len=3
An alternative to using myisamchk for
MyISAM
table index modification is to use
the REPAIR TABLE
,
ANALYZE TABLE
,
OPTIMIZE TABLE
, or
ALTER TABLE
statements. These
statements are performed by the server, which knows the proper
full-text parameter values to use.
This section describes how to add a new collation for full-text
searches using the built-in full-text parser. The sample
collation is like latin1_swedish_ci
but
treats the '-'
character as a letter rather
than as a punctuation character so that it can be indexed as a
word character. General information about adding collations is
given in Section 10.13, “Adding a Collation to a Character Set”; it is assumed that
you have read it and are familiar with the files involved.
To add a collation for full-text indexing, use the following procedure. The instructions here add a collation for a simple character set, which as discussed in Section 10.13, “Adding a Collation to a Character Set”, can be created using a configuration file that describes the character set properties. For a complex character set such as Unicode, create collations using C source files that describe the character set properties.
Add a collation to the Index.xml
file.
The collation ID must be unused, so choose a value different
from 1000 if that ID is already taken on your system.
<charset name="latin1"> ... <collation name="latin1_fulltext_ci" id="1000"/> </charset>
Declare the sort order for the collation in the
latin1.xml
file. In this case, the
order can be copied from
latin1_swedish_ci
:
<collation name="latin1_fulltext_ci"> <map> 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F 20 21 22 23 24 25 26 27 28 29 2A 2B 2C 2D 2E 2F 30 31 32 33 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F 40 41 42 43 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F 50 51 52 53 54 55 56 57 58 59 5A 5B 5C 5D 5E 5F 60 41 42 43 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F 50 51 52 53 54 55 56 57 58 59 5A 7B 7C 7D 7E 7F 80 81 82 83 84 85 86 87 88 89 8A 8B 8C 8D 8E 8F 90 91 92 93 94 95 96 97 98 99 9A 9B 9C 9D 9E 9F A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 AA AB AC AD AE AF B0 B1 B2 B3 B4 B5 B6 B7 B8 B9 BA BB BC BD BE BF 41 41 41 41 5C 5B 5C 43 45 45 45 45 49 49 49 49 44 4E 4F 4F 4F 4F 5D D7 D8 55 55 55 59 59 DE DF 41 41 41 41 5C 5B 5C 43 45 45 45 45 49 49 49 49 44 4E 4F 4F 4F 4F 5D F7 D8 55 55 55 59 59 DE FF </map> </collation>
Modify the ctype
array in
latin1.xml
. Change the value
corresponding to 0x2D (which is the code for the
'-'
character) from 10 (punctuation) to
01 (small letter). In the following array, this is the
element in the fourth row down, third value from the end.
<ctype>
<map>
00
20 20 20 20 20 20 20 20 20 28 28 28 28 28 20 20
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
48 10 10 10 10 10 10 10 10 10 10 10 10 01 10 10
84 84 84 84 84 84 84 84 84 84 10 10 10 10 10 10
10 81 81 81 81 81 81 01 01 01 01 01 01 01 01 01
01 01 01 01 01 01 01 01 01 01 01 10 10 10 10 10
10 82 82 82 82 82 82 02 02 02 02 02 02 02 02 02
02 02 02 02 02 02 02 02 02 02 02 10 10 10 10 20
10 00 10 02 10 10 10 10 10 10 01 10 01 00 01 00
00 10 10 10 10 10 10 10 10 10 02 10 02 00 02 01
48 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01
01 01 01 01 01 01 01 10 01 01 01 01 01 01 01 02
02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02
02 02 02 02 02 02 02 10 02 02 02 02 02 02 02 02
</map>
</ctype>
Restart the server.
To employ the new collation, include it in the definition of columns that are to use it:
mysql>DROP TABLE IF EXISTS t1;
Query OK, 0 rows affected (0.13 sec) mysql>CREATE TABLE t1 (
a TEXT CHARACTER SET latin1 COLLATE latin1_fulltext_ci,
FULLTEXT INDEX(a)
) ENGINE=InnoDB;
Query OK, 0 rows affected (0.47 sec)
Test the collation to verify that hyphen is considered as a word character:
mysql>INSERT INTO t1 VALUEs ('----'),('....'),('abcd');
Query OK, 3 rows affected (0.22 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql>SELECT * FROM t1 WHERE MATCH a AGAINST ('----' IN BOOLEAN MODE);
+------+ | a | +------+ | ---- | +------+ 1 row in set (0.00 sec)
The built-in MySQL full-text parser uses the white space between
words as a delimiter to determine where words begin and end,
which is a limitation when working with ideographic languages
that do not use word delimiters. To address this limitation,
MySQL provides an ngram full-text parser that supports Chinese,
Japanese, and Korean (CJK). The ngram full-text parser is
supported for use with InnoDB
and
MyISAM
.
MySQL also provides a MeCab full-text parser plugin for Japanese, which tokenizes documents into meaningful words. For more information, see Section 12.9.9, “MeCab Full-Text Parser Plugin”.
An ngram is a contiguous sequence of
n
characters from a given sequence of
text. The ngram parser tokenizes a sequence of text into a
contiguous sequence of n
characters.
For example, you can tokenize “abcd” for different
values of n
using the ngram full-text
parser.
n=1: 'a', 'b', 'c', 'd' n=2: 'ab', 'bc', 'cd' n=3: 'abc', 'bcd' n=4: 'abcd'
The ngram full-text parser is a built-in server plugin. As with other built-in server plugins, it is automatically loaded when the server is started.
The full-text search syntax described in
Section 12.9, “Full-Text Search Functions” applies to the ngram parser
plugin. Differences in parsing behavior are described in this
section. Full-text-related configuration options, except for
minimum and maximum word length options
(innodb_ft_min_token_size
,
innodb_ft_max_token_size
,
ft_min_word_len
,
ft_max_word_len
) are also
applicable.
The ngram parser has a default ngram token size of 2 (bigram). For example, with a token size of 2, the ngram parser parses the string “abc def” into four tokens: “ab”, “bc”, “de” and “ef”.
ngram token size is configurable using the
ngram_token_size
configuration
option, which has a minimum value of 1 and maximum value of 10.
Typically, ngram_token_size
is
set to the size of the largest token that you want to search
for. If you only intend to search for single characters, set
ngram_token_size
to 1. A
smaller token size produces a smaller full-text search index,
and faster searches. If you need to search for words comprised
of more than one character, set
ngram_token_size
accordingly.
For example, “Happy Birthday” is
“生日快乐” in
simplified Chinese, where
“生日” is
“birthday”, and
“快乐” translates
as “happy”. To search on two-character words such
as these, set ngram_token_size
to a value of 2 or higher.
As a read-only variable,
ngram_token_size
may only be
set as part of a startup string or in a configuration file:
Startup string:
mysqld --ngram_token_size=2
Configuration file:
[mysqld] ngram_token_size=2
The following minimum and maximum word length configuration
options are ignored for FULLTEXT
indexes
that use the ngram parser:
innodb_ft_min_token_size
,
innodb_ft_max_token_size
,
ft_min_word_len
, and
ft_max_word_len
.
To create a FULLTEXT
index that uses the
ngram parser, specify WITH PARSER ngram
with
CREATE TABLE
,
ALTER TABLE
, or
CREATE INDEX
.
The following example demonstrates creating a table with an
ngram
FULLTEXT
index,
inserting sample data (Simplified Chinese text), and viewing
tokenized data in the
INFORMATION_SCHEMA.INNODB_FT_INDEX_CACHE
table.
mysql> USE test; mysql> CREATE TABLE articles ( id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY, title VARCHAR(200), body TEXT, FULLTEXT (title,body) WITH PARSER ngram ) ENGINE=InnoDB CHARACTER SET utf8mb4; mysql> SET NAMES utf8mb4; INSERT INTO articles (title,body) VALUES ('数据库管理','在本教程中我将向你展示如何管理数据库'), ('数据库应用开发','学习开发数据库应用程序'); mysql> SET GLOBAL innodb_ft_aux_table="test/articles"; mysql> SELECT * FROM INFORMATION_SCHEMA.INNODB_FT_INDEX_CACHE ORDER BY doc_id, position;
To add a FULLTEXT
index to an existing table,
you can use ALTER TABLE
or
CREATE INDEX
. For example:
CREATE TABLE articles ( id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY, title VARCHAR(200), body TEXT ) ENGINE=InnoDB CHARACTER SET utf8; ALTER TABLE articles ADD FULLTEXT INDEX ft_index (title,body) WITH PARSER ngram; # Or: CREATE FULLTEXT INDEX ft_index ON articles (title,body) WITH PARSER ngram;
The ngram parser eliminates spaces when parsing. For example:
“ab cd” is parsed to “ab”, “cd”
“a bc” is parsed to “bc”
The built-in MySQL full-text parser compares words to entries in
the stopword list. If a word is equal to an entry in the
stopword list, the word is excluded from the index. For the
ngram parser, stopword handling is performed differently.
Instead of excluding tokens that are equal to entries in the
stopword list, the ngram parser excludes tokens that
contain stopwords. For example, assuming
ngram_token_size=2
, a document
that contains “a,b” is parsed to “a,”
and “,b”. If a comma (“,”) is defined
as a stopword, both “a,” and “,b” are
excluded from the index because they contain a comma.
By default, the ngram parser uses the default stopword list, which contains a list of English stopwords. For a stopword list applicable to Chinese, Japanese, or Korean, you must create your own. For information about creating a stopword list, see Section 12.9.4, “Full-Text Stopwords”.
Stopwords greater in length than
ngram_token_size
are ignored.
For natural language mode search, the
search term is converted to a union of ngram terms. For example,
the string “abc” (assuming
ngram_token_size=2
) is
converted to “ab bc”. Given two documents, one
containing “ab” and the other containing
“abc”, the search term “ab bc” matches
both documents.
For boolean mode search, the search term is
converted to an ngram phrase search. For example, the string
'abc' (assuming
ngram_token_size=2
) is
converted to '“ab bc”'. Given two documents, one
containing 'ab' and the other containing 'abc', the search
phrase '“ab bc”' only matches the document
containing 'abc'.
Because an ngram FULLTEXT
index contains only
ngrams, and does not contain information about the beginning of
terms, wildcard searches may return unexpected results. The
following behaviors apply to wildcard searches using ngram
FULLTEXT
search indexes:
If the prefix term of a wildcard search is shorter than
ngram token size, the query returns all indexed rows that
contain ngram tokens starting with the prefix term. For
example, assuming
ngram_token_size=2
, a
search on “a*” returns all rows starting with
“a”.
If the prefix term of a wildcard search is longer than ngram
token size, the prefix term is converted to an ngram phrase
and the wildcard operator is ignored. For example, assuming
ngram_token_size=2
, an
“abc*” wildcard search is converted to
“ab bc”.
Phrase searches are converted to ngram phrase searches. For example, The search phrase “abc” is converted to “ab bc”, which returns documents containing “abc” and “ab bc”.
The search phrase “abc def” is converted to “ab bc de ef”, which returns documents containing “abc def” and “ab bc de ef”. A document that contains “abcdef” is not returned.
The built-in MySQL full-text parser uses the white space between
words as a delimiter to determine where words begin and end,
which is a limitation when working with ideographic languages
that do not use word delimiters. To address this limitation for
Japanese, MySQL provides a MeCab full-text parser plugin. The
MeCab full-text parser plugin is supported for use with
InnoDB
and
MyISAM
.
MySQL also provides an ngram full-text parser plugin that supports Japanese. For more information, see Section 12.9.8, “ngram Full-Text Parser”.
The MeCab full-text parser plugin is a full-text parser plugin
for Japanese that tokenizes a sequence of text into meaningful
words. For example, MeCab tokenizes
“データベース管理”
(“Database Management”) into
“データベース”
(“Database”) and
“管理”
(“Management”). By comparison, the
ngram full-text
parser tokenizes text into a contiguous sequence of
n
characters, where
n
represents a number between 1 and
10.
In addition to tokenizing text into meaningful words, MeCab indexes are typically smaller than ngram indexes, and MeCab full-text searches are generally faster. One drawback is that it may take longer for the MeCab full-text parser to tokenize documents, compared to the ngram full-text parser.
The full-text search syntax described in Section 12.9, “Full-Text Search Functions” applies to the MeCab parser plugin. Differences in parsing behavior are described in this section. Full-text related configuration options are also applicable.
For additional information about the MeCab parser, refer to the MeCab: Yet Another Part-of-Speech and Morphological Analyzer project on Github.
The MeCab parser plugin requires mecab
and
mecab-ipadic
.
On supported Fedora, Debian and Ubuntu platforms (except Ubuntu
12.04 where the system mecab
version is too
old), MySQL dynamically links to the system
mecab
installation if it is installed to
the default location. On other supported Unix-like platforms,
libmecab.so
is statically linked in
libpluginmecab.so
, which is located in the
MySQL plugin directory. mecab-ipadic
is
included in MySQL binaries and is located in
.
MYSQL_HOME
\lib\mecab
You can install mecab
and
mecab-ipadic
using a native package
management utility (on Fedora, Debian, and Ubuntu), or you can
build mecab
and
mecab-ipadic
from source. For information
about installing mecab
and
mecab-ipadic
using a native package
management utility, see
Installing MeCab From a
Binary Distribution (Optional). If you want to build
mecab
and mecab-ipadic
from source, see
Building MeCab From
Source (Optional).
On Windows, libmecab.dll
is found in the
MySQL bin
directory.
mecab-ipadic
is located in
.
MYSQL_HOME
/lib/mecab
To install and configure the MeCab parser plugin, perform the following steps:
In the MySQL configuration file, set the
mecab_rc_file
configuration
option to the location of the mecabrc
configuration file, which is the configuration file for
MeCab. If you are using the MeCab package distributed with
MySQL, the mecabrc
file is located in
MYSQL_HOME/lib/mecab/etc/
.
[mysqld] loose-mecab-rc-file=MYSQL_HOME/lib/mecab/etc/mecabrc
The loose
prefix is an
option modifier. The
mecab_rc_file
option is not
recognized by MySQL until the MeCaB parser plugin is
installed but it must be set before attempting to install
the MeCaB parser plugin. The loose
prefix
allows you restart MySQL without encountering an error due
to an unrecognized variable.
If you use your own MeCab installation, or build MeCab from
source, the location of the mecabrc
configuration file may differ.
For information about the MySQL configuration file and its location, see Section 4.2.7, “Using Option Files”.
Also in the MySQL configuration file, set the minimum token
size to 1 or 2, which are the values recommended for use
with the MeCab parser. For InnoDB
tables,
minimum token size is defined by the
innodb_ft_min_token_size
configuration option, which has a default value of 3. For
MyISAM
tables, minimum token size is
defined by ft_min_word_len
,
which has a default value of 4.
[mysqld] innodb_ft_min_token_size=1
Modify the mecabrc
configuration file
to specify the dictionary you want to use. The
mecab-ipadic
package distributed with
MySQL binaries includes three dictionaries
(ipadic_euc-jp
,
ipadic_sjis
, and
ipadic_utf-8
). The
mecabrc
configuration file packaged
with MySQL contains and entry similar to the following:
dicdir = /path/to/mysql/lib/mecab/lib/mecab/dic/ipadic_euc-jp
To use the ipadic_utf-8
dictionary, for
example, modify the entry as follows:
dicdir=MYSQL_HOME
/lib/mecab/dic/ipadic_utf-8
If you are using your own MeCab installation or have built
MeCab from source, the default dicdir
entry in the mecabrc
file will differ,
as will the dictionaries and their location.
After the MeCab parser plugin is installed, you can use
the mecab_charset
status
variable to view the character set used with MeCab. The
three MeCab dictionaries provided with the MySQL binary
support the following character sets.
The ipadic_euc-jp
dictionary
supports the ujis
and
eucjpms
character sets.
The ipadic_sjis
dictionary supports
the sjis
and
cp932
character sets.
The ipadic_utf-8
dictionary
supports the utf8
and
utf8mb4
character sets.
mecab_charset
only
reports the first supported character set. For example,
the ipadic_utf-8
dictionary supports
both utf8
and
utf8mb4
.
mecab_charset
always
reports utf8
when this dictionary is in
use.
Restart MySQL.
Install the MeCab parser plugin:
The MeCab parser plugin is installed using
INSTALL PLUGIN
syntax. The
plugin name is mecab
, and the shared
library name is libpluginmecab.so
. For
additional information about installing plugins, see
Section 5.6.1, “Installing and Uninstalling Plugins”.
INSTALL PLUGIN mecab SONAME 'libpluginmecab.so';
Once installed, the MeCab parser plugin loads at every normal MySQL restart.
Verify that the MeCab parser plugin is loaded using the
SHOW PLUGINS
statement.
mysql> SHOW PLUGINS;
A mecab
plugin should appear in the list
of plugins.
To create a FULLTEXT
index that uses the
mecab parser, specify WITH PARSER ngram
with
CREATE TABLE
,
ALTER TABLE
, or
CREATE INDEX
.
This example demonstrates creating a table with a
mecab
FULLTEXT
index,
inserting sample data, and viewing tokenized data in the
INFORMATION_SCHEMA.INNODB_FT_INDEX_CACHE
table:
mysql> USE test; mysql> CREATE TABLE articles ( id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY, title VARCHAR(200), body TEXT, FULLTEXT (title,body) WITH PARSER mecab ) ENGINE=InnoDB CHARACTER SET utf8; mysql> SET NAMES utf8; mysql> INSERT INTO articles (title,body) VALUES ('データベース管理','このチュートリアルでは、私はどのようにデータベースを管理する方法を紹介します'), ('データベースアプリケーション開発','データベースアプリケーションを開発することを学ぶ'); mysql> SET GLOBAL innodb_ft_aux_table="test/articles"; mysql> SELECT * FROM INFORMATION_SCHEMA.INNODB_FT_INDEX_CACHE ORDER BY doc_id, position;
To add a FULLTEXT
index to an existing table,
you can use ALTER TABLE
or
CREATE INDEX
. For example:
CREATE TABLE articles ( id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY, title VARCHAR(200), body TEXT ) ENGINE=InnoDB CHARACTER SET utf8; ALTER TABLE articles ADD FULLTEXT INDEX ft_index (title,body) WITH PARSER mecab; # Or: CREATE FULLTEXT INDEX ft_index ON articles (title,body) WITH PARSER mecab;
The MeCab parser uses spaces as separators in query strings. For example, the MeCab parser tokenizes データベース管理 as データベース and 管理.
By default, the MeCab parser uses the default stopword list, which contains a short list of English stopwords. For a stopword list applicable to Japanese, you must create your own. For information about creating stopword lists, see Section 12.9.4, “Full-Text Stopwords”.
For natural language mode search, the search term is converted to a union of tokens. For example, データベース管理 is converted to データベース 管理.
SELECT COUNT(*) FROM articles WHERE MATCH(title,body) AGAINST('データベース管理' IN NATURAL LANGUAGE MODE);
For boolean mode search, the search term is converted to a search phrase. For example, データベース管理 is converted to データベース 管理.
SELECT COUNT(*) FROM articles WHERE MATCH(title,body) AGAINST('データベース管理' IN BOOLEAN MODE);
Wildcard search terms are not tokenized. A search on データベース管理* is performed on the prefix, データベース管理.
SELECT COUNT(*) FROM articles WHERE MATCH(title,body) AGAINST('データベース*' IN BOOLEAN MODE);
Phrases are tokenized. For example, データベース管理 is tokenized as データベース 管理.
SELECT COUNT(*) FROM articles WHERE MATCH(title,body) AGAINST('"データベース管理"' IN BOOLEAN MODE);
This section describes how to install mecab
and mecab-ipadic
from a binary distribution
using a native package management utility. For example, on
Fedora, you can use Yum to perform the installation:
yum mecab-devel
On Debian or Ubuntu, you can perform an APT installation:
apt-get install mecab apt-get install mecab-ipadic
If you want to build mecab
and
mecab-ipadic
from source, basic
installation steps are provided below. For additional
information, refer to the MeCab documentation.
Download the tar.gz packages for mecab
and mecab-ipadic
from
http://taku910.github.io/mecab/#download. As
of February, 2016, the latest available packages are
mecab-0.996.tar.gz
and
mecab-ipadic-2.7.0-20070801.tar.gz
.
Install mecab
:
tar zxfv mecab-0.996.tar cd mecab-0.996 ./configure make make check su make install
Install mecab-ipadic
:
tar zxfv mecab-ipadic-2.7.0-20070801.tar cd mecab-ipadic-2.7.0-20070801 ./configure make su make install
Compile MySQL using the
WITH_MECAB
CMake option. Set
the WITH_MECAB
option to
system
if you have installed
mecab
and
mecab-ipadic
to the default location.
-DWITH_MECAB=system
If you defined a custom installation directory, set
WITH_MECAB
to the custom
directory. For example:
-DWITH_MECAB=/path/to/mecab
Cast functions and operators enable conversion of values from one data type to another.
CONVERT()
with a
USING
clause provides a way to convert data
between different character sets:
CONVERT(expr
USINGtranscoding_name
)
In MySQL, transcoding names are the same as the corresponding character set names.
Examples:
SELECT CONVERT(_latin1'Müller' USING utf8); INSERT INTO utf8_table (utf8_column) SELECT CONVERT(latin1_column USING utf8) FROM latin1_table;
You can also use CONVERT()
without
USING
or CAST()
to convert strings between different character sets:
CONVERT(string
, CHAR[(N
)] CHARACTER SETcharset_name
) CAST(string
AS CHAR[(N
)] CHARACTER SETcharset_name
)
Examples:
SELECT CONVERT('test', CHAR CHARACTER SET utf8); SELECT CAST('test' AS CHAR CHARACTER SET utf8);
If you specify CHARACTER SET
as just shown,
the resulting character set and collation are
charset_name
charset_name
and the default collation
of charset_name
. If you omit
CHARACTER SET
, the resulting
character set and collation are defined by the
charset_name
character_set_connection
and
collation_connection
system
variables that determine the default connection character set and
collation (see Section 10.4, “Connection Character Sets and Collations”).
A COLLATE
clause is not permitted within a
CONVERT()
or
CAST()
call, but you can apply it
to the function result. For example, this is legal:
SELECT CAST('test' AS CHAR CHARACTER SET utf8) COLLATE utf8_bin;
But this is illegal:
SELECT CAST('test' AS CHAR CHARACTER SET utf8 COLLATE utf8_bin);
Normally, you cannot compare a BLOB
value or other binary string in case-insensitive fashion because
binary strings use the binary
character set,
which has no collation with the concept of lettercase. To perform
a case-insensitive comparison, use the
CONVERT()
or
CAST()
function to convert the
value to a nonbinary string. Comparisons of the resulting string
use its collation. For example, if the conversion result character
set has a case-insensitive collation, a
LIKE
operation is not case-sensitive:
SELECT 'A' LIKE CONVERT(blob_col
USING latin1) FROMtbl_name
;
To use a different character set, substitute its name for
latin1
in the preceding statement. To specify a
particular collation for the converted string, use a
COLLATE
clause following the
CONVERT()
call:
SELECT 'A' LIKE CONVERT(blob_col
USING latin1) COLLATE latin1_german1_ci FROMtbl_name
;
CONVERT()
and
CAST()
can be used more generally
for comparing strings that are represented in different character
sets. For example, a comparison of these strings results in an
error because they have different character sets:
mysql>SET @s1 = _latin1 'abc', @s2 = _latin2 'abc';
mysql>SELECT @s1 = @s2;
ERROR 1267 (HY000): Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (latin2_general_ci,IMPLICIT) for operation '='
Converting one of the strings to a character set compatible with the other enables the comparison to occur without error:
mysql> SELECT @s1 = CONVERT(@s2 USING latin1);
+---------------------------------+
| @s1 = CONVERT(@s2 USING latin1) |
+---------------------------------+
| 1 |
+---------------------------------+
For string literals, another way to specify the character set is
to use a character set introducer (_latin1
and
_latin2
in the preceding example are instances
of introducers). Unlike conversion functions such as
CAST()
, or
CONVERT()
, which convert a string
from one character set to another, an introducer designates a
string literal as having a particular character set, with no
conversion involved. For more information, see
Section 10.3.8, “Character Set Introducers”.
Character set conversion is also useful preceding lettercase
conversion of binary strings.
LOWER()
and
UPPER()
are ineffective when
applied directly to binary strings because the concept of
lettercase does not apply. To perform lettercase conversion of a
binary string, first convert it to a nonbinary string:
mysql>SET @str = BINARY 'New York';
mysql>SELECT LOWER(@str), LOWER(CONVERT(@str USING utf8mb4));
+-------------+------------------------------------+ | LOWER(@str) | LOWER(CONVERT(@str USING utf8mb4)) | +-------------+------------------------------------+ | New York | new york | +-------------+------------------------------------+
If you convert an indexed column using
BINARY
,
CAST()
, or
CONVERT()
, MySQL may not be able to
use the index efficiently.
The cast functions are useful for creating a column with a
specific type in a
CREATE TABLE ...
SELECT
statement:
mysql>CREATE TABLE new_table SELECT CAST('2000-01-01' AS DATE) AS c1;
mysql>SHOW CREATE TABLE new_table\G
*************************** 1. row *************************** Table: new_table Create Table: CREATE TABLE `new_table` ( `c1` date DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
The cast functions are useful for sorting
ENUM
columns in lexical order.
Normally, sorting of ENUM
columns
occurs using the internal numeric values. Casting the values to
CHAR
results in a lexical sort:
SELECTenum_col
FROMtbl_name
ORDER BY CAST(enum_col
AS CHAR);
CAST()
also changes the result if
you use it as part of a more complex expression such as
CONCAT('Date: ',CAST(NOW() AS
DATE))
.
For temporal values, there is little need to use
CAST()
to extract data in different
formats. Instead, use a function such as
EXTRACT()
,
DATE_FORMAT()
, or
TIME_FORMAT()
. See
Section 12.7, “Date and Time Functions”.
To cast a string to a number, you normally need do nothing other than use the string value in numeric context:
mysql> SELECT 1+'1';
-> 2
That is also true for hexadecimal and bit literals, which are binary strings by default:
mysql>SELECT X'41', X'41'+0;
-> 'A', 65 mysql>SELECT b'1100001', b'1100001'+0;
-> 'a', 97
A string used in an arithmetic operation is converted to a floating-point number during expression evaluation.
A number used in string context is converted to a string:
mysql> SELECT CONCAT('hello you ',2);
-> 'hello you 2'
For information about implicit conversion of numbers to strings, see Section 12.2, “Type Conversion in Expression Evaluation”.
MySQL supports arithmetic with both signed and unsigned 64-bit
values. For numeric operators (such as
+
or
-
) where one of the
operands is an unsigned integer, the result is unsigned by default
(see Section 12.6.1, “Arithmetic Operators”). To override this,
use the SIGNED
or UNSIGNED
cast operator to cast a value to a signed or unsigned 64-bit
integer, respectively.
mysql>SELECT 1 - 2;
-> -1 mysql>SELECT CAST(1 - 2 AS UNSIGNED);
-> 18446744073709551615 mysql>SELECT CAST(CAST(1 - 2 AS UNSIGNED) AS SIGNED);
-> -1
If either operand is a floating-point value, the result is a
floating-point value and is not affected by the preceding rule.
(In this context, DECIMAL
column
values are regarded as floating-point values.)
mysql> SELECT CAST(1 AS UNSIGNED) - 2.0;
-> -1.0
The SQL mode affects the result of conversion operations (see Section 5.1.11, “Server SQL Modes”). Examples:
For conversion of a “zero” date string to a date,
CONVERT()
and
CAST()
return
NULL
and produce a warning when the
NO_ZERO_DATE
SQL mode is
enabled.
For integer subtraction, if the
NO_UNSIGNED_SUBTRACTION
SQL
mode is enabled, the subtraction result is signed even if any
operand is unsigned.
The following list describes the available cast functions and operators:
BINARY
expr
The BINARY
operator converts the
expression to a binary string. A common use for
BINARY
is to force a character
string comparison to be done byte by byte rather than
character by character, in effect becoming case-sensitive. The
BINARY
operator also causes
trailing spaces in comparisons to be significant.
mysql>SELECT 'a' = 'A';
-> 1 mysql>SELECT BINARY 'a' = 'A';
-> 0 mysql>SELECT 'a' = 'a ';
-> 1 mysql>SELECT BINARY 'a' = 'a ';
-> 0
In a comparison, BINARY
affects
the entire operation; it can be given before either operand
with the same result.
For purposes of converting a string expression to a binary string, these constructs are equivalent:
BINARYexpr
CAST(expr
AS BINARY) CONVERT(expr
USING BINARY)
If a value is a string literal, it can be designated as a
binary string without performing any conversion by using the
_binary
character set introducer:
mysql>SELECT 'a' = 'A';
-> 1 mysql>SELECT _binary 'a' = 'A';
-> 0
For information about introducers, see Section 10.3.8, “Character Set Introducers”.
The BINARY
operator in
expressions differs in effect from the
BINARY
attribute in character column
definitions. A character column defined with the
BINARY
attribute is assigned table default
character set and the binary (_bin
)
collation of that character set. Every nonbinary character set
has a _bin
collation. For example, the
binary collation for the utf8
character set
is utf8_bin
, so if the table default
character set is utf8
, these two column
definitions are equivalent:
CHAR(10) BINARY CHAR(10) CHARACTER SET utf8 COLLATE utf8_bin
The use of CHARACTER SET binary
in the
definition of a CHAR
,
VARCHAR
, or
TEXT
column causes the column
to be treated as the corresponding binary string data type.
For example, the following pairs of definitions are
equivalent:
CHAR(10) CHARACTER SET binary BINARY(10) VARCHAR(10) CHARACTER SET binary VARBINARY(10) TEXT CHARACTER SET binary BLOB
The CAST()
function takes an
expression of any type and produces a result value of the
specified type, similar to
CONVERT()
. For more
information, see the description of
CONVERT()
.
CAST()
is standard SQL syntax.
CONVERT(
,
expr
,type
)CONVERT(
expr
USING transcoding_name
)
The CONVERT()
function takes an
expression of any type and produces a result value of the
specified type.
Discussion of
CONVERT(
syntax here also
applies to
expr
,
type
)CAST(
, which is
equivalent.
expr
AS
type
)
CONVERT(... USING ...)
is
standard SQL syntax. The non-USING
form of
CONVERT()
is ODBC syntax.
CONVERT()
with
USING
converts data between different
character sets. In MySQL, transcoding names are the same as
the corresponding character set names. For example, this
statement converts the string 'abc'
in the
default character set to the corresponding string in the
utf8
character set:
SELECT CONVERT('abc' USING utf8);
CONVERT()
without
USING
and
CAST()
take an expression and a
type
value specifying the result
type. These type
values are
permitted:
BINARY[(
N
)]
Produces a string with the
BINARY
data type. See
Section 11.4.2, “The BINARY and VARBINARY Types” for a description of
how this affects comparisons. If the optional length
N
is given,
BINARY(
causes the cast to use no more than
N
)N
bytes of the argument. Values
shorter than N
bytes are padded
with 0x00
bytes to a length of
N
.
CHAR[(
N
)]
[charset_info
]
Produces a string with the
CHAR
data type. If the
optional length N
is given,
CHAR(
causes the cast to use no more than
N
)N
characters of the argument.
No padding occurs for values shorter than
N
characters.
With no charset_info
clause,
CHAR
produces a string with the default
character set. To specify the character set explicitly,
these charset_info
values are
permitted:
CHARACTER SET
:
Produces a string with the given character set.
charset_name
ASCII
: Shorthand for
CHARACTER SET latin1
.
UNICODE
: Shorthand for
CHARACTER SET ucs2
.
In all cases, the string has the default collation for the character set.
DATE
Produces a DATE
value.
DATETIME
Produces a DATETIME
value.
DECIMAL[(
M
[,D
])]
Produces a DECIMAL
value.
If the optional M
and
D
values are given, they
specify the maximum number of digits (the precision) and
the number of digits following the decimal point (the
scale).
JSON
Produces a JSON
value. For
details on the rules for conversion of values between
JSON
and other types, see
Comparison and Ordering of JSON Values.
NCHAR[(
N
)]
Like CHAR
, but produces a string with
the national character set. See
Section 10.3.7, “The National Character Set”.
Unlike CHAR
, NCHAR
does not permit trailing character set information to be
specified.
SIGNED [INTEGER]
Produces a signed integer value.
TIME
Produces a TIME
value.
UNSIGNED [INTEGER]
Produces an unsigned integer value.
Table 12.15 XML Functions
Name | Description |
---|---|
ExtractValue() |
Extract a value from an XML string using XPath notation |
UpdateXML() |
Return replaced XML fragment |
This section discusses XML and related functionality in MySQL.
It is possible to obtain XML-formatted output from MySQL in the
mysql and mysqldump
clients by invoking them with the
--xml
option. See
Section 4.5.1, “mysql — The MySQL Command-Line Client”, and Section 4.5.4, “mysqldump — A Database Backup Program”.
Two functions providing basic XPath 1.0 (XML Path Language, version 1.0) capabilities are available. Some basic information about XPath syntax and usage is provided later in this section; however, an in-depth discussion of these topics is beyond the scope of this manual, and you should refer to the XML Path Language (XPath) 1.0 standard for definitive information. A useful resource for those new to XPath or who desire a refresher in the basics is the Zvon.org XPath Tutorial, which is available in several languages.
These functions remain under development. We continue to improve these and other aspects of XML and XPath functionality in MySQL 8.0 and onwards. You may discuss these, ask questions about them, and obtain help from other users with them in the MySQL XML User Forum.
XPath expressions used with these functions support user variables and local stored program variables. User variables are weakly checked; variables local to stored programs are strongly checked (see also Bug #26518):
User variables (weak checking).
Variables using the syntax
$@
(that is, user variables) are not checked. No warnings or
errors are issued by the server if a variable has the wrong
type or has previously not been assigned a value. This also
means the user is fully responsible for any typographical
errors, since no warnings will be given if (for example)
variable_name
$@myvariable
is used where
$@myvariable
was intended.
Example:
mysql>SET @xml = '<a><b>X</b><b>Y</b></a>';
Query OK, 0 rows affected (0.00 sec) mysql>SET @i =1, @j = 2;
Query OK, 0 rows affected (0.00 sec) mysql>SELECT @i, ExtractValue(@xml, '//b[$@i]');
+------+--------------------------------+ | @i | ExtractValue(@xml, '//b[$@i]') | +------+--------------------------------+ | 1 | X | +------+--------------------------------+ 1 row in set (0.00 sec) mysql>SELECT @j, ExtractValue(@xml, '//b[$@j]');
+------+--------------------------------+ | @j | ExtractValue(@xml, '//b[$@j]') | +------+--------------------------------+ | 2 | Y | +------+--------------------------------+ 1 row in set (0.00 sec) mysql>SELECT @k, ExtractValue(@xml, '//b[$@k]');
+------+--------------------------------+ | @k | ExtractValue(@xml, '//b[$@k]') | +------+--------------------------------+ | NULL | | +------+--------------------------------+ 1 row in set (0.00 sec)
Variables in stored programs (strong checking).
Variables using the syntax
$
can be declared and used with these functions when they are
called inside stored programs. Such variables are local to
the stored program in which they are defined, and are
strongly checked for type and value.
variable_name
Example:
mysql>DELIMITER |
mysql>CREATE PROCEDURE myproc ()
->BEGIN
->DECLARE i INT DEFAULT 1;
->DECLARE xml VARCHAR(25) DEFAULT '<a>X</a><a>Y</a><a>Z</a>';
-> ->WHILE i < 4 DO
->SELECT xml, i, ExtractValue(xml, '//a[$i]');
->SET i = i+1;
->END WHILE;
->END |
Query OK, 0 rows affected (0.01 sec) mysql>DELIMITER ;
mysql>CALL myproc();
+--------------------------+---+------------------------------+ | xml | i | ExtractValue(xml, '//a[$i]') | +--------------------------+---+------------------------------+ | <a>X</a><a>Y</a><a>Z</a> | 1 | X | +--------------------------+---+------------------------------+ 1 row in set (0.00 sec) +--------------------------+---+------------------------------+ | xml | i | ExtractValue(xml, '//a[$i]') | +--------------------------+---+------------------------------+ | <a>X</a><a>Y</a><a>Z</a> | 2 | Y | +--------------------------+---+------------------------------+ 1 row in set (0.01 sec) +--------------------------+---+------------------------------+ | xml | i | ExtractValue(xml, '//a[$i]') | +--------------------------+---+------------------------------+ | <a>X</a><a>Y</a><a>Z</a> | 3 | Z | +--------------------------+---+------------------------------+ 1 row in set (0.01 sec)
Parameters. Variables used in XPath expressions inside stored routines that are passed in as parameters are also subject to strong checking.
Expressions containing user variables or variables local to stored programs must otherwise (except for notation) conform to the rules for XPath expressions containing variables as given in the XPath 1.0 specification.
A user variable used to store an XPath expression is treated as an empty string. Because of this, it is not possible to store an XPath expression as a user variable. (Bug #32911)
ExtractValue(
xml_frag
,
xpath_expr
)
ExtractValue()
takes two string
arguments, a fragment of XML markup
xml_frag
and an XPath expression
xpath_expr
(also known as a
locator); it returns the
text (CDATA
) of the first text node which
is a child of the element or elements matched by the XPath
expression.
Using this function is the equivalent of performing a match
using the xpath_expr
after
appending /text()
. In other words,
ExtractValue('<a><b>Sakila</b></a>',
'/a/b')
and
ExtractValue('<a><b>Sakila</b></a>',
'/a/b/text()')
produce the same result.
If multiple matches are found, the content of the first child text node of each matching element is returned (in the order matched) as a single, space-delimited string.
If no matching text node is found for the expression
(including the implicit /text()
)—for
whatever reason, as long as
xpath_expr
is valid, and
xml_frag
consists of elements which
are properly nested and closed—an empty string is
returned. No distinction is made between a match on an empty
element and no match at all. This is by design.
If you need to determine whether no matching element was found
in xml_frag
or such an element was
found but contained no child text nodes, you should test the
result of an expression that uses the XPath
count()
function. For example, both of
these statements return an empty string, as shown here:
mysql>SELECT ExtractValue('<a><b/></a>', '/a/b');
+-------------------------------------+ | ExtractValue('<a><b/></a>', '/a/b') | +-------------------------------------+ | | +-------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT ExtractValue('<a><c/></a>', '/a/b');
+-------------------------------------+ | ExtractValue('<a><c/></a>', '/a/b') | +-------------------------------------+ | | +-------------------------------------+ 1 row in set (0.00 sec)
However, you can determine whether there was actually a matching element using the following:
mysql>SELECT ExtractValue('<a><b/></a>', 'count(/a/b)');
+-------------------------------------+ | ExtractValue('<a><b/></a>', 'count(/a/b)') | +-------------------------------------+ | 1 | +-------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT ExtractValue('<a><c/></a>', 'count(/a/b)');
+-------------------------------------+ | ExtractValue('<a><c/></a>', 'count(/a/b)') | +-------------------------------------+ | 0 | +-------------------------------------+ 1 row in set (0.01 sec)
ExtractValue()
returns only
CDATA
, and does not return any tags that
might be contained within a matching tag, nor any of their
content (see the result returned as val1
in the following example).
mysql>SELECT
->ExtractValue('<a>ccc<b>ddd</b></a>', '/a') AS val1,
->ExtractValue('<a>ccc<b>ddd</b></a>', '/a/b') AS val2,
->ExtractValue('<a>ccc<b>ddd</b></a>', '//b') AS val3,
->ExtractValue('<a>ccc<b>ddd</b></a>', '/b') AS val4,
->ExtractValue('<a>ccc<b>ddd</b><b>eee</b></a>', '//b') AS val5;
+------+------+------+------+---------+ | val1 | val2 | val3 | val4 | val5 | +------+------+------+------+---------+ | ccc | ddd | ddd | | ddd eee | +------+------+------+------+---------+
This function uses the current SQL collation for making
comparisons with contains()
, performing the
same collation aggregation as other string functions (such as
CONCAT()
), in taking into
account the collation coercibility of their arguments; see
Section 10.8.4, “Collation Coercibility in Expressions”, for an
explanation of the rules governing this behavior.
(Previously, binary—that is, case-sensitive—comparison was always used.)
NULL
is returned if
xml_frag
contains elements which
are not properly nested or closed, and a warning is generated,
as shown in this example:
mysql>SELECT ExtractValue('<a>c</a><b', '//a');
+-----------------------------------+ | ExtractValue('<a>c</a><b', '//a') | +-----------------------------------+ | NULL | +-----------------------------------+ 1 row in set, 1 warning (0.00 sec) mysql>SHOW WARNINGS\G
*************************** 1. row *************************** Level: Warning Code: 1525 Message: Incorrect XML value: 'parse error at line 1 pos 11: END-OF-INPUT unexpected ('>' wanted)' 1 row in set (0.00 sec) mysql>SELECT ExtractValue('<a>c</a><b/>', '//a');
+-------------------------------------+ | ExtractValue('<a>c</a><b/>', '//a') | +-------------------------------------+ | c | +-------------------------------------+ 1 row in set (0.00 sec)
UpdateXML(
xml_target
,
xpath_expr
,
new_xml
)
This function replaces a single portion of a given fragment of
XML markup xml_target
with a new
XML fragment new_xml
, and then
returns the changed XML. The portion of
xml_target
that is replaced matches
an XPath expression xpath_expr
supplied by the user.
If no expression matching
xpath_expr
is found, or if multiple
matches are found, the function returns the original
xml_target
XML fragment. All three
arguments should be strings.
mysql>SELECT
->UpdateXML('<a><b>ccc</b><d></d></a>', '/a', '<e>fff</e>') AS val1,
->UpdateXML('<a><b>ccc</b><d></d></a>', '/b', '<e>fff</e>') AS val2,
->UpdateXML('<a><b>ccc</b><d></d></a>', '//b', '<e>fff</e>') AS val3,
->UpdateXML('<a><b>ccc</b><d></d></a>', '/a/d', '<e>fff</e>') AS val4,
->UpdateXML('<a><d></d><b>ccc</b><d></d></a>', '/a/d', '<e>fff</e>') AS val5
->\G
*************************** 1. row *************************** val1: <e>fff</e> val2: <a><b>ccc</b><d></d></a> val3: <a><e>fff</e><d></d></a> val4: <a><b>ccc</b><e>fff</e></a> val5: <a><d></d><b>ccc</b><d></d></a>
A discussion in depth of XPath syntax and usage are beyond the scope of this manual. Please see the XML Path Language (XPath) 1.0 specification for definitive information. A useful resource for those new to XPath or who are wishing a refresher in the basics is the Zvon.org XPath Tutorial, which is available in several languages.
Descriptions and examples of some basic XPath expressions follow:
/
tag
Matches
<
if
and only if
tag
/><
is
the root element.
tag
/>
Example: /a
has a match in
<a><b/></a>
because it
matches the outermost (root) tag. It does not match the inner
a
element in
<b><a/></b>
because in
this instance it is the child of another element.
/
tag1
/tag2
Matches
<
if
and only if it is a child of
tag2
/><
,
and
tag1
/><
is
the root element.
tag1
/>
Example: /a/b
matches the
b
element in the XML fragment
<a><b/></a>
because it is
a child of the root element a
. It
does not have a match in
<b><a/></b>
because in
this case, b
is the root element
(and hence the child of no other element). Nor does the XPath
expression have a match in
<a><c><b/></c></a>
;
here, b
is a descendant of
a
, but not actually a child of
a
.
This construct is extendable to three or more elements. For
example, the XPath expression /a/b/c
matches the c
element in the
fragment
<a><b><c/></b></a>
.
//
tag
Matches any instance of
<
.
tag
>
Example: //a
matches the
a
element in any of the following:
<a><b><c/></b></a>
;
<c><a><b/></a></b>
;
<c><b><a/></b></c>
.
//
can be combined with
/
. For example, //a/b
matches the b
element in either of
the fragments <a><b/></a>
or
<c><a><b/></a></c>
.
//
is the
equivalent of
tag
/descendant-or-self::*/
.
A common error is to confuse this with
tag
/descendant-or-self::
,
although the latter expression can actually lead to very
different results, as can be seen here:
tag
mysql>SET @xml = '<a><b><c>w</c><b>x</b><d>y</d>z</b></a>';
Query OK, 0 rows affected (0.00 sec) mysql>SELECT @xml;
+-----------------------------------------+ | @xml | +-----------------------------------------+ | <a><b><c>w</c><b>x</b><d>y</d>z</b></a> | +-----------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT ExtractValue(@xml, '//b[1]');
+------------------------------+ | ExtractValue(@xml, '//b[1]') | +------------------------------+ | x z | +------------------------------+ 1 row in set (0.00 sec) mysql>SELECT ExtractValue(@xml, '//b[2]');
+------------------------------+ | ExtractValue(@xml, '//b[2]') | +------------------------------+ | | +------------------------------+ 1 row in set (0.01 sec) mysql>SELECT ExtractValue(@xml, '/descendant-or-self::*/b[1]');
+---------------------------------------------------+ | ExtractValue(@xml, '/descendant-or-self::*/b[1]') | +---------------------------------------------------+ | x z | +---------------------------------------------------+ 1 row in set (0.06 sec) mysql>SELECT ExtractValue(@xml, '/descendant-or-self::*/b[2]');
+---------------------------------------------------+ | ExtractValue(@xml, '/descendant-or-self::*/b[2]') | +---------------------------------------------------+ | | +---------------------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT ExtractValue(@xml, '/descendant-or-self::b[1]');
+-------------------------------------------------+ | ExtractValue(@xml, '/descendant-or-self::b[1]') | +-------------------------------------------------+ | z | +-------------------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT ExtractValue(@xml, '/descendant-or-self::b[2]');
+-------------------------------------------------+ | ExtractValue(@xml, '/descendant-or-self::b[2]') | +-------------------------------------------------+ | x | +-------------------------------------------------+ 1 row in set (0.00 sec)
The *
operator acts as a
“wildcard” that matches any element. For example,
the expression /*/b
matches the
b
element in either of the XML
fragments <a><b/></a>
or
<c><b/></c>
. However, the
expression does not produce a match in the fragment
<b><a/></b>
because
b
must be a child of some other
element. The wildcard may be used in any position: The
expression /*/b/*
will match any child of a
b
element that is itself not the
root element.
You can match any of several locators using the
|
(UNION
)
operator. For example, the expression
//b|//c
matches all
b
and c
elements in the XML target.
It is also possible to match an element based on the value of
one or more of its attributes. This done using the syntax
.
For example, the expression tag
[@attribute
="value
"]//b[@id="idB"]
matches the second b
element in the
fragment <a><b id="idA"/><c/><b
id="idB"/></a>
. To match against
any element having
,
use the XPath expression
attribute
="value
"//*[
.
attribute
="value
"]
To filter multiple attribute values, simply use multiple
attribute-comparison clauses in succession. For example, the
expression //b[@c="x"][@d="y"]
matches the
element <b c="x" d="y"/>
occurring
anywhere in a given XML fragment.
To find elements for which the same attribute matches any of
several values, you can use multiple locators joined by the
|
operator. For example, to match all
b
elements whose
c
attributes have either of the
values 23 or 17, use the expression
//b[@c="23"]|//b[@c="17"]
. You can also use
the logical or
operator for this purpose:
//b[@c="23" or @c="17"]
.
The difference between or
and
|
is that or
joins
conditions, while |
joins result sets.
XPath Limitations. The XPath syntax supported by these functions is currently subject to the following limitations:
Nodeset-to-nodeset comparison (such as
'/a/b[@c=@d]'
) is not supported.
All of the standard XPath comparison operators are supported. (Bug #22823)
Relative locator expressions are resolved in the context of the root node. For example, consider the following query and result:
mysql>SELECT ExtractValue(
->'<a><b c="1">X</b><b c="2">Y</b></a>',
->'a/b'
->) AS result;
+--------+ | result | +--------+ | X Y | +--------+ 1 row in set (0.03 sec)
In this case, the locator a/b
resolves to
/a/b
.
Relative locators are also supported within predicates. In the
following example, d[../@c="1"]
is resolved
as /a/b[@c="1"]/d
:
mysql>SELECT ExtractValue(
->'<a>
-><b c="1"><d>X</d></b>
-><b c="2"><d>X</d></b>
-></a>',
->'a/b/d[../@c="1"]')
->AS result;
+--------+ | result | +--------+ | X | +--------+ 1 row in set (0.00 sec)
Locators prefixed with expressions that evaluate as scalar values—including variable references, literals, numbers, and scalar function calls—are not permitted, and their use results in an error.
The ::
operator is not supported in
combination with node types such as the following:
axis
::comment()
axis
::text()
axis
::processing-instructions()
axis
::node()
However, name tests (such as
and axis
::name
) are
supported, as shown in these examples:
axis
::*
mysql>SELECT ExtractValue('<a><b>x</b><c>y</c></a>','/a/child::b');
+-------------------------------------------------------+ | ExtractValue('<a><b>x</b><c>y</c></a>','/a/child::b') | +-------------------------------------------------------+ | x | +-------------------------------------------------------+ 1 row in set (0.02 sec) mysql>SELECT ExtractValue('<a><b>x</b><c>y</c></a>','/a/child::*');
+-------------------------------------------------------+ | ExtractValue('<a><b>x</b><c>y</c></a>','/a/child::*') | +-------------------------------------------------------+ | x y | +-------------------------------------------------------+ 1 row in set (0.01 sec)
“Up-and-down” navigation is not supported in cases where the path would lead “above” the root element. That is, you cannot use expressions which match on descendants of ancestors of a given element, where one or more of the ancestors of the current element is also an ancestor of the root element (see Bug #16321).
The following XPath functions are not supported, or have known issues as indicated:
id()
lang()
local-name()
name()
namespace-uri()
normalize-space()
starts-with()
string()
substring-after()
substring-before()
translate()
The following axes are not supported:
following-sibling
following
preceding-sibling
preceding
XPath expressions passed as arguments to
ExtractValue()
and
UpdateXML()
may contain the colon
character (:
) in element selectors, which
enables their use with markup employing XML namespaces notation.
For example:
mysql>SET @xml = '<a>111<b:c>222<d>333</d><e:f>444</e:f></b:c></a>';
Query OK, 0 rows affected (0.00 sec) mysql>SELECT ExtractValue(@xml, '//e:f');
+-----------------------------+ | ExtractValue(@xml, '//e:f') | +-----------------------------+ | 444 | +-----------------------------+ 1 row in set (0.00 sec) mysql>SELECT UpdateXML(@xml, '//b:c', '<g:h>555</g:h>');
+--------------------------------------------+ | UpdateXML(@xml, '//b:c', '<g:h>555</g:h>') | +--------------------------------------------+ | <a>111<g:h>555</g:h></a> | +--------------------------------------------+ 1 row in set (0.00 sec)
This is similar in some respects to what is permitted by
Apache Xalan and
some other parsers, and is much simpler than requiring namespace
declarations or the use of the namespace-uri()
and local-name()
functions.
Error handling.
For both ExtractValue()
and
UpdateXML()
, the XPath locator
used must be valid and the XML to be searched must consist of
elements which are properly nested and closed. If the locator is
invalid, an error is generated:
mysql> SELECT ExtractValue('<a>c</a><b/>', '/&a');
ERROR 1105 (HY000): XPATH syntax error: '&a'
If xml_frag
does not consist of
elements which are properly nested and closed,
NULL
is returned and a warning is generated, as
shown in this example:
mysql>SELECT ExtractValue('<a>c</a><b', '//a');
+-----------------------------------+ | ExtractValue('<a>c</a><b', '//a') | +-----------------------------------+ | NULL | +-----------------------------------+ 1 row in set, 1 warning (0.00 sec) mysql>SHOW WARNINGS\G
*************************** 1. row *************************** Level: Warning Code: 1525 Message: Incorrect XML value: 'parse error at line 1 pos 11: END-OF-INPUT unexpected ('>' wanted)' 1 row in set (0.00 sec) mysql>SELECT ExtractValue('<a>c</a><b/>', '//a');
+-------------------------------------+ | ExtractValue('<a>c</a><b/>', '//a') | +-------------------------------------+ | c | +-------------------------------------+ 1 row in set (0.00 sec)
The replacement XML used as the third argument to
UpdateXML()
is
not checked to determine whether it
consists solely of elements which are properly nested and
closed.
XPath Injection. code injection occurs when malicious code is introduced into the system to gain unauthorized access to privileges and data. It is based on exploiting assumptions made by developers about the type and content of data input from users. XPath is no exception in this regard.
A common scenario in which this can happen is the case of application which handles authorization by matching the combination of a login name and password with those found in an XML file, using an XPath expression like this one:
//user[login/text()='neapolitan' and password/text()='1c3cr34m']/attribute::id
This is the XPath equivalent of an SQL statement like this one:
SELECT id FROM users WHERE login='neapolitan' AND password='1c3cr34m';
A PHP application employing XPath might handle the login process like this:
<?php $file = "users.xml"; $login = $POST["login"]; $password = $POST["password"]; $xpath = "//user[login/text()=$login and password/text()=$password]/attribute::id"; if( file_exists($file) ) { $xml = simplexml_load_file($file); if($result = $xml->xpath($xpath)) echo "You are now logged in as user $result[0]."; else echo "Invalid login name or password."; } else exit("Failed to open $file."); ?>
No checks are performed on the input. This means that a malevolent
user can “short-circuit” the test by entering
' or 1=1
for both the login name and password,
resulting in $xpath
being evaluated as shown
here:
//user[login/text()='' or 1=1 and password/text()='' or 1=1]/attribute::id
Since the expression inside the square brackets always evaluates
as true
, it is effectively the same as this
one, which matches the id
attribute of every
user
element in the XML document:
//user/attribute::id
One way in which this particular attack can be circumvented is
simply by quoting the variable names to be interpolated in the
definition of $xpath
, forcing the values passed
from a Web form to be converted to strings:
$xpath = "//user[login/text()='$login' and password/text()='$password']/attribute::id";
This is the same strategy that is often recommended for preventing SQL injection attacks. In general, the practices you should follow for preventing XPath injection attacks are the same as for preventing SQL injection:
Never accepted untested data from users in your application.
Check all user-submitted data for type; reject or convert data that is of the wrong type
Test numeric data for out of range values; truncate, round, or reject values that are out of range. Test strings for illegal characters and either strip them out or reject input containing them.
Do not output explicit error messages that might provide an unauthorized user with clues that could be used to compromise the system; log these to a file or database table instead.
Just as SQL injection attacks can be used to obtain information about database schemas, so can XPath injection be used to traverse XML files to uncover their structure, as discussed in Amit Klein's paper Blind XPath Injection (PDF file, 46KB).
It is also important to check the output being sent back to the
client. Consider what can happen when we use the MySQL
ExtractValue()
function:
mysql>SELECT ExtractValue(
->LOAD_FILE('users.xml'),
->'//user[login/text()="" or 1=1 and password/text()="" or 1=1]/attribute::id'
->) AS id;
+-------------------------------+ | id | +-------------------------------+ | 00327 13579 02403 42354 28570 | +-------------------------------+ 1 row in set (0.01 sec)
Because ExtractValue()
returns
multiple matches as a single space-delimited string, this
injection attack provides every valid ID contained within
users.xml
to the user as a single row of
output. As an extra safeguard, you should also test output before
returning it to the user. Here is a simple example:
mysql>SELECT @id = ExtractValue(
->LOAD_FILE('users.xml'),
->'//user[login/text()="" or 1=1 and password/text()="" or 1=1]/attribute::id'
->);
Query OK, 0 rows affected (0.00 sec) mysql>SELECT IF(
->INSTR(@id, ' ') = 0,
->@id,
->'Unable to retrieve user ID')
->AS singleID;
+----------------------------+ | singleID | +----------------------------+ | Unable to retrieve user ID | +----------------------------+ 1 row in set (0.00 sec)
In general, the guidelines for returning data to users securely are the same as for accepting user input. These can be summed up as:
Always test outgoing data for type and permissible values.
Never permit unauthorized users to view error messages that might provide information about the application that could be used to exploit it.
Bit functions and operators comprise
BIT_COUNT()
,
BIT_AND()
,
BIT_OR()
,
BIT_XOR()
,
&
,
|
,
^
,
~
,
<<
, and
>>
.
(The BIT_AND()
,
BIT_OR()
, and
BIT_XOR()
aggregate functions are
described in Section 12.20.1, “Aggregate (GROUP BY) Function Descriptions”.) Prior to MySQL
8.0, bit functions and operators required
BIGINT
(64-bit integer) arguments
and returned BIGINT
values, so they
had a maximum range of 64 bits.
Non-BIGINT
arguments were converted
to BIGINT
prior to performing the
operation and truncation could occur.
In MySQL 8.0, bit functions and operators permit
binary string type arguments
(BINARY
,
VARBINARY
, and the
BLOB
types) and return a value of
like type, which enables them to take arguments and produce return
values larger than 64 bits. Nonbinary string arguments are
converted to BIGINT
and processed
as such, as before.
An implication of this change in behavior is that bit operations on binary string arguments might produce a different result in MySQL 8.0 than in 5.7. For information about how to prepare in MySQL 5.7 for potential incompatibilities between MySQL 5.7 and 8.0, see Bit Functions and Operators, in MySQL 5.7 Reference Manual.
Bit operations prior to MySQL 8.0 handle only unsigned 64-bit
integer argument and result values (that is, unsigned
BIGINT
values). Conversion of
arguments of other types to
BIGINT
occurs as necessary.
Examples:
This statement operates on numeric literals, treated as unsigned 64-bit integers:
mysql> SELECT 127 | 128, 128 << 2, BIT_COUNT(15);
+-----------+----------+---------------+
| 127 | 128 | 128 << 2 | BIT_COUNT(15) |
+-----------+----------+---------------+
| 255 | 512 | 4 |
+-----------+----------+---------------+
This statement performs to-number conversions on the string
arguments ('127'
to
127
, and so forth) before performing the
same operations as the first statement and producing the
same results:
mysql> SELECT '127' | '128', '128' << 2, BIT_COUNT('15');
+---------------+------------+-----------------+
| '127' | '128' | '128' << 2 | BIT_COUNT('15') |
+---------------+------------+-----------------+
| 255 | 512 | 4 |
+---------------+------------+-----------------+
This statement uses hexadecimal literals for the bit-operation arguments. MySQL by default treats hexadecimal literals as binary strings, but in numeric context evaluates them as numbers (see Section 9.1.4, “Hexadecimal Literals”). Prior to MySQL 8.0, numeric context includes bit operations. Examples:
mysql> SELECT X'7F' | X'80', X'80' << 2, BIT_COUNT(X'0F');
+---------------+------------+------------------+
| X'7F' | X'80' | X'80' << 2 | BIT_COUNT(X'0F') |
+---------------+------------+------------------+
| 255 | 512 | 4 |
+---------------+------------+------------------+
Handling of bit-value literals in bit operations is similar to hexadecimal literals (that is, as numbers).
MySQL 8.0 extends bit operations to handle binary string arguments directly (without conversion) and produce binary string results. (Arguments that are not integers or binary strings are still converted to integers, as before.) This extension enhances bit operations in the following ways:
Bit operations become possible on values longer than 64 bits.
It is easier to perform bit operations on values that are more naturally represented as binary strings than as integers.
For example, consider UUID values and IPv6 addresses, which have human-readable text formats like this:
UUID: 6ccd780c-baba-1026-9564-5b8c656024db IPv6: fe80::219:d1ff:fe91:1a72
It is cumbersome to operate on text strings in those formats. An
alternative is convert them to fixed-length binary strings
without delimiters. UUID_TO_BIN()
and INET6_ATON()
each produce a
value of data type BINARY(16)
, a
binary string 16 bytes (128 bits) long. The following statements
illustrate this (HEX()
is used to produce
displayable values):
mysql>SELECT HEX(UUID_TO_BIN('6ccd780c-baba-1026-9564-5b8c656024db'));
+----------------------------------------------------------+ | HEX(UUID_TO_BIN('6ccd780c-baba-1026-9564-5b8c656024db')) | +----------------------------------------------------------+ | 6CCD780CBABA102695645B8C656024DB | +----------------------------------------------------------+ mysql>SELECT HEX(INET6_ATON('fe80::219:d1ff:fe91:1a72'));
+---------------------------------------------+ | HEX(INET6_ATON('fe80::219:d1ff:fe91:1a72')) | +---------------------------------------------+ | FE800000000000000219D1FFFE911A72 | +---------------------------------------------+
Those binary values are easily manipulable with bit operations to perform actions such as extracting the timestamp from UUID values, or extracting the network and host parts of IPv6 addresses. (For examples, see later in this discussion.)
Arguments that count as binary strings include column values,
routine parameters, local variables, and user-defined variables
that have a binary string type:
BINARY
,
VARBINARY
, or one of the
BLOB
types.
What about hexadecimal literals and bit literals? Recall that those are binary strings by default in MySQL, but numbers in numeric context. How are they handled for bit operations in MySQL 8.0? Does MySQL continue to evaluate them in numeric context, as is done prior to MySQL 8.0? Or do bit operations evaluate them as binary strings, now that binary strings can be handled “natively” without conversion?
Answer: It has been common to specify arguments to bit
operations using hexadecimal literals or bit literals with the
intent that they represent numbers, so MySQL continues to
evaluate bit operations in numeric context when all bit
arguments are hexadecimal or bit literals, for backward
compatility. If you require evaluation as binary strings
instead, that is easily accomplished: Use the
_binary
introducer for at least one literal.
These bit operations evaluate the hexadecimal literals and bit literals as integers:
mysql> SELECT X'40' | X'01', b'11110001' & b'01001111';
+---------------+---------------------------+
| X'40' | X'01' | b'11110001' & b'01001111' |
+---------------+---------------------------+
| 65 | 65 |
+---------------+---------------------------+
These bit operations evaluate the hexadecimal literals and
bit literals as binary strings, due to the
_binary
introducer:
mysql> SELECT _binary X'40' | X'01', b'11110001' & _binary b'01001111';
+-----------------------+-----------------------------------+
| _binary X'40' | X'01' | b'11110001' & _binary b'01001111' |
+-----------------------+-----------------------------------+
| A | A |
+-----------------------+-----------------------------------+
Although the bit operations in both statements produce a result
with a numeric value of 65, the second statement operates in
binary-string context, for which 65 is ASCII
A
.
In numeric evaluation context, permitted values of hexadecimal literal and bit literal arguments have a maximum of 64 bits, as do results. By contrast, in binary-string evaluation context, permitted arguments (and results) can exceed 64 bits:
mysql> SELECT _binary X'4040404040404040' | X'0102030405060708';
+---------------------------------------------------+
| _binary X'4040404040404040' | X'0102030405060708' |
+---------------------------------------------------+
| ABCDEFGH |
+---------------------------------------------------+
There are several ways to refer to a hexadecimal literal or bit literal in a bit operation to cause binary-string evaluation:
_binaryliteral
BINARYliteral
CAST(literal
AS BINARY)
Another way to produce binary-string evaluation of hexadecimal literals or bit literals is to assign them to user-defined variables, which results in variables that have a binary string type:
mysql>SET @v1 = X'40', @v2 = X'01', @v3 = b'11110001', @v4 = b'01001111';
mysql>SELECT @v1 | @v2, @v3 & @v4;
+-----------+-----------+ | @v1 | @v2 | @v3 & @v4 | +-----------+-----------+ | A | A | +-----------+-----------+
In binary-string context, bitwise operation arguments must have
the same length or an
ER_INVALID_BITWISE_OPERANDS_SIZE
error occurs:
mysql> SELECT _binary X'40' | X'0001';
ERROR 3513 (HY000): Binary operands of bitwise
operators must be of equal length
To satisfy the equal-length requirement, pad the shorter value with leading zero digits or, if the longer value begins with leading zero digits and a shorter result value is acceptable, strip them:
mysql>SELECT _binary X'0040' | X'0001';
+---------------------------+ | _binary X'0040' | X'0001' | +---------------------------+ | A | +---------------------------+ mysql>SELECT _binary X'40' | X'01';
+-----------------------+ | _binary X'40' | X'01' | +-----------------------+ | A | +-----------------------+
Padding or stripping can also be accomplished using functions
such as LPAD()
,
RPAD()
,
SUBSTR()
, or
CAST()
. In such cases, the
expression arguments are no longer all literals and
_binary
becomes unnecessary. Examples:
mysql>SELECT LPAD(X'40', 2, X'00') | X'0001';
+---------------------------------+ | LPAD(X'40', 2, X'00') | X'0001' | +---------------------------------+ | A | +---------------------------------+ mysql>SELECT X'40' | SUBSTR(X'0001', 2, 1);
+-------------------------------+ | X'40' | SUBSTR(X'0001', 2, 1) | +-------------------------------+ | A | +-------------------------------+
The following example illustrates use of bit operations to extract parts of a UUID value, in this case, the timestamp and IEEE 802 node number. This technique requires bitmasks for each extracted part.
Convert the text UUID to the corresponding 16-byte binary value so that it can be manipulated using bit operations in binary-string context:
mysql>SET @uuid = UUID_TO_BIN('6ccd780c-baba-1026-9564-5b8c656024db');
mysql>SELECT HEX(@uuid);
+----------------------------------+ | HEX(@uuid) | +----------------------------------+ | 6CCD780CBABA102695645B8C656024DB | +----------------------------------+
Construct bitmasks for the timestamp and node number parts of the value. The timestamp comprises the first three parts (64 bits, bits 0 to 63) and the node number is the last part (48 bits, bits 80 to 127):
mysql>SET @ts_mask = CAST(X'FFFFFFFFFFFFFFFF' AS BINARY(16));
mysql>SET @node_mask = CAST(X'FFFFFFFFFFFF' AS BINARY(16)) >> 80;
mysql>SELECT HEX(@ts_mask);
+----------------------------------+ | HEX(@ts_mask) | +----------------------------------+ | FFFFFFFFFFFFFFFF0000000000000000 | +----------------------------------+ mysql>SELECT HEX(@node_mask);
+----------------------------------+ | HEX(@node_mask) | +----------------------------------+ | 00000000000000000000FFFFFFFFFFFF | +----------------------------------+
The CAST(... AS BINARY(16))
function is used
here because the masks must be the same length as the UUID value
against which they are applied. The same result can be produced
using other functions to pad the masks to the required length:
SET @ts_mask= RPAD(X'FFFFFFFFFFFFFFFF' , 16, X'00'); SET @node_mask = LPAD(X'FFFFFFFFFFFF', 16, X'00') ;
Use the masks to extract the timestamp and node number parts:
mysql>SELECT HEX(@uuid & @ts_mask) AS 'timestamp part';
+----------------------------------+ | timestamp part | +----------------------------------+ | 6CCD780CBABA10260000000000000000 | +----------------------------------+ mysql>SELECT HEX(@uuid & @node_mask) AS 'node part';
+----------------------------------+ | node part | +----------------------------------+ | 000000000000000000005B8C656024DB | +----------------------------------+
The preceding example uses these bit operations: right shift
(>>
)
and bitwise AND
(&
).
UUID_TO_BIN()
takes a flag that
causes some bit rearrangement in the resulting binary UUID
value. If you use that flag, modify the extraction masks
accordingly.
The next example uses bit operations to extract the network and host parts of an IPv6 address. Suppose that the network part has a length of 80 bits. Then the host part has a length of 128 − 80 = 48 bits. To extract the network and host parts of the address, convert it to a binary string, then use bit operations in binary-string context.
Convert the text IPv6 address to the corresponding binary string:
mysql> SET @ip = INET6_ATON('fe80::219:d1ff:fe91:1a72');
Define the network length in bits:
mysql> SET @net_len = 80;
Construct network and host masks by shifting the all-ones
address left or right. To do this, begin with the address
::
, which is shorthand for all zeros, as you
can see by converting it to a binary string like this:
mysql> SELECT HEX(INET6_ATON('::')) AS 'all zeros';
+----------------------------------+
| all zeros |
+----------------------------------+
| 00000000000000000000000000000000 |
+----------------------------------+
To produce the complementary value (all ones), use the
~
operator to invert the bits:
mysql> SELECT HEX(~INET6_ATON('::')) AS 'all ones';
+----------------------------------+
| all ones |
+----------------------------------+
| FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF |
+----------------------------------+
Shift the all-ones value left or right to produce the network and host masks:
mysql>SET @net_mask = ~INET6_ATON('::') << (128 - @net_len);
mysql>SET @host_mask = ~INET6_ATON('::') >> @net_len;
Display the masks to verify that they cover the correct parts of the address:
mysql>SELECT INET6_NTOA(@net_mask) AS 'network mask';
+----------------------------+ | network mask | +----------------------------+ | ffff:ffff:ffff:ffff:ffff:: | +----------------------------+ mysql>SELECT INET6_NTOA(@host_mask) AS 'host mask';
+------------------------+ | host mask | +------------------------+ | ::ffff:255.255.255.255 | +------------------------+
Extract and display the network and host parts of the address:
mysql>SET @net_part = @ip & @net_mask;
mysql>SET @host_part = @ip & @host_mask;
mysql>SELECT INET6_NTOA(@net_part) AS 'network part';
+-----------------+ | network part | +-----------------+ | fe80::219:0:0:0 | +-----------------+ mysql>SELECT INET6_NTOA(@host_part) AS 'host part';
+------------------+ | host part | +------------------+ | ::d1ff:fe91:1a72 | +------------------+
The preceding example uses these bit operations: Complement
(~
),
left shift
(<<
),
and bitwise AND
(&
).
The remaining discussion provides details on argument handling for each group of bit operations, more information about literal-value handling in bit operations, and potential incompatibilities between MySQL 8.0 and older MySQL versions.
For &
,
|
, and
^
bit
operations, the result type depends on whether the arguments are
evaluated as binary strings or numbers:
Binary-string evaluation occurs when the arguments have a
binary string type, and at least one of them is not a
hexadecimal literal, bit literal, or NULL
literal. Numeric evaluation occurs otherwise, with argument
conversion to unsigned 64-bit integers as necessary.
Binary-string evaluation produces a binary string of the
same length as the arguments. If the arguments have unequal
lengths, an
ER_INVALID_BITWISE_OPERANDS_SIZE
error occurs. Numeric evaluation produces an unsigned 64-bit
integer.
Examples of numeric evaluation:
mysql> SELECT 64 | 1, X'40' | X'01';
+--------+---------------+
| 64 | 1 | X'40' | X'01' |
+--------+---------------+
| 65 | 65 |
+--------+---------------+
Examples of binary-string evaluation:
mysql>SELECT _binary X'40' | X'01';
+-----------------------+ | _binary X'40' | X'01' | +-----------------------+ | A | +-----------------------+ mysql>SET @var1 = X'40', @var2 = X'01';
mysql>SELECT @var1 | @var2;
+---------------+ | @var1 | @var2 | +---------------+ | A | +---------------+
For ~
,
<<
,
and
>>
bit operations, the result type depends on whether the bit
argument is evaluated as a binary string or number:
Binary-string evaluation occurs when the bit argument has a
binary string type, and is not a hexadecimal literal, bit
literal, or NULL
literal. Numeric
evaluation occurs otherwise, with argument conversion to an
unsigned 64-bit integer as necessary.
Binary-string evaluation produces a binary string of the same length as the bit argument. Numeric evaluation produces an unsigned 64-bit integer.
For shift operations, bits shifted off the end of the value are lost without warning, regardless of the argument type. In particular, if the shift count is greater or equal to the number of bits in the bit argument, all bits in the result are 0.
Examples of numeric evaluation:
mysql> SELECT ~0, 64 << 2, X'40' << 2;
+----------------------+---------+------------+
| ~0 | 64 << 2 | X'40' << 2 |
+----------------------+---------+------------+
| 18446744073709551615 | 256 | 256 |
+----------------------+---------+------------+
Examples of binary-string evaluation:
mysql>SELECT HEX(_binary X'1111000022220000' >> 16);
+----------------------------------------+ | HEX(_binary X'1111000022220000' >> 16) | +----------------------------------------+ | 0000111100002222 | +----------------------------------------+ mysql>SELECT HEX(_binary X'1111000022220000' << 16);
+----------------------------------------+ | HEX(_binary X'1111000022220000' << 16) | +----------------------------------------+ | 0000222200000000 | +----------------------------------------+ mysql>SET @var1 = X'F0F0F0F0';
mysql>SELECT HEX(~@var1);
+-------------+ | HEX(~@var1) | +-------------+ | 0F0F0F0F | +-------------+
The BIT_COUNT()
function always
returns an unsigned 64-bit integer, or NULL
if the argument is NULL
.
mysql>SELECT BIT_COUNT(127);
+----------------+ | BIT_COUNT(127) | +----------------+ | 7 | +----------------+ mysql>SELECT BIT_COUNT(b'010101'), BIT_COUNT(_binary b'010101');
+----------------------+------------------------------+ | BIT_COUNT(b'010101') | BIT_COUNT(_binary b'010101') | +----------------------+------------------------------+ | 3 | 3 | +----------------------+------------------------------+
For the BIT_AND()
,
BIT_OR()
, and
BIT_XOR()
bit functions, the
result type depends on whether the function argument values are
evaluated as binary strings or numbers:
Binary-string evaluation occurs when the argument values
have a binary string type, and the argument is not a
hexadecimal literal, bit literal, or NULL
literal. Numeric evaluation occurs otherwise, with argument
value conversion to unsigned 64-bit integers as necessary.
Binary-string evaluation produces a binary string of the
same length as the argument values. If argument values have
unequal lengths, an
ER_INVALID_BITWISE_OPERANDS_SIZE
error occurs. If the argument size exceeds 511 bytes, an
ER_INVALID_BITWISE_AGGREGATE_OPERANDS_SIZE
error occurs. Numeric evaluation produces an unsigned 64-bit
integer.
NULL
values do not affect the result unless
all values are NULL
. In that case, the result
is a neutral value having the same length as the length of the
argument values (all bits 1 for
BIT_AND()
, all bits 0 for
BIT_OR()
, and
BIT_XOR()
).
Example:
mysql>CREATE TABLE t (group_id INT, a VARBINARY(6));
mysql>INSERT INTO t VALUES (1, NULL);
mysql>INSERT INTO t VALUES (1, NULL);
mysql>INSERT INTO t VALUES (2, NULL);
mysql>INSERT INTO t VALUES (2, X'1234');
mysql>INSERT INTO t VALUES (2, X'FF34');
mysql>SELECT HEX(BIT_AND(a)), HEX(BIT_OR(a)), HEX(BIT_XOR(a))
FROM t GROUP BY group_id;
+-----------------+----------------+-----------------+ | HEX(BIT_AND(a)) | HEX(BIT_OR(a)) | HEX(BIT_XOR(a)) | +-----------------+----------------+-----------------+ | FFFFFFFFFFFF | 000000000000 | 000000000000 | | 1234 | FF34 | ED00 | +-----------------+----------------+-----------------+
For backward compatibility, MySQL 8.0 evaluates bit operations
in numeric context when all bit arguments are hexadecimal
literals, bit literals, or NULL
literals.
That is, bit operations on binary-string bit arguments do not
use binary-string evaluation if all bit arguments are unadorned
hexadecimal literals, bit literals, or NULL
literals. (This does not apply to such literals if they are
written with a _binary
introducer,
BINARY
operator, or other way of
specifying them explicitly as binary strings.)
The literal handling just described is the same as prior to MySQL 8.0. Examples:
These bit operations evaluate the literals in numeric
context and produce a BIGINT
result:
b'0001' | b'0010' X'0008' << 8
These bit operations evaluate NULL
in
numeric context and produce a BIGINT
result that has a NULL
value:
NULL & NULL NULL >> 4
In MySQL 8.0, you can cause those operations to evaluate the arguments in binary-string context by indicating explicitly that at least one argument is a binary string:
_binary b'0001' | b'0010' _binary X'0008' << 8 BINARY NULL & NULL BINARY NULL >> 4
The result of the last two expressions is
NULL
, just as without the
BINARY
operator, but the data type of the
result is a binary string type rather than an integer type.
Because bit operations can handle binary string arguments natively in MySQL 8.0, some expressions produce a different result in MySQL 8.0 than in 5.7. The five problematic expression types to watch out for are:
nonliteral_binary
{ & | ^ }binary
binary
{ & | ^ }nonliteral_binary
nonliteral_binary
{ << >> }anything
~nonliteral_binary
AGGR_BIT_FUNC
(nonliteral_binary
)
Those expressions return BIGINT
in MySQL 5.7, binary string in 8.0.
Explanation of notation:
{
: List of
operators that apply to the given expression type.
op1
op2
... }
binary
: Any kind of binary string
argument, including a hexadecimal literal, bit literal, or
NULL
literal.
nonliteral_binary
: An argument
that is a binary string value other than a hexadecimal
literal, bit literal, or NULL
literal.
AGGR_BIT_FUNC
: An aggregate
function that takes bit-value arguments:
BIT_AND()
,
BIT_OR()
,
BIT_XOR()
.
For information about how to prepare in MySQL 5.7 for potential incompatibilities between MySQL 5.7 and 8.0, see Bit Functions and Operators, in MySQL 5.7 Reference Manual.
The following list describes available bit functions and operators:
Bitwise OR.
The result type depends on whether the arguments are evaluated as binary strings or numbers:
Binary-string evaluation occurs when the arguments have
a binary string type, and at least one of them is not a
hexadecimal literal, bit literal, or
NULL
literal. Numeric evaluation
occurs otherwise, with argument conversion to unsigned
64-bit integers as necessary.
Binary-string evaluation produces a binary string of the
same length as the arguments. If the arguments have
unequal lengths, an
ER_INVALID_BITWISE_OPERANDS_SIZE
error occurs. Numeric evaluation produces an unsigned
64-bit integer.
For more information, see the introductory discussion in this section.
mysql>SELECT 29 | 15;
-> 31 mysql>SELECT _binary X'40404040' | X'01020304';
-> 'ABCD'
Bitwise AND.
The result type depends on whether the arguments are evaluated as binary strings or numbers:
Binary-string evaluation occurs when the arguments have
a binary string type, and at least one of them is not a
hexadecimal literal, bit literal, or
NULL
literal. Numeric evaluation
occurs otherwise, with argument conversion to unsigned
64-bit integers as necessary.
Binary-string evaluation produces a binary string of the
same length as the arguments. If the arguments have
unequal lengths, an
ER_INVALID_BITWISE_OPERANDS_SIZE
error occurs. Numeric evaluation produces an unsigned
64-bit integer.
For more information, see the introductory discussion in this section.
mysql>SELECT 29 & 15;
-> 13 mysql>SELECT HEX(_binary X'FF' & b'11110000');
-> 'F0'
Bitwise XOR.
The result type depends on whether the arguments are evaluated as binary strings or numbers:
Binary-string evaluation occurs when the arguments have
a binary string type, and at least one of them is not a
hexadecimal literal, bit literal, or
NULL
literal. Numeric evaluation
occurs otherwise, with argument conversion to unsigned
64-bit integers as necessary.
Binary-string evaluation produces a binary string of the
same length as the arguments. If the arguments have
unequal lengths, an
ER_INVALID_BITWISE_OPERANDS_SIZE
error occurs. Numeric evaluation produces an unsigned
64-bit integer.
For more information, see the introductory discussion in this section.
mysql>SELECT 1 ^ 1;
-> 0 mysql>SELECT 1 ^ 0;
-> 1 mysql>SELECT 11 ^ 3;
-> 8 mysql>SELECT HEX(_binary X'FEDC' ^ X'1111');
-> 'EFCD'
Shifts a longlong (BIGINT
)
number or binary string to the left.
The result type depends on whether the bit argument is evaluated as a binary string or number:
Binary-string evaluation occurs when the bit argument
has a binary string type, and is not a hexadecimal
literal, bit literal, or NULL
literal. Numeric evaluation occurs otherwise, with
argument conversion to an unsigned 64-bit integer as
necessary.
Binary-string evaluation produces a binary string of the same length as the bit argument. Numeric evaluation produces an unsigned 64-bit integer.
Bits shifted off the end of the value are lost without warning, regardless of the argument type. In particular, if the shift count is greater or equal to the number of bits in the bit argument, all bits in the result are 0.
For more information, see the introductory discussion in this section.
mysql>SELECT 1 << 2;
-> 4 mysql>SELECT HEX(_binary X'00FF00FF00FF' << 8);
-> 'FF00FF00FF00'
Shifts a longlong (BIGINT
)
number or binary string to the right.
The result type depends on whether the bit argument is evaluated as a binary string or number:
Binary-string evaluation occurs when the bit argument
has a binary string type, and is not a hexadecimal
literal, bit literal, or NULL
literal. Numeric evaluation occurs otherwise, with
argument conversion to an unsigned 64-bit integer as
necessary.
Binary-string evaluation produces a binary string of the same length as the bit argument. Numeric evaluation produces an unsigned 64-bit integer.
Bits shifted off the end of the value are lost without warning, regardless of the argument type. In particular, if the shift count is greater or equal to the number of bits in the bit argument, all bits in the result are 0.
For more information, see the introductory discussion in this section.
mysql>SELECT 4 >> 2;
-> 1 mysql>SELECT HEX(_binary X'00FF00FF00FF' >> 8);
-> '0000FF00FF00'
Invert all bits.
The result type depends on whether the bit argument is evaluated as a binary string or number:
Binary-string evaluation occurs when the bit argument
has a binary string type, and is not a hexadecimal
literal, bit literal, or NULL
literal. Numeric evaluation occurs otherwise, with
argument conversion to an unsigned 64-bit integer as
necessary.
Binary-string evaluation produces a binary string of the same length as the bit argument. Numeric evaluation produces an unsigned 64-bit integer.
For more information, see the introductory discussion in this section.
mysql>SELECT 5 & ~1;
-> 4 mysql>SELECT HEX(~X'0000FFFF1111EEEE');
-> 'FFFF0000EEEE1111'
Returns the number of bits that are set in the argument
N
as an unsigned 64-bit integer,
or NULL
if the argument is
NULL
.
mysql>SELECT BIT_COUNT(64), BIT_COUNT(BINARY 64);
-> 1, 7 mysql>SELECT BIT_COUNT('64'), BIT_COUNT(_binary '64');
-> 1, 7 mysql>SELECT BIT_COUNT(X'40'), BIT_COUNT(_binary X'40');
-> 1, 1
Table 12.17 Encryption Functions
Name | Description |
---|---|
AES_DECRYPT() |
Decrypt using AES |
AES_ENCRYPT() |
Encrypt using AES |
ASYMMETRIC_DECRYPT() |
Decrypt ciphertext using private or public key |
ASYMMETRIC_DERIVE() |
Derive symmetric key from asymmetric keys |
ASYMMETRIC_ENCRYPT() |
Encrypt cleartext using private or public key |
ASYMMETRIC_SIGN() |
Generate signature from digest |
ASYMMETRIC_VERIFY() |
Verify that signature matches digest |
COMPRESS() |
Return result as a binary string |
CREATE_ASYMMETRIC_PRIV_KEY() |
Create private key |
CREATE_ASYMMETRIC_PUB_KEY() |
Create public key |
CREATE_DH_PARAMETERS() |
Generate shared DH secret |
CREATE_DIGEST() |
Generate digest from string |
DECODE() |
Decode a string encrypted using ENCODE() |
DES_DECRYPT() |
Decrypt a string |
DES_ENCRYPT() |
Encrypt a string |
ENCODE() |
Encode a string |
ENCRYPT() |
Encrypt a string |
MD5() |
Calculate MD5 checksum |
PASSWORD() |
Calculate and return a password string |
RANDOM_BYTES() |
Return a random byte vector |
SHA1() , SHA() |
Calculate an SHA-1 160-bit checksum |
SHA2() |
Calculate an SHA-2 checksum |
STATEMENT_DIGEST() |
Compute statement digest hash value |
STATEMENT_DIGEST_TEXT() |
Compute normalized statement digest |
UNCOMPRESS() |
Uncompress a string compressed |
UNCOMPRESSED_LENGTH() |
Return the length of a string before compression |
VALIDATE_PASSWORD_STRENGTH() |
Determine strength of password |
Many encryption and compression functions return strings for which
the result might contain arbitrary byte values. If you want to
store these results, use a column with a
VARBINARY
or
BLOB
binary string data type. This
will avoid potential problems with trailing space removal or
character set conversion that would change data values, such as
may occur if you use a nonbinary string data type
(CHAR
,
VARCHAR
,
TEXT
).
Some encryption functions return strings of ASCII characters:
MD5()
,
SHA()
,
SHA1()
,
SHA2()
,
STATEMENT_DIGEST()
,
STATEMENT_DIGEST_TEXT()
. Their
return value is a string that has a character set and collation
determined by the
character_set_connection
and
collation_connection
system
variables. This is a nonbinary string unless the character set is
binary
.
If an application stores values from a function such as
MD5()
or
SHA1()
that returns a string of hex
digits, more efficient storage and comparisons can be obtained by
converting the hex representation to binary using
UNHEX()
and storing the result in a
BINARY(
column. Each pair of hexadecimal digits requires one byte in
binary form, so the value of N
)N
depends
on the length of the hex string. N
is
16 for an MD5()
value and 20 for a
SHA1()
value. For
SHA2()
,
N
ranges from 28 to 32 depending on the
argument specifying the desired bit length of the result.
The size penalty for storing the hex string in a
CHAR
column is at least two times,
up to eight times if the value is stored in a column that uses the
utf8
character set (where each character uses 4
bytes). Storing the string also results in slower comparisons
because of the larger values and the need to take character set
collation rules into account.
Suppose that an application stores
MD5()
string values in a
CHAR(32)
column:
CREATE TABLE md5_tbl (md5_val CHAR(32), ...); INSERT INTO md5_tbl (md5_val, ...) VALUES(MD5('abcdef'), ...);
To convert hex strings to more compact form, modify the
application to use UNHEX()
and
BINARY(16)
instead as follows:
CREATE TABLE md5_tbl (md5_val BINARY(16), ...); INSERT INTO md5_tbl (md5_val, ...) VALUES(UNHEX(MD5('abcdef')), ...);
Applications should be prepared to handle the very rare case that a hashing function produces the same value for two different input values. One way to make collisions detectable is to make the hash column a primary key.
Exploits for the MD5 and SHA-1 algorithms have become known. You
may wish to consider using another one-way encryption function
described in this section instead, such as
SHA2()
.
Passwords or other sensitive values supplied as arguments to encryption functions are sent in cleartext to the MySQL server unless an SSL connection is used. Also, such values will appear in any MySQL logs to which they are written. To avoid these types of exposure, applications can encrypt sensitive values on the client side before sending them to the server. The same considerations apply to encryption keys. To avoid exposing these, applications can use stored procedures to encrypt and decrypt values on the server side.
AES_DECRYPT(
crypt_str
,key_str
[,init_vector
])
This function decrypts data using the official AES (Advanced
Encryption Standard) algorithm. For more information, see the
description of AES_ENCRYPT()
.
The optional initialization vector argument,
init_vector
. Statements that use
AES_DECRYPT()
are unsafe for
statement-based replication.
AES_ENCRYPT(
str
,key_str
[,init_vector
])
AES_ENCRYPT()
and
AES_DECRYPT()
implement
encryption and decryption of data using the official AES
(Advanced Encryption Standard) algorithm, previously known as
“Rijndael.” The AES standard permits various key
lengths. By default these functions implement AES with a
128-bit key length. Key lengths of 196 or 256 bits can be
used, as described later. The key length is a trade off
between performance and security.
AES_ENCRYPT()
encrypts the
string str
using the key string
key_str
and returns a binary string
containing the encrypted output.
AES_DECRYPT()
decrypts the
encrypted string crypt_str
using
the key string key_str
and returns
the original cleartext string. If either function argument is
NULL
, the function returns
NULL
.
The str
and
crypt_str
arguments can be any
length, and padding is automatically added to
str
so it is a multiple of a block
as required by block-based algorithms such as AES. This
padding is automatically removed by the
AES_DECRYPT()
function. The
length of crypt_str
can be
calculated using this formula:
16 * (trunc(string_length
/ 16) + 1)
For a key length of 128 bits, the most secure way to pass a
key to the key_str
argument is to
create a truly random 128-bit value and pass it as a binary
value. For example:
INSERT INTO t VALUES (1,AES_ENCRYPT('text',UNHEX('F3229A0B371ED2D9441B830D21A390C3')));
A passphrase can be used to generate an AES key by hashing the passphrase. For example:
INSERT INTO t VALUES (1,AES_ENCRYPT('text', UNHEX(SHA2('My secret passphrase',512))));
Do not pass a password or passphrase directly to
crypt_str
, hash it first. Previous
versions of this documentation suggested the former approach,
but it is no longer recommended as the examples shown here are
more secure.
If AES_DECRYPT()
detects
invalid data or incorrect padding, it returns
NULL
. However, it is possible for
AES_DECRYPT()
to return a
non-NULL
value (possibly garbage) if the
input data or the key is invalid.
AES_ENCRYPT()
and
AES_DECRYPT()
permit control of
the block encryption mode and take an optional
init_vector
initialization vector
argument:
The block_encryption_mode
system variable controls the mode for block-based
encryption algorithms. Its default value is
aes-128-ecb
, which signifies encryption
using a key length of 128 bits and ECB mode. For a
description of the permitted values of this variable, see
Section 5.1.8, “Server System Variables”.
The optional init_vector
argument provides an initialization vector for block
encryption modes that require it.
For modes that require the optional
init_vector
argument, it must be 16
bytes or longer (bytes in excess of 16 are ignored). An error
occurs if init_vector
is missing.
For modes that do not require
init_vector
, it is ignored and a
warning is generated if it is specified.
A random string of bytes to use for the initialization vector
can be produced by calling
RANDOM_BYTES(16)
. For
encryption modes that require an initialization vector, the
same vector must be used for encryption and decryption.
mysql>SET block_encryption_mode = 'aes-256-cbc';
mysql>SET @key_str = SHA2('My secret passphrase',512);
mysql>SET @init_vector = RANDOM_BYTES(16);
mysql>SET @crypt_str = AES_ENCRYPT('text',@key_str,@init_vector);
mysql>SELECT AES_DECRYPT(@crypt_str,@key_str,@init_vector);
+-----------------------------------------------+ | AES_DECRYPT(@crypt_str,@key_str,@init_vector) | +-----------------------------------------------+ | text | +-----------------------------------------------+
The following table lists each permitted block encryption mode, the SSL libraries that support it, and whether the initialization vector argument is required.
Block Encryption Mode | SSL Libraries that Support Mode | Initialization Vector Required |
---|---|---|
ECB | OpenSSL, wolfSSL | No |
CBC | OpenSSL, wolfSSL | Yes |
CFB1 | OpenSSL | Yes |
CFB8 | OpenSSL | Yes |
CFB128 | OpenSSL | Yes |
OFB | OpenSSL | Yes |
Statements that use
AES_ENCRYPT()
or
AES_DECRYPT()
are unsafe for
statement-based replication.
Compresses a string and returns the result as a binary string.
This function requires MySQL to have been compiled with a
compression library such as zlib
.
Otherwise, the return value is always NULL
.
The compressed string can be uncompressed with
UNCOMPRESS()
.
mysql>SELECT LENGTH(COMPRESS(REPEAT('a',1000)));
-> 21 mysql>SELECT LENGTH(COMPRESS(''));
-> 0 mysql>SELECT LENGTH(COMPRESS('a'));
-> 13 mysql>SELECT LENGTH(COMPRESS(REPEAT('a',16)));
-> 15
The compressed string contents are stored the following way:
Empty strings are stored as empty strings.
Nonempty strings are stored as a 4-byte length of the
uncompressed string (low byte first), followed by the
compressed string. If the string ends with space, an extra
.
character is added to avoid problems
with endspace trimming should the result be stored in a
CHAR
or
VARCHAR
column. (However,
use of nonbinary string data types such as
CHAR
or
VARCHAR
to store compressed
strings is not recommended anyway because character set
conversion may occur. Use a
VARBINARY
or
BLOB
binary string column
instead.)
This function was removed in MySQL 8.0.3.
Consider using AES_ENCRYPT()
and AES_DECRYPT()
instead.
DES_DECRYPT(
crypt_str
[,key_str
])
This function was removed in MySQL 8.0.3.
Consider using AES_ENCRYPT()
and AES_DECRYPT()
instead.
DES_ENCRYPT(
str
[,{key_num
|key_str
}])
This function was removed in MySQL 8.0.3.
Consider using AES_ENCRYPT()
and AES_DECRYPT()
instead.
This function was removed in MySQL 8.0.3.
Consider using AES_ENCRYPT()
and AES_DECRYPT()
instead.
This function was removed in MySQL 8.0.3. For one-way hashing,
consider using SHA2()
instead.
Calculates an MD5 128-bit checksum for the string. The value
is returned as a string of 32 hexadecimal digits, or
NULL
if the argument was
NULL
. The return value can, for example, be
used as a hash key. See the notes at the beginning of this
section about storing hash values efficiently.
The return value is a string in the connection character set.
If FIPS mode is enabled,
MD5()
returns
NULL
. See Section 6.6, “FIPS Support”.
mysql> SELECT MD5('testing');
-> 'ae2b1fca515949e5d54fb22b8ed95575'
This is the “RSA Data Security, Inc. MD5 Message-Digest Algorithm.”
See the note regarding the MD5 algorithm at the beginning this section.
This function was removed in MySQL 8.0.11.
This function returns a binary string of
len
random bytes generated using
the random number generator of the SSL library. Permitted
values of len
range from 1 to 1024.
For values outside that range,
RANDOM_BYTES()
generates a
warning and returns NULL
.
RANDOM_BYTES()
can be used to
provide the initialization vector for the
AES_DECRYPT()
and
AES_ENCRYPT()
functions. For
use in that context, len
must be at
least 16. Larger values are permitted, but bytes in excess of
16 are ignored.
RANDOM_BYTES()
generates a
random value, which makes its result nondeterministic.
Consequently, statements that use this function are unsafe for
statement-based replication.
Calculates an SHA-1 160-bit checksum for the string, as
described in RFC 3174 (Secure Hash Algorithm). The value is
returned as a string of 40 hexadecimal digits, or
NULL
if the argument was
NULL
. One of the possible uses for this
function is as a hash key. See the notes at the beginning of
this section about storing hash values efficiently.
SHA()
is
synonymous with SHA1()
.
The return value is a string in the connection character set.
mysql> SELECT SHA1('abc');
-> 'a9993e364706816aba3e25717850c26c9cd0d89d'
SHA1()
can be considered a
cryptographically more secure equivalent of
MD5()
. However, see the note
regarding the MD5 and SHA-1 algorithms at the beginning this
section.
Calculates the SHA-2 family of hash functions (SHA-224,
SHA-256, SHA-384, and SHA-512). The first argument is the
cleartext string to be hashed. The second argument indicates
the desired bit length of the result, which must have a value
of 224, 256, 384, 512, or 0 (which is equivalent to 256). If
either argument is NULL
or the hash length
is not one of the permitted values, the return value is
NULL
. Otherwise, the function result is a
hash value containing the desired number of bits. See the
notes at the beginning of this section about storing hash
values efficiently.
The return value is a string in the connection character set.
mysql> SELECT SHA2('abc', 224);
-> '23097d223405d8228642a477bda255b32aadbce4bda0b3f7e36c9da7'
This function works only if MySQL has been configured with SSL support. See Section 6.4, “Using Encrypted Connections”.
SHA2()
can be considered
cryptographically more secure than
MD5()
or
SHA1()
.
Given an SQL statement as a string, returns the statement
digest hash value as a string in the connection character set,
or NULL
if the argument is
NULL
. The related
STATEMENT_DIGEST_TEXT()
function returns the normalized statement digest. For
information about statement digesting, see
Section 26.10, “Performance Schema Statement Digests and Sampling”.
Both functions use the MySQL parser to parse the statement. If parsing fails, an error occurs. The error message includes the parse error only if the statement is provided as a literal string.
The max_digest_length
system
variable determines the maximum number of bytes available to
these functions for computing normalized statement digests.
mysql>SET @stmt = 'SELECT * FROM mytable WHERE cola = 10 AND colb = 20';
mysql>SELECT STATEMENT_DIGEST(@stmt);
+------------------------------------------------------------------+ | STATEMENT_DIGEST(@stmt) | +------------------------------------------------------------------+ | 3bb95eeade896657c4526e74ff2a2862039d0a0fe8a9e7155b5fe492cbd78387 | +------------------------------------------------------------------+ mysql>SELECT STATEMENT_DIGEST_TEXT(@stmt);
+----------------------------------------------------------+ | STATEMENT_DIGEST_TEXT(@stmt) | +----------------------------------------------------------+ | SELECT * FROM `mytable` WHERE `cola` = ? AND `colb` = ? | +----------------------------------------------------------+
STATEMENT_DIGEST_TEXT(
statement
)
Given an SQL statement as a string, returns the normalized
statement digest as a string in the connection character set,
or NULL
if the argument is
NULL
. For additional discussion and
examples, see the description of the related
STATEMENT_DIGEST()
function.
UNCOMPRESS(
string_to_uncompress
)
Uncompresses a string compressed by the
COMPRESS()
function. If the
argument is not a compressed value, the result is
NULL
. This function requires MySQL to have
been compiled with a compression library such as
zlib
. Otherwise, the return value is always
NULL
.
mysql>SELECT UNCOMPRESS(COMPRESS('any string'));
-> 'any string' mysql>SELECT UNCOMPRESS('any string');
-> NULL
UNCOMPRESSED_LENGTH(
compressed_string
)
Returns the length that the compressed string had before being compressed.
mysql> SELECT UNCOMPRESSED_LENGTH(COMPRESS(REPEAT('a',30)));
-> 30
VALIDATE_PASSWORD_STRENGTH(
str
)
Given an argument representing a cleartext password, this function returns an integer to indicate how strong the password is. The return value ranges from 0 (weak) to 100 (strong).
Password assessment by
VALIDATE_PASSWORD_STRENGTH()
is
done by the validate_password
component. If
that component is not installed, the function always returns
0. For information about installing
validate_password
, see
Section 6.5.3, “The Password Validation Component”. To examine or configure
the parameters that affect password testing, check or set the
system variables implemented by
validate_password
. See
Section 6.5.3.2, “Password Validation Options and Variables”.
The password is subjected to increasingly strict tests and the
return value reflects which tests were satisfied, as shown in
the following table. In addition, if the
validate_password.check_user_name
system variable is enabled and the password matches the user
name,
VALIDATE_PASSWORD_STRENGTH()
returns 0 regardless of how other
validate_password
system variables are set.
Password Test | Return Value |
---|---|
Length < 4 | 0 |
Length ≥ 4 and <
validate_password.length |
25 |
Satisfies policy 1 (LOW ) |
50 |
Satisfies policy 2 (MEDIUM ) |
75 |
Satisfies policy 3 (STRONG ) |
100 |
This section describes functions used to manipulate user-level locks.
Table 12.18 Locking Functions
Name | Description |
---|---|
GET_LOCK() |
Get a named lock |
IS_FREE_LOCK() |
Whether the named lock is free |
IS_USED_LOCK() |
Whether the named lock is in use; return connection identifier if true |
RELEASE_ALL_LOCKS() |
Release all current named locks |
RELEASE_LOCK() |
Release the named lock |
Tries to obtain a lock with a name given by the string
str
, using a timeout of
timeout
seconds. A negative
timeout
value means infinite
timeout. The lock is exclusive. While held by one session,
other sessions cannot obtain a lock of the same name.
Returns 1
if the lock was obtained
successfully, 0
if the attempt timed out
(for example, because another client has previously locked the
name), or NULL
if an error occurred (such
as running out of memory or the thread was killed with
mysqladmin kill).
A lock obtained with GET_LOCK()
is released explicitly by executing
RELEASE_LOCK()
or implicitly
when your session terminates (either normally or abnormally).
Locks obtained with GET_LOCK()
are not released when transactions commit or roll back.
GET_LOCK()
is implemented using
the metadata locking (MDL) subsystem. Multiple simultaneous
locks can be acquired and
GET_LOCK()
does not release any
existing locks. For example, suppose that you execute these
statements:
SELECT GET_LOCK('lock1',10); SELECT GET_LOCK('lock2',10); SELECT RELEASE_LOCK('lock2'); SELECT RELEASE_LOCK('lock1');
The second GET_LOCK()
acquires
a second lock and both
RELEASE_LOCK()
calls return 1
(success).
It is even possible for a given session to acquire multiple locks for the same name. Other sessions cannot acquire a lock with that name until the acquiring session releases all its locks for the name.
Uniquely named locks acquired with
GET_LOCK()
appear in the
Performance Schema metadata_locks
table. The OBJECT_TYPE
column says
USER LEVEL LOCK
and the
OBJECT_NAME
column indicates the lock name.
In the case that multiple locks are acquired for the
same name, only the first lock for the
name registers a row in the
metadata_locks
table. Subsequent
locks for the name increment a counter in the lock but do not
acquire additional metadata locks. The
metadata_locks
row for the lock
is deleted when the last lock instance on the name is
released.
The capability of acquiring multiple locks means there is the
possibility of deadlock among clients. When this happens, the
server chooses a caller and terminates its lock-acquisition
request with an
ER_USER_LOCK_DEADLOCK
error.
This error does not cause transactions to roll back.
MySQL enforces a maximum length on lock names of 64 characters.
GET_LOCK()
can be used to
implement application locks or to simulate record locks. Names
are locked on a server-wide basis. If a name has been locked
within one session, GET_LOCK()
blocks any request by another session for a lock with the same
name. This enables clients that agree on a given lock name to
use the name to perform cooperative advisory locking. But be
aware that it also enables a client that is not among the set
of cooperating clients to lock a name, either inadvertently or
deliberately, and thus prevent any of the cooperating clients
from locking that name. One way to reduce the likelihood of
this is to use lock names that are database-specific or
application-specific. For example, use lock names of the form
db_name.str
or
app_name.str
.
If multiple clients are waiting for a lock, the order in which they will acquire it is undefined. Applications should not assume that clients will acquire the lock in the same order that they issued the lock requests.
GET_LOCK()
is unsafe for
statement-based replication. A warning is logged if you use
this function when
binlog_format
is set to
STATEMENT
.
With the capability of acquiring multiple named locks, it is possible for a single statement to acquire a large number of locks. For example:
INSERT INTO ... SELECT GET_LOCK(t1.col_name) FROM t1;
These types of statements may have certain adverse effects. For example, if the statement fails part way through and rolls back, locks acquired up to the point of failure will still exist. If the intent is for there to be a correspondence between rows inserted and locks acquired, that intent will not be satisfied. Also, if it is important that locks are granted in a certain order, be aware that result set order may differ depending on which execution plan the optimizer chooses. For these reasons, it may be best to limit applications to a single lock-acquisition call per statement.
A different locking interface is available as either a plugin
service or a set of user-defined functions. This interface
provides lock namespaces and distinct read and write locks,
unlike the interface provided by
GET_LOCK()
and related
functions. For details, see Section 29.3.1, “The Locking Service”.
Checks whether the lock named str
is free to use (that is, not locked). Returns
1
if the lock is free (no one is using the
lock), 0
if the lock is in use, and
NULL
if an error occurs (such as an
incorrect argument).
This function is unsafe for statement-based replication. A
warning is logged if you use this function when
binlog_format
is set to
STATEMENT
.
Checks whether the lock named str
is in use (that is, locked). If so, it returns the connection
identifier of the client session that holds the lock.
Otherwise, it returns NULL
.
This function is unsafe for statement-based replication. A
warning is logged if you use this function when
binlog_format
is set to
STATEMENT
.
Releases all named locks held by the current session and returns the number of locks released (0 if there were none)
This function is unsafe for statement-based replication. A
warning is logged if you use this function when
binlog_format
is set to
STATEMENT
.
Releases the lock named by the string
str
that was obtained with
GET_LOCK()
. Returns
1
if the lock was released,
0
if the lock was not established by this
thread (in which case the lock is not released), and
NULL
if the named lock did not exist. The
lock does not exist if it was never obtained by a call to
GET_LOCK()
or if it has
previously been released.
The DO
statement is convenient
to use with RELEASE_LOCK()
. See
Section 13.2.3, “DO Syntax”.
This function is unsafe for statement-based replication. A
warning is logged if you use this function when
binlog_format
is set to
STATEMENT
.
Table 12.19 Information Functions
Name | Description |
---|---|
BENCHMARK() |
Repeatedly execute an expression |
CHARSET() |
Return the character set of the argument |
COERCIBILITY() |
Return the collation coercibility value of the string argument |
COLLATION() |
Return the collation of the string argument |
CONNECTION_ID() |
Return the connection ID (thread ID) for the connection |
CURRENT_ROLE() |
Return the current active roles |
CURRENT_USER() , CURRENT_USER |
The authenticated user name and host name |
DATABASE() |
Return the default (current) database name |
FOUND_ROWS() |
For a SELECT with a LIMIT clause, the number of rows that would be returned were there no LIMIT clause |
ICU_VERSION() |
ICU library version |
LAST_INSERT_ID() |
Value of the AUTOINCREMENT column for the last INSERT |
ROLES_GRAPHML() |
Return a GraphML document representing memory role subgraphs |
ROW_COUNT() |
The number of rows updated |
SCHEMA() |
Synonym for DATABASE() |
SESSION_USER() |
Synonym for USER() |
SYSTEM_USER() |
Synonym for USER() |
USER() |
The user name and host name provided by the client |
VERSION() |
Return a string that indicates the MySQL server version |
The BENCHMARK()
function
executes the expression expr
repeatedly count
times. It may be
used to time how quickly MySQL processes the expression. The
result value is always 0
. The intended use
is from within the mysql client, which
reports query execution times:
mysql> SELECT BENCHMARK(1000000,AES_ENCRYPT('hello','goodbye'));
+---------------------------------------------------+
| BENCHMARK(1000000,AES_ENCRYPT('hello','goodbye')) |
+---------------------------------------------------+
| 0 |
+---------------------------------------------------+
1 row in set (4.74 sec)
The time reported is elapsed time on the client end, not CPU
time on the server end. It is advisable to execute
BENCHMARK()
several times, and
to interpret the result with regard to how heavily loaded the
server machine is.
BENCHMARK()
is intended for
measuring the runtime performance of scalar expressions, which
has some significant implications for the way that you use it
and interpret the results:
Only scalar expressions can be used. Although the
expression can be a subquery, it must return a single
column and at most a single row. For example,
BENCHMARK(10, (SELECT * FROM
t))
will fail if the table t
has more than one column or more than one row.
Executing a SELECT
statement
expr
N
times differs from executing
SELECT BENCHMARK(
in terms of the
amount of overhead involved. The two have very different
execution profiles and you should not expect them to take
the same amount of time. The former involves the parser,
optimizer, table locking, and runtime evaluation
N
,
expr
)N
times each. The latter
involves only runtime evaluation
N
times, and all the other
components just once. Memory structures already allocated
are reused, and runtime optimizations such as local
caching of results already evaluated for aggregate
functions can alter the results. Use of
BENCHMARK()
thus measures
performance of the runtime component by giving more weight
to that component and removing the “noise”
introduced by the network, parser, optimizer, and so
forth.
Returns the character set of the string argument.
mysql>SELECT CHARSET('abc');
-> 'utf8' mysql>SELECT CHARSET(CONVERT('abc' USING latin1));
-> 'latin1' mysql>SELECT CHARSET(USER());
-> 'utf8'
Returns the collation coercibility value of the string argument.
mysql>SELECT COERCIBILITY('abc' COLLATE utf8_swedish_ci);
-> 0 mysql>SELECT COERCIBILITY(USER());
-> 3 mysql>SELECT COERCIBILITY('abc');
-> 4 mysql>SELECT COERCIBILITY(1000);
-> 5
The return values have the meanings shown in the following table. Lower values have higher precedence.
Coercibility | Meaning | Example |
---|---|---|
0 |
Explicit collation | Value with COLLATE clause |
1 |
No collation | Concatenation of strings with different collations |
2 |
Implicit collation | Column value, stored routine parameter or local variable |
3 |
System constant | USER() return value |
4 |
Coercible | Literal string |
5 |
Numeric | Numeric or temporal value |
5 |
Ignorable | NULL or an expression derived from
NULL |
For more information, see Section 10.8.4, “Collation Coercibility in Expressions”.
Returns the collation of the string argument.
mysql>SELECT COLLATION('abc');
-> 'utf8_general_ci' mysql>SELECT COLLATION(_utf8mb4'abc');
-> 'utf8mb4_0900_ai_ci' mysql>SELECT COLLATION(_latin1'abc');
-> 'latin1_swedish_ci'
Returns the connection ID (thread ID) for the connection. Every connection has an ID that is unique among the set of currently connected clients.
The value returned by
CONNECTION_ID()
is the same
type of value as displayed in the ID
column
of the
INFORMATION_SCHEMA.PROCESSLIST
table, the Id
column of
SHOW PROCESSLIST
output, and
the PROCESSLIST_ID
column of the
Performance Schema threads
table.
mysql> SELECT CONNECTION_ID();
-> 23786
Returns a utf8
string containing the
current active roles for the current session, separated by
commas, or NONE
if there are none. The
value reflects the setting of the
sql_quote_show_create
system
variable.
Suppose that an account is granted roles as follows:
GRANT 'r1', 'r2' TO 'u1'@'localhost'; SET DEFAULT ROLE ALL TO 'u1'@'localhost';
In sessions for u1
, the initial
CURRENT_ROLE()
value names the
default account roles. Using SET
ROLE
changes that:
mysql>SELECT CURRENT_ROLE();
+-------------------+ | CURRENT_ROLE() | +-------------------+ | `r1`@`%`,`r2`@`%` | +-------------------+ mysql>SET ROLE 'r1'; SELECT CURRENT_ROLE();
+----------------+ | CURRENT_ROLE() | +----------------+ | `r1`@`%` | +----------------+
Returns the user name and host name combination for the MySQL
account that the server used to authenticate the current
client. This account determines your access privileges. The
return value is a string in the utf8
character set.
The value of CURRENT_USER()
can
differ from the value of
USER()
.
mysql>SELECT USER();
-> 'davida@localhost' mysql>SELECT * FROM mysql.user;
ERROR 1044: Access denied for user ''@'localhost' to database 'mysql' mysql>SELECT CURRENT_USER();
-> '@localhost'
The example illustrates that although the client specified a
user name of davida
(as indicated by the
value of the USER()
function),
the server authenticated the client using an anonymous user
account (as seen by the empty user name part of the
CURRENT_USER()
value). One way
this might occur is that there is no account listed in the
grant tables for davida
.
Within a stored program or view,
CURRENT_USER()
returns the
account for the user who defined the object (as given by its
DEFINER
value) unless defined with the
SQL SECURITY INVOKER
characteristic. In the
latter case, CURRENT_USER()
returns the object's invoker.
Triggers and events have no option to define the SQL
SECURITY
characteristic, so for these objects,
CURRENT_USER()
returns the
account for the user who defined the object. To return the
invoker, use USER()
or
SESSION_USER()
.
The following statements support use of the
CURRENT_USER()
function to take
the place of the name of (and, possibly, a host for) an
affected user or a definer; in such cases,
CURRENT_USER()
is expanded
where and as needed:
For information about the implications that this expansion of
CURRENT_USER()
has for
replication, see
Section 17.4.1.8, “Replication of CURRENT_USER()”.
Returns the default (current) database name as a string in the
utf8
character set. If there is no default
database, DATABASE()
returns
NULL
. Within a stored routine, the default
database is the database that the routine is associated with,
which is not necessarily the same as the database that is the
default in the calling context.
mysql> SELECT DATABASE();
-> 'test'
If there is no default database,
DATABASE()
returns
NULL
.
The SQL_CALC_FOUND_ROWS
query modifier
and accompanying FOUND_ROWS()
function are deprecated as of MySQL 8.0.17 and will be
removed in a future MySQL version. As a replacement,
considering executing your query with
LIMIT
, and then a second query with
COUNT(*)
and without
LIMIT
to determine whether there are
additional rows. For example, instead of these queries:
SELECT SQL_CALC_FOUND_ROWS * FROM tbl_name
WHERE id > 100 LIMIT 10;
SELECT FOUND_ROWS();
Use these queries instead:
SELECT * FROM tbl_name
WHERE id > 100 LIMIT 10;
SELECT COUNT(*) WHERE id > 100;
COUNT(*)
is subject to
certain optimizations.
SQL_CALC_FOUND_ROWS
causes some
optimizations to be disabled.
A SELECT
statement may include
a LIMIT
clause to restrict the number of
rows the server returns to the client. In some cases, it is
desirable to know how many rows the statement would have
returned without the LIMIT
, but without
running the statement again. To obtain this row count, include
an SQL_CALC_FOUND_ROWS
option in the
SELECT
statement, and then
invoke FOUND_ROWS()
afterward:
mysql>SELECT SQL_CALC_FOUND_ROWS * FROM
->tbl_name
WHERE id > 100 LIMIT 10;
mysql>SELECT FOUND_ROWS();
The second SELECT
returns a
number indicating how many rows the first
SELECT
would have returned had
it been written without the LIMIT
clause.
In the absence of the SQL_CALC_FOUND_ROWS
option in the most recent successful
SELECT
statement,
FOUND_ROWS()
returns the number
of rows in the result set returned by that statement. If the
statement includes a LIMIT
clause,
FOUND_ROWS()
returns the number
of rows up to the limit. For example,
FOUND_ROWS()
returns 10 or 60,
respectively, if the statement includes LIMIT
10
or LIMIT 50, 10
.
The row count available through
FOUND_ROWS()
is transient and
not intended to be available past the statement following the
SELECT SQL_CALC_FOUND_ROWS
statement. If
you need to refer to the value later, save it:
mysql>SELECT SQL_CALC_FOUND_ROWS * FROM ... ;
mysql>SET @rows = FOUND_ROWS();
If you are using SELECT
SQL_CALC_FOUND_ROWS
, MySQL must calculate how many
rows are in the full result set. However, this is faster than
running the query again without LIMIT
,
because the result set need not be sent to the client.
SQL_CALC_FOUND_ROWS
and
FOUND_ROWS()
can be useful in
situations when you want to restrict the number of rows that a
query returns, but also determine the number of rows in the
full result set without running the query again. An example is
a Web script that presents a paged display containing links to
the pages that show other sections of a search result. Using
FOUND_ROWS()
enables you to
determine how many other pages are needed for the rest of the
result.
The use of SQL_CALC_FOUND_ROWS
and
FOUND_ROWS()
is more complex
for UNION
statements than for
simple SELECT
statements,
because LIMIT
may occur at multiple places
in a UNION
. It may be applied
to individual SELECT
statements
in the UNION
, or global to the
UNION
result as a whole.
The intent of SQL_CALC_FOUND_ROWS
for
UNION
is that it should return
the row count that would be returned without a global
LIMIT
. The conditions for use of
SQL_CALC_FOUND_ROWS
with
UNION
are:
The SQL_CALC_FOUND_ROWS
keyword must
appear in the first SELECT
of the UNION
.
The value of FOUND_ROWS()
is exact only if
UNION ALL
is used. If UNION
without
ALL
is used, duplicate removal occurs
and the value of
FOUND_ROWS()
is only
approximate.
If no LIMIT
is present in the
UNION
,
SQL_CALC_FOUND_ROWS
is ignored and
returns the number of rows in the temporary table that is
created to process the
UNION
.
Beyond the cases described here, the behavior of
FOUND_ROWS()
is undefined (for
example, its value following a
SELECT
statement that fails
with an error).
FOUND_ROWS()
is not
replicated reliably using statement-based replication. This
function is automatically replicated using row-based
replication.
The version of the International Components for Unicode (ICU) library used to support regular expression operations (see Section 12.5.2, “Regular Expressions”). This function is primarily intended for use in test cases.
LAST_INSERT_ID()
,
LAST_INSERT_ID(
expr
)
With no argument,
LAST_INSERT_ID()
returns a
BIGINT UNSIGNED
(64-bit) value representing
the first automatically generated value successfully inserted
for an AUTO_INCREMENT
column as a result of
the most recently executed
INSERT
statement. The value of
LAST_INSERT_ID()
remains
unchanged if no rows are successfully inserted.
With an argument,
LAST_INSERT_ID()
returns an
unsigned integer.
For example, after inserting a row that generates an
AUTO_INCREMENT
value, you can get the value
like this:
mysql> SELECT LAST_INSERT_ID();
-> 195
The currently executing statement does not affect the value of
LAST_INSERT_ID()
. Suppose that
you generate an AUTO_INCREMENT
value with
one statement, and then refer to
LAST_INSERT_ID()
in a
multiple-row INSERT
statement
that inserts rows into a table with its own
AUTO_INCREMENT
column. The value of
LAST_INSERT_ID()
will remain
stable in the second statement; its value for the second and
later rows is not affected by the earlier row insertions.
(However, if you mix references to
LAST_INSERT_ID()
and
LAST_INSERT_ID(
,
the effect is undefined.)
expr
)
If the previous statement returned an error, the value of
LAST_INSERT_ID()
is undefined.
For transactional tables, if the statement is rolled back due
to an error, the value of
LAST_INSERT_ID()
is left
undefined. For manual
ROLLBACK
,
the value of LAST_INSERT_ID()
is not restored to that before the transaction; it remains as
it was at the point of the
ROLLBACK
.
Within the body of a stored routine (procedure or function) or
a trigger, the value of
LAST_INSERT_ID()
changes the
same way as for statements executed outside the body of these
kinds of objects. The effect of a stored routine or trigger
upon the value of
LAST_INSERT_ID()
that is seen
by following statements depends on the kind of routine:
If a stored procedure executes statements that change the
value of LAST_INSERT_ID()
,
the changed value is seen by statements that follow the
procedure call.
For stored functions and triggers that change the value, the value is restored when the function or trigger ends, so following statements will not see a changed value.
The ID that was generated is maintained in the server on a
per-connection basis. This means that the
value returned by the function to a given client is the first
AUTO_INCREMENT
value generated for most
recent statement affecting an
AUTO_INCREMENT
column by that
client. This value cannot be affected by other
clients, even if they generate
AUTO_INCREMENT
values of their own. This
behavior ensures that each client can retrieve its own ID
without concern for the activity of other clients, and without
the need for locks or transactions.
The value of LAST_INSERT_ID()
is not changed if you set the
AUTO_INCREMENT
column of a row to a
non-“magic” value (that is, a value that is not
NULL
and not 0
).
If you insert multiple rows using a single
INSERT
statement,
LAST_INSERT_ID()
returns the
value generated for the first inserted
row only. The reason for this is to
make it possible to reproduce easily the same
INSERT
statement against some
other server.
For example:
mysql>USE test;
mysql>CREATE TABLE t (
id INT AUTO_INCREMENT NOT NULL PRIMARY KEY,
name VARCHAR(10) NOT NULL
);
mysql>INSERT INTO t VALUES (NULL, 'Bob');
mysql>SELECT * FROM t;
+----+------+ | id | name | +----+------+ | 1 | Bob | +----+------+ mysql>SELECT LAST_INSERT_ID();
+------------------+ | LAST_INSERT_ID() | +------------------+ | 1 | +------------------+ mysql>INSERT INTO t VALUES
(NULL, 'Mary'), (NULL, 'Jane'), (NULL, 'Lisa');
mysql>SELECT * FROM t;
+----+------+ | id | name | +----+------+ | 1 | Bob | | 2 | Mary | | 3 | Jane | | 4 | Lisa | +----+------+ mysql>SELECT LAST_INSERT_ID();
+------------------+ | LAST_INSERT_ID() | +------------------+ | 2 | +------------------+
Although the second INSERT
statement inserted three new rows into t
,
the ID generated for the first of these rows was
2
, and it is this value that is returned by
LAST_INSERT_ID()
for the
following SELECT
statement.
If you use INSERT
IGNORE
and the row is ignored, the
LAST_INSERT_ID()
remains
unchanged from the current value (or 0 is returned if the
connection has not yet performed a successful
INSERT
) and, for non-transactional tables,
the AUTO_INCREMENT
counter is not
incremented. For InnoDB
tables, the
AUTO_INCREMENT
counter is incremented if
innodb_autoinc_lock_mode
is
set to 1
or 2
, as
demonstrated in the following example:
mysql>USE test;
mysql>SELECT @@innodb_autoinc_lock_mode;
+----------------------------+ | @@innodb_autoinc_lock_mode | +----------------------------+ | 1 | +----------------------------+ mysql>CREATE TABLE `t` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`val` INT(11) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `i1` (`val`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
# Insert two rows mysql>INSERT INTO t (val) VALUES (1),(2);
# With auto_increment_offset=1, the inserted rows # result in an AUTO_INCREMENT value of 3 mysql>SHOW CREATE TABLE t\G
*************************** 1. row *************************** Table: t Create Table: CREATE TABLE `t` ( `id` int(11) NOT NULL AUTO_INCREMENT, `val` int(11) DEFAULT NULL, PRIMARY KEY (`id`), UNIQUE KEY `i1` (`val`) ) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=latin1 # LAST_INSERT_ID() returns the first automatically generated # value that is successfully inserted for the AUTO_INCREMENT column mysql>SELECT LAST_INSERT_ID();
+------------------+ | LAST_INSERT_ID() | +------------------+ | 1 | +------------------+ # The attempted insertion of duplicate rows fail but errors are ignored mysql>INSERT IGNORE INTO t (val) VALUES (1),(2);
Query OK, 0 rows affected (0.00 sec) Records: 2 Duplicates: 2 Warnings: 0 # With innodb_autoinc_lock_mode=1, the AUTO_INCREMENT counter # is incremented for the ignored rows mysql>SHOW CREATE TABLE t\G
*************************** 1. row *************************** Table: t Create Table: CREATE TABLE `t` ( `id` int(11) NOT NULL AUTO_INCREMENT, `val` int(11) DEFAULT NULL, PRIMARY KEY (`id`), UNIQUE KEY `i1` (`val`) ) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=latin1 # The LAST_INSERT_ID is unchanged because the previous insert was unsuccessful mysql>SELECT LAST_INSERT_ID();
+------------------+ | LAST_INSERT_ID() | +------------------+ | 1 | +------------------+
For more information, see Section 15.6.1.4, “AUTO_INCREMENT Handling in InnoDB”.
If expr
is given as an argument to
LAST_INSERT_ID()
, the value of
the argument is returned by the function and is remembered as
the next value to be returned by
LAST_INSERT_ID()
. This can be
used to simulate sequences:
Create a table to hold the sequence counter and initialize it:
mysql>CREATE TABLE sequence (id INT NOT NULL);
mysql>INSERT INTO sequence VALUES (0);
Use the table to generate sequence numbers like this:
mysql>UPDATE sequence SET id=LAST_INSERT_ID(id+1);
mysql>SELECT LAST_INSERT_ID();
The UPDATE
statement
increments the sequence counter and causes the next call
to LAST_INSERT_ID()
to
return the updated value. The
SELECT
statement retrieves
that value. The
mysql_insert_id()
C API
function can also be used to get the value. See
Section 28.7.7.38, “mysql_insert_id()”.
You can generate sequences without calling
LAST_INSERT_ID()
, but the
utility of using the function this way is that the ID value is
maintained in the server as the last automatically generated
value. It is multi-user safe because multiple clients can
issue the UPDATE
statement and
get their own sequence value with the
SELECT
statement (or
mysql_insert_id()
), without
affecting or being affected by other clients that generate
their own sequence values.
Note that mysql_insert_id()
is
only updated after INSERT
and
UPDATE
statements, so you
cannot use the C API function to retrieve the value for
LAST_INSERT_ID(
after executing other SQL statements like
expr
)SELECT
or
SET
.
Returns a utf8
string containing a GraphML
document representing memory role subgraphs. The
ROLE_ADMIN
or
SUPER
privilege is required to
see content in the <graphml>
element.
Otherwise, the result shows only an empty element:
mysql> SELECT ROLES_GRAPHML();
+---------------------------------------------------+
| ROLES_GRAPHML() |
+---------------------------------------------------+
| <?xml version="1.0" encoding="UTF-8"?><graphml /> |
+---------------------------------------------------+
ROW_COUNT()
returns a value as follows:
DDL statements: 0. This applies to statements such as
CREATE TABLE
or
DROP TABLE
.
DML statements other than
SELECT
: The number of
affected rows. This applies to statements such as
UPDATE
,
INSERT
, or
DELETE
(as before), but now
also to statements such as ALTER
TABLE
and LOAD
DATA
.
SELECT
: -1 if the statement
returns a result set, or the number of rows
“affected” if it does not. For example, for
SELECT * FROM t1
,
ROW_COUNT()
returns -1. For
SELECT * FROM t1 INTO OUTFILE
'
,
file_name
'ROW_COUNT()
returns the
number of rows written to the file.
SIGNAL
statements: 0.
For UPDATE
statements, the
affected-rows value by default is the number of rows actually
changed. If you specify the
CLIENT_FOUND_ROWS
flag to
mysql_real_connect()
when
connecting to mysqld, the affected-rows
value is the number of rows “found”; that is,
matched by the WHERE
clause.
For REPLACE
statements, the
affected-rows value is 2 if the new row replaced an old row,
because in this case, one row was inserted after the duplicate
was deleted.
For
INSERT
... ON DUPLICATE KEY UPDATE
statements, the
affected-rows value per row is 1 if the row is inserted as a
new row, 2 if an existing row is updated, and 0 if an existing
row is set to its current values. If you specify the
CLIENT_FOUND_ROWS
flag, the affected-rows
value is 1 (not 0) if an existing row is set to its current
values.
The ROW_COUNT()
value is
similar to the value from the
mysql_affected_rows()
C API
function and the row count that the mysql
client displays following statement execution.
mysql>INSERT INTO t VALUES(1),(2),(3);
Query OK, 3 rows affected (0.00 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql>SELECT ROW_COUNT();
+-------------+ | ROW_COUNT() | +-------------+ | 3 | +-------------+ 1 row in set (0.00 sec) mysql>DELETE FROM t WHERE i IN(1,2);
Query OK, 2 rows affected (0.00 sec) mysql>SELECT ROW_COUNT();
+-------------+ | ROW_COUNT() | +-------------+ | 2 | +-------------+ 1 row in set (0.00 sec)
ROW_COUNT()
is not replicated
reliably using statement-based replication. This function is
automatically replicated using row-based replication.
This function is a synonym for
DATABASE()
.
SESSION_USER()
is a synonym for
USER()
.
SYSTEM_USER()
is a synonym for
USER()
.
Returns the current MySQL user name and host name as a string
in the utf8
character set.
mysql> SELECT USER();
-> 'davida@localhost'
The value indicates the user name you specified when
connecting to the server, and the client host from which you
connected. The value can be different from that of
CURRENT_USER()
.
Returns a string that indicates the MySQL server version. The
string uses the utf8
character set. The
value might have a suffix in addition to the version number.
See the description of the
version
system variable in
Section 5.1.8, “Server System Variables”.
This function is unsafe for statement-based replication. A
warning is logged if you use this function when
binlog_format
is set to
STATEMENT
.
mysql> SELECT VERSION();
-> '8.0.17-standard'
MySQL provides functions to perform various operations on spatial data. These functions can be grouped into several major categories according to the type of operation they perform:
Functions that create geometries in various formats (WKT, WKB, internal)
Functions that convert geometries between formats
Functions that access qualitative or quantitative properties of a geometry
Functions that describe relations between two geometries
Functions that create new geometries from existing ones
For general background about MySQL support for using spatial data, see Section 11.5, “Spatial Data Types”.
The following table lists each spatial function and provides a short description of each one.
Table 12.20 Spatial Functions
Name | Description |
---|---|
GeomCollection() |
Construct geometry collection from geometries |
GeometryCollection() |
Construct geometry collection from geometries |
LineString() |
Construct LineString from Point values |
MBRContains() |
Whether MBR of one geometry contains MBR of another |
MBRCoveredBy() |
Whether one MBR is covered by another |
MBRCovers() |
Whether one MBR covers another |
MBRDisjoint() |
Whether MBRs of two geometries are disjoint |
MBREquals() |
Whether MBRs of two geometries are equal |
MBRIntersects() |
Whether MBRs of two geometries intersect |
MBROverlaps() |
Whether MBRs of two geometries overlap |
MBRTouches() |
Whether MBRs of two geometries touch |
MBRWithin() |
Whether MBR of one geometry is within MBR of another |
MultiLineString() |
Contruct MultiLineString from LineString values |
MultiPoint() |
Construct MultiPoint from Point values |
MultiPolygon() |
Construct MultiPolygon from Polygon values |
Point() |
Construct Point from coordinates |
Polygon() |
Construct Polygon from LineString arguments |
ST_Area() |
Return Polygon or MultiPolygon area |
ST_AsBinary() , ST_AsWKB() |
Convert from internal geometry format to WKB |
ST_AsGeoJSON() |
Generate GeoJSON object from geometry |
ST_AsText() , ST_AsWKT() |
Convert from internal geometry format to WKT |
ST_Buffer() |
Return geometry of points within given distance from geometry |
ST_Buffer_Strategy() |
Produce strategy option for ST_Buffer() |
ST_Centroid() |
Return centroid as a point |
ST_Contains() |
Whether one geometry contains another |
ST_ConvexHull() |
Return convex hull of geometry |
ST_Crosses() |
Whether one geometry crosses another |
ST_Difference() |
Return point set difference of two geometries |
ST_Dimension() |
Dimension of geometry |
ST_Disjoint() |
Whether one geometry is disjoint from another |
ST_Distance() |
The distance of one geometry from another |
ST_Distance_Sphere() |
Minimum distance on earth between two geometries |
ST_EndPoint() |
End Point of LineString |
ST_Envelope() |
Return MBR of geometry |
ST_Equals() |
Whether one geometry is equal to another |
ST_ExteriorRing() |
Return exterior ring of Polygon |
ST_GeoHash() |
Produce a geohash value |
ST_GeomCollFromText() , ST_GeometryCollectionFromText() , ST_GeomCollFromTxt() |
Return geometry collection from WKT |
ST_GeomCollFromWKB() , ST_GeometryCollectionFromWKB() |
Return geometry collection from WKB |
ST_GeometryN() |
Return N-th geometry from geometry collection |
ST_GeometryType() |
Return name of geometry type |
ST_GeomFromGeoJSON() |
Generate geometry from GeoJSON object |
ST_GeomFromText() , ST_GeometryFromText() |
Return geometry from WKT |
ST_GeomFromWKB() , ST_GeometryFromWKB() |
Return geometry from WKB |
ST_InteriorRingN() |
Return N-th interior ring of Polygon |
ST_Intersection() |
Return point set intersection of two geometries |
ST_Intersects() |
Whether one geometry intersects another |
ST_IsClosed() |
Whether a geometry is closed and simple |
ST_IsEmpty() |
Placeholder function |
ST_IsSimple() |
Whether a geometry is simple |
ST_IsValid() |
Whether a geometry is valid |
ST_LatFromGeoHash() |
Return latitude from geohash value |
ST_Latitude() |
Return latitude of Point |
ST_Length() |
Return length of LineString |
ST_LineFromText() , ST_LineStringFromText() |
Construct LineString from WKT |
ST_LineFromWKB() , ST_LineStringFromWKB() |
Construct LineString from WKB |
ST_LongFromGeoHash() |
Return longitude from geohash value |
ST_Longitude() |
Return longitude of Point |
ST_MakeEnvelope() |
Rectangle around two points |
ST_MLineFromText() , ST_MultiLineStringFromText() |
Construct MultiLineString from WKT |
ST_MLineFromWKB() , ST_MultiLineStringFromWKB() |
Construct MultiLineString from WKB |
ST_MPointFromText() , ST_MultiPointFromText() |
Construct MultiPoint from WKT |
ST_MPointFromWKB() , ST_MultiPointFromWKB() |
Construct MultiPoint from WKB |
ST_MPolyFromText() , ST_MultiPolygonFromText() |
Construct MultiPolygon from WKT |
ST_MPolyFromWKB() , ST_MultiPolygonFromWKB() |
Construct MultiPolygon from WKB |
ST_NumGeometries() |
Return number of geometries in geometry collection |
ST_NumInteriorRing() , ST_NumInteriorRings() |
Return number of interior rings in Polygon |
ST_NumPoints() |
Return number of points in LineString |
ST_Overlaps() |
Whether one geometry overlaps another |
ST_PointFromGeoHash() |
Convert geohash value to POINT value |
ST_PointFromText() |
Construct Point from WKT |
ST_PointFromWKB() |
Construct Point from WKB |
ST_PointN() |
Return N-th point from LineString |
ST_PolyFromText() , ST_PolygonFromText() |
Construct Polygon from WKT |
ST_PolyFromWKB() , ST_PolygonFromWKB() |
Construct Polygon from WKB |
ST_Simplify() |
Return simplified geometry |
ST_SRID() |
Return spatial reference system ID for geometry |
ST_StartPoint() |
Start Point of LineString |
ST_SwapXY() |
Return argument with X/Y coordinates swapped |
ST_SymDifference() |
Return point set symmetric difference of two geometries |
ST_Touches() |
Whether one geometry touches another |
ST_Transform() |
Transform coordinates of geometry |
ST_Union() |
Return point set union of two geometries |
ST_Validate() |
Return validated geometry |
ST_Within() |
Whether one geometry is within another |
ST_X() |
Return X coordinate of Point |
ST_Y() |
Return Y coordinate of Point |
Spatial values, or geometries, have the properties described at Section 11.5.2.2, “Geometry Class”. The following discussion lists general spatial function argument-handling characteristics. Specific functions or groups of functions may have additional argument-handling characteristics, as discussed in the sections where those function descriptions occur.
Spatial functions are defined only for valid geometry values.
The spatial reference identifier (SRID) of a geometry identifies the coordinate space in which the geometry is defined. In MySQL, the SRID value is an integer associated with the geometry value. The maximum usable SRID value is 232−1. If a larger value is given, only the lower 32 bits are used.
SRID 0 represents an infinite flat Cartesian plane with no units assigned to its axes. To ensure SRID 0 behavior, create geometry values using SRID 0. SRID 0 is the default for new geometry values if no SRID is specified.
Geometry values produced by any spatial function inherit the SRID of the geometry arguments.
Spatial functions that take multiple geometry arguments require
those arguments to have the same SRID value (that is, same value
in the lower 32 bits). Assuming equal SRIDs, spatial functions do
nothing with them after performing the equality check; geometry
values are implicitly handled using Cartesian coordinates (SRID
0). If a spatial function returns
ER_GIS_DIFFERENT_SRIDS
, it means
that the geometry arguments did not all have the same SRID. You
must modify them to have the same SRID.
The Open Geospatial Consortium guidelines require that input polygons already be closed, so unclosed polygons are rejected as invalid rather than being closed.
Empty geometry-collection handling is as follows: An empty WKT
input geometry collection may be specified as
'GEOMETRYCOLLECTION()'
. This is also the output
WKT resulting from a spatial operation that produces an empty
geometry collection.
During parsing of a nested geometry collection, the collection is flattened and its basic components are used in various GIS operations to compute results. This provides additional flexibility to users because it is unnecessary to be concerned about the uniqueness of geometry data. Nested geometry collections may be produced from nested GIS function calls without having to be explicitly flattened first.
These functions take as arguments a Well-Known Text (WKT) representation and, optionally, a spatial reference system identifier (SRID). They return the corresponding geometry. For a description of WKT format, see Well-Known Text (WKT) Format.
Functions in this section detect arguments in either Cartesian or geographic spatial reference systems (SRSs), and return results appropriate to the SRS.
ST_GeomFromText()
accepts a WKT
value of any geometry type as its first argument. Other functions
provide type-specific construction functions for construction of
geometry values of each geometry type.
Functions such as
ST_MPointFromText()
and
ST_GeomFromText()
that accept
WKT-format representations of MultiPoint
values
permit individual points within values to be surrounded by
parentheses. For example, both of the following function calls are
valid:
ST_MPointFromText('MULTIPOINT (1 1, 2 2, 3 3)') ST_MPointFromText('MULTIPOINT ((1 1), (2 2), (3 3))')
Functions such as ST_GeomFromText()
that accept WKT geometry collection arguments understand both
OpenGIS 'GEOMETRYCOLLECTION EMPTY'
standard
syntax and MySQL 'GEOMETRYCOLLECTION()'
nonstandard syntax. Functions such as
ST_AsWKT()
that produce WKT values produce 'GEOMETRYCOLLECTION
EMPTY'
standard syntax:
mysql>SET @s1 = ST_GeomFromText('GEOMETRYCOLLECTION()');
mysql>SET @s2 = ST_GeomFromText('GEOMETRYCOLLECTION EMPTY');
mysql>SELECT ST_AsWKT(@s1), ST_AsWKT(@s2);
+--------------------------+--------------------------+ | ST_AsWKT(@s1) | ST_AsWKT(@s2) | +--------------------------+--------------------------+ | GEOMETRYCOLLECTION EMPTY | GEOMETRYCOLLECTION EMPTY | +--------------------------+--------------------------+
Unless otherwise specified, functions in this section handle their arguments as follows:
If any geometry argument is NULL
or is not
a syntactically well-formed geometry, or if the SRID argument
is NULL
, the return value is
NULL
.
By default, geographic coordinates (latitude, longitude) are
interpreted as in the order specified by the spatial reference
system of geometry arguments. An optional
options
argument may be given to
override the default axis order. options
consists of a list of comma-separated
.
The onlypermitted key
=value
key
value is
axis-order
, with permitted values of
lat-long
, long-lat
and
srid-defined
(the default).
If the options
argument is
NULL
, the return value is
NULL
. If the
options
argument is invalid, an
error occurs to indicate why.
If an SRID argument refers to an undefined spatial reference
system (SRS), an
ER_SRS_NOT_FOUND
error occurs.
For geographic SRS geometry arguments, if any argument has a longitude or latitude that is out of range, an error occurs:
If a longitude value is not in the range (−180,
180], an
ER_LONGITUDE_OUT_OF_RANGE
error occurs.
If a latitude value is not in the range [−90, 90],
an
ER_LATITUDE_OUT_OF_RANGE
error occurs.
Ranges shown are in degrees. If an SRS uses another unit, the range uses the corresponding values in its unit. The exact range limits deviate slightly due to floating-point arithmetic.
These functions are available for creating geometries from WKT values:
ST_GeomCollFromText(
,
wkt
[,
srid
[,
options
]])ST_GeometryCollectionFromText(
,
wkt
[,
srid
[,
options
]])ST_GeomCollFromTxt(
wkt
[,
srid
[,
options
]])
Constructs a GeometryCollection
value using
its WKT representation and SRID.
These functions handle their arguments as described in the introduction to this section.
mysql>SET @g = "MULTILINESTRING((10 10, 11 11), (9 9, 10 10))";
mysql>SELECT ST_AsText(ST_GeomCollFromText(@g));
+--------------------------------------------+ | ST_AsText(ST_GeomCollFromText(@g)) | +--------------------------------------------+ | MULTILINESTRING((10 10,11 11),(9 9,10 10)) | +--------------------------------------------+
ST_GeomFromText(
,
wkt
[,
srid
[,
options
]])ST_GeometryFromText(
wkt
[,
srid
[,
options
]])
Constructs a geometry value of any type using its WKT representation and SRID.
These functions handle their arguments as described in the introduction to this section.
ST_LineFromText(
,
wkt
[,
srid
[,
options
]])ST_LineStringFromText(
wkt
[,
srid
[,
options
]])
Constructs a LineString
value using its WKT
representation and SRID.
These functions handle their arguments as described in the introduction to this section.
ST_MLineFromText(
,
wkt
[,
srid
[,
options
]])ST_MultiLineStringFromText(
wkt
[,
srid
[,
options
]])
Constructs a MultiLineString
value using
its WKT representation and SRID.
These functions handle their arguments as described in the introduction to this section.
ST_MPointFromText(
,
wkt
[,
srid
[,
options
]])ST_MultiPointFromText(
wkt
[,
srid
[,
options
]])
Constructs a MultiPoint
value using its WKT
representation and SRID.
These functions handle their arguments as described in the introduction to this section.
ST_MPolyFromText(
,
wkt
[,
srid
[,
options
]])ST_MultiPolygonFromText(
wkt
[,
srid
[,
options
]])
Constructs a MultiPolygon
value using its
WKT representation and SRID.
These functions handle their arguments as described in the introduction to this section.
ST_PointFromText(
wkt
[,
srid
[,
options
]])
Constructs a Point
value using its WKT
representation and SRID.
ST_PointFromText()
handles its
arguments as described in the introduction to this section.
ST_PolyFromText(
,
wkt
[,
srid
[,
options
]])ST_PolygonFromText(
wkt
[,
srid
[,
options
]])
Constructs a Polygon
value using its WKT
representation and SRID.
These functions handle their arguments as described in the introduction to this section.
These functions take as arguments a
BLOB
containing a Well-Known Binary
(WKB) representation and, optionally, a spatial reference system
identifier (SRID). They return the corresponding geometry. For a
description of WKB format, see Well-Known Binary (WKB) Format.
Functions in this section detect arguments in either Cartesian or geographic spatial reference systems (SRSs), and return results appropriate to the SRS.
ST_GeomFromWKB()
accepts a WKB
value of any geometry type as its first argument. Other functions
provide type-specific construction functions for construction of
geometry values of each geometry type.
Prior to MySQL 8.0, these functions also accepted geometry objects as returned by the functions in Section 12.16.5, “MySQL-Specific Functions That Create Geometry Values”. Geometry arguments are no longer permitted and produce an error. To migrate calls from using geometry arguments to using WKB arguments, follow these guidelines:
Rewrite constructs such as ST_GeomFromWKB(Point(0,
0))
as Point(0, 0)
.
Rewrite constructs such as ST_GeomFromWKB(Point(0,
0), 4326)
as ST_SRID(Point(0, 0),
4326)
or ST_GeomFromWKB(ST_AsWKB(Point(0,
0)), 4326)
.
Unless otherwise specified, functions in this section handle their arguments as follows:
If the WKB or SRID argument is NULL
, the
return value is NULL
.
By default, geographic coordinates (latitude, longitude) are
interpreted as in the order specified by the spatial reference
system of geometry arguments. An optional
options
argument may be given to
override the default axis order. options
consists of a list of comma-separated
.
The onlypermitted key
=value
key
value is
axis-order
, with permitted values of
lat-long
, long-lat
and
srid-defined
(the default).
If the options
argument is
NULL
, the return value is
NULL
. If the
options
argument is invalid, an
error occurs to indicate why.
If an SRID argument refers to an undefined spatial reference
system (SRS), an
ER_SRS_NOT_FOUND
error occurs.
For geographic SRS geometry arguments, if any argument has a longitude or latitude that is out of range, an error occurs:
If a longitude value is not in the range (−180,
180], an
ER_LONGITUDE_OUT_OF_RANGE
error occurs.
If a latitude value is not in the range [−90, 90],
an
ER_LATITUDE_OUT_OF_RANGE
error occurs.
Ranges shown are in degrees. If an SRS uses another unit, the range uses the corresponding values in its unit. The exact range limits deviate slightly due to floating-point arithmetic.
These functions are available for creating geometries from WKB values:
ST_GeomCollFromWKB(
,
wkb
[,
srid
[,
options
]])ST_GeometryCollectionFromWKB(
wkb
[,
srid
[,
options
]])
Constructs a GeometryCollection
value using
its WKB representation and SRID.
These functions handle their arguments as described in the introduction to this section.
ST_GeomFromWKB(
,
wkb
[,
srid
[,
options
]])ST_GeometryFromWKB(
wkb
[,
srid
[,
options
]])
Constructs a geometry value of any type using its WKB representation and SRID.
These functions handle their arguments as described in the introduction to this section.
ST_LineFromWKB(
,
wkb
[,
srid
[,
options
]])ST_LineStringFromWKB(
wkb
[,
srid
[,
options
]])
Constructs a LineString
value using its WKB
representation and SRID.
These functions handle their arguments as described in the introduction to this section.
ST_MLineFromWKB(
,
wkb
[,
srid
[,
options
]])ST_MultiLineStringFromWKB(
wkb
[,
srid
[,
options
]])
Constructs a MultiLineString
value using
its WKB representation and SRID.
These functions handle their arguments as described in the introduction to this section.
ST_MPointFromWKB(
,
wkb
[,
srid
[,
options
]])ST_MultiPointFromWKB(
wkb
[,
srid
[,
options
]])
Constructs a MultiPoint
value using its WKB
representation and SRID.
These functions handle their arguments as described in the introduction to this section.
ST_MPolyFromWKB(
,
wkb
[,
srid
[,
options
]])ST_MultiPolygonFromWKB(
wkb
[,
srid
[,
options
]])
Constructs a MultiPolygon
value using its
WKB representation and SRID.
These functions handle their arguments as described in the introduction to this section.
ST_PointFromWKB(
wkb
[,
srid
[,
options
]])
Constructs a Point
value using its WKB
representation and SRID.
ST_PointFromWKB()
handles its
arguments as described in the introduction to this section.
ST_PolyFromWKB(
,
wkb
[,
srid
[,
options
]])ST_PolygonFromWKB(
wkb
[,
srid
[,
options
]])
Constructs a Polygon
value using its WKB
representation and SRID.
These functions handle their arguments as described in the introduction to this section.
MySQL provides a set of useful nonstandard functions for creating geometry values. The functions described in this section are MySQL extensions to the OpenGIS specification.
These functions produce geometry objects from either WKB values or
geometry objects as arguments. If any argument is not a proper WKB
or geometry representation of the proper object type, the return
value is NULL
.
For example, you can insert the geometry return value from
Point()
directly into a
POINT
column:
INSERT INTO t1 (pt_col) VALUES(Point(1,2));
Constructs a GeomCollection
value from the
geometry arguments.
GeomCollection()
returns all
the proper geometries contained in the arguments even if a
nonsupported geometry is present.
GeomCollection()
with no
arguments is permitted as a way to create an empty geometry.
Also, functions such as
ST_GeomFromText()
that accept
WKT geometry collection arguments understand both OpenGIS
'GEOMETRYCOLLECTION EMPTY'
standard syntax
and MySQL 'GEOMETRYCOLLECTION()'
nonstandard syntax.
GeomCollection()
and
GeometryCollection()
are
synonymous, with
GeomCollection()
the preferred
function.
GeometryCollection(
g
[, g
] ...)
Constructs a GeomCollection
value from the
geometry arguments.
GeometryCollection()
returns
all the proper geometries contained in the arguments even if a
nonsupported geometry is present.
GeometryCollection()
with no
arguments is permitted as a way to create an empty geometry.
Also, functions such as
ST_GeomFromText()
that accept
WKT geometry collection arguments understand both OpenGIS
'GEOMETRYCOLLECTION EMPTY'
standard syntax
and MySQL 'GEOMETRYCOLLECTION()'
nonstandard syntax.
GeomCollection()
and
GeometryCollection()
are
synonymous, with
GeomCollection()
the preferred
function.
Constructs a LineString
value from a number
of Point
or WKB Point
arguments. If the number of arguments is less than two, the
return value is NULL
.
MultiLineString(
ls
[, ls
] ...)
Constructs a MultiLineString
value using
LineString
or WKB
LineString
arguments.
Constructs a MultiPoint
value using
Point
or WKB Point
arguments.
MultiPolygon(
poly
[, poly
] ...)
Constructs a MultiPolygon
value from a set
of Polygon
or WKB
Polygon
arguments.
Constructs a Point
using its coordinates.
Constructs a Polygon
value from a number of
LineString
or WKB
LineString
arguments. If any argument does
not represent a LinearRing
(that is, not a
closed and simple LineString
), the return
value is NULL
.
MySQL supports the functions listed in this section for converting geometry values from internal geometry format to WKT or WKB format, or for swapping the order of X and Y coordinates.
There are also functions to convert a string from WKT or WKB format to internal geometry format. See Section 12.16.3, “Functions That Create Geometry Values from WKT Values”, and Section 12.16.4, “Functions That Create Geometry Values from WKB Values”.
Functions such as ST_GeomFromText()
that accept WKT geometry collection arguments understand both
OpenGIS 'GEOMETRYCOLLECTION EMPTY'
standard
syntax and MySQL 'GEOMETRYCOLLECTION()'
nonstandard syntax. Another way to produce an empty geometry
collection is by calling
GeometryCollection()
with no
arguments. Functions such as
ST_AsWKT()
that produce WKT values produce 'GEOMETRYCOLLECTION
EMPTY'
standard syntax:
mysql>SET @s1 = ST_GeomFromText('GEOMETRYCOLLECTION()');
mysql>SET @s2 = ST_GeomFromText('GEOMETRYCOLLECTION EMPTY');
mysql>SELECT ST_AsWKT(@s1), ST_AsWKT(@s2);
+--------------------------+--------------------------+ | ST_AsWKT(@s1) | ST_AsWKT(@s2) | +--------------------------+--------------------------+ | GEOMETRYCOLLECTION EMPTY | GEOMETRYCOLLECTION EMPTY | +--------------------------+--------------------------+ mysql>SELECT ST_AsWKT(GeomCollection());
+----------------------------+ | ST_AsWKT(GeomCollection()) | +----------------------------+ | GEOMETRYCOLLECTION EMPTY | +----------------------------+
Unless otherwise specified, functions in this section handle their arguments as follows:
If any argument is NULL
, the return value
is NULL
.
If any geometry argument is not a syntactically well-formed
geometry, an
ER_GIS_INVALID_DATA
error
occurs.
If any geometry argument is in an undefined spatial reference
system, the axes are output in the order they appear in the
geometry and an
ER_WARN_SRS_NOT_FOUND_AXIS_ORDER
warning occurs.
By default, geographic coordinates (latitude, longitude) are
interpreted as in the order specified by the spatial reference
system of geometry arguments. An optional
options
argument may be given to
override the default axis order. options
consists of a list of comma-separated
.
The onlypermitted key
=value
key
value is
axis-order
, with permitted values of
lat-long
, long-lat
and
srid-defined
(the default).
If the options
argument is
NULL
, the return value is
NULL
. If the
options
argument is invalid, an
error occurs to indicate why.
Otherwise, the return value is non-NULL
.
These functions are available for format conversions or coordinate swapping:
ST_AsBinary(
,
g
[, options
])ST_AsWKB(
g
[, options
])
Converts a value in internal geometry format to its WKB representation and returns the binary result.
The function return value has geographic coordinates
(latitude, longitude) in the order specified by the spatial
reference system that applies to the geometry argument. An
optional options
argument may be
given to override the default axis order.
ST_AsBinary()
and
ST_AsWKB()
handle their arguments as described in the introduction to
this section.
mysql>SET @g = ST_LineFromText('LINESTRING(0 5,5 10,10 15)', 4326);
mysql>SELECT ST_AsText(ST_GeomFromWKB(ST_AsWKB(@g)));
+-----------------------------------------+ | ST_AsText(ST_GeomFromWKB(ST_AsWKB(@g))) | +-----------------------------------------+ | LINESTRING(5 0,10 5,15 10) | +-----------------------------------------+ mysql>SELECT ST_AsText(ST_GeomFromWKB(ST_AsWKB(@g, 'axis-order=long-lat')));
+----------------------------------------------------------------+ | ST_AsText(ST_GeomFromWKB(ST_AsWKB(@g, 'axis-order=long-lat'))) | +----------------------------------------------------------------+ | LINESTRING(0 5,5 10,10 15) | +----------------------------------------------------------------+ mysql>SELECT ST_AsText(ST_GeomFromWKB(ST_AsWKB(@g, 'axis-order=lat-long')));
+----------------------------------------------------------------+ | ST_AsText(ST_GeomFromWKB(ST_AsWKB(@g, 'axis-order=lat-long'))) | +----------------------------------------------------------------+ | LINESTRING(5 0,10 5,15 10) | +----------------------------------------------------------------+
ST_AsText(
,
g
[,
options
])ST_AsWKT(
g
[, options
])
Converts a value in internal geometry format to its WKT representation and returns the string result.
The function return value has geographic coordinates
(latitude, longitude) in the order specified by the spatial
reference system that applies to the geometry argument. An
optional options
argument may be
given to override the default axis order.
ST_AsText()
and
ST_AsWKT()
handle their arguments as described in the introduction to
this section.
mysql>SET @g = 'LineString(1 1,2 2,3 3)';
mysql>SELECT ST_AsText(ST_GeomFromText(@g));
+--------------------------------+ | ST_AsText(ST_GeomFromText(@g)) | +--------------------------------+ | LINESTRING(1 1,2 2,3 3) | +--------------------------------+
Output for MultiPoint
values includes
parentheses around each point. For example:
mysql> SELECT ST_AsText(ST_GeomFromText(@mp));
+---------------------------------+
| ST_AsText(ST_GeomFromText(@mp)) |
+---------------------------------+
| MULTIPOINT((1 1),(2 2),(3 3)) |
+---------------------------------+
Accepts an argument in internal geometry format, swaps the X and Y values of each coordinate pair within the geometry, and returns the result.
ST_SwapXY()
handles its
arguments as described in the introduction to this section.
mysql>SET @g = ST_LineFromText('LINESTRING(0 5,5 10,10 15)');
mysql>SELECT ST_AsText(@g);
+----------------------------+ | ST_AsText(@g) | +----------------------------+ | LINESTRING(0 5,5 10,10 15) | +----------------------------+ mysql>SELECT ST_AsText(ST_SwapXY(@g));
+----------------------------+ | ST_AsText(ST_SwapXY(@g)) | +----------------------------+ | LINESTRING(5 0,10 5,15 10) | +----------------------------+
Each function that belongs to this group takes a geometry value as
its argument and returns some quantitative or qualitative property
of the geometry. Some functions restrict their argument type. Such
functions return NULL
if the argument is of an
incorrect geometry type. For example, the
ST_Area()
polygon function returns
NULL
if the object type is neither
Polygon
nor MultiPolygon
.
The functions listed in this section do not restrict their argument and accept a geometry value of any type.
Unless otherwise specified, functions in this section handle their arguments as follows:
If any argument is NULL
, the return value
is NULL
.
If any geometry argument is not a syntactically well-formed
geometry, an
ER_GIS_INVALID_DATA
error
occurs.
If any geometry argument has an SRID value that refers to an
undefined spatial reference system (SRS), an
ER_SRS_NOT_FOUND
error
occurs.
If any SRID argument is not within the range of a 32-bit
unsigned integer, an
ER_DATA_OUT_OF_RANGE
error
occurs.
If any SRID argument refers to an undefined SRS, an
ER_SRS_NOT_FOUND
error
occurs.
Otherwise, the return value is non-NULL
.
These functions are available for obtaining geometry properties:
Returns the inherent dimension of the geometry value
g
. The dimension can be −1,
0, 1, or 2. The meaning of these values is given in
Section 11.5.2.2, “Geometry Class”.
ST_Dimension()
handles its
arguments as described in the introduction to this section.
mysql> SELECT ST_Dimension(ST_GeomFromText('LineString(1 1,2 2)'));
+------------------------------------------------------+
| ST_Dimension(ST_GeomFromText('LineString(1 1,2 2)')) |
+------------------------------------------------------+
| 1 |
+------------------------------------------------------+
Returns the minimum bounding rectangle (MBR) for the
geometry value g
. The result is
returned as a Polygon
value that is
defined by the corner points of the bounding box:
POLYGON((MINX MINY, MAXX MINY, MAXX MAXY, MINX MAXY, MINX MINY))
mysql> SELECT ST_AsText(ST_Envelope(ST_GeomFromText('LineString(1 1,2 2)')));
+----------------------------------------------------------------+
| ST_AsText(ST_Envelope(ST_GeomFromText('LineString(1 1,2 2)'))) |
+----------------------------------------------------------------+
| POLYGON((1 1,2 1,2 2,1 2,1 1)) |
+----------------------------------------------------------------+
If the argument is a point or a vertical or horizontal line
segment, ST_Envelope()
returns the point or the line segment as its MBR rather than
returning an invalid polygon:
mysql> SELECT ST_AsText(ST_Envelope(ST_GeomFromText('LineString(1 1,1 2)')));
+----------------------------------------------------------------+
| ST_AsText(ST_Envelope(ST_GeomFromText('LineString(1 1,1 2)'))) |
+----------------------------------------------------------------+
| LINESTRING(1 1,1 2) |
+----------------------------------------------------------------+
ST_Envelope()
handles its
arguments as described in the introduction to this section,
with this exception:
If the geometry has an SRID value for a geographic
spatial reference system (SRS), an
ER_NOT_IMPLEMENTED_FOR_GEOGRAPHIC_SRS
error occurs.
Returns a binary string indicating the name of the geometry
type of which the geometry instance
g
is a member. The name
corresponds to one of the instantiable
Geometry
subclasses.
ST_GeometryType()
handles its
arguments as described in the introduction to this section.
mysql> SELECT ST_GeometryType(ST_GeomFromText('POINT(1 1)'));
+------------------------------------------------+
| ST_GeometryType(ST_GeomFromText('POINT(1 1)')) |
+------------------------------------------------+
| POINT |
+------------------------------------------------+
This function is a placeholder that returns 1 for an empty geometry collection value or 0 otherwise.
The only valid empty geometry is represented in the form of
an empty geometry collection value. MySQL does not support
GIS EMPTY
values such as POINT
EMPTY
.
ST_IsEmpty()
handles its
arguments as described in the introduction to this section.
Returns 1 if the geometry value g
is simple according to the ISO SQL/MM Part 3:
Spatial standard.
ST_IsSimple()
returns 0 if
the argument is not simple.
The descriptions of the instantiable geometric classes given under Section 11.5.2, “The OpenGIS Geometry Model” include the specific conditions that cause class instances to be classified as not simple.
ST_IsSimple()
handles its
arguments as described in the introduction to this section,
with this exception:
If the geometry has a geographic SRS with a longitude or latitude that is out of range, an error occurs:
If any longitude argument is not in the range
(−180, 180], an
ER_LONGITUDE_OUT_OF_RANGE
error occurs.
If any latitude argument is not in the range
[−90, 90], an
ER_LATITUDE_OUT_OF_RANGE
error occurs.
Ranges shown are in degrees. The exact range limits deviate slightly due to floating-point arithmetic.
With a single argument representing a valid geometry object
g
,
ST_SRID()
returns an integer
indicating the ID of the spatial reference system (SRS)
associated with g
.
With the optional second argument representing a valid SRID
value, ST_SRID()
returns an
object with the same type as its first argument with an SRID
value equal to the second argument. This only sets the SRID
value of the object; it does not perform any transformation
of coordinate values.
ST_SRID()
handles its
arguments as described in the introduction to this section,
with this exception:
For the single-argument syntax,
ST_SRID()
returns the
geometry SRID even if it refers to an undefined SRS. An
ER_SRS_NOT_FOUND
error
does not occur.
ST_SRID(
and
g
,
target_srid
)ST_Transform(
differ as
follows:
g
,
target_srid
)
ST_SRID()
changes the
geometry SRID value without transforming its
coordinates.
ST_Transform()
transforms
the geometry coordinates in addition to changing its
SRID value.
mysql>SET @g = ST_GeomFromText('LineString(1 1,2 2)', 0);
mysql>SELECT ST_SRID(@g);
+-------------+ | ST_SRID(@g) | +-------------+ | 0 | +-------------+ mysql>SET @g = ST_SRID(@g, 4326);
mysql>SELECT ST_SRID(@g);
+-------------+ | ST_SRID(@g) | +-------------+ | 4326 | +-------------+
It is possible to create a geometry in a particular SRID by
passing to ST_SRID()
the
result of one of the MySQL-specific functions for creating
spatial values, along with an SRID value. For example:
SET @g1 = ST_SRID(Point(1, 1), 4326);
However, that method creates the geometry in SRID 0, then casts it to SRID 4326 (WGS 84). A preferable alternative is to create the geometry with the correct spatial reference system to begin with. For example:
SET @g1 = ST_PointFromText('POINT(1 1)', 4326); SET @g1 = ST_GeomFromText('POINT(1 1)', 4326);
The two-argument form of
ST_SRID()
is useful for tasks
such as correcting or changing the SRS of geometries that
have an incorrect SRID.
A Point
consists of X and Y coordinates,
which may be obtained using the
ST_X()
and
ST_Y()
functions, respectively.
These functions also permit an optional second argument that
specifies an X or Y coordinate value, in which case the function
result is the Point
object from the first
argument with the appropriate coordinate modified to be equal to
the second argument.
For Point
objects that have a geographic
spatial reference system (SRS), the longitude and latitude may
be obtained using the
ST_Longitude()
and
ST_Latitude()
functions,
respectively. These functions also permit an optional second
argument that specifies a longitude or latitude value, in which
case the function result is the Point
object
from the first argument with the longitude or latitude modified
to be equal to the second argument.
Unless otherwise specified, functions in this section handle their arguments as follows:
If any argument is NULL
, the return value
is NULL
.
If any geometry argument is a valid geometry but not a
Point
object, an
ER_UNEXPECTED_GEOMETRY_TYPE
error occurs.
If any geometry argument is not a syntactically well-formed
geometry, an
ER_GIS_INVALID_DATA
error
occurs.
If any geometry argument has an SRID value that refers to an
undefined spatial reference system (SRS), an
ER_SRS_NOT_FOUND
error
occurs.
If an X or Y coordinate argument is provided and the value
is -inf
, +inf
, or
NaN
, an
ER_DATA_OUT_OF_RANGE
error
occurs.
If a longitude or latitude argument is out of range, an error occurs:
If any longitude argument is not in the range
(−180, 180], an
ER_LONGITUDE_OUT_OF_RANGE
error occurs.
If any latitude argument is not in the range [−90,
90], an
ER_LATITUDE_OUT_OF_RANGE
error occurs.
Ranges shown are in degrees. The exact range limits deviate slightly due to floating-point arithmetic.
Otherwise, the return value is non-NULL
.
These functions are available for obtaining point properties:
ST_Latitude(
p
[, new_latitude_val
])
With a single argument representing a valid
Point
object p
that has a geographic spatial reference system (SRS),
ST_Latitude()
returns the
latitude value of p
as a
double-precision number.
With the optional second argument representing a valid
latitude value, ST_Latitude()
returns a Point
object like the first
argument with its latitude equal to the second argument.
ST_Latitude()
handles its
arguments as described in the introduction to this section,
with the addition that if the Point
object is valid but does not have a geographic SRS, an
ER_SRS_NOT_GEOGRAPHIC
error
occurs.
mysql>SET @pt = ST_GeomFromText('POINT(45 90)', 4326);
mysql>SELECT ST_Latitude(@pt);
+------------------+ | ST_Latitude(@pt) | +------------------+ | 45 | +------------------+ mysql>SELECT ST_AsText(ST_Latitude(@pt, 10));
+---------------------------------+ | ST_AsText(ST_Latitude(@pt, 10)) | +---------------------------------+ | POINT(10 90) | +---------------------------------+
This function was added in MySQL 8.0.12.
ST_Longitude(
p
[, new_longitude_val
])
With a single argument representing a valid
Point
object p
that has a geographic spatial reference system (SRS),
ST_Longitude()
returns the
longitude value of p
as a
double-precision number.
With the optional second argument representing a valid
longitude value,
ST_Longitude()
returns a
Point
object like the first argument with
its longitude equal to the second argument.
ST_Longitude()
handles its
arguments as described in the introduction to this section,
with the addition that if the Point
object is valid but does not have a geographic SRS, an
ER_SRS_NOT_GEOGRAPHIC
error
occurs.
mysql>SET @pt = ST_GeomFromText('POINT(45 90)', 4326);
mysql>SELECT ST_Longitude(@pt);
+-------------------+ | ST_Longitude(@pt) | +-------------------+ | 90 | +-------------------+ mysql>SELECT ST_AsText(ST_Longitude(@pt, 10));
+----------------------------------+ | ST_AsText(ST_Longitude(@pt, 10)) | +----------------------------------+ | POINT(45 10) | +----------------------------------+
This function was added in MySQL 8.0.12.
With a single argument representing a valid
Point
object
p
,
ST_X()
returns the
X-coordinate value of p
as a
double-precision number. As of MySQL 8.0.12, the X
coordinate is considered to refer to the axis that appears
first in the Point
spatial reference
system (SRS) definition.
With the optional second argument,
ST_X()
returns a
Point
object like the first argument with
its X coordinate equal to the second argument. As of MySQL
8.0.12, if the Point
object has a
geographic SRS, the second argument must be in the proper
range for longitude or latitude values.
ST_X()
handles its arguments
as described in the introduction to this section.
mysql>SELECT ST_X(Point(56.7, 53.34));
+--------------------------+ | ST_X(Point(56.7, 53.34)) | +--------------------------+ | 56.7 | +--------------------------+ mysql>SELECT ST_AsText(ST_X(Point(56.7, 53.34), 10.5));
+-------------------------------------------+ | ST_AsText(ST_X(Point(56.7, 53.34), 10.5)) | +-------------------------------------------+ | POINT(10.5 53.34) | +-------------------------------------------+
With a single argument representing a valid
Point
object
p
,
ST_Y()
returns the
Y-coordinate value of p
as a
double-precision number. As of MySQL 8.0.12, the Y
coordinate is considered to refer to the axis that appears
second in the Point
spatial reference
system (SRS) definition.
With the optional second argument,
ST_Y()
returns a
Point
object like the first argument with
its Y coordinate equal to the second argument. As of MySQL
8.0.12, if the Point
object has a
geographic SRS, the second argument must be in the proper
range for longitude or latitude values.
ST_Y()
handles its arguments
as described in the introduction to this section.
mysql>SELECT ST_Y(Point(56.7, 53.34));
+--------------------------+ | ST_Y(Point(56.7, 53.34)) | +--------------------------+ | 53.34 | +--------------------------+ mysql>SELECT ST_AsText(ST_Y(Point(56.7, 53.34), 10.5));
+-------------------------------------------+ | ST_AsText(ST_Y(Point(56.7, 53.34), 10.5)) | +-------------------------------------------+ | POINT(56.7 10.5) | +-------------------------------------------+
A LineString
consists of
Point
values. You can extract particular
points of a LineString
, count the number of
points that it contains, or obtain its length.
Some functions in this section also work for
MultiLineString
values.
Unless otherwise specified, functions in this section handle their arguments as follows:
If any argument is NULL
or any geometry
argument is an empty geometry, the return value is
NULL
.
If any geometry argument is not a syntactically well-formed
geometry, an
ER_GIS_INVALID_DATA
error
occurs.
If any geometry argument has an SRID value that refers to an
undefined spatial reference system (SRS), an
ER_SRS_NOT_FOUND
error
occurs.
Otherwise, the return value is non-NULL
.
These functions are available for obtaining linestring properties:
Returns the Point
that is the endpoint of
the LineString
value
ls
.
ST_EndPoint()
handles its
arguments as described in the introduction to this section.
mysql>SET @ls = 'LineString(1 1,2 2,3 3)';
mysql>SELECT ST_AsText(ST_EndPoint(ST_GeomFromText(@ls)));
+----------------------------------------------+ | ST_AsText(ST_EndPoint(ST_GeomFromText(@ls))) | +----------------------------------------------+ | POINT(3 3) | +----------------------------------------------+
For a LineString
value
ls
,
ST_IsClosed()
returns 1 if
ls
is closed (that is, its
ST_StartPoint()
and
ST_EndPoint()
values are the
same).
For a MultiLineString
value
ls
,
ST_IsClosed()
returns 1 if
ls
is closed (that is, the
ST_StartPoint()
and
ST_EndPoint()
values are the
same for each LineString
in
ls
).
ST_IsClosed()
returns 0 if
ls
is not closed, and
NULL
if ls
is
NULL
.
ST_IsClosed()
handles its
arguments as described in the introduction to this section,
with this exception:
If the geometry has an SRID value for a geographic
spatial reference system (SRS), an
ER_NOT_IMPLEMENTED_FOR_GEOGRAPHIC_SRS
error occurs.
mysql>SET @ls1 = 'LineString(1 1,2 2,3 3,2 2)';
mysql>SET @ls2 = 'LineString(1 1,2 2,3 3,1 1)';
mysql>SELECT ST_IsClosed(ST_GeomFromText(@ls1));
+------------------------------------+ | ST_IsClosed(ST_GeomFromText(@ls1)) | +------------------------------------+ | 0 | +------------------------------------+ mysql>SELECT ST_IsClosed(ST_GeomFromText(@ls2));
+------------------------------------+ | ST_IsClosed(ST_GeomFromText(@ls2)) | +------------------------------------+ | 1 | +------------------------------------+ mysql>SET @ls3 = 'MultiLineString((1 1,2 2,3 3),(4 4,5 5))';
mysql>SELECT ST_IsClosed(ST_GeomFromText(@ls3));
+------------------------------------+ | ST_IsClosed(ST_GeomFromText(@ls3)) | +------------------------------------+ | 0 | +------------------------------------+
Returns a double-precision number indicating the length of
the LineString
or
MultiLineString
value
ls
in its associated spatial
reference system. The length of a
MultiLineString
value is equal to the sum
of the lengths of its elements.
ST_Length()
computes a result
as follows:
If the geometry is a valid LineString
in a Cartesian SRS, the return value is the Cartesian
length of the geometry.
If the geometry is a valid
MultiLineString
in a Cartesian SRS,
the return value is the sum of the Cartesian lengths of
its elements.
If the geometry is a valid LineString
in a geographic SRS, the return value is the geodetic
length of the geometry in that SRS, in meters.
If the geometry is a valid
MultiLineString
in a geographic SRS,
the return value is the sum of the geodetic lengths of
its elements in that SRS, in meters.
ST_Length()
handles its
arguments as described in the introduction to this section,
with these exceptions:
If the geometry is not a LineString
or MultiLineString
, the return value
is NULL
.
If the geometry is geometrically invalid, either the result is an undefined length (that is, it can be any number), or an error occurs.
If the length computation result is
+inf
, an
ER_DATA_OUT_OF_RANGE
error occurs.
If the geometry has a geographic SRS with a longitude or latitude that is out of range, an error occurs:
If any longitude argument is not in the range
(−180, 180], an
ER_LONGITUDE_OUT_OF_RANGE
error occurs.
If any latitude argument is not in the range
[−90, 90], an
ER_LATITUDE_OUT_OF_RANGE
error occurs.
Ranges shown are in degrees. The exact range limits deviate slightly due to floating-point arithmetic.
As of MySQL 8.0.16,
ST_Length()
permits an
optional unit
argument that
specifies the linear unit for the returned length value.
These rules apply:
If a unit is specified but not supported by MySQL, an
ER_UNIT_NOT_FOUND
error
occurs.
If a supported linear unit is specified and the SRID is
0, an
ER_GEOMETRY_IN_UNKNOWN_LENGTH_UNIT
error occurs.
If a supported linear unit is specified and the SRID is not 0, the result is in that unit.
If a unit is not specified, the result is in the unit of the SRS of the geometries, whether Cartesian or geographic. Currently, all MySQL SRSs are expressed in meters.
A unit is supported if it is found in the
INFORMATION_SCHEMA
ST_UNITS_OF_MEASURE
table. See
Section 25.29, “The INFORMATION_SCHEMA ST_UNITS_OF_MEASURE Table”.
mysql>SET @ls = ST_GeomFromText('LineString(1 1,2 2,3 3)');
mysql>SELECT ST_Length(@ls);
+--------------------+ | ST_Length(@ls) | +--------------------+ | 2.8284271247461903 | +--------------------+ mysql>SET @mls = ST_GeomFromText('MultiLineString((1 1,2 2,3 3),(4 4,5 5))');
mysql>SELECT ST_Length(@mls);
+-------------------+ | ST_Length(@mls) | +-------------------+ | 4.242640687119286 | +-------------------+ mysql>SET @ls = ST_GeomFromText('LineString(1 1,2 2,3 3)', 4326);
mysql>SELECT ST_Length(@ls);
+-------------------+ | ST_Length(@ls) | +-------------------+ | 313701.9623204328 | +-------------------+ mysql>SELECT ST_Length(@ls, 'metre');
+-------------------------+ | ST_Length(@ls, 'metre') | +-------------------------+ | 313701.9623204328 | +-------------------------+ mysql>SELECT ST_Length(@ls, 'foot');
+------------------------+ | ST_Length(@ls, 'foot') | +------------------------+ | 1029205.9131247795 | +------------------------+
Returns the number of Point
objects in
the LineString
value
ls
.
ST_NumPoints()
handles its
arguments as described in the introduction to this section.
mysql>SET @ls = 'LineString(1 1,2 2,3 3)';
mysql>SELECT ST_NumPoints(ST_GeomFromText(@ls));
+------------------------------------+ | ST_NumPoints(ST_GeomFromText(@ls)) | +------------------------------------+ | 3 | +------------------------------------+
Returns the N
-th
Point
in the
Linestring
value
ls
. Points are numbered beginning
with 1.
ST_PointN()
handles its
arguments as described in the introduction to this section.
mysql>SET @ls = 'LineString(1 1,2 2,3 3)';
mysql>SELECT ST_AsText(ST_PointN(ST_GeomFromText(@ls),2));
+----------------------------------------------+ | ST_AsText(ST_PointN(ST_GeomFromText(@ls),2)) | +----------------------------------------------+ | POINT(2 2) | +----------------------------------------------+
Returns the Point
that is the start point
of the LineString
value
ls
.
ST_StartPoint()
handles its
arguments as described in the introduction to this section.
mysql>SET @ls = 'LineString(1 1,2 2,3 3)';
mysql>SELECT ST_AsText(ST_StartPoint(ST_GeomFromText(@ls)));
+------------------------------------------------+ | ST_AsText(ST_StartPoint(ST_GeomFromText(@ls))) | +------------------------------------------------+ | POINT(1 1) | +------------------------------------------------+
Functions in this section return properties of
Polygon
or MultiPolygon
values.
Unless otherwise specified, functions in this section handle their arguments as follows:
If any argument is NULL
or any geometry
argument is an empty geometry, the return value is
NULL
.
If any geometry argument is not a syntactically well-formed
geometry, an
ER_GIS_INVALID_DATA
error
occurs.
If any geometry argument has an SRID value that refers to an
undefined spatial reference system (SRS), an
ER_SRS_NOT_FOUND
error
occurs.
For functions that take multiple geometry arguments, if
those arguments do not have the same SRID, an
ER_GIS_DIFFERENT_SRIDS
error
occurs.
Otherwise, the return value is non-NULL
.
These functions are available for obtaining polygon properties:
Returns a double-precision number indicating the area of the
Polygon
or
MultiPolygon
argument, as measured in its
spatial reference system.
As of MySQL 8.0.13, ST_Area()
handles its arguments as described in the introduction to
this section, with these exceptions:
If the geometry is geometrically invalid, either the result is an undefined area (that is, it can be any number), or an error occurs.
If the geometry is valid but is not a
Polygon
or
MultiPolygon
object, an
ER_UNEXPECTED_GEOMETRY_TYPE
error occurs.
If the geometry is a valid Polygon
in
a Cartesian SRS, the result is the Cartesian area of the
polygon.
If the geometry is a valid
MultiPolygon
in a Cartesian SRS, the
result is the sum of the Cartesian area of the polygons.
If the geometry is a valid Polygon
in
a geographic SRS, the result is the geodetic area of the
polygon in that SRS, in square meters.
If the geometry is a valid
MultiPolygon
in a geographic SRS, the
result is the sum of geodetic area of the polygons in
that SRS, in square meters.
If an area computation results in
+inf
, an
ER_DATA_OUT_OF_RANGE
error occurs.
If the geometry has a geographic SRS with a longitude or latitude that is out of range, an error occurs:
If any longitude argument is not in the range
(−180, 180], an
ER_LONGITUDE_OUT_OF_RANGE
error occurs.
If any latitude argument is not in the range
[−90, 90], an
ER_LATITUDE_OUT_OF_RANGE
error occurs.
Ranges shown are in degrees. The exact range limits deviate slightly due to floating-point arithmetic.
Prior to MySQL 8.0.13,
ST_Area()
handles its
arguments as described in the introduction to this section,
with these exceptions:
For arguments of dimension 0 or 1, the result is 0.
If a geometry is empty, the return value is 0 rather
than NULL
.
For a geometry collection, the result is the sum of the area values of all components. If the geometry collection is empty, its area is returned as 0.
If the geometry has an SRID value for a geographic
spatial reference system (SRS), an
ER_NOT_IMPLEMENTED_FOR_GEOGRAPHIC_SRS
error occurs.
mysql>SET @poly =
'Polygon((0 0,0 3,3 0,0 0),(1 1,1 2,2 1,1 1))';
mysql>SELECT ST_Area(ST_GeomFromText(@poly));
+---------------------------------+ | ST_Area(ST_GeomFromText(@poly)) | +---------------------------------+ | 4 | +---------------------------------+ mysql>SET @mpoly =
'MultiPolygon(((0 0,0 3,3 3,3 0,0 0),(1 1,1 2,2 2,2 1,1 1)))';
mysql>SELECT ST_Area(ST_GeomFromText(@mpoly));
+----------------------------------+ | ST_Area(ST_GeomFromText(@mpoly)) | +----------------------------------+ | 8 | +----------------------------------+
Returns the mathematical centroid for the
Polygon
or
MultiPolygon
argument as a
Point
. The result is not guaranteed to be
on the MultiPolygon
.
This function processes geometry collections by computing
the centroid point for components of highest dimension in
the collection. Such components are extracted and made into
a single MultiPolygon
,
MultiLineString
, or
MultiPoint
for centroid computation.
ST_Centroid()
handles its
arguments as described in the introduction to this section,
with these exceptions:
The return value is NULL
for the
additional condition that the argument is an empty
geometry collection.
If the geometry has an SRID value for a geographic
spatial reference system (SRS), an
ER_NOT_IMPLEMENTED_FOR_GEOGRAPHIC_SRS
error occurs.
mysql>SET @poly =
ST_GeomFromText('POLYGON((0 0,10 0,10 10,0 10,0 0),(5 5,7 5,7 7,5 7,5 5))');
mysql>SELECT ST_GeometryType(@poly),ST_AsText(ST_Centroid(@poly));
+------------------------+--------------------------------------------+ | ST_GeometryType(@poly) | ST_AsText(ST_Centroid(@poly)) | +------------------------+--------------------------------------------+ | POLYGON | POINT(4.958333333333333 4.958333333333333) | +------------------------+--------------------------------------------+
Returns the exterior ring of the Polygon
value poly
as a
LineString
.
ST_ExteriorRing()
handles its
arguments as described in the introduction to this section.
mysql>SET @poly =
'Polygon((0 0,0 3,3 3,3 0,0 0),(1 1,1 2,2 2,2 1,1 1))';
mysql>SELECT ST_AsText(ST_ExteriorRing(ST_GeomFromText(@poly)));
+----------------------------------------------------+ | ST_AsText(ST_ExteriorRing(ST_GeomFromText(@poly))) | +----------------------------------------------------+ | LINESTRING(0 0,0 3,3 3,3 0,0 0) | +----------------------------------------------------+
Returns the N
-th interior ring
for the Polygon
value
poly
as a
LineString
. Rings are numbered beginning
with 1.
ST_InteriorRingN()
handles
its arguments as described in the introduction to this
section.
mysql>SET @poly =
'Polygon((0 0,0 3,3 3,3 0,0 0),(1 1,1 2,2 2,2 1,1 1))';
mysql>SELECT ST_AsText(ST_InteriorRingN(ST_GeomFromText(@poly),1));
+-------------------------------------------------------+ | ST_AsText(ST_InteriorRingN(ST_GeomFromText(@poly),1)) | +-------------------------------------------------------+ | LINESTRING(1 1,1 2,2 2,2 1,1 1) | +-------------------------------------------------------+
ST_NumInteriorRing(
,
poly
)ST_NumInteriorRings(
poly
)
Returns the number of interior rings in the
Polygon
value
poly
.
ST_NumInteriorRing()
and ST_NuminteriorRings()
handle their arguments as described in the introduction to
this section.
mysql>SET @poly =
'Polygon((0 0,0 3,3 3,3 0,0 0),(1 1,1 2,2 2,2 1,1 1))';
mysql>SELECT ST_NumInteriorRings(ST_GeomFromText(@poly));
+---------------------------------------------+ | ST_NumInteriorRings(ST_GeomFromText(@poly)) | +---------------------------------------------+ | 1 | +---------------------------------------------+
These functions return properties of
GeometryCollection
values.
Unless otherwise specified, functions in this section handle their arguments as follows:
If any argument is NULL
or any geometry
argument is an empty geometry, the return value is
NULL
.
If any geometry argument is not a syntactically well-formed
geometry, an
ER_GIS_INVALID_DATA
error
occurs.
If any geometry argument has an SRID value that refers to an
undefined spatial reference system (SRS), an
ER_SRS_NOT_FOUND
error
occurs.
Otherwise, the return value is non-NULL
.
These functions are available for obtaining geometry collection properties:
Returns the N
-th geometry in the
GeometryCollection
value
gc
. Geometries are numbered
beginning with 1.
ST_GeometryN()
handles its
arguments as described in the introduction to this section.
mysql>SET @gc = 'GeometryCollection(Point(1 1),LineString(2 2, 3 3))';
mysql>SELECT ST_AsText(ST_GeometryN(ST_GeomFromText(@gc),1));
+-------------------------------------------------+ | ST_AsText(ST_GeometryN(ST_GeomFromText(@gc),1)) | +-------------------------------------------------+ | POINT(1 1) | +-------------------------------------------------+
Returns the number of geometries in the
GeometryCollection
value
gc
.
ST_NumGeometries()
handles
its arguments as described in the introduction to this
section.
mysql>SET @gc = 'GeometryCollection(Point(1 1),LineString(2 2, 3 3))';
mysql>SELECT ST_NumGeometries(ST_GeomFromText(@gc));
+----------------------------------------+ | ST_NumGeometries(ST_GeomFromText(@gc)) | +----------------------------------------+ | 2 | +----------------------------------------+
OpenGIS proposes a number of functions that can produce geometries. They are designed to implement spatial operators.
These functions support all argument type combinations except those that are inapplicable according to the Open Geospatial Consortium specification.
Unless otherwise specified, functions in this section handle their arguments as follows:
If any argument is NULL
, the return value
is NULL
.
If any geometry argument is not a syntactically well-formed
geometry, an
ER_GIS_INVALID_DATA
error
occurs.
If any geometry argument has an SRID value that refers to an
undefined spatial reference system (SRS), an
ER_SRS_NOT_FOUND
error occurs.
For functions that take multiple geometry arguments, if those
arguments do not have the same SRID, an
ER_GIS_DIFFERENT_SRIDS
error
occurs.
If any geometry argument has an SRID value for a geographic
SRS, an
ER_NOT_IMPLEMENTED_FOR_GEOGRAPHIC_SRS
error occurs.
Otherwise, the return value is non-NULL
.
These spatial operator functions are available:
ST_Buffer(
g
,
d
[,
strategy1
[,
strategy2
[,
strategy3
]]])
Returns a geometry that represents all points whose distance
from the geometry value g
is less
than or equal to a distance of d
.
If the geometry argument is empty,
ST_Buffer()
returns an empty
geometry.
If the distance is 0,
ST_Buffer()
returns the
geometry argument unchanged:
mysql>SET @pt = ST_GeomFromText('POINT(0 0)');
mysql>SELECT ST_AsText(ST_Buffer(@pt, 0));
+------------------------------+ | ST_AsText(ST_Buffer(@pt, 0)) | +------------------------------+ | POINT(0 0) | +------------------------------+
ST_Buffer()
supports negative
distances for Polygon
and
MultiPolygon
values, and for geometry
collections containing Polygon
or
MultiPolygon
values. The result may be an
empty geometry.
ST_Buffer()
permits up to three
optional strategy arguments following the distance argument.
Strategies influence buffer computation. These arguments are
byte string values produced by the
ST_Buffer_Strategy()
function,
to be used for point, join, and end strategies:
Point strategies apply to Point
and
MultiPoint
geometries. If no point
strategy is specified, the default is
ST_Buffer_Strategy('point_circle',
32)
.
Join strategies apply to LineString
,
MultiLineString
,
Polygon
, and
MultiPolygon
geometries. If no join
strategy is specified, the default is
ST_Buffer_Strategy('join_round',
32)
.
End strategies apply to LineString
and
MultiLineString
geometries. If no end
strategy is specified, the default is
ST_Buffer_Strategy('end_round',
32)
.
Up to one strategy of each type may be specified, and they may be given in any order.
ST_Buffer()
handles its
arguments as described in the introduction to this section,
with these exceptions:
For a negative distance for Point
,
MultiPoint
,
LineString
, and
MultiLineString
values, and for
geometry collections not containing any
Polygon
or
MultiPolygon
values, an
ER_WRONG_ARGUMENTS
error
occurs.
If multiple strategies of a given type are specified, an
ER_WRONG_ARGUMENTS
error
occurs.
mysql>SET @pt = ST_GeomFromText('POINT(0 0)');
mysql>SET @pt_strategy = ST_Buffer_Strategy('point_square');
mysql>SELECT ST_AsText(ST_Buffer(@pt, 2, @pt_strategy));
+--------------------------------------------+ | ST_AsText(ST_Buffer(@pt, 2, @pt_strategy)) | +--------------------------------------------+ | POLYGON((-2 -2,2 -2,2 2,-2 2,-2 -2)) | +--------------------------------------------+
mysql>SET @ls = ST_GeomFromText('LINESTRING(0 0,0 5,5 5)');
mysql>SET @end_strategy = ST_Buffer_Strategy('end_flat');
mysql>SET @join_strategy = ST_Buffer_Strategy('join_round', 10);
mysql>SELECT ST_AsText(ST_Buffer(@ls, 5, @end_strategy, @join_strategy))
+---------------------------------------------------------------+ | ST_AsText(ST_Buffer(@ls, 5, @end_strategy, @join_strategy)) | +---------------------------------------------------------------+ | POLYGON((5 5,5 10,0 10,-3.5355339059327373 8.535533905932738, | | -5 5,-5 0,0 0,5 0,5 5)) | +---------------------------------------------------------------+
ST_Buffer_Strategy(
strategy
[,
points_per_circle
])
This function returns a strategy byte string for use with
ST_Buffer()
to influence buffer
computation.
Information about strategies is available at Boost.org.
The first argument must be a string indicating a strategy option:
For point strategies, permitted values are
'point_circle'
and
'point_square'
.
For join strategies, permitted values are
'join_round'
and
'join_miter'
.
For end strategies, permitted values are
'end_round'
and
'end_flat'
.
If the first argument is 'point_circle'
,
'join_round'
,
'join_miter'
, or
'end_round'
, the
points_per_circle
argument must be
given as a positive numeric value. The maximum
points_per_circle
value is the
value of the
max_points_in_geometry
system
variable.
For examples, see the description of
ST_Buffer()
.
ST_Buffer_Strategy()
handles
its arguments as described in the introduction to this
section, with these exceptions:
If any argument is invalid, an
ER_WRONG_ARGUMENTS
error
occurs.
If the first argument is 'point_square'
or 'end_flat'
, the
points_per_circle
argument must
not be given or an
ER_WRONG_ARGUMENTS
error
occurs.
Returns a geometry that represents the convex hull of the
geometry value g
.
This function computes a geometry's convex hull by first
checking whether its vertex points are colinear. The function
returns a linear hull if so, a polygon hull otherwise. This
function processes geometry collections by extracting all
vertex points of all components of the collection, creating a
MultiPoint
value from them, and computing
its convex hull.
ST_ConvexHull()
handles its
arguments as described in the introduction to this section,
with this exception:
The return value is NULL
for the
additional condition that the argument is an empty
geometry collection.
mysql>SET @g = 'MULTIPOINT(5 0,25 0,15 10,15 25)';
mysql>SELECT ST_AsText(ST_ConvexHull(ST_GeomFromText(@g)));
+-----------------------------------------------+ | ST_AsText(ST_ConvexHull(ST_GeomFromText(@g))) | +-----------------------------------------------+ | POLYGON((5 0,25 0,15 25,5 0)) | +-----------------------------------------------+
Returns a geometry that represents the point set difference of
the geometry values g1
and
g2
.
ST_Difference()
handles its
arguments as described in the introduction to this section.
mysql>SET @g1 = Point(1,1), @g2 = Point(2,2);
mysql>SELECT ST_AsText(ST_Difference(@g1, @g2));
+------------------------------------+ | ST_AsText(ST_Difference(@g1, @g2)) | +------------------------------------+ | POINT(1 1) | +------------------------------------+
Returns a geometry that represents the point set intersection
of the geometry values g1
and
g2
.
ST_Intersection()
handles its
arguments as described in the introduction to this section.
mysql>SET @g1 = ST_GeomFromText('LineString(1 1, 3 3)');
mysql>SET @g2 = ST_GeomFromText('LineString(1 3, 3 1)');
mysql>SELECT ST_AsText(ST_Intersection(@g1, @g2));
+--------------------------------------+ | ST_AsText(ST_Intersection(@g1, @g2)) | +--------------------------------------+ | POINT(2 2) | +--------------------------------------+
Returns a geometry that represents the point set symmetric
difference of the geometry values
g1
and
g2
, which is defined as:
g1
symdifferenceg2
:= (g1
uniong2
) difference (g1
intersectiong2
)
Or, in function call notation:
ST_SymDifference(g1
,g2
) = ST_Difference(ST_Union(g1
,g2
), ST_Intersection(g1
,g2
))
ST_SymDifference()
handles its
arguments as described in the introduction to this section.
mysql>SET @g1 = Point(1,1), @g2 = Point(2,2);
mysql>SELECT ST_AsText(ST_SymDifference(@g1, @g2));
+---------------------------------------+ | ST_AsText(ST_SymDifference(@g1, @g2)) | +---------------------------------------+ | MULTIPOINT((1 1),(2 2)) | +---------------------------------------+
Transforms a geometry from one spatial reference system (SRS)
to another. The return value is a geometry of the same type as
the input geometry with all coordinates transformed to the
target SRID, target_srid
.
Transformation support is limited to geographic SRSs, unless
the SRID of the geometry argument is the same as the target
SRID value, in which case the return value is the input
geometry for any valid SRS.
ST_Transform()
handles its
arguments as described in the introduction to this section,
with these exceptions:
Geometry arguments that have an SRID value for a geographic SRS do not produce an error.
If the geometry or target SRID argument has an SRID value
that refers to an undefined spatial reference system
(SRS), an ER_SRS_NOT_FOUND
error occurs.
If the geometry is in an SRS that
ST_Transform()
cannot
transform from, an
ER_TRANSFORM_SOURCE_SRS_NOT_SUPPORTED
error occurs.
If the target SRID is in an SRS that
ST_Transform()
cannot
transform to, an
ER_TRANSFORM_TARGET_SRS_NOT_SUPPORTED
error occurs.
If the geometry is in an SRS that is not WGS 84 and has no
TOWGS84 clause, an
ER_TRANSFORM_SOURCE_SRS_MISSING_TOWGS84
error occurs.
If the target SRID is in an SRS that is not WGS 84 and has
no TOWGS84 clause, an
ER_TRANSFORM_TARGET_SRS_MISSING_TOWGS84
error occurs.
ST_SRID(
and
g
,
target_srid
)ST_Transform(
differ as
follows:
g
,
target_srid
)
ST_SRID()
changes the
geometry SRID value without transforming its coordinates.
ST_Transform()
transforms
the geometry coordinates in addition to changing its SRID
value.
mysql>SET @p = ST_GeomFromText('POINT(52.381389 13.064444)', 4326);
mysql>SELECT ST_AsText(@p);
+----------------------------+ | ST_AsText(@p) | +----------------------------+ | POINT(52.381389 13.064444) | +----------------------------+ mysql>SET @p = ST_Transform(@p, 4230);
mysql>SELECT ST_AsText(@p);
+---------------------------------------------+ | ST_AsText(@p) | +---------------------------------------------+ | POINT(52.38208611407426 13.065520672345304) | +---------------------------------------------+
Returns a geometry that represents the point set union of the
geometry values g1
and
g2
.
ST_Union()
handles its
arguments as described in the introduction to this section.
mysql>SET @g1 = ST_GeomFromText('LineString(1 1, 3 3)');
mysql>SET @g2 = ST_GeomFromText('LineString(1 3, 3 1)');
mysql>SELECT ST_AsText(ST_Union(@g1, @g2));
+--------------------------------------+ | ST_AsText(ST_Union(@g1, @g2)) | +--------------------------------------+ | MULTILINESTRING((1 1,3 3),(1 3,3 1)) | +--------------------------------------+
In addition, Section 12.16.7, “Geometry Property Functions”, discusses several functions that construct new geometries from existing ones. See that section for descriptions of these functions:
The functions described in this section take two geometries as arguments and return a qualitative or quantitative relation between them.
MySQL implements two sets of functions using function names defined by the OpenGIS specification. One set tests the relationship between two geometry values using precise object shapes, the other set uses object minimum bounding rectangles (MBRs).
The OpenGIS specification defines the following functions to
test the relationship between two geometry values
g1
and g2
, using precise
object shapes. The return values 1 and 0 indicate true and
false, respectively, except for
ST_Distance()
, which returns
distance values.
Functions in this section detect arguments in either Cartesian or geographic spatial reference systems (SRSs), and return results appropriate to the SRS.
Unless otherwise specified, functions in this section handle their arguments as follows:
If any argument is NULL
or any geometry
argument is an empty geometry, the return value is
NULL
.
If any geometry argument is not a syntactically well-formed
geometry, an
ER_GIS_INVALID_DATA
error
occurs.
If any geometry argument refers to an undefined spatial
reference system (SRS), an
ER_SRS_NOT_FOUND
error
occurs.
For functions that take multiple geometry arguments, if
those arguments do not have the same SRID, an
ER_GIS_DIFFERENT_SRIDS
error
occurs.
If any geometry argument is geometrically invalid, either the result is true or false (it is undefined which), or an error occurs.
For geographic SRS geometry arguments, if any argument has a longitude or latitude that is out of range, an error occurs:
If a longitude value is not in the range (−180,
180], an
ER_LONGITUDE_OUT_OF_RANGE
error occurs.
If a latitude value is not in the range [−90, 90],
an
ER_LATITUDE_OUT_OF_RANGE
error occurs.
Ranges shown are in degrees. If an SRS uses another unit, the range uses the corresponding values in its unit. The exact range limits deviate slightly due to floating-point arithmetic.
Otherwise, the return value is non-NULL
.
These object-shape functions are available for testing geometry relationships:
Returns 1 or 0 to indicate whether
g1
completely contains
g2
. This tests the opposite
relationship as ST_Within()
.
ST_Contains()
handles its
arguments as described in the introduction to this section.
Two geometries spatially cross if their spatial relation has the following properties:
Unless g1
and and
g2
are both of dimension 1:
g1
crosses g2
if
the interior of g2
has points in
common with the interior of g1
, but
g2
does not cover the entire interior
of g1
.
If both g1
and g2
are of dimension 1: If the lines cross each other in a
finite number of points (that is, no common line
segments, only single points in common).
This function returns 1 or 0 to indicate whether
g1
spatially crosses
g2
.
ST_Crosses()
handles its
arguments as described in the introduction to this section
except that the return value is NULL
for
these additional conditions:
g1
is of dimension 2
(Polygon
or
MultiPolygon
).
g2
is of dimension 1
(Point
or
MultiPoint
).
Returns 1 or 0 to indicate whether
g1
is spatially disjoint from
(does not intersect) g2
.
ST_Disjoint()
handles its
arguments as described in the introduction to this section.
Returns the distance between g1
and g2
, measured in the length
unit of the spatial reference system (SRS) of the geometry
arguments, or in the unit of the optional
unit
argument if that is
specified.
This function processes geometry collections by returning the shortest distance among all combinations of the components of the two geometry arguments.
ST_Distance()
handles its
geometry arguments as described in the introduction to this
section, with these exceptions:
ST_Distance()
detects
arguments in a geographic (ellipsoidal) spatial
reference system and returns the geodetic distance on
the ellipsoid. The only permitted geographic argument
types are Point
and
Point
, or Point
and MultiPoint
(in any argument
order). If called with other geometry type argument
combinations in a geographic SRS, an
ER_NOT_IMPLEMENTED_FOR_GEOGRAPHIC_SRS
error occurs.
If any argument is geometrically invalid, either the result is an undefined distance (that is, it can be any number), or an error occurs.
If an intermediate or final result produces
NaN
or a negative number, an
ER_GIS_INVALID_DATA
error occurs.
As of MySQL 8.0.14,
ST_Distance()
permits an
optional unit
argument that
specifies the linear unit for the returned distance value.
These rules apply:
If a unit is specified but not supported by MySQL, an
ER_UNIT_NOT_FOUND
error
occurs.
If a supported linear unit is specified and the SRID is
0, an
ER_GEOMETRY_IN_UNKNOWN_LENGTH_UNIT
error occurs.
If a supported linear unit is specified and the SRID is not 0, the result is in that unit.
If a unit is not specified, the result is in the unit of the SRS of the geometries, whether Cartesian or geographic. Currently, all MySQL SRSs are expressed in meters.
A unit is supported if it is found in the
INFORMATION_SCHEMA
ST_UNITS_OF_MEASURE
table. See
Section 25.29, “The INFORMATION_SCHEMA ST_UNITS_OF_MEASURE Table”.
mysql>SET @g1 = Point(1,1);
mysql>SET @g2 = Point(2,2);
mysql>SELECT ST_Distance(@g1, @g2);
+-----------------------+ | ST_Distance(@g1, @g2) | +-----------------------+ | 1.4142135623730951 | +-----------------------+ mysql>SET @g1 = ST_GeomFromText('POINT(1 1)', 4326);
mysql>SET @g2 = ST_GeomFromText('POINT(2 2)', 4326);
mysql>SELECT ST_Distance(@g1, @g2);
+-----------------------+ | ST_Distance(@g1, @g2) | +-----------------------+ | 156874.3859490455 | +-----------------------+ mysql>SELECT ST_Distance(@g1, @g2, 'metre');
+--------------------------------+ | ST_Distance(@g1, @g2, 'metre') | +--------------------------------+ | 156874.3859490455 | +--------------------------------+ mysql>SELECT ST_Distance(@g1, @g2, 'foot');
+-------------------------------+ | ST_Distance(@g1, @g2, 'foot') | +-------------------------------+ | 514679.7439273146 | +-------------------------------+
For the special case of distance calculations on a sphere,
see the ST_Distance_Sphere()
function.
Returns 1 or 0 to indicate whether
g1
is spatially equal to
g2
.
ST_Equals()
handles its
arguments as described in the introduction to this section,
except that it does not return NULL
for
empty geometry arguments.
mysql>SET @g1 = Point(1,1), @g2 = Point(2,2);
mysql>SELECT ST_Equals(@g1, @g1), ST_Equals(@g1, @g2);
+---------------------+---------------------+ | ST_Equals(@g1, @g1) | ST_Equals(@g1, @g2) | +---------------------+---------------------+ | 1 | 0 | +---------------------+---------------------+
Returns 1 or 0 to indicate whether
g1
spatially intersects
g2
.
ST_Intersects()
handles its
arguments as described in the introduction to this section.
Two geometries spatially overlap if they intersect and their intersection results in a geometry of the same dimension but not equal to either of the given geometries.
This function returns 1 or 0 to indicate whether
g1
spatially overlaps
g2
.
ST_Overlaps()
handles its
arguments as described in the introduction to this section
except that the return value is NULL
for
the additional condition that the dimensions of the two
geometries are not equal.
Two geometries spatially touch if their interiors do not intersect, but the boundary of one of the geometries intersects either the boundary or the interior of the other.
This function returns 1 or 0 to indicate whether
g1
spatially touches
g2
.
ST_Touches()
handles its
arguments as described in the introduction to this section
except that the return value is NULL
for
the additional condition that both geometries are of
dimension 0 (Point
or
MultiPoint
).
Returns 1 or 0 to indicate whether
g1
is spatially within
g2
. This tests the opposite
relationship as
ST_Contains()
.
ST_Within()
handles its
arguments as described in the introduction to this section.
MySQL provides several MySQL-specific functions that test the
relationship between minimum bounding rectangles (MBRs) of two
geometries g1
and g2
. The
return values 1 and 0 indicate true and false, respectively.
The bounding box of a point is interpreted as a point that is both boundary and interior.
The bounding box of a straight horizontal or vertical line is interpreted as a line where the interior of the line is also boundary. The endpoints are boundary points.
If any of the parameters are geometry collections, the interior, boundary, and exterior of those parameters are those of the union of all elements in the collection.
Functions in this section detect arguments in either Cartesian or geographic spatial reference systems (SRSs), and return results appropriate to the SRS.
Unless otherwise specified, functions in this section handle their arguments as follows:
If any argument is NULL
or an empty
geometry, the return value is NULL
.
If any geometry argument is not a syntactically well-formed
geometry, an
ER_GIS_INVALID_DATA
error
occurs.
If any geometry argument refers to an undefined spatial
reference system (SRS), an
ER_SRS_NOT_FOUND
error
occurs.
For functions that take multiple geometry arguments, if
those arguments do not have the same SRID, an
ER_GIS_DIFFERENT_SRIDS
error
occurs.
If any argument is geometrically invalid, either the result is true or false (it is undefined which), or an error occurs.
For geographic SRS geometry arguments, if any argument has a longitude or latitude that is out of range, an error occurs:
If a longitude value is not in the range (−180,
180], an
ER_LONGITUDE_OUT_OF_RANGE
error occurs.
If a latitude value is not in the range [−90, 90],
an
ER_LATITUDE_OUT_OF_RANGE
error occurs.
Ranges shown are in degrees. If an SRS uses another unit, the range uses the corresponding values in its unit. The exact range limits deviate slightly due to floating-point arithmetic.
Otherwise, the return value is non-NULL
.
These MBR functions are available for testing geometry relationships:
Returns 1 or 0 to indicate whether the minimum bounding
rectangle of g1
contains the
minimum bounding rectangle of g2
.
This tests the opposite relationship as
MBRWithin()
.
MBRContains()
handles its
arguments as described in the introduction to this section.
mysql>SET @g1 = ST_GeomFromText('Polygon((0 0,0 3,3 3,3 0,0 0))');
mysql>SET @g2 = ST_GeomFromText('Point(1 1)');
mysql>SELECT MBRContains(@g1,@g2), MBRWithin(@g2,@g1);
+----------------------+--------------------+ | MBRContains(@g1,@g2) | MBRWithin(@g2,@g1) | +----------------------+--------------------+ | 1 | 1 | +----------------------+--------------------+
Returns 1 or 0 to indicate whether the minimum bounding
rectangle of g1
is covered by the
minimum bounding rectangle of g2
.
This tests the opposite relationship as
MBRCovers()
.
MBRCoveredBy()
handles its
arguments as described in the introduction to this section.
mysql>SET @g1 = ST_GeomFromText('Polygon((0 0,0 3,3 3,3 0,0 0))');
mysql>SET @g2 = ST_GeomFromText('Point(1 1)');
mysql>SELECT MBRCovers(@g1,@g2), MBRCoveredby(@g1,@g2);
+--------------------+-----------------------+ | MBRCovers(@g1,@g2) | MBRCoveredby(@g1,@g2) | +--------------------+-----------------------+ | 1 | 0 | +--------------------+-----------------------+ mysql>SELECT MBRCovers(@g2,@g1), MBRCoveredby(@g2,@g1);
+--------------------+-----------------------+ | MBRCovers(@g2,@g1) | MBRCoveredby(@g2,@g1) | +--------------------+-----------------------+ | 0 | 1 | +--------------------+-----------------------+
Returns 1 or 0 to indicate whether the minimum bounding
rectangle of g1
covers the
minimum bounding rectangle of g2
.
This tests the opposite relationship as
MBRCoveredBy()
. See the
description of MBRCoveredBy()
for examples.
MBRCovers()
handles its
arguments as described in the introduction to this section.
Returns 1 or 0 to indicate whether the minimum bounding
rectangles of the two geometries
g1
and
g2
are disjoint (do not
intersect).
MBRDisjoint()
handles its
arguments as described in the introduction to this section.
Returns 1 or 0 to indicate whether the minimum bounding
rectangles of the two geometries
g1
and
g2
are the same.
MBREquals()
handles its
arguments as described in the introduction to this section,
except that it does not return NULL
for
empty geometry arguments.
Returns 1 or 0 to indicate whether the minimum bounding
rectangles of the two geometries
g1
and
g2
intersect.
MBRIntersects()
handles its
arguments as described in the introduction to this section.
Two geometries spatially overlap if they intersect and their intersection results in a geometry of the same dimension but not equal to either of the given geometries.
This function returns 1 or 0 to indicate whether the minimum
bounding rectangles of the two geometries
g1
and
g2
overlap.
MBROverlaps()
handles its
arguments as described in the introduction to this section.
Two geometries spatially touch if their interiors do not intersect, but the boundary of one of the geometries intersects either the boundary or the interior of the other.
This function returns 1 or 0 to indicate whether the minimum
bounding rectangles of the two geometries
g1
and
g2
touch.
MBRTouches()
handles its
arguments as described in the introduction to this section.
Returns 1 or 0 to indicate whether the minimum bounding
rectangle of g1
is within the
minimum bounding rectangle of g2
.
This tests the opposite relationship as
MBRContains()
.
MBRWithin()
handles its
arguments as described in the introduction to this section.
mysql>SET @g1 = ST_GeomFromText('Polygon((0 0,0 3,3 3,3 0,0 0))');
mysql>SET @g2 = ST_GeomFromText('Polygon((0 0,0 5,5 5,5 0,0 0))');
mysql>SELECT MBRWithin(@g1,@g2), MBRWithin(@g2,@g1);
+--------------------+--------------------+ | MBRWithin(@g1,@g2) | MBRWithin(@g2,@g1) | +--------------------+--------------------+ | 1 | 0 | +--------------------+--------------------+
Geohash is a system for encoding latitude and longitude
coordinates of arbitrary precision into a text string. Geohash
values are strings that contain only characters chosen from
"0123456789bcdefghjkmnpqrstuvwxyz"
.
The functions in this section enable manipulation of geohash values, which provides applications the capabilities of importing and exporting geohash data, and of indexing and searching geohash values.
Unless otherwise specified, functions in this section handle their arguments as follows:
If any argument is NULL
, the return value
is NULL
.
If any argument is invalid, an error occurs.
If any argument has a longitude or latitude that is out of range, an error occurs:
If any longitude argument is not in the range (−180,
180], an
ER_LONGITUDE_OUT_OF_RANGE
error occurs.
If any latitude argument is not in the range [−90,
90], an
ER_LATITUDE_OUT_OF_RANGE
error occurs.
Ranges shown are in degrees. The exact range limits deviate slightly due to floating-point arithmetic.
If any point argument does not have SRID 0 or 4326, an
ER_SRS_NOT_FOUND
error occurs.
point
argument SRID validity is not
checked.
If any SRID argument refers to an undefined spatial reference
system (SRS), an
ER_SRS_NOT_FOUND
error occurs.
If any SRID argument is not within the range of a 32-bit
unsigned integer, an
ER_DATA_OUT_OF_RANGE
error
occurs.
Otherwise, the return value is non-NULL
.
These geohash functions are available:
ST_GeoHash(
,
longitude
,
latitude
,
max_length
)ST_GeoHash(
point
,
max_length
)
Returns a geohash string in the connection character set and collation.
For the first syntax, the longitude
must be a number in the range [−180, 180], and the
latitude
must be a number in the
range [−90, 90]. For the second syntax, a
POINT
value is required, where the X and Y
coordinates are in the valid ranges for longitude and
latitude, respectively.
The resulting string is no longer than
max_length
characters, which has an
upper limit of 100. The string might be shorter than
max_length
characters because the
algorithm that creates the geohash value continues until it
has created a string that is either an exact representation of
the location or max_length
characters, whichever comes first.
ST_GeoHash()
handles its
arguments as described in the introduction to this section.
mysql> SELECT ST_GeoHash(180,0,10), ST_GeoHash(-180,-90,15);
+----------------------+-------------------------+
| ST_GeoHash(180,0,10) | ST_GeoHash(-180,-90,15) |
+----------------------+-------------------------+
| xbpbpbpbpb | 000000000000000 |
+----------------------+-------------------------+
ST_LatFromGeoHash(
geohash_str
)
Returns the latitude from a geohash string value, as a double-precision number in the range [−90, 90].
The ST_LatFromGeoHash()
decoding function reads no more than 433 characters from the
geohash_str
argument. That
represents the upper limit on information in the internal
representation of coordinate values. Characters past the 433rd
are ignored, even if they are otherwise illegal and produce an
error.
ST_LatFromGeoHash()
handles its
arguments as described in the introduction to this section.
mysql> SELECT ST_LatFromGeoHash(ST_GeoHash(45,-20,10));
+------------------------------------------+
| ST_LatFromGeoHash(ST_GeoHash(45,-20,10)) |
+------------------------------------------+
| -20 |
+------------------------------------------+
ST_LongFromGeoHash(
geohash_str
)
Returns the longitude from a geohash string value, as a double-precision number in the range [−180, 180].
The remarks in the description of
ST_LatFromGeoHash()
regarding
the maximum number of characters processed from the
geohash_str
argument also apply to
ST_LongFromGeoHash()
.
ST_LongFromGeoHash()
handles
its arguments as described in the introduction to this
section.
mysql> SELECT ST_LongFromGeoHash(ST_GeoHash(45,-20,10));
+-------------------------------------------+
| ST_LongFromGeoHash(ST_GeoHash(45,-20,10)) |
+-------------------------------------------+
| 45 |
+-------------------------------------------+
ST_PointFromGeoHash(
geohash_str
,
srid
)
Returns a POINT
value containing the
decoded geohash value, given a geohash string value.
The X and Y coordinates of the point are the longitude in the range [−180, 180] and the latitude in the range [−90, 90], respectively.
The srid
argument is an 32-bit
unsigned integer.
The remarks in the description of
ST_LatFromGeoHash()
regarding
the maximum number of characters processed from the
geohash_str
argument also apply to
ST_PointFromGeoHash()
.
ST_PointFromGeoHash()
handles
its arguments as described in the introduction to this
section.
mysql>SET @gh = ST_GeoHash(45,-20,10);
mysql>SELECT ST_AsText(ST_PointFromGeoHash(@gh,0));
+---------------------------------------+ | ST_AsText(ST_PointFromGeoHash(@gh,0)) | +---------------------------------------+ | POINT(45 -20) | +---------------------------------------+
This section describes functions for converting between GeoJSON documents and spatial values. GeoJSON is an open standard for encoding geometric/geographical features. For more information, see http://geojson.org. The functions discussed here follow GeoJSON specification revision 1.0.
GeoJSON supports the same geometric/geographic data types that MySQL supports. Feature and FeatureCollection objects are not supported, except that geometry objects are extracted from them. CRS support is limited to values that identify an SRID.
MySQL also supports a native JSON
data type and a set of SQL functions to enable operations on JSON
values. For more information, see Section 11.6, “The JSON Data Type”, and
Section 12.17, “JSON Functions”.
ST_AsGeoJSON(
g
[, max_dec_digits
[,
options
]])
Generates a GeoJSON object from the geometry
g
. The object string has the
connection character set and collation.
If any argument is NULL
, the return value
is NULL
. If any non-NULL
argument is invalid, an error occurs.
max_dec_digits
, if specified,
limits the number of decimal digits for coordinates and causes
rounding of output. If not specified, this argument defaults
to its maximum value of 232 −
1. The minimum is 0.
options
, if specified, is a
bitmask. The following table shows the permitted flag values.
If the geometry argument has an SRID of 0, no CRS object is
produced even for those flag values that request one.
Flag Value | Meaning |
---|---|
0 | No options. This is the default if options is
not specified. |
1 | Add a bounding box to the output. |
2 | Add a short-format CRS URN to the output. The default format is a short
format
(EPSG: ). |
4 | Add a long-format CRS URN
(urn:ogc:def:crs:EPSG:: ).
This flag overrides flag 2. For example, option values
of 5 and 7 mean the same (add a bounding box and a
long-format CRS URN). |
mysql> SELECT ST_AsGeoJSON(ST_GeomFromText('POINT(11.11111 12.22222)'),2);
+-------------------------------------------------------------+
| ST_AsGeoJSON(ST_GeomFromText('POINT(11.11111 12.22222)'),2) |
+-------------------------------------------------------------+
| {"type": "Point", "coordinates": [11.11, 12.22]} |
+-------------------------------------------------------------+
ST_GeomFromGeoJSON(
str
[, options
[,
srid
]])
Parses a string str
representing a
GeoJSON object and returns a geometry.
If any argument is NULL
, the return value
is NULL
. If any non-NULL
argument is invalid, an error occurs.
options
, if given, describes how to
handle GeoJSON documents that contain geometries with
coordinate dimensions higher than 2. The following table shows
the permitted options
values.
Option Value | Meaning |
---|---|
1 | Reject the document and produce an error. This is the default if
options is not specified. |
2, 3, 4 | Accept the document and strip off the coordinates for higher coordinate dimensions. |
options
values of 2, 3, and 4
currently produce the same effect. If geometries with
coordinate dimensions higher than 2 are supported in the
future, these values will produce different effects.
The srid
argument, if given, must
be a 32-bit unsigned integer. If not given, the geometry
return value has an SRID of 4326.
If srid
refers to an undefined
spatial reference system (SRS), an
ER_SRS_NOT_FOUND
error occurs.
For geographic SRS geometry arguments, if any argument has a longitude or latitude that is out of range, an error occurs:
If a longitude value is not in the range (−180,
180], an
ER_LONGITUDE_OUT_OF_RANGE
error occurs.
If a latitude value is not in the range [−90, 90],
an
ER_LATITUDE_OUT_OF_RANGE
error occurs.
Ranges shown are in degrees. If an SRS uses another unit, the range uses the corresponding values in its unit. The exact range limits deviate slightly due to floating-point arithmetic.
GeoJSON geometry, feature, and feature collection objects may
have a crs
property. The parsing function
parses named CRS URNs in the
urn:ogc:def:crs:EPSG::
and srid
EPSG:
namespaces, but not CRSs given as link objects. Also,
srid
urn:ogc:def:crs:OGC:1.3:CRS84
is recognized
as SRID 4326. If an object has a CRS that is not understood,
an error occurs, with the exception that if the optional
srid
argument is given, any CRS is
ignored even if it is invalid.
If a crs
member that specifies an SRID
different from the top-level object SRID is found at a lower
level of the GeoJSON document, an
ER_INVALID_GEOJSON_CRS_NOT_TOP_LEVEL
error occurs.
As specified in the GeoJSON specification, parsing is case
sensitive for the type
member of the
GeoJSON input (Point
,
LineString
, and so forth). The
specification is silent regarding case sensitivity for other
parsing, which in MySQL is not case-sensitive.
This example shows the parsing result for a simple GeoJSON object:
mysql>SET @json = '{ "type": "Point", "coordinates": [102.0, 0.0]}';
mysql>SELECT ST_AsText(ST_GeomFromGeoJSON(@json));
+--------------------------------------+ | ST_AsText(ST_GeomFromGeoJSON(@json)) | +--------------------------------------+ | POINT(102 0) | +--------------------------------------+
The functions in this section provide convenience operations on geometry values.
Unless otherwise specified, functions in this section handle their arguments as follows:
If any argument is NULL
, the return value
is NULL
.
If any geometry argument is not a syntactically well-formed
geometry, an
ER_GIS_INVALID_DATA
error
occurs.
If any geometry argument has an SRID value that refers to an
undefined spatial reference system (SRS), an
ER_SRS_NOT_FOUND
error occurs.
For functions that take multiple geometry arguments, if those
arguments do not have the same SRID, an
ER_GIS_DIFFERENT_SRIDS
error
occurs.
Otherwise, the return value is non-NULL
.
These convenience functions are available:
ST_Distance_Sphere(
g1
,
g2
[,
radius
])
Returns the mimimum spherical distance between
Point
or MultiPoint
arguments on a sphere, in meters. (For general-purpose
distance calculations, see the
ST_Distance()
function.) The
optional radius
argument should be
given in meters.
If both geometry parameters are valid Cartesian
Point
or MultiPoint
values in SRID 0, the return value is shortest distance
between the two geometries on a sphere with the provided
radius. If omitted, the default radius is 6,370,986 meters,
Point X and Y coordinates are interpreted as longitude and
latitude, respectively, in degrees.
If both geometry parameters are valid Point
or MultiPoint
values in a geographic
spatial reference system (SRS), the return value is the
shortest distance between the two geometries on a sphere with
the provided radius. If omitted, the default radius is equal
to the mean radius, defined as (2a+b)/3, where a is the
semi-major axis and b is the semi-minor axis of the SRS.
ST_Distance_Sphere()
handles
its arguments as described in the introduction to this
section, with these exceptions:
Supported geometry argument combinations are
Point
and Point
, or
Point
and MultiPoint
(in any argument order). If at least one of the geometries
is neither Point
nor
MultiPoint
, and its SRID is 0, an
ER_NOT_IMPLEMENTED_FOR_CARTESIAN_SRS
error occurs. If at least one of the geometries is neither
Point
nor
MultiPoint
, and its SRID refers to a
geographic SRS, an
ER_NOT_IMPLEMENTED_FOR_GEOGRAPHIC_SRS
error occurs. If any geometry refers to a projected SRS,
an
ER_NOT_IMPLEMENTED_FOR_PROJECTED_SRS
error occurs.
If any argument has a longitude or latitude that is out of range, an error occurs:
If any longitude argument is not in the range
(−180, 180], an
ER_LONGITUDE_OUT_OF_RANGE
error occurs.
If any latitude argument is not in the range
[−90, 90], an
ER_LATITUDE_OUT_OF_RANGE
error occurs.
Ranges shown are in degrees. If an SRS uses another unit, the range uses the corresponding values in its unit. The exact range limits deviate slightly due to floating-point arithmetic.
If the radius
argument is
present but not positive, an
ER_NONPOSITIVE_RADIUS
error occurs.
If the distance exceeds the range of a double-precision
number, an
ER_STD_OVERFLOW_ERROR
error occurs.
mysql>SET @pt1 = ST_GeomFromText('POINT(0 0)');
mysql>SET @pt2 = ST_GeomFromText('POINT(180 0)');
mysql>SELECT ST_Distance_Sphere(@pt1, @pt2);
+--------------------------------+ | ST_Distance_Sphere(@pt1, @pt2) | +--------------------------------+ | 20015042.813723423 | +--------------------------------+
Returns 1 if the argument is geometrically valid, 0 if the argument is not geometrically valid. Geometry validity is defined by the OGC specification.
The only valid empty geometry is represented in the form of an
empty geometry collection value.
ST_IsValid()
returns 1 in this
case. MySQL does not support GIS EMPTY
values such as POINT EMPTY
.
ST_IsValid()
handles its
arguments as described in the introduction to this section,
with this exception:
If the geometry has a geographic SRS with a longitude or latitude that is out of range, an error occurs:
If any longitude argument is not in the range
(−180, 180], an
ER_LONGITUDE_OUT_OF_RANGE
error occurs.
If any latitude argument is not in the range
[−90, 90], an
ER_LATITUDE_OUT_OF_RANGE
error occurs.
Ranges shown are in degrees. If an SRS uses another unit, the range uses the corresponding values in its unit. The exact range limits deviate slightly due to floating-point arithmetic.
mysql>SET @ls1 = ST_GeomFromText('LINESTRING(0 0,-0.00 0,0.0 0)');
mysql>SET @ls2 = ST_GeomFromText('LINESTRING(0 0, 1 1)');
mysql>SELECT ST_IsValid(@ls1);
+------------------+ | ST_IsValid(@ls1) | +------------------+ | 0 | +------------------+ mysql>SELECT ST_IsValid(@ls2);
+------------------+ | ST_IsValid(@ls2) | +------------------+ | 1 | +------------------+
Returns the rectangle that forms the envelope around two
points, as a Point
,
LineString
, or Polygon
.
Calculations are done using the Cartesian coordinate system rather than on a sphere, spheroid, or on earth.
Given two points pt1
and
pt2
,
ST_MakeEnvelope()
creates the
result geometry on an abstract plane like this:
If pt1
and
pt2
are equal, the result is
the point pt1
.
Otherwise, if (
is a vertical or
horizontal line segment, the result is the line segment
pt1
,
pt2
)(
.
pt1
,
pt2
)
Otherwise, the result is a polygon using
pt1
and
pt2
as diagonal points.
The result geometry has an SRID of 0.
ST_MakeEnvelope()
handles its
arguments as described in the introduction to this section,
with these exceptions:
If the arguments are not Point
values,
an ER_WRONG_ARGUMENTS
error occurs.
An ER_GIS_INVALID_DATA
error occurs for the additional condition that any
coordinate value of the two points is infinite or
NaN
.
If any geometry has an SRID value for a geographic spatial
reference system (SRS), an
ER_NOT_IMPLEMENTED_FOR_GEOGRAPHIC_SRS
error occurs.
mysql>SET @pt1 = ST_GeomFromText('POINT(0 0)');
mysql>SET @pt2 = ST_GeomFromText('POINT(1 1)');
mysql>SELECT ST_AsText(ST_MakeEnvelope(@pt1, @pt2));
+----------------------------------------+ | ST_AsText(ST_MakeEnvelope(@pt1, @pt2)) | +----------------------------------------+ | POLYGON((0 0,1 0,1 1,0 1,0 0)) | +----------------------------------------+
Simplifies a geometry using the Douglas-Peucker algorithm and returns a simplified value of the same type.
The geometry may be any geometry type, although the Douglas-Peucker algorithm may not actually process every type. A geometry collection is processed by giving its components one by one to the simplification algorithm, and the returned geometries are put into a geometry collection as result.
The max_distance
argument is the
distance (in units of the input coordinates) of a vertex to
other segments to be removed. Vertices within this distance of
the simplified linestring are removed.
According to Boost.Geometry, geometries might become invalid
as a result of the simplification process, and the process
might create self-intersections. To check the validity of the
result, pass it to
ST_IsValid()
.
ST_Simplify()
handles its
arguments as described in the introduction to this section,
with this exception:
If the max_distance
argument is
not positive, or is NaN
, an
ER_WRONG_ARGUMENTS
error
occurs.
mysql>SET @g = ST_GeomFromText('LINESTRING(0 0,0 1,1 1,1 2,2 2,2 3,3 3)');
mysql>SELECT ST_AsText(ST_Simplify(@g, 0.5));
+---------------------------------+ | ST_AsText(ST_Simplify(@g, 0.5)) | +---------------------------------+ | LINESTRING(0 0,0 1,1 1,2 3,3 3) | +---------------------------------+ mysql>SELECT ST_AsText(ST_Simplify(@g, 1.0));
+---------------------------------+ | ST_AsText(ST_Simplify(@g, 1.0)) | +---------------------------------+ | LINESTRING(0 0,3 3) | +---------------------------------+
Validates a geometry according to the OGC specification. A
geometry can be syntactically well-formed (WKB value plus
SRID) but geometrically invalid. For example, this polygon is
geometrically invalid: POLYGON((0 0, 0 0, 0 0, 0 0, 0
0))
ST_Validate()
returns the
geometry if it is syntactically well-formed and is
geometrically valid, NULL
if the argument
is not syntactically well-formed or is not geometrically valid
or is NULL
.
ST_Validate()
can be used to
filter out invalid geometry data, although at a cost. For
applications that require more precise results not tainted by
invalid data, this penalty may be worthwhile.
If the geometry argument is valid, it is returned as is,
except that if an input Polygon
or
MultiPolygon
has clockwise rings, those
rings are reversed before checking for validity. If the
geometry is valid, the value with the reversed rings is
returned.
The only valid empty geometry is represented in the form of an
empty geometry collection value.
ST_Validate()
returns it
directly without further checks in this case.
As of MySQL 8.0.13,
ST_Validate()
handles its
arguments as described in the introduction to this section,
with these exceptions:
If the geometry has a geographic SRS with a longitude or latitude that is out of range, an error occurs:
If any longitude argument is not in the range
(−180, 180], an
ER_LONGITUDE_OUT_OF_RANGE
error occurs.
If any latitude argument is not in the range
[−90, 90], an
ER_LATITUDE_OUT_OF_RANGE
error occurs.
Ranges shown are in degrees. The exact range limits deviate slightly due to floating-point arithmetic.
Prior to MySQL 8.0.13,
ST_Validate()
handles its
arguments as described in the introduction to this section,
with these exceptions:
If the geometry is not syntactically well-formed, the
return value is NULL
. An
ER_GIS_INVALID_DATA
error
does not occur.
If the geometry has an SRID value for a geographic spatial
reference system (SRS), an
ER_NOT_IMPLEMENTED_FOR_GEOGRAPHIC_SRS
error occurs.
mysql>SET @ls1 = ST_GeomFromText('LINESTRING(0 0)');
mysql>SET @ls2 = ST_GeomFromText('LINESTRING(0 0, 1 1)');
mysql>SELECT ST_AsText(ST_Validate(@ls1));
+------------------------------+ | ST_AsText(ST_Validate(@ls1)) | +------------------------------+ | NULL | +------------------------------+ mysql>SELECT ST_AsText(ST_Validate(@ls2));
+------------------------------+ | ST_AsText(ST_Validate(@ls2)) | +------------------------------+ | LINESTRING(0 0,1 1) | +------------------------------+
The functions described in this section perform operations on JSON
values. For discussion of the JSON
data type and additional examples showing how to use these
functions, see Section 11.6, “The JSON Data Type”.
For functions that take a JSON argument, an error occurs if the
argument is not a valid JSON value. Arguments parsed as JSON are
indicated by json_doc
; arguments
indicated by val
are not parsed.
A set of spatial functions for operating on GeoJSON values is also available. See Section 12.16.11, “Spatial GeoJSON Functions”.
Table 12.21 JSON Functions
Name | Description |
---|---|
JSON_ARRAY() |
Create JSON array |
JSON_ARRAY_APPEND() |
Append data to JSON document |
JSON_ARRAY_INSERT() |
Insert into JSON array |
-> |
Return value from JSON column after evaluating path; equivalent to JSON_EXTRACT(). |
JSON_CONTAINS() |
Whether JSON document contains specific object at path |
JSON_CONTAINS_PATH() |
Whether JSON document contains any data at path |
JSON_DEPTH() |
Maximum depth of JSON document |
JSON_EXTRACT() |
Return data from JSON document |
->> |
Return value from JSON column after evaluating path and unquoting the result; equivalent to JSON_UNQUOTE(JSON_EXTRACT()). |
JSON_INSERT() |
Insert data into JSON document |
JSON_KEYS() |
Array of keys from JSON document |
JSON_LENGTH() |
Number of elements in JSON document |
JSON_MERGE() (deprecated 8.0.3) |
Merge JSON documents, preserving duplicate keys. Deprecated synonym for JSON_MERGE_PRESERVE() |
JSON_MERGE_PATCH() |
Merge JSON documents, replacing values of duplicate keys |
JSON_MERGE_PRESERVE() |
Merge JSON documents, preserving duplicate keys |
JSON_OBJECT() |
Create JSON object |
JSON_PRETTY() |
Print a JSON document in human-readable format |
JSON_QUOTE() |
Quote JSON document |
JSON_REMOVE() |
Remove data from JSON document |
JSON_REPLACE() |
Replace values in JSON document |
JSON_SEARCH() |
Path to value within JSON document |
JSON_SET() |
Insert data into JSON document |
JSON_STORAGE_FREE() |
Freed space within binary representation of a JSON column value following a partial update |
JSON_STORAGE_SIZE() |
Space used for storage of binary representation of a JSON document |
JSON_TABLE() |
Return data from a JSON expression as a relational table |
JSON_TYPE() |
Type of JSON value |
JSON_UNQUOTE() |
Unquote JSON value |
JSON_VALID() |
Whether JSON value is valid |
MySQL supports two aggregate JSON functions
JSON_ARRAYAGG()
and
JSON_OBJECTAGG()
. See
Section 12.20, “Aggregate (GROUP BY) Functions”, for
descriptions of these.
MySQL also supports “pretty-printing” of JSON values
in an easy-to-read format, using the
JSON_PRETTY()
function. You can see
how much storage space a given JSON value takes up, and how much
space remains for additional storage, using
JSON_STORAGE_SIZE()
and
JSON_STORAGE_FREE()
, respectively.
For complete descriptions of these functions, see
Section 12.17.7, “JSON Utility Functions”.
The functions listed in this section compose JSON values from component elements.
Evaluates a (possibly empty) list of values and returns a JSON array containing those values.
mysql> SELECT JSON_ARRAY(1, "abc", NULL, TRUE, CURTIME());
+---------------------------------------------+
| JSON_ARRAY(1, "abc", NULL, TRUE, CURTIME()) |
+---------------------------------------------+
| [1, "abc", null, true, "11:30:24.000000"] |
+---------------------------------------------+
JSON_OBJECT([
key
,
val
[,
key
,
val
] ...])
Evaluates a (possibly empty) list of key-value pairs and
returns a JSON object containing those pairs. An error occurs
if any key name is NULL
or the number of
arguments is odd.
mysql> SELECT JSON_OBJECT('id', 87, 'name', 'carrot');
+-----------------------------------------+
| JSON_OBJECT('id', 87, 'name', 'carrot') |
+-----------------------------------------+
| {"id": 87, "name": "carrot"} |
+-----------------------------------------+
Quotes a string as a JSON value by wrapping it with double
quote characters and escaping interior quote and other
characters, then returning the result as a
utf8mb4
string. Returns
NULL
if the argument is
NULL
.
This function is typically used to produce a valid JSON string literal for inclusion within a JSON document.
Certain special characters are escaped with backslashes per the escape sequences shown in Table 12.22, “JSON_UNQUOTE() Special Character Escape Sequences”.
mysql>SELECT JSON_QUOTE('null'), JSON_QUOTE('"null"');
+--------------------+----------------------+ | JSON_QUOTE('null') | JSON_QUOTE('"null"') | +--------------------+----------------------+ | "null" | "\"null\"" | +--------------------+----------------------+ mysql>SELECT JSON_QUOTE('[1, 2, 3]');
+-------------------------+ | JSON_QUOTE('[1, 2, 3]') | +-------------------------+ | "[1, 2, 3]" | +-------------------------+
You can also obtain JSON values by casting values of other types
to the JSON
type using
CAST(
; see
Converting between JSON and non-JSON values, for more
information.
value
AS
JSON)
Two aggregate functions generating JSON values are available.
JSON_ARRAYAGG()
returns a result
set as a single JSON array, and
JSON_OBJECTAGG()
returns a result
set as a single JSON object. For more information, see
Section 12.20, “Aggregate (GROUP BY) Functions”.
The functions in this section perform search operations on JSON values to extract data from them, report whether data exists at a location within them, or report the path to data within them.
JSON_CONTAINS(
target
,
candidate
[,
path
])
Indicates by returning 1 or 0 whether a given
candidate
JSON document is
contained within a target
JSON
document, or—if a path
argument was supplied—whether the candidate is found at
a specific path within the target. Returns
NULL
if any argument is
NULL
, or if the path argument does not
identify a section of the target document. An error occurs if
target
or
candidate
is not a valid JSON
document, or if the path
argument
is not a valid path expression or contains a
*
or **
wildcard.
To check only whether any data exists at the path, use
JSON_CONTAINS_PATH()
instead.
The following rules define containment:
A candidate scalar is contained in a target scalar if and
only if they are comparable and are equal. Two scalar
values are comparable if they have the same
JSON_TYPE()
types, with the
exception that values of types INTEGER
and DECIMAL
are also comparable to each
other.
A candidate array is contained in a target array if and only if every element in the candidate is contained in some element of the target.
A candidate nonarray is contained in a target array if and only if the candidate is contained in some element of the target.
A candidate object is contained in a target object if and only if for each key in the candidate there is a key with the same name in the target and the value associated with the candidate key is contained in the value associated with the target key.
Otherwise, the candidate value is not contained in the target document.
mysql>SET @j = '{"a": 1, "b": 2, "c": {"d": 4}}';
mysql>SET @j2 = '1';
mysql>SELECT JSON_CONTAINS(@j, @j2, '$.a');
+-------------------------------+ | JSON_CONTAINS(@j, @j2, '$.a') | +-------------------------------+ | 1 | +-------------------------------+ mysql>SELECT JSON_CONTAINS(@j, @j2, '$.b');
+-------------------------------+ | JSON_CONTAINS(@j, @j2, '$.b') | +-------------------------------+ | 0 | +-------------------------------+ mysql>SET @j2 = '{"d": 4}';
mysql>SELECT JSON_CONTAINS(@j, @j2, '$.a');
+-------------------------------+ | JSON_CONTAINS(@j, @j2, '$.a') | +-------------------------------+ | 0 | +-------------------------------+ mysql>SELECT JSON_CONTAINS(@j, @j2, '$.c');
+-------------------------------+ | JSON_CONTAINS(@j, @j2, '$.c') | +-------------------------------+ | 1 | +-------------------------------+
JSON_CONTAINS_PATH(
json_doc
,
one_or_all
,
path
[,
path
] ...)
Returns 0 or 1 to indicate whether a JSON document contains
data at a given path or paths. Returns NULL
if any argument is NULL
. An error occurs if
the json_doc
argument is not a
valid JSON document, any path
argument is not a valid path expression, or
one_or_all
is not
'one'
or 'all'
.
To check for a specific value at a path, use
JSON_CONTAINS()
instead.
The return value is 0 if no specified path exists within the
document. Otherwise, the return value depends on the
one_or_all
argument:
'one'
: 1 if at least one path exists
within the document, 0 otherwise.
'all'
: 1 if all paths exist within the
document, 0 otherwise.
mysql>SET @j = '{"a": 1, "b": 2, "c": {"d": 4}}';
mysql>SELECT JSON_CONTAINS_PATH(@j, 'one', '$.a', '$.e');
+---------------------------------------------+ | JSON_CONTAINS_PATH(@j, 'one', '$.a', '$.e') | +---------------------------------------------+ | 1 | +---------------------------------------------+ mysql>SELECT JSON_CONTAINS_PATH(@j, 'all', '$.a', '$.e');
+---------------------------------------------+ | JSON_CONTAINS_PATH(@j, 'all', '$.a', '$.e') | +---------------------------------------------+ | 0 | +---------------------------------------------+ mysql>SELECT JSON_CONTAINS_PATH(@j, 'one', '$.c.d');
+----------------------------------------+ | JSON_CONTAINS_PATH(@j, 'one', '$.c.d') | +----------------------------------------+ | 1 | +----------------------------------------+ mysql>SELECT JSON_CONTAINS_PATH(@j, 'one', '$.a.d');
+----------------------------------------+ | JSON_CONTAINS_PATH(@j, 'one', '$.a.d') | +----------------------------------------+ | 0 | +----------------------------------------+
JSON_EXTRACT(
json_doc
,
path
[,
path
] ...)
Returns data from a JSON document, selected from the parts of
the document matched by the path
arguments. Returns NULL
if any argument is
NULL
or no paths locate a value in the
document. An error occurs if the
json_doc
argument is not a valid
JSON document or any path
argument
is not a valid path expression.
The return value consists of all values matched by the
path
arguments. If it is possible
that those arguments could return multiple values, the matched
values are autowrapped as an array, in the order corresponding
to the paths that produced them. Otherwise, the return value
is the single matched value.
mysql>SELECT JSON_EXTRACT('[10, 20, [30, 40]]', '$[1]');
+--------------------------------------------+ | JSON_EXTRACT('[10, 20, [30, 40]]', '$[1]') | +--------------------------------------------+ | 20 | +--------------------------------------------+ mysql>SELECT JSON_EXTRACT('[10, 20, [30, 40]]', '$[1]', '$[0]');
+----------------------------------------------------+ | JSON_EXTRACT('[10, 20, [30, 40]]', '$[1]', '$[0]') | +----------------------------------------------------+ | [20, 10] | +----------------------------------------------------+ mysql>SELECT JSON_EXTRACT('[10, 20, [30, 40]]', '$[2][*]');
+-----------------------------------------------+ | JSON_EXTRACT('[10, 20, [30, 40]]', '$[2][*]') | +-----------------------------------------------+ | [30, 40] | +-----------------------------------------------+
MySQL supports the
->
operator as shorthand for this function as used with 2
arguments where the left hand side is a
JSON
column identifier (not an
expression) and the right hand side is the JSON path to be
matched within the column.
The
->
operator serves as an alias for the
JSON_EXTRACT()
function when
used with two arguments, a column identifier on the left and a
JSON path on the right that is evaluated against the JSON
document (the column value). You can use such expressions in
place of column identifiers wherever they occur in SQL
statements.
The two SELECT
statements shown
here produce the same output:
mysql>SELECT c, JSON_EXTRACT(c, "$.id"), g
>FROM jemp
>WHERE JSON_EXTRACT(c, "$.id") > 1
>ORDER BY JSON_EXTRACT(c, "$.name");
+-------------------------------+-----------+------+ | c | c->"$.id" | g | +-------------------------------+-----------+------+ | {"id": "3", "name": "Barney"} | "3" | 3 | | {"id": "4", "name": "Betty"} | "4" | 4 | | {"id": "2", "name": "Wilma"} | "2" | 2 | +-------------------------------+-----------+------+ 3 rows in set (0.00 sec) mysql>SELECT c, c->"$.id", g
>FROM jemp
>WHERE c->"$.id" > 1
>ORDER BY c->"$.name";
+-------------------------------+-----------+------+ | c | c->"$.id" | g | +-------------------------------+-----------+------+ | {"id": "3", "name": "Barney"} | "3" | 3 | | {"id": "4", "name": "Betty"} | "4" | 4 | | {"id": "2", "name": "Wilma"} | "2" | 2 | +-------------------------------+-----------+------+ 3 rows in set (0.00 sec)
This functionality is not limited to
SELECT
, as shown here:
mysql>ALTER TABLE jemp ADD COLUMN n INT;
Query OK, 0 rows affected (0.68 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql>UPDATE jemp SET n=1 WHERE c->"$.id" = "4";
Query OK, 1 row affected (0.04 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql>SELECT c, c->"$.id", g, n
>FROM jemp
>WHERE JSON_EXTRACT(c, "$.id") > 1
>ORDER BY c->"$.name";
+-------------------------------+-----------+------+------+ | c | c->"$.id" | g | n | +-------------------------------+-----------+------+------+ | {"id": "3", "name": "Barney"} | "3" | 3 | NULL | | {"id": "4", "name": "Betty"} | "4" | 4 | 1 | | {"id": "2", "name": "Wilma"} | "2" | 2 | NULL | +-------------------------------+-----------+------+------+ 3 rows in set (0.00 sec) mysql>DELETE FROM jemp WHERE c->"$.id" = "4";
Query OK, 1 row affected (0.04 sec) mysql>SELECT c, c->"$.id", g, n
>FROM jemp
>WHERE JSON_EXTRACT(c, "$.id") > 1
>ORDER BY c->"$.name";
+-------------------------------+-----------+------+------+ | c | c->"$.id" | g | n | +-------------------------------+-----------+------+------+ | {"id": "3", "name": "Barney"} | "3" | 3 | NULL | | {"id": "2", "name": "Wilma"} | "2" | 2 | NULL | +-------------------------------+-----------+------+------+ 2 rows in set (0.00 sec)
(See Indexing a Generated Column to Provide a JSON Column Index, for the statements used to create and populate the table just shown.)
This also works with JSON array values, as shown here:
mysql>CREATE TABLE tj10 (a JSON, b INT);
Query OK, 0 rows affected (0.26 sec) mysql>INSERT INTO tj10
>VALUES ("[3,10,5,17,44]", 33), ("[3,10,5,17,[22,44,66]]", 0);
Query OK, 1 row affected (0.04 sec) mysql>SELECT a->"$[4]" FROM tj10;
+--------------+ | a->"$[4]" | +--------------+ | 44 | | [22, 44, 66] | +--------------+ 2 rows in set (0.00 sec) mysql>SELECT * FROM tj10 WHERE a->"$[0]" = 3;
+------------------------------+------+ | a | b | +------------------------------+------+ | [3, 10, 5, 17, 44] | 33 | | [3, 10, 5, 17, [22, 44, 66]] | 0 | +------------------------------+------+ 2 rows in set (0.00 sec)
Nested arrays are supported. An expression using
->
evaluates as NULL
if no matching key is found in the target JSON document, as
shown here:
mysql>SELECT * FROM tj10 WHERE a->"$[4][1]" IS NOT NULL;
+------------------------------+------+ | a | b | +------------------------------+------+ | [3, 10, 5, 17, [22, 44, 66]] | 0 | +------------------------------+------+ mysql>SELECT a->"$[4][1]" FROM tj10;
+--------------+ | a->"$[4][1]" | +--------------+ | NULL | | 44 | +--------------+ 2 rows in set (0.00 sec)
This is the same behavior as seen in such cases when using
JSON_EXTRACT()
:
mysql> SELECT JSON_EXTRACT(a, "$[4][1]") FROM tj10;
+----------------------------+
| JSON_EXTRACT(a, "$[4][1]") |
+----------------------------+
| NULL |
| 44 |
+----------------------------+
2 rows in set (0.00 sec)
This is an improved, unquoting extraction operator. Whereas
the ->
operator simply extracts a value,
the ->>
operator in addition unquotes
the extracted result. In other words, given a
JSON
column value
column
and a path expression
path
, the following three
expressions return the same value:
JSON_UNQUOTE(
column
->
path
)
column
->>path
The ->>
operator can be used wherever
JSON_UNQUOTE(JSON_EXTRACT())
would be
allowed. This includes (but is not limited to)
SELECT
lists, WHERE
and
HAVING
clauses, and ORDER
BY
and GROUP BY
clauses.
The next few statements demonstrate some
->>
operator equivalences with other
expressions in the mysql client:
mysql>SELECT * FROM jemp WHERE g > 2;
+-------------------------------+------+ | c | g | +-------------------------------+------+ | {"id": "3", "name": "Barney"} | 3 | | {"id": "4", "name": "Betty"} | 4 | +-------------------------------+------+ 2 rows in set (0.01 sec) mysql>SELECT c->'$.name' AS name
->FROM jemp WHERE g > 2;
+----------+ | name | +----------+ | "Barney" | | "Betty" | +----------+ 2 rows in set (0.00 sec) mysql>SELECT JSON_UNQUOTE(c->'$.name') AS name
->FROM jemp WHERE g > 2;
+--------+ | name | +--------+ | Barney | | Betty | +--------+ 2 rows in set (0.00 sec) mysql>SELECT c->>'$.name' AS name
->FROM jemp WHERE g > 2;
+--------+ | name | +--------+ | Barney | | Betty | +--------+ 2 rows in set (0.00 sec)
See Indexing a Generated Column to Provide a JSON Column Index, for the SQL
statements used to create and populate the
jemp
table in the set of examples just
shown.
This operator can also be used with JSON arrays, as shown here:
mysql>CREATE TABLE tj10 (a JSON, b INT);
Query OK, 0 rows affected (0.26 sec) mysql>INSERT INTO tj10 VALUES
->('[3,10,5,"x",44]', 33),
->('[3,10,5,17,[22,"y",66]]', 0);
Query OK, 2 rows affected (0.04 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql>SELECT a->"$[3]", a->"$[4][1]" FROM tj10;
+-----------+--------------+ | a->"$[3]" | a->"$[4][1]" | +-----------+--------------+ | "x" | NULL | | 17 | "y" | +-----------+--------------+ 2 rows in set (0.00 sec) mysql>SELECT a->>"$[3]", a->>"$[4][1]" FROM tj10;
+------------+---------------+ | a->>"$[3]" | a->>"$[4][1]" | +------------+---------------+ | x | NULL | | 17 | y | +------------+---------------+ 2 rows in set (0.00 sec)
As with
->
,
the ->>
operator is always expanded
in the output of EXPLAIN
, as
the following example demonstrates:
mysql>EXPLAIN SELECT c->>'$.name' AS name
->FROM jemp WHERE g > 2\G
*************************** 1. row *************************** id: 1 select_type: SIMPLE table: jemp partitions: NULL type: range possible_keys: i key: i key_len: 5 ref: NULL rows: 2 filtered: 100.00 Extra: Using where 1 row in set, 1 warning (0.00 sec) mysql>SHOW WARNINGS\G
*************************** 1. row *************************** Level: Note Code: 1003 Message: /* select#1 */ select json_unquote(json_extract(`jtest`.`jemp`.`c`,'$.name')) AS `name` from `jtest`.`jemp` where (`jtest`.`jemp`.`g` > 2) 1 row in set (0.00 sec)
This is similar to how MySQL expands the
->
operator in the same circumstances.
Returns the keys from the top-level value of a JSON object as
a JSON array, or, if a path
argument is given, the top-level keys from the selected path.
Returns NULL
if any argument is
NULL
, the
json_doc
argument is not an object,
or path
, if given, does not locate
an object. An error occurs if the
json_doc
argument is not a valid
JSON document or the path
argument
is not a valid path expression or contains a
*
or **
wildcard.
The result array is empty if the selected object is empty. If the top-level value has nested subobjects, the return value does not include keys from those subobjects.
mysql>SELECT JSON_KEYS('{"a": 1, "b": {"c": 30}}');
+---------------------------------------+ | JSON_KEYS('{"a": 1, "b": {"c": 30}}') | +---------------------------------------+ | ["a", "b"] | +---------------------------------------+ mysql>SELECT JSON_KEYS('{"a": 1, "b": {"c": 30}}', '$.b');
+----------------------------------------------+ | JSON_KEYS('{"a": 1, "b": {"c": 30}}', '$.b') | +----------------------------------------------+ | ["c"] | +----------------------------------------------+
JSON_SEARCH(
json_doc
,
one_or_all
,
search_str
[,
escape_char
[,
path
] ...])
Returns the path to the given string within a JSON document.
Returns NULL
if any of the
json_doc
,
search_str
, or
path
arguments are
NULL
; no path
exists within the document; or
search_str
is not found. An error
occurs if the json_doc
argument is
not a valid JSON document, any path
argument is not a valid path expression,
one_or_all
is not
'one'
or 'all'
, or
escape_char
is not a constant
expression.
The one_or_all
argument affects the
search as follows:
'one'
: The search terminates after the
first match and returns one path string. It is undefined
which match is considered first.
'all'
: The search returns all matching
path strings such that no duplicate paths are included. If
there are multiple strings, they are autowrapped as an
array. The order of the array elements is undefined.
Within the search_str
search string
argument, the %
and _
characters work as for the LIKE
operator: %
matches any number of
characters (including zero characters), and
_
matches exactly one character.
To specify a literal %
or
_
character in the search string, precede
it by the escape character. The default is
\
if the
escape_char
argument is missing or
NULL
. Otherwise,
escape_char
must be a constant that
is empty or one character.
For more information about matching and escape character
behavior, see the description of
LIKE
in
Section 12.5.1, “String Comparison Functions”. For escape
character handling, a difference from the
LIKE
behavior is that the escape
character for JSON_SEARCH()
must evaluate to a constant at compile time, not just at
execution time. For example, if
JSON_SEARCH()
is used in a
prepared statement and the
escape_char
argument is supplied
using a ?
parameter, the parameter value
might be constant at execution time, but is not at compile
time.
mysql>SET @j = '["abc", [{"k": "10"}, "def"], {"x":"abc"}, {"y":"bcd"}]';
mysql>SELECT JSON_SEARCH(@j, 'one', 'abc');
+-------------------------------+ | JSON_SEARCH(@j, 'one', 'abc') | +-------------------------------+ | "$[0]" | +-------------------------------+ mysql>SELECT JSON_SEARCH(@j, 'all', 'abc');
+-------------------------------+ | JSON_SEARCH(@j, 'all', 'abc') | +-------------------------------+ | ["$[0]", "$[2].x"] | +-------------------------------+ mysql>SELECT JSON_SEARCH(@j, 'all', 'ghi');
+-------------------------------+ | JSON_SEARCH(@j, 'all', 'ghi') | +-------------------------------+ | NULL | +-------------------------------+ mysql>SELECT JSON_SEARCH(@j, 'all', '10');
+------------------------------+ | JSON_SEARCH(@j, 'all', '10') | +------------------------------+ | "$[1][0].k" | +------------------------------+ mysql>SELECT JSON_SEARCH(@j, 'all', '10', NULL, '$');
+-----------------------------------------+ | JSON_SEARCH(@j, 'all', '10', NULL, '$') | +-----------------------------------------+ | "$[1][0].k" | +-----------------------------------------+ mysql>SELECT JSON_SEARCH(@j, 'all', '10', NULL, '$[*]');
+--------------------------------------------+ | JSON_SEARCH(@j, 'all', '10', NULL, '$[*]') | +--------------------------------------------+ | "$[1][0].k" | +--------------------------------------------+ mysql>SELECT JSON_SEARCH(@j, 'all', '10', NULL, '$**.k');
+---------------------------------------------+ | JSON_SEARCH(@j, 'all', '10', NULL, '$**.k') | +---------------------------------------------+ | "$[1][0].k" | +---------------------------------------------+ mysql>SELECT JSON_SEARCH(@j, 'all', '10', NULL, '$[*][0].k');
+-------------------------------------------------+ | JSON_SEARCH(@j, 'all', '10', NULL, '$[*][0].k') | +-------------------------------------------------+ | "$[1][0].k" | +-------------------------------------------------+ mysql>SELECT JSON_SEARCH(@j, 'all', '10', NULL, '$[1]');
+--------------------------------------------+ | JSON_SEARCH(@j, 'all', '10', NULL, '$[1]') | +--------------------------------------------+ | "$[1][0].k" | +--------------------------------------------+ mysql>SELECT JSON_SEARCH(@j, 'all', '10', NULL, '$[1][0]');
+-----------------------------------------------+ | JSON_SEARCH(@j, 'all', '10', NULL, '$[1][0]') | +-----------------------------------------------+ | "$[1][0].k" | +-----------------------------------------------+ mysql>SELECT JSON_SEARCH(@j, 'all', 'abc', NULL, '$[2]');
+---------------------------------------------+ | JSON_SEARCH(@j, 'all', 'abc', NULL, '$[2]') | +---------------------------------------------+ | "$[2].x" | +---------------------------------------------+ mysql>SELECT JSON_SEARCH(@j, 'all', '%a%');
+-------------------------------+ | JSON_SEARCH(@j, 'all', '%a%') | +-------------------------------+ | ["$[0]", "$[2].x"] | +-------------------------------+ mysql>SELECT JSON_SEARCH(@j, 'all', '%b%');
+-------------------------------+ | JSON_SEARCH(@j, 'all', '%b%') | +-------------------------------+ | ["$[0]", "$[2].x", "$[3].y"] | +-------------------------------+ mysql>SELECT JSON_SEARCH(@j, 'all', '%b%', NULL, '$[0]');
+---------------------------------------------+ | JSON_SEARCH(@j, 'all', '%b%', NULL, '$[0]') | +---------------------------------------------+ | "$[0]" | +---------------------------------------------+ mysql>SELECT JSON_SEARCH(@j, 'all', '%b%', NULL, '$[2]');
+---------------------------------------------+ | JSON_SEARCH(@j, 'all', '%b%', NULL, '$[2]') | +---------------------------------------------+ | "$[2].x" | +---------------------------------------------+ mysql>SELECT JSON_SEARCH(@j, 'all', '%b%', NULL, '$[1]');
+---------------------------------------------+ | JSON_SEARCH(@j, 'all', '%b%', NULL, '$[1]') | +---------------------------------------------+ | NULL | +---------------------------------------------+ mysql>SELECT JSON_SEARCH(@j, 'all', '%b%', '', '$[1]');
+-------------------------------------------+ | JSON_SEARCH(@j, 'all', '%b%', '', '$[1]') | +-------------------------------------------+ | NULL | +-------------------------------------------+ mysql>SELECT JSON_SEARCH(@j, 'all', '%b%', '', '$[3]');
+-------------------------------------------+ | JSON_SEARCH(@j, 'all', '%b%', '', '$[3]') | +-------------------------------------------+ | "$[3].y" | +-------------------------------------------+
For more information about the JSON path syntax supported by
MySQL, including rules governing the wildcard operators
*
and **
, see
JSON Path Syntax.
The functions in this section modify JSON values and return the result.
JSON_ARRAY_APPEND(
json_doc
,
path
,
val
[,
path
,
val
] ...)
Appends values to the end of the indicated arrays within a
JSON document and returns the result. Returns
NULL
if any argument is
NULL
. An error occurs if the
json_doc
argument is not a valid
JSON document or any path
argument
is not a valid path expression or contains a
*
or **
wildcard.
The path-value pairs are evaluated left to right. The document produced by evaluating one pair becomes the new value against which the next pair is evaluated.
If a path selects a scalar or object value, that value is autowrapped within an array and the new value is added to that array. Pairs for which the path does not identify any value in the JSON document are ignored.
mysql>SET @j = '["a", ["b", "c"], "d"]';
mysql>SELECT JSON_ARRAY_APPEND(@j, '$[1]', 1);
+----------------------------------+ | JSON_ARRAY_APPEND(@j, '$[1]', 1) | +----------------------------------+ | ["a", ["b", "c", 1], "d"] | +----------------------------------+ mysql>SELECT JSON_ARRAY_APPEND(@j, '$[0]', 2);
+----------------------------------+ | JSON_ARRAY_APPEND(@j, '$[0]', 2) | +----------------------------------+ | [["a", 2], ["b", "c"], "d"] | +----------------------------------+ mysql>SELECT JSON_ARRAY_APPEND(@j, '$[1][0]', 3);
+-------------------------------------+ | JSON_ARRAY_APPEND(@j, '$[1][0]', 3) | +-------------------------------------+ | ["a", [["b", 3], "c"], "d"] | +-------------------------------------+ mysql>SET @j = '{"a": 1, "b": [2, 3], "c": 4}';
mysql>SELECT JSON_ARRAY_APPEND(@j, '$.b', 'x');
+------------------------------------+ | JSON_ARRAY_APPEND(@j, '$.b', 'x') | +------------------------------------+ | {"a": 1, "b": [2, 3, "x"], "c": 4} | +------------------------------------+ mysql>SELECT JSON_ARRAY_APPEND(@j, '$.c', 'y');
+--------------------------------------+ | JSON_ARRAY_APPEND(@j, '$.c', 'y') | +--------------------------------------+ | {"a": 1, "b": [2, 3], "c": [4, "y"]} | +--------------------------------------+ mysql>SET @j = '{"a": 1}';
mysql>SELECT JSON_ARRAY_APPEND(@j, '$', 'z');
+---------------------------------+ | JSON_ARRAY_APPEND(@j, '$', 'z') | +---------------------------------+ | [{"a": 1}, "z"] | +---------------------------------+
In MySQL 5.7, this function was named
JSON_APPEND()
. That name is no longer
supported in MySQL 8.0.
JSON_ARRAY_INSERT(
json_doc
,
path
,
val
[,
path
,
val
] ...)
Updates a JSON document, inserting into an array within the
document and returning the modified document. Returns
NULL
if any argument is
NULL
. An error occurs if the
json_doc
argument is not a valid
JSON document or any path
argument
is not a valid path expression or contains a
*
or **
wildcard or does
not end with an array element identifier.
The path-value pairs are evaluated left to right. The document produced by evaluating one pair becomes the new value against which the next pair is evaluated.
Pairs for which the path does not identify any array in the JSON document are ignored. If a path identifies an array element, the corresponding value is inserted at that element position, shifting any following values to the right. If a path identifies an array position past the end of an array, the value is inserted at the end of the array.
mysql>SET @j = '["a", {"b": [1, 2]}, [3, 4]]';
mysql>SELECT JSON_ARRAY_INSERT(@j, '$[1]', 'x');
+------------------------------------+ | JSON_ARRAY_INSERT(@j, '$[1]', 'x') | +------------------------------------+ | ["a", "x", {"b": [1, 2]}, [3, 4]] | +------------------------------------+ mysql>SELECT JSON_ARRAY_INSERT(@j, '$[100]', 'x');
+--------------------------------------+ | JSON_ARRAY_INSERT(@j, '$[100]', 'x') | +--------------------------------------+ | ["a", {"b": [1, 2]}, [3, 4], "x"] | +--------------------------------------+ mysql>SELECT JSON_ARRAY_INSERT(@j, '$[1].b[0]', 'x');
+-----------------------------------------+ | JSON_ARRAY_INSERT(@j, '$[1].b[0]', 'x') | +-----------------------------------------+ | ["a", {"b": ["x", 1, 2]}, [3, 4]] | +-----------------------------------------+ mysql>SELECT JSON_ARRAY_INSERT(@j, '$[2][1]', 'y');
+---------------------------------------+ | JSON_ARRAY_INSERT(@j, '$[2][1]', 'y') | +---------------------------------------+ | ["a", {"b": [1, 2]}, [3, "y", 4]] | +---------------------------------------+ mysql>SELECT JSON_ARRAY_INSERT(@j, '$[0]', 'x', '$[2][1]', 'y');
+----------------------------------------------------+ | JSON_ARRAY_INSERT(@j, '$[0]', 'x', '$[2][1]', 'y') | +----------------------------------------------------+ | ["x", "a", {"b": [1, 2]}, [3, 4]] | +----------------------------------------------------+
Earlier modifications affect the positions of the following
elements in the array, so subsequent paths in the same
JSON_ARRAY_INSERT()
call should
take this into account. In the final example, the second path
inserts nothing because the path no longer matches anything
after the first insert.
JSON_INSERT(
json_doc
,
path
,
val
[,
path
,
val
] ...)
Inserts data into a JSON document and returns the result.
Returns NULL
if any argument is
NULL
. An error occurs if the
json_doc
argument is not a valid
JSON document or any path
argument
is not a valid path expression or contains a
*
or **
wildcard.
The path-value pairs are evaluated left to right. The document produced by evaluating one pair becomes the new value against which the next pair is evaluated.
A path-value pair for an existing path in the document is ignored and does not overwrite the existing document value. A path-value pair for a nonexisting path in the document adds the value to the document if the path identifies one of these types of values:
A member not present in an existing object. The member is added to the object and associated with the new value.
A position past the end of an existing array. The array is extended with the new value. If the existing value is not an array, it is autowrapped as an array, then extended with the new value.
Otherwise, a path-value pair for a nonexisting path in the document is ignored and has no effect.
For a comparison of
JSON_INSERT()
,
JSON_REPLACE()
, and
JSON_SET()
, see the discussion
of JSON_SET()
.
mysql>SET @j = '{ "a": 1, "b": [2, 3]}';
mysql>SELECT JSON_INSERT(@j, '$.a', 10, '$.c', '[true, false]');
+----------------------------------------------------+ | JSON_INSERT(@j, '$.a', 10, '$.c', '[true, false]') | +----------------------------------------------------+ | {"a": 1, "b": [2, 3], "c": "[true, false]"} | +----------------------------------------------------+
The third and final value listed in the result is a quoted string and not an array like the second one (which is not quoted in the output); no casting of values to the JSON type is performed. To insert the array as an array, you must perform such casts explicitly, as shown here:
mysql> SELECT JSON_INSERT(@j, '$.a', 10, '$.c', CAST('[true, false]' AS JSON));
+------------------------------------------------------------------+
| JSON_INSERT(@j, '$.a', 10, '$.c', CAST('[true, false]' AS JSON)) |
+------------------------------------------------------------------+
| {"a": 1, "b": [2, 3], "c": [true, false]} |
+------------------------------------------------------------------+
1 row in set (0.00 sec)
JSON_MERGE(
json_doc
,
json_doc
[,
json_doc
] ...)
Merges two or more JSON documents. Synonym for
JSON_MERGE_PRESERVE()
; deprecated in MySQL
8.0.3 and subject to removal in a future release.
mysql>SELECT JSON_MERGE('[1, 2]', '[true, false]');
+---------------------------------------+ | JSON_MERGE('[1, 2]', '[true, false]') | +---------------------------------------+ | [1, 2, true, false] | +---------------------------------------+ 1 row in set, 1 warning (0.00 sec) mysql>SHOW WARNINGS\G
*************************** 1. row *************************** Level: Warning Code: 1287 Message: 'JSON_MERGE' is deprecated and will be removed in a future release. \ Please use JSON_MERGE_PRESERVE/JSON_MERGE_PATCH instead 1 row in set (0.00 sec)
For additional examples, see the entry for
JSON_MERGE_PRESERVE()
.
JSON_MERGE_PATCH(
json_doc
,
json_doc
[,
json_doc
] ...)
Performs an RFC 7396 compliant merge of two or more JSON documents and returns the merged result, without preserving members having duplicate keys. Raises an error if at least one of the documents passed as arguments to this function is not valid.
For an explanation and example of the differences between
this function and JSON_MERGE_PRESERVE()
,
see
JSON_MERGE_PATCH() compared with JSON_MERGE_PRESERVE().
JSON_MERGE_PATCH()
performs a merge as
follows:
If the first argument is not an object, the result of the merge is the same as if an empty object had been merged with the second argument.
If the second argument is not an object, the result of the merge is the second argument.
If both arguments are objects, the result of the merge is an object with the following members:
All members of the first object which do not have a corresponding member with the same key in the second object.
All members of the second object which do not have a
corresponding key in the first object, and whose value
is not the JSON null
literal.
All members with a key that exists in both the first
and the second object, and whose value in the second
object is not the JSON null
literal. The values of these members are the results
of recursively merging the value in the first object
with the value in the second object.
For additional information, see Normalization, Merging, and Autowrapping of JSON Values.
mysql>SELECT JSON_MERGE_PATCH('[1, 2]', '[true, false]');
+---------------------------------------------+ | JSON_MERGE_PATCH('[1, 2]', '[true, false]') | +---------------------------------------------+ | [true, false] | +---------------------------------------------+ mysql>SELECT JSON_MERGE_PATCH('{"name": "x"}', '{"id": 47}');
+-------------------------------------------------+ | JSON_MERGE_PATCH('{"name": "x"}', '{"id": 47}') | +-------------------------------------------------+ | {"id": 47, "name": "x"} | +-------------------------------------------------+ mysql>SELECT JSON_MERGE_PATCH('1', 'true');
+-------------------------------+ | JSON_MERGE_PATCH('1', 'true') | +-------------------------------+ | true | +-------------------------------+ mysql>SELECT JSON_MERGE_PATCH('[1, 2]', '{"id": 47}');
+------------------------------------------+ | JSON_MERGE_PATCH('[1, 2]', '{"id": 47}') | +------------------------------------------+ | {"id": 47} | +------------------------------------------+ mysql>SELECT JSON_MERGE_PATCH('{ "a": 1, "b":2 }',
>'{ "a": 3, "c":4 }');
+-----------------------------------------------------------+ | JSON_MERGE_PATCH('{ "a": 1, "b":2 }','{ "a": 3, "c":4 }') | +-----------------------------------------------------------+ | {"a": 3, "b": 2, "c": 4} | +-----------------------------------------------------------+ mysql>SELECT JSON_MERGE_PATCH('{ "a": 1, "b":2 }','{ "a": 3, "c":4 }',
>'{ "a": 5, "d":6 }');
+-------------------------------------------------------------------------------+ | JSON_MERGE_PATCH('{ "a": 1, "b":2 }','{ "a": 3, "c":4 }','{ "a": 5, "d":6 }') | +-------------------------------------------------------------------------------+ | {"a": 5, "b": 2, "c": 4, "d": 6} | +-------------------------------------------------------------------------------+
You can use this function to remove a member by specifying
null
as the value of the same member in the
seond argument, as shown here:
mysql> SELECT JSON_MERGE_PATCH('{"a":1, "b":2}', '{"b":null}');
+--------------------------------------------------+
| JSON_MERGE_PATCH('{"a":1, "b":2}', '{"b":null}') |
+--------------------------------------------------+
| {"a": 1} |
+--------------------------------------------------+
This example shows that the function operates in a recursive fashion; that is, values of members are not limited to scalars, but rather can themselves be JSON documents:
mysql> SELECT JSON_MERGE_PATCH('{"a":{"x":1}}', '{"a":{"y":2}}');
+----------------------------------------------------+
| JSON_MERGE_PATCH('{"a":{"x":1}}', '{"a":{"y":2}}') |
+----------------------------------------------------+
| {"a": {"x": 1, "y": 2}} |
+----------------------------------------------------+
JSON_MERGE_PATCH()
is supported in MySQL
8.0.3 and later.
JSON_MERGE_PATCH() compared with JSON_MERGE_PRESERVE().
The behavior of JSON_MERGE_PATCH()
is the
same as that of
JSON_MERGE_PRESERVE()
, with
the following two exceptions:
JSON_MERGE_PATCH()
removes any member
in the first object with a matching key in the second
object, provided that the value associated with the key in
the second object is not JSON null
.
If the second object has a member with a key matching a
member in the first object,
JSON_MERGE_PATCH()
replaces the value in the first
object with the value in the second object, whereas
JSON_MERGE_PRESERVE()
appends the second value to the first
value.
This example compares the results of merging the same 3 JSON
objects, each having a matching key "a"
,
with each of these two functions:
mysql>SET @x = '{ "a": 1, "b": 2 }',
>@y = '{ "a": 3, "c": 4 }',
>@z = '{ "a": 5, "d": 6 }';
mysql>SELECT JSON_MERGE_PATCH(@x, @y, @z) AS Patch,
->JSON_MERGE_PRESERVE(@x, @y, @z) AS Preserve\G
*************************** 1. row *************************** Patch: {"a": 5, "b": 2, "c": 4, "d": 6} Preserve: {"a": [1, 3, 5], "b": 2, "c": 4, "d": 6}
JSON_MERGE_PRESERVE(
json_doc
,
json_doc
[,
json_doc
] ...)
Merges two or more JSON documents and returns the merged
result. Returns NULL
if any argument is
NULL
. An error occurs if any argument is
not a valid JSON document.
Merging takes place according to the following rules. For additional information, see Normalization, Merging, and Autowrapping of JSON Values.
Adjacent arrays are merged to a single array.
Adjacent objects are merged to a single object.
A scalar value is autowrapped as an array and merged as an array.
An adjacent array and object are merged by autowrapping the object as an array and merging the two arrays.
mysql>SELECT JSON_MERGE_PRESERVE('[1, 2]', '[true, false]');
+------------------------------------------------+ | JSON_MERGE_PRESERVE('[1, 2]', '[true, false]') | +------------------------------------------------+ | [1, 2, true, false] | +------------------------------------------------+ mysql>SELECT JSON_MERGE_PRESERVE('{"name": "x"}', '{"id": 47}');
+----------------------------------------------------+ | JSON_MERGE_PRESERVE('{"name": "x"}', '{"id": 47}') | +----------------------------------------------------+ | {"id": 47, "name": "x"} | +----------------------------------------------------+ mysql>SELECT JSON_MERGE_PRESERVE('1', 'true');
+----------------------------------+ | JSON_MERGE_PRESERVE('1', 'true') | +----------------------------------+ | [1, true] | +----------------------------------+ mysql>SELECT JSON_MERGE_PRESERVE('[1, 2]', '{"id": 47}');
+---------------------------------------------+ | JSON_MERGE_PRESERVE('[1, 2]', '{"id": 47}') | +---------------------------------------------+ | [1, 2, {"id": 47}] | +---------------------------------------------+ mysql>SELECT JSON_MERGE_PRESERVE('{ "a": 1, "b": 2 }',
>'{ "a": 3, "c": 4 }');
+--------------------------------------------------------------+ | JSON_MERGE_PRESERVE('{ "a": 1, "b": 2 }','{ "a": 3, "c":4 }') | +--------------------------------------------------------------+ | {"a": [1, 3], "b": 2, "c": 4} | +--------------------------------------------------------------+ mysql>SELECT JSON_MERGE_PRESERVE('{ "a": 1, "b": 2 }','{ "a": 3, "c": 4 }',
>'{ "a": 5, "d": 6 }');
+----------------------------------------------------------------------------------+ | JSON_MERGE_PRESERVE('{ "a": 1, "b": 2 }','{ "a": 3, "c": 4 }','{ "a": 5, "d": 6 }') | +----------------------------------------------------------------------------------+ | {"a": [1, 3, 5], "b": 2, "c": 4, "d": 6} | +----------------------------------------------------------------------------------+
This function was added in MySQL 8.0.3 as a synonym for
JSON_MERGE()
. The
JSON_MERGE()
function is now deprecated,
and is subject to removal in a future release of MySQL.
This function is similar to but differs from
JSON_MERGE_PATCH()
in
significant respects; see
JSON_MERGE_PATCH() compared with JSON_MERGE_PRESERVE(),
for more information.
JSON_REMOVE(
json_doc
,
path
[,
path
] ...)
Removes data from a JSON document and returns the result.
Returns NULL
if any argument is
NULL
. An error occurs if the
json_doc
argument is not a valid
JSON document or any path
argument
is not a valid path expression or is $
or
contains a *
or **
wildcard.
The path
arguments are evaluated
left to right. The document produced by evaluating one path
becomes the new value against which the next path is
evaluated.
It is not an error if the element to be removed does not exist in the document; in that case, the path does not affect the document.
mysql>SET @j = '["a", ["b", "c"], "d"]';
mysql>SELECT JSON_REMOVE(@j, '$[1]');
+-------------------------+ | JSON_REMOVE(@j, '$[1]') | +-------------------------+ | ["a", "d"] | +-------------------------+
JSON_REPLACE(
json_doc
,
path
,
val
[,
path
,
val
] ...)
Replaces existing values in a JSON document and returns the
result. Returns NULL
if any argument is
NULL
. An error occurs if the
json_doc
argument is not a valid
JSON document or any path
argument
is not a valid path expression or contains a
*
or **
wildcard.
The path-value pairs are evaluated left to right. The document produced by evaluating one pair becomes the new value against which the next pair is evaluated.
A path-value pair for an existing path in the document overwrites the existing document value with the new value. A path-value pair for a nonexisting path in the document is ignored and has no effect.
In MySQL 8.0.4, the optimizer can perform a partial, in-place
update of a JSON
column instead of removing
the old document and writing the new document in its entirety
to the column. This optimization can be performed for an
update statement that uses the
JSON_REPLACE()
function and
meets the conditions outlined in
Partial Updates of JSON Values.
For a comparison of
JSON_INSERT()
,
JSON_REPLACE()
, and
JSON_SET()
, see the discussion
of JSON_SET()
.
mysql>SET @j = '{ "a": 1, "b": [2, 3]}';
mysql>SELECT JSON_REPLACE(@j, '$.a', 10, '$.c', '[true, false]');
+-----------------------------------------------------+ | JSON_REPLACE(@j, '$.a', 10, '$.c', '[true, false]') | +-----------------------------------------------------+ | {"a": 10, "b": [2, 3]} | +-----------------------------------------------------+
JSON_SET(
json_doc
,
path
,
val
[,
path
,
val
] ...)
Inserts or updates data in a JSON document and returns the
result. Returns NULL
if any argument is
NULL
or path
, if
given, does not locate an object. An error occurs if the
json_doc
argument is not a valid
JSON document or any path
argument
is not a valid path expression or contains a
*
or **
wildcard.
The path-value pairs are evaluated left to right. The document produced by evaluating one pair becomes the new value against which the next pair is evaluated.
A path-value pair for an existing path in the document overwrites the existing document value with the new value. A path-value pair for a nonexisting path in the document adds the value to the document if the path identifies one of these types of values:
A member not present in an existing object. The member is added to the object and associated with the new value.
A position past the end of an existing array. The array is extended with the new value. If the existing value is not an array, it is autowrapped as an array, then extended with the new value.
Otherwise, a path-value pair for a nonexisting path in the document is ignored and has no effect.
In MySQL 8.0.4, the optimizer can perform a partial, in-place
update of a JSON
column instead of removing
the old document and writing the new document in its entirety
to the column. This optimization can be performed for an
update statement that uses the
JSON_SET()
function and meets
the conditions outlined in
Partial Updates of JSON Values.
The JSON_SET()
,
JSON_INSERT()
, and
JSON_REPLACE()
functions are
related:
JSON_SET()
replaces
existing values and adds nonexisting values.
JSON_INSERT()
inserts
values without replacing existing values.
JSON_REPLACE()
replaces
only existing values.
The following examples illustrate these differences, using one
path that does exist in the document ($.a
)
and another that does not exist ($.c
):
mysql>SET @j = '{ "a": 1, "b": [2, 3]}';
mysql>SELECT JSON_SET(@j, '$.a', 10, '$.c', '[true, false]');
+-------------------------------------------------+ | JSON_SET(@j, '$.a', 10, '$.c', '[true, false]') | +-------------------------------------------------+ | {"a": 10, "b": [2, 3], "c": "[true, false]"} | +-------------------------------------------------+ mysql>SELECT JSON_INSERT(@j, '$.a', 10, '$.c', '[true, false]');
+----------------------------------------------------+ | JSON_INSERT(@j, '$.a', 10, '$.c', '[true, false]') | +----------------------------------------------------+ | {"a": 1, "b": [2, 3], "c": "[true, false]"} | +----------------------------------------------------+ mysql>SELECT JSON_REPLACE(@j, '$.a', 10, '$.c', '[true, false]');
+-----------------------------------------------------+ | JSON_REPLACE(@j, '$.a', 10, '$.c', '[true, false]') | +-----------------------------------------------------+ | {"a": 10, "b": [2, 3]} | +-----------------------------------------------------+
Unquotes JSON value and returns the result as a
utf8mb4
string. Returns
NULL
if the argument is
NULL
. An error occurs if the value starts
and ends with double quotes but is not a valid JSON string
literal.
Within a string, certain sequences have special meaning unless
the NO_BACKSLASH_ESCAPES
SQL
mode is enabled. Each of these sequences begins with a
backslash (\
), known as the
escape character. MySQL recognizes the
escape sequences shown in
Table 12.22, “JSON_UNQUOTE() Special Character Escape Sequences”. For
all other escape sequences, backslash is ignored. That is, the
escaped character is interpreted as if it was not escaped. For
example, \x
is just x
.
These sequences are case-sensitive. For example,
\b
is interpreted as a backspace, but
\B
is interpreted as B
.
Two simple examples of the use of this function are shown here:
mysql>SET @j = '"abc"';
mysql>SELECT @j, JSON_UNQUOTE(@j);
+-------+------------------+ | @j | JSON_UNQUOTE(@j) | +-------+------------------+ | "abc" | abc | +-------+------------------+ mysql>SET @j = '[1, 2, 3]';
mysql>SELECT @j, JSON_UNQUOTE(@j);
+-----------+------------------+ | @j | JSON_UNQUOTE(@j) | +-----------+------------------+ | [1, 2, 3] | [1, 2, 3] | +-----------+------------------+
The following set of examples shows how
JSON_UNQUOTE
handles escapes with
NO_BACKSLASH_ESCAPES
disabled and enabled:
mysql>SELECT @@sql_mode;
+------------+ | @@sql_mode | +------------+ | | +------------+ mysql>SELECT JSON_UNQUOTE('"\\t\\u0032"');
+------------------------------+ | JSON_UNQUOTE('"\\t\\u0032"') | +------------------------------+ | 2 | +------------------------------+ mysql>SET @@sql_mode = 'NO_BACKSLASH_ESCAPES';
mysql>SELECT JSON_UNQUOTE('"\\t\\u0032"');
+------------------------------+ | JSON_UNQUOTE('"\\t\\u0032"') | +------------------------------+ | \t\u0032 | +------------------------------+ mysql>SELECT JSON_UNQUOTE('"\t\u0032"');
+----------------------------+ | JSON_UNQUOTE('"\t\u0032"') | +----------------------------+ | 2 | +----------------------------+
The functions in this section return attributes of JSON values.
Returns the maximum depth of a JSON document. Returns
NULL
if the argument is
NULL
. An error occurs if the argument is
not a valid JSON document.
An empty array, empty object, or scalar value has depth 1. A nonempty array containing only elements of depth 1 or nonempty object containing only member values of depth 1 has depth 2. Otherwise, a JSON document has depth greater than 2.
mysql>SELECT JSON_DEPTH('{}'), JSON_DEPTH('[]'), JSON_DEPTH('true');
+------------------+------------------+--------------------+ | JSON_DEPTH('{}') | JSON_DEPTH('[]') | JSON_DEPTH('true') | +------------------+------------------+--------------------+ | 1 | 1 | 1 | +------------------+------------------+--------------------+ mysql>SELECT JSON_DEPTH('[10, 20]'), JSON_DEPTH('[[], {}]');
+------------------------+------------------------+ | JSON_DEPTH('[10, 20]') | JSON_DEPTH('[[], {}]') | +------------------------+------------------------+ | 2 | 2 | +------------------------+------------------------+ mysql>SELECT JSON_DEPTH('[10, {"a": 20}]');
+-------------------------------+ | JSON_DEPTH('[10, {"a": 20}]') | +-------------------------------+ | 3 | +-------------------------------+
Returns the length of a JSON document, or, if a
path
argument is given, the length
of the value within the document identified by the path.
Returns NULL
if any argument is
NULL
or the path
argument does not identify a value in the document. An error
occurs if the json_doc
argument is
not a valid JSON document or the
path
argument is not a valid path
expression or contains a *
or
**
wildcard.
The length of a document is determined as follows:
The length of a scalar is 1.
The length of an array is the number of array elements.
The length of an object is the number of object members.
The length does not count the length of nested arrays or objects.
mysql>SELECT JSON_LENGTH('[1, 2, {"a": 3}]');
+---------------------------------+ | JSON_LENGTH('[1, 2, {"a": 3}]') | +---------------------------------+ | 3 | +---------------------------------+ mysql>SELECT JSON_LENGTH('{"a": 1, "b": {"c": 30}}');
+-----------------------------------------+ | JSON_LENGTH('{"a": 1, "b": {"c": 30}}') | +-----------------------------------------+ | 2 | +-----------------------------------------+ mysql>SELECT JSON_LENGTH('{"a": 1, "b": {"c": 30}}', '$.b');
+------------------------------------------------+ | JSON_LENGTH('{"a": 1, "b": {"c": 30}}', '$.b') | +------------------------------------------------+ | 1 | +------------------------------------------------+
Returns a utf8mb4
string indicating the
type of a JSON value. This can be an object, an array, or a
scalar type, as shown here:
mysql>SET @j = '{"a": [10, true]}';
mysql>SELECT JSON_TYPE(@j);
+---------------+ | JSON_TYPE(@j) | +---------------+ | OBJECT | +---------------+ mysql>SELECT JSON_TYPE(JSON_EXTRACT(@j, '$.a'));
+------------------------------------+ | JSON_TYPE(JSON_EXTRACT(@j, '$.a')) | +------------------------------------+ | ARRAY | +------------------------------------+ mysql>SELECT JSON_TYPE(JSON_EXTRACT(@j, '$.a[0]'));
+---------------------------------------+ | JSON_TYPE(JSON_EXTRACT(@j, '$.a[0]')) | +---------------------------------------+ | INTEGER | +---------------------------------------+ mysql>SELECT JSON_TYPE(JSON_EXTRACT(@j, '$.a[1]'));
+---------------------------------------+ | JSON_TYPE(JSON_EXTRACT(@j, '$.a[1]')) | +---------------------------------------+ | BOOLEAN | +---------------------------------------+
JSON_TYPE()
returns
NULL
if the argument is
NULL
:
mysql> SELECT JSON_TYPE(NULL);
+-----------------+
| JSON_TYPE(NULL) |
+-----------------+
| NULL |
+-----------------+
An error occurs if the argument is not a valid JSON value:
mysql> SELECT JSON_TYPE(1);
ERROR 3146 (22032): Invalid data type for JSON data in argument 1
to function json_type; a JSON string or JSON type is required.
For a non-NULL
, non-error result, the
following list describes the possible
JSON_TYPE()
return values:
Purely JSON types:
OBJECT
: JSON objects
ARRAY
: JSON arrays
BOOLEAN
: The JSON true and false
literals
NULL
: The JSON null literal
Numeric types:
Temporal types:
String types:
Binary types:
All other types:
OPAQUE
(raw bits)
Returns 0 or 1 to indicate whether a value is valid JSON.
Returns NULL
if the argument is
NULL
.
mysql>SELECT JSON_VALID('{"a": 1}');
+------------------------+ | JSON_VALID('{"a": 1}') | +------------------------+ | 1 | +------------------------+ mysql>SELECT JSON_VALID('hello'), JSON_VALID('"hello"');
+---------------------+-----------------------+ | JSON_VALID('hello') | JSON_VALID('"hello"') | +---------------------+-----------------------+ | 0 | 1 | +---------------------+-----------------------+
This section contains information about JSON functions that
convert JSON data to tabular data. In MySQL 8.0.4 and later, one
such function—JSON_TABLE()
—is
supported.
JSON_TABLE(
expr
,
path
COLUMNS
(column_list
) [AS]
alias
)
Extracts data from a JSON document and returns it as a relational table having the specified columns. The complete syntax for this function is shown here:
JSON_TABLE(expr
,path
COLUMNS (column_list
) ) [AS]alias
column_list
:column
[,column
][, ...]column
:name
FOR ORDINALITY |name
type
PATHstring path
[on_error
] [on_empty
] |name
type
EXISTS PATHstring path
| NESTED [PATH]path
COLUMNS (column_list
)on_error
: {NULL | ERROR | DEFAULTjson_string
} ON ERRORon_empty
: {NULL | ERROR | DEFAULTjson_string
} ON EMPTY
expr
: This is an expression that
returns JSON data. This can be a constant
('{"a":1}'
), a column
(t1.json_data
, given table
t1
specified prior to
JSON_TABLE()
in the FROM
clause), or a function call
(JSON_EXTRACT(t1,jsn_data,'$.post.comments')
).
path
: A JSON path expression, which
is applied to the data source. We refer to the JSON value
matching the path as the row source; this
is used to generate a row of relational data. The
COLUMNS
clause evaluates the row source,
finds specific JSON values within the row source, and returns
those JSON values as SQL values in individual columns of a row
of relational data.
The alias
is required. The usual
rules for table aliases apply (see
Section 9.2, “Schema Object Names”).
JSON_TABLE()
supports four types of
columns, described in the following list:
: This type enumerates rows in the
name
FOR
ORDINALITYCOLUMNS
clause; the column named
name
is a counter whose type is
UNSIGNED INT
, and whose initial value
is 1. This is equivalent to specifying a column as
AUTO_INCREMENT
in a
CREATE TABLE
statement, and
can be used to distinguish parent rows with the same value
for multiple rows generated by a NESTED
[PATH]
clause.
: Columns
of this type are used to extract values specified by
name
type
PATH
string_path
[on_error
]
[on_empty
]string_path
.
type
is a MySQL data type.
JSON_TABLE()
extracts data as JSON then
coerces it to the column type, using the regular automatic
type conversion applying to JSON data in MySQL. The exact
behavior depends on the column type: If the column type is
an SQL type, then only a scalar value can be saved in the
column. Saving an object or array triggers the
on error
clause; this also
occurs when an error takes place during coercion from the
value saved as JSON to the table column, such as trying to
save the string 'asd'
to an integer
column. A missing value triggers the
on_empty
clause.
The optional on_error
clause
determines what JSON_TABLE()
does when
saving an object or array:
NULL ON ERROR
: The column is set to
NULL
; this is the default behavior.
If an error occurs during type coercion, a warning is
thrown.
ERROR ON ERROR
: An error is thrown.
DEFAULT
: The
json
string
ON ERRORjson_string
is parsed as
JSON (provided that it is valid) and stored instead of
the object or array. A warning is thrown if the error
is caused by type coercion. Column type rules also
apply to the default value.
When a value saved to a column is truncated, such as
saving 3.14159 in a
DECIMAL(10,1)
column, a
warning is issued independently of any ON
ERROR
option. When multiple values are truncated
in a single statement, the warning is issued only once.
The optional on empty
clause
determines what JSON_TABLE()
does in
the event that data is missing (depending on type). This
clause is also triggered on a column in a NESTED
PATH
clause when the latter has no match and a
NULL
complemented row is produced for
it. on empty
takes one of the
following values:
NULL ON EMPTY
: The column is set to
NULL
; this is the default behavior.
ERROR ON EMPTY
: An error is thrown.
DEFAULT
: the provided
json_string
ON
EMPTYjson_string
is parsed as
JSON, as long as it is valid, and stored instead of
the missing value. Column type rules also apply to the
default value.
This query demonstrates the use of the ON
ERROR
and ON EMPTY
options.
The row corresponding to {"b":1}
is
empty for the path "$.a"
, and
attempting to save [1,2]
as a scalar
produces an error; these rows are highlighted in the
output shown.
mysql>SELECT *
->FROM
->JSON_TABLE(
->'[{"a":"3"},{"a":2},{"b":1},{"a":0},{"a":[1,2]}]',
->"$[*]"
->COLUMNS(
->rowid FOR ORDINALITY,
->ac VARCHAR(100) PATH "$.a" DEFAULT '999' ON ERROR DEFAULT '111' ON EMPTY,
->aj JSON PATH "$.a" DEFAULT '{"x": 333}' ON EMPTY,
->bx INT EXISTS PATH "$.b"
->)
->) AS tt;
+-------+------+------------+------+ | rowid | ac | aj | bx | +-------+------+------------+------+ | 1 | 3 | "3" | 0 | | 2 | 2 | 2 | 0 | | 3 | 111 | {"x": 333} | 1 | | 4 | 0 | 0 | 0 | | 5 | 999 | [1, 2] | 0 | +-------+------+------------+------+ 5 rows in set (0.00 sec)
: This column
returns 1 if any data is present at the location specified
by name
type
EXISTS PATH
path
path
, and 0 otherwise.
type
can be any valid MySQL
data type, but should normally be specified as some
variety of INT
.
NESTED [PATH]
: This
flattens nested objects or arrays in JSON data into a
single row along with the JSON values from the parent
object or array. Using multiple path
COLUMNS
(column_list
)PATH
options allows projection of JSON values from multiple
levels of nesting into a single row.
The path
is relative to the
parent path row path of JSON_TABLE()
,
or the path of the parent NESTED [PATH]
clause in the event of nested paths.
Column names are subject to the usual rules and limitations governing table column names. See Section 9.2, “Schema Object Names”.
All JSON and JSON path expressions are checked for validity; an invalid expression of either type causes an error.
Each match for the path
preceding
the COLUMNS
keyword maps to an individual
row in the result table. For example, the following query
gives the result shown here:
mysql>SELECT *
->FROM
->JSON_TABLE(
->'[{"x":2,"y":"8"},{"x":"3","y":"7"},{"x":"4","y":6}]',
->"$[*]" COLUMNS(
->xval VARCHAR(100) PATH "$.x",
->yval VARCHAR(100) PATH "$.y"
->)
->) AS jt1;
+------+------+ | xval | yval | +------+------+ | 2 | 8 | | 3 | 7 | | 4 | 6 | +------+------+
The expression "$[*]"
matches each element
of the array. You can filter the rows in the result by
modifying the path; for example, using
"$[1]"
limits extraction to the second
element of the JSON array used as the source, as shown here:
mysql>SELECT *
->FROM
->JSON_TABLE(
->'[{"x":2,"y":"8"},{"x":"3","y":"7"},{"x":"4","y":6}]',
->"$[1]" COLUMNS(
->xval VARCHAR(100) PATH "$.x",
->yval VARCHAR(100) PATH "$.y"
->)
->) AS jt1;
+------+------+ | xval | yval | +------+------+ | 3 | 7 | +------+------+
Within a column definition, "$"
passes the
entire match to the column; "$.x"
and
"$.y"
pass only the values corresponding to
the keys x
and y
,
respectively, within that match. For more information, see
JSON Path Syntax.
NESTED PATH
(or simply
NESTED
; PATH
is
optional) produces a set of records for each match in the
COLUMNS
clause to which it belongs. If
there is no match, all columns of the nested path are set to
NULL
. This implements an outer join between
the topmost clause and NESTED [PATH]
. An
inner join can be emulated by applying a suitable condition in
the WHERE
clause, as shown here:
mysql>SELECT *
->FROM
->JSON_TABLE(
->'[ {"a": 1, "b": [11,111]}, {"a": 2, "b": [22,222]}, {"a":3}]',
->'$[*]' COLUMNS(
->a INT PATH '$.a',
->NESTED PATH '$.b[*]' COLUMNS (b INT PATH '$')
->)
->) AS jt
->WHERE b IS NOT NULL;
+------+------+ | a | b | +------+------+ | 1 | 11 | | 1 | 111 | | 2 | 22 | | 2 | 222 | +------+------+
Sibling nested paths—that is, two or more instances of
NESTED [PATH]
in the same
COLUMNS
clause—are processed one
after another, one at a time. While one nested path is
producing records, columns of any sibling nested path
expressions are set to NULL
. This means
that the total number of records for a single match within a
single containing COLUMNS
clause is the sum
and not the product of all records produced by NESTED
[PATH]
modifiers, as shown here:
mysql>SELECT *
->FROM
->JSON_TABLE(
->'[{"a": 1, "b": [11,111]}, {"a": 2, "b": [22,222]}]',
->'$[*]' COLUMNS(
->a INT PATH '$.a',
->NESTED PATH '$.b[*]' COLUMNS (b1 INT PATH '$'),
->NESTED PATH '$.b[*]' COLUMNS (b2 INT PATH '$')
->)
->) AS jt;
+------+------+------+ | a | b1 | b2 | +------+------+------+ | 1 | 11 | NULL | | 1 | 111 | NULL | | 1 | NULL | 11 | | 1 | NULL | 111 | | 2 | 22 | NULL | | 2 | 222 | NULL | | 2 | NULL | 22 | | 2 | NULL | 222 | +------+------+------+
A FOR ORDINALITY
column enumerates records
produced by the COLUMNS
clause, and can be
used to distinguish parent records of a nested path,
especially if values in parent records are the same, as can be
seen here:
mysql>SELECT *
->FROM
->JSON_TABLE(
->'[{"a": "a_val",
'>"b": [{"c": "c_val", "l": [1,2]}]},
'>{"a": "a_val",
'>"b": [{"c": "c_val","l": [11]}, {"c": "c_val", "l": [22]}]}]',
->'$[*]' COLUMNS(
->top_ord FOR ORDINALITY,
->apath VARCHAR(10) PATH '$.a',
->NESTED PATH '$.b[*]' COLUMNS (
->bpath VARCHAR(10) PATH '$.c',
->ord FOR ORDINALITY,
->NESTED PATH '$.l[*]' COLUMNS (lpath varchar(10) PATH '$')
->)
->)
->) as jt;
+---------+---------+---------+------+-------+ | top_ord | apath | bpath | ord | lpath | +---------+---------+---------+------+-------+ | 1 | a_val | c_val | 1 | 1 | | 1 | a_val | c_val | 1 | 2 | | 2 | a_val | c_val | 1 | 11 | | 2 | a_val | c_val | 2 | 22 | +---------+---------+---------+------+-------+
The source document contains an array of two elements; each of
these elements produces two rows. The values of
apath
and bpath
are the
same over the entire result set; this means that they cannot
be used to determine whether lpath
values
came from the same or different parents. The value of the
ord
column remains the same as the set of
records having top_ord
equal to 1, so these
two values are from a single object. The remaining two values
are from different objects, since they have different values
in the ord
column.
This section documents utility functions that act on JSON values,
or strings that can be parsed as JSON values.
JSON_PRETTY()
prints out a JSON
value in a format that is easy to read.
JSON_STORAGE_SIZE()
and
JSON_STORAGE_FREE()
show,
respectively, the amount of storage space used by a given JSON
value and the amount of space remaining in a
JSON
column following a partial update.
Provides pretty-printing of JSON values similar to that
implemented in PHP and by other languages and database
systems. The value supplied must be a JSON value or a valid
string representation of a JSON value. Extraneous whitespaces
and newlines present in this value have no effect on the
output. For a NULL
value, the function
returns NULL
. If the value is not a JSON
document, or if it cannot cannot be parsed as one, the
function fails with an error.
Formatting of the output from this function adheres to the following rules:
Each array element or object member appears on a separate line, indented by one additional level as compared to its parent.
Each level of indentation adds two leading spaces.
A comma separating individual array elements or object members is printed before the newline that separates the two elements or members.
The key and the value of an object member are separated by
a colon followed by a space (':
').
An empty object or array is printed on a single line. No space is printed between the opening and closing brace.
Special characters in string scalars and key names are
escaped employing the same rules used by the
JSON_QUOTE()
function.
mysql>SELECT JSON_PRETTY('123'); # scalar
+--------------------+ | JSON_PRETTY('123') | +--------------------+ | 123 | +--------------------+ mysql>SELECT JSON_PRETTY("[1,3,5]"); # array
+------------------------+ | JSON_PRETTY("[1,3,5]") | +------------------------+ | [ 1, 3, 5 ] | +------------------------+ mysql>SELECT JSON_PRETTY('{"a":"10","b":"15","x":"25"}'); # object
+---------------------------------------------+ | JSON_PRETTY('{"a":"10","b":"15","x":"25"}') | +---------------------------------------------+ | { "a": "10", "b": "15", "x": "25" } | +---------------------------------------------+ mysql>SELECT JSON_PRETTY('["a",1,{"key1":
>"value1"},"5", "77" ,
>{"key2":["value3","valueX",
>"valueY"]},"j", "2" ]')\G # nested arrays and objects
*************************** 1. row *************************** JSON_PRETTY('["a",1,{"key1": "value1"},"5", "77" , {"key2":["value3","valuex", "valuey"]},"j", "2" ]'): [ "a", 1, { "key1": "value1" }, "5", "77", { "key2": [ "value3", "valuex", "valuey" ] }, "j", "2" ]
For a JSON
column value, this
function shows how much storage space was freed in its binary
representation after it was updated in place using
JSON_SET()
,
JSON_REPLACE()
, or
JSON_REMOVE()
. The argument can
also be a valid JSON document or a string which can be parsed
as one—either as a literal value or as the value of a
user variable—in which case the function returns 0. It
returns a positive, nonzero value if the argument is a
JSON
column value which has been updated as
described previously, such that its binary representation
takes up less space than it did prior to the update. For a
JSON
column which has been updated such
that its binary representation is the same as or larger than
before, or if the update was not able to take advantage of a
partial update, it returns 0; it returns
NULL
if the argument is
NULL
.
If json_val
is not
NULL
, and neither is a valid JSON document
nor can be successfully parsed as one, an error results.
In this example, we create a table containing a
JSON
column, then insert a row containing a
JSON object:
mysql>CREATE TABLE jtable (jcol JSON);
Query OK, 0 rows affected (0.38 sec) mysql>INSERT INTO jtable VALUES
->('{"a": 10, "b": "wxyz", "c": "[true, false]"}');
Query OK, 1 row affected (0.04 sec) mysql>SELECT * FROM jtable;
+----------------------------------------------+ | jcol | +----------------------------------------------+ | {"a": 10, "b": "wxyz", "c": "[true, false]"} | +----------------------------------------------+ 1 row in set (0.00 sec)
Now we update the column value using
JSON_SET()
such that a partial update can
be performed; in this case, we replace the value pointed to by
the c
key (the array [true,
false]
) with one that takes up less space (the
integer 1
):
mysql>UPDATE jtable
->SET jcol = JSON_SET(jcol, "$.a", 10, "$.b", "wxyz", "$.c", 1);
Query OK, 1 row affected (0.03 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql>SELECT * FROM jtable;
+--------------------------------+ | jcol | +--------------------------------+ | {"a": 10, "b": "wxyz", "c": 1} | +--------------------------------+ 1 row in set (0.00 sec) mysql>SELECT JSON_STORAGE_FREE(jcol) FROM jtable;
+-------------------------+ | JSON_STORAGE_FREE(jcol) | +-------------------------+ | 14 | +-------------------------+ 1 row in set (0.00 sec)
The effects of successive partial updates on this free space
are cumulative, as shown in this example using
JSON_SET()
to reduce the space taken up by
the value having key b
(and making no other
changes):
mysql>UPDATE jtable
->SET jcol = JSON_SET(jcol, "$.a", 10, "$.b", "wx", "$.c", 1);
Query OK, 1 row affected (0.03 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql>SELECT JSON_STORAGE_FREE(jcol) FROM jtable;
+-------------------------+ | JSON_STORAGE_FREE(jcol) | +-------------------------+ | 16 | +-------------------------+ 1 row in set (0.00 sec)
Updating the column without using
JSON_SET()
,
JSON_REPLACE()
, or
JSON_REMOVE()
means that the optimizer
cannot perform the update in place; in this case,
JSON_STORAGE_FREE()
returns 0, as shown
here:
mysql>UPDATE jtable SET jcol = '{"a": 10, "b": 1}';
Query OK, 1 row affected (0.05 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql>SELECT JSON_STORAGE_FREE(jcol) FROM jtable;
+-------------------------+ | JSON_STORAGE_FREE(jcol) | +-------------------------+ | 0 | +-------------------------+ 1 row in set (0.00 sec)
Partial updates of JSON documents can be performed only on
column values. For a user variable that stores a JSON value,
the value is always completely replaced, even when the update
is performed using JSON_SET()
:
mysql>SET @j = '{"a": 10, "b": "wxyz", "c": "[true, false]"}';
Query OK, 0 rows affected (0.00 sec) mysql>SET @j = JSON_SET(@j, '$.a', 10, '$.b', 'wxyz', '$.c', '1');
Query OK, 0 rows affected (0.00 sec) mysql>SELECT @j, JSON_STORAGE_FREE(@j) AS Free;
+----------------------------------+------+ | @j | Free | +----------------------------------+------+ | {"a": 10, "b": "wxyz", "c": "1"} | 0 | +----------------------------------+------+ 1 row in set (0.00 sec)
For a JSON literal, this function always returns 0:
mysql> SELECT JSON_STORAGE_FREE('{"a": 10, "b": "wxyz", "c": "1"}') AS Free;
+------+
| Free |
+------+
| 0 |
+------+
1 row in set (0.00 sec)
This function returns the number of bytes used to store the
binary representation of a JSON document. When the argument is
a JSON
column, this is the space used to
store the JSON document as it was inserted into the column,
prior to any partial updates that may have been performed on
it afterwards. json_val
must be a
valid JSON document or a string which can be parsed as one. In
the case where it is string, the function returns the amount
of storage space in the JSON binary representation that is
created by parsing the string as JSON and converting it to
binary. It returns NULL
if the argument is
NULL
.
An error results when json_val
is
not NULL
, and is not—or cannot be
successfully parsed as—a JSON document.
To illustrate this function's behavior when used with a
JSON
column as its argument, we create a
table named jtable
containing a
JSON
column jcol
, insert
a JSON value into the table, then obtain the storage space
used by this column with
JSON_STORAGE_SIZE()
, as shown here:
mysql>CREATE TABLE jtable (jcol JSON);
Query OK, 0 rows affected (0.42 sec) mysql>INSERT INTO jtable VALUES
->('{"a": 1000, "b": "wxyz", "c": "[1, 3, 5, 7]"}');
Query OK, 1 row affected (0.04 sec) mysql>SELECT
->jcol,
->JSON_STORAGE_SIZE(jcol) AS Size,
->JSON_STORAGE_FREE(jcol) AS Free
->FROM jtable;
+-----------------------------------------------+------+------+ | jcol | Size | Free | +-----------------------------------------------+------+------+ | {"a": 1000, "b": "wxyz", "c": "[1, 3, 5, 7]"} | 47 | 0 | +-----------------------------------------------+------+------+ 1 row in set (0.00 sec)
According to the output of
JSON_STORAGE_SIZE()
, the JSON document
inserted into the column takes up 47 bytes. We also checked
the amount of space freed by any previous partial updates of
the column using
JSON_STORAGE_FREE()
; since no
updates have yet been performed, this is 0, as expected.
Next we perform an UPDATE
on
the table that should result in a partial update of the
document stored in jcol
, and then test the
result as shown here:
mysql>UPDATE jtable SET jcol =
->JSON_SET(jcol, "$.b", "a");
Query OK, 1 row affected (0.04 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql>SELECT
->jcol,
->JSON_STORAGE_SIZE(jcol) AS Size,
->JSON_STORAGE_FREE(jcol) AS Free
->FROM jtable;
+--------------------------------------------+------+------+ | jcol | Size | Free | +--------------------------------------------+------+------+ | {"a": 1000, "b": "a", "c": "[1, 3, 5, 7]"} | 47 | 3 | +--------------------------------------------+------+------+ 1 row in set (0.00 sec)
The value returned by JSON_STORAGE_FREE()
in the previous query indicates that a partial update of the
JSON document was performed, and that this freed 3 bytes of
space used to store it. The result returned by
JSON_STORAGE_SIZE()
is unchanged by the
partial update.
Partial updates are supported for updates using
JSON_SET()
,
JSON_REPLACE()
, or
JSON_REMOVE()
. The direct
assignment of a value to a JSON
column
cannot be partially updated; following such an update,
JSON_STORAGE_SIZE()
always shows the
storage used for the newly-set value:
mysql>UPDATE jtable
mysql>SET jcol = '{"a": 4.55, "b": "wxyz", "c": "[true, false]"}';
Query OK, 1 row affected (0.04 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql>SELECT
->jcol,
->JSON_STORAGE_SIZE(jcol) AS Size,
->JSON_STORAGE_FREE(jcol) AS Free
->FROM jtable;
+------------------------------------------------+------+------+ | jcol | Size | Free | +------------------------------------------------+------+------+ | {"a": 4.55, "b": "wxyz", "c": "[true, false]"} | 56 | 0 | +------------------------------------------------+------+------+ 1 row in set (0.00 sec)
A JSON user variable cannot be partially updated. This means that this function always shows the space currently used to store a JSON document in a user variable:
mysql>SET @j = '[100, "sakila", [1, 3, 5], 425.05]';
Query OK, 0 rows affected (0.00 sec) mysql>SELECT @j, JSON_STORAGE_SIZE(@j) AS Size;
+------------------------------------+------+ | @j | Size | +------------------------------------+------+ | [100, "sakila", [1, 3, 5], 425.05] | 45 | +------------------------------------+------+ 1 row in set (0.00 sec) mysql>SET @j = JSON_SET(@j, '$[1]', "json");
Query OK, 0 rows affected (0.00 sec) mysql>SELECT @j, JSON_STORAGE_SIZE(@j) AS Size;
+----------------------------------+------+ | @j | Size | +----------------------------------+------+ | [100, "json", [1, 3, 5], 425.05] | 43 | +----------------------------------+------+ 1 row in set (0.00 sec) mysql>SET @j = JSON_SET(@j, '$[2][0]', JSON_ARRAY(10, 20, 30));
Query OK, 0 rows affected (0.00 sec) mysql>SELECT @j, JSON_STORAGE_SIZE(@j) AS Size;
+---------------------------------------------+------+ | @j | Size | +---------------------------------------------+------+ | [100, "json", [[10, 20, 30], 3, 5], 425.05] | 56 | +---------------------------------------------+------+ 1 row in set (0.00 sec)
For a JSON literal, this function always returns the current storage space used:
mysql>SELECT
->JSON_STORAGE_SIZE('[100, "sakila", [1, 3, 5], 425.05]') AS A,
->JSON_STORAGE_SIZE('{"a": 1000, "b": "a", "c": "[1, 3, 5, 7]"}') AS B,
->JSON_STORAGE_SIZE('{"a": 1000, "b": "wxyz", "c": "[1, 3, 5, 7]"}') AS C,
->JSON_STORAGE_SIZE('[100, "json", [[10, 20, 30], 3, 5], 425.05]') AS D;
+----+----+----+----+ | A | B | C | D | +----+----+----+----+ | 45 | 44 | 47 | 56 | +----+----+----+----+ 1 row in set (0.00 sec)
The functions described in this section are used with GTID-based replication. It is important to keep in mind that all of these functions take string representations of GTID sets as arguments. As such, the GTID sets must always be quoted when used with them. See GTID Sets for more information.
The union of two GTID sets is simply their representations as strings, joined together with an interposed comma. In other words, you can define a very simple function for obtaining the union of two GTID sets, similar to that created here:
CREATE FUNCTION GTID_UNION(g1 TEXT, g2 TEXT) RETURNS TEXT DETERMINISTIC RETURN CONCAT(g1,',',g2);
For more information about GTIDs and how these GTID functions are used in practice, see Section 17.1.3, “Replication with Global Transaction Identifiers”.
Table 12.23 GTID Functions
Name | Description |
---|---|
GTID_SUBSET() |
Return true if all GTIDs in subset are also in set; otherwise false. |
GTID_SUBTRACT() |
Return all GTIDs in set that are not in subset. |
WAIT_FOR_EXECUTED_GTID_SET() |
Wait until the given GTIDs have executed on slave. |
WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS() |
Wait until the given GTIDs have executed on slave. |
Given two sets of global transaction identifiers
set1
and
set2
, returns true if all GTIDs in
set1
are also in
set2
. Returns false otherwise.
The GTID sets used with this function are represented as strings, as shown in the following examples:
mysql>SELECT GTID_SUBSET('3E11FA47-71CA-11E1-9E33-C80AA9429562:23',
->'3E11FA47-71CA-11E1-9E33-C80AA9429562:21-57')\G
*************************** 1. row *************************** GTID_SUBSET('3E11FA47-71CA-11E1-9E33-C80AA9429562:23', '3E11FA47-71CA-11E1-9E33-C80AA9429562:21-57'): 1 1 row in set (0.00 sec) mysql>SELECT GTID_SUBSET('3E11FA47-71CA-11E1-9E33-C80AA9429562:23-25',
->'3E11FA47-71CA-11E1-9E33-C80AA9429562:21-57')\G
*************************** 1. row *************************** GTID_SUBSET('3E11FA47-71CA-11E1-9E33-C80AA9429562:23-25', '3E11FA47-71CA-11E1-9E33-C80AA9429562:21-57'): 1 1 row in set (0.00 sec) mysql>SELECT GTID_SUBSET('3E11FA47-71CA-11E1-9E33-C80AA9429562:20-25',
->'3E11FA47-71CA-11E1-9E33-C80AA9429562:21-57')\G
*************************** 1. row *************************** GTID_SUBSET('3E11FA47-71CA-11E1-9E33-C80AA9429562:20-25', '3E11FA47-71CA-11E1-9E33-C80AA9429562:21-57'): 0 1 row in set (0.00 sec)
Given two sets of global transaction identifiers
set1
and
set2
, returns only those GTIDs from
set1
that are not in
set2
.
All GTID sets used with this function are represented as strings and must be quoted, as shown in these examples:
mysql>SELECT GTID_SUBTRACT('3E11FA47-71CA-11E1-9E33-C80AA9429562:21-57',
->'3E11FA47-71CA-11E1-9E33-C80AA9429562:21')\G
*************************** 1. row *************************** GTID_SUBTRACT('3E11FA47-71CA-11E1-9E33-C80AA9429562:21-57', '3E11FA47-71CA-11E1-9E33-C80AA9429562:21'): 3e11fa47-71ca-11e1-9e33-c80aa9429562:22-57 1 row in set (0.00 sec) mysql>SELECT GTID_SUBTRACT('3E11FA47-71CA-11E1-9E33-C80AA9429562:21-57',
->'3E11FA47-71CA-11E1-9E33-C80AA9429562:20-25')\G
*************************** 1. row *************************** GTID_SUBTRACT('3E11FA47-71CA-11E1-9E33-C80AA9429562:21-57', '3E11FA47-71CA-11E1-9E33-C80AA9429562:20-25'): 3e11fa47-71ca-11e1-9e33-c80aa9429562:26-57 1 row in set (0.00 sec) mysql>SELECT GTID_SUBTRACT('3E11FA47-71CA-11E1-9E33-C80AA9429562:21-57',
->'3E11FA47-71CA-11E1-9E33-C80AA9429562:23-24')\G
*************************** 1. row *************************** GTID_SUBTRACT('3E11FA47-71CA-11E1-9E33-C80AA9429562:21-57', '3E11FA47-71CA-11E1-9E33-C80AA9429562:23-24'): 3e11fa47-71ca-11e1-9e33-c80aa9429562:21-22:25-57 1 row in set (0.01 sec)
WAIT_FOR_EXECUTED_GTID_SET(
gtid_set
[,
timeout
])
Wait until the server has applied all of the transactions
whose global transaction identifiers are contained in
gtid_set
; that is, until the
condition GTID_SUBSET(gtid_subset
,
@@GLOBAL.gtid_executed
) holds. See
Section 17.1.3.1, “GTID Format and Storage” for a definition
of GTID sets.
If a timeout is specified, and
timeout
seconds elapse before all
of the transactions in the GTID set have been applied, the
function stops waiting. timeout
is
optional, and the default timeout is 0 seconds, in which case
the function always waits until all of the transactions in the
GTID set have been applied.
WAIT_FOR_EXECUTED_GTID_SET()
monitors all
the GTIDs that are applied on the server, including
transactions that arrive from all replication channels and
user clients. It does not take into account whether
replication channels have been started or stopped.
For more information, see Section 17.1.3, “Replication with Global Transaction Identifiers”.
GTID sets used with this function are represented as strings and so must be quoted as shown in the following example:
mysql> SELECT WAIT_FOR_EXECUTED_GTID_SET('3E11FA47-71CA-11E1-9E33-C80AA9429562:1-5');
-> 0
For a syntax description for GTID sets, see Section 17.1.3.1, “GTID Format and Storage”.
For WAIT_FOR_EXECUTED_GTID_SET()
, the
return value is the state of the query, where 0 represents
success, and 1 represents timeout. Any other failures generate
an error.
gtid_mode
cannot be changed
to OFF while any client is using this function to wait for
GTIDs to be applied.
WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS(
gtid_set
[,
timeout
][,channel
])
WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS()
is
similar to WAIT_FOR_EXECUTED_GTID_SET()
in
that it waits until all of the transactions whose global
transaction identifiers are contained in
gtid_set
have been applied, or
until timeout
seconds have elapsed,
whichever occurs first. However,
WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS()
applies
to a specific replication channel, and stops only after the
transactions have been applied on the specified channel, for
which the applier must be running. In contrast,
WAIT_FOR_EXECUTED_GTID_SET()
stops after
the transactions have been applied, regardless of where they
were applied (on any replication channel or any user client),
and whether or not any replication channels are running.
The channel
option names which
replication channel the function applies to. If no channel is
named and no channels other than the default replication
channel exist, the function applies to the default replication
channel. If multiple replication channels exist, you must
specify a channel as otherwise it is not known which
replication channel the function applies to. See
Section 17.2.3, “Replication Channels” for more
information on replication channels.
Because
WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS()
applies to a specific replication channel, if an expected
transaction arrives on a different replication channel or
from a user client, for example in a failover or manual
recovery situation, the function can hang indefinitely if no
timeout is set. Use
WAIT_FOR_EXECUTED_GTID_SET()
instead to
ensure correct handling of transactions in these situations.
GTID sets used with
WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS()
are
represented as strings and must be quoted in the same way as
for WAIT_FOR_EXECUTED_GTID_SET()
. For
WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS()
, the
return value for the function is an arbitrary positive number.
If GTID-based replication is not active (that is, if the value
of the gtid_mode
variable is
OFF), then this value is undefined and
WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS()
returns
NULL. If the slave is not running then the function also
returns NULL.
gtid_mode
cannot be changed
to OFF while any client is using this function to wait for
GTIDs to be applied.
MySQL Enterprise Encryption is an extension included in MySQL Enterprise Edition, a commercial product. To learn more about commercial products, https://www.mysql.com/products/.
MySQL Enterprise Edition includes a set of encryption functions based on the OpenSSL library that expose OpenSSL capabilities at the SQL level. These functions enable Enterprise applications to perform the following operations:
Implement added data protection using public-key asymmetric cryptography
Create public and private keys and digital signatures
Perform asymmetric encryption and decryption
Use cryptographic hashing for digital signing and data verification and validation
MySQL Enterprise Encryption supports the RSA, DSA, and DH cryptographic algorithms.
MySQL Enterprise Encryption is supplied as a user-defined function (UDF) library, from which individual functions can be installed individually.
MySQL Enterprise Encryption functions are located in a user-defined function
(UDF) library file installed in the plugin directory (the
directory named by the
plugin_dir
system variable).
The UDF library base name is openssl_udf
and the suffix is platform dependent. For example, the file name
on Linux or Windows is openssl_udf.so
or
openssl_udf.dll
, respectively.
To install functions from the library file, use the
CREATE
FUNCTION
statement. To load all functions from the
library, use this set of statements (adjust the file name suffix
as necessary):
CREATE FUNCTION asymmetric_decrypt RETURNS STRING SONAME 'openssl_udf.so'; CREATE FUNCTION asymmetric_derive RETURNS STRING SONAME 'openssl_udf.so'; CREATE FUNCTION asymmetric_encrypt RETURNS STRING SONAME 'openssl_udf.so'; CREATE FUNCTION asymmetric_sign RETURNS STRING SONAME 'openssl_udf.so'; CREATE FUNCTION asymmetric_verify RETURNS INTEGER SONAME 'openssl_udf.so'; CREATE FUNCTION create_asymmetric_priv_key RETURNS STRING SONAME 'openssl_udf.so'; CREATE FUNCTION create_asymmetric_pub_key RETURNS STRING SONAME 'openssl_udf.so'; CREATE FUNCTION create_dh_parameters RETURNS STRING SONAME 'openssl_udf.so'; CREATE FUNCTION create_digest RETURNS STRING SONAME 'openssl_udf.so';
Once installed, UDFs remain installed across server restarts. To
unload UDFs, use the
DROP
FUNCTION
statement. For example, to unload the
key-generation functions, do this:
DROP FUNCTION create_asymmetric_priv_key; DROP FUNCTION create_asymmetric_pub_key;
In the
CREATE
FUNCTION
and
DROP
FUNCTION
statements, the function names must be
specified in lowercase. This differs from their use at function
invocation time, for which you can use any lettercase.
The CREATE
FUNCTION
and
DROP
FUNCTION
statements require the
INSERT
and
DROP
privilege, respectively, for
the mysql
database.
To use MySQL Enterprise Encryption in applications, invoke the functions that are appropriate for the operations you wish to perform. This section demonstrates how to carry out some representative tasks:
-- Encryption algorithm; can be 'DSA' or 'DH' instead SET @algo = 'RSA'; -- Key length in bits; make larger for stronger keys SET @key_len = 1024; -- Create private key SET @priv = CREATE_ASYMMETRIC_PRIV_KEY(@algo, @key_len); -- Derive corresponding public key from private key, using same algorithm SET @pub = CREATE_ASYMMETRIC_PUB_KEY(@algo, @priv);
Now you can use the key pair to encrypt and decrypt data, sign and verify data, or generate symmetric keys.
This requires that the members of the key pair be RSA keys.
SET @ciphertext = ASYMMETRIC_ENCRYPT(@algo, 'My secret text', @priv); SET @cleartext = ASYMMETRIC_DECRYPT(@algo, @ciphertext, @pub);
Conversely, you can encrypt using the public key and decrypt using the private key.
SET @ciphertext = ASYMMETRIC_ENCRYPT(@algo, 'My secret text', @pub); SET @cleartext = ASYMMETRIC_DECRYPT(@algo, @ciphertext, @priv);
In either case, the algorithm specified for the encryption and decryption functions must match that used to generate the keys.
-- Digest type; can be 'SHA256', 'SHA384', or 'SHA512' instead SET @dig_type = 'SHA224'; -- Generate digest string SET @dig = CREATE_DIGEST(@dig_type, 'My text to digest');
The key pair can be used to sign data, then verify that the signature matches the digest.
-- Encryption algorithm; could be 'DSA' instead; keys must -- have been created using same algorithm SET @algo = 'RSA'; -- Generate signature for digest and verify signature against digest SET @sig = ASYMMETRIC_SIGN(@algo, @dig, @priv, @dig_type); -- Verify signature against digest SET @verf = ASYMMETRIC_VERIFY(@algo, @dig, @sig, @pub, @dig_type);
This requires DH private/public keys as inputs, created using
a shared symmetric secret. Create the secret by passing the
key length to
CREATE_DH_PARAMETERS()
, then
pass the secret as the “key length” to
CREATE_ASYMMETRIC_PRIV_KEY()
.
-- Generate DH shared symmetric secret SET @dhp = CREATE_DH_PARAMETERS(1024); -- Generate DH key pairs SET @algo = 'DH'; SET @priv1 = CREATE_ASYMMETRIC_PRIV_KEY(@algo, @dhp); SET @pub1 = CREATE_ASYMMETRIC_PUB_KEY(@algo, @priv1); SET @priv2 = CREATE_ASYMMETRIC_PRIV_KEY(@algo, @dhp); SET @pub2 = CREATE_ASYMMETRIC_PUB_KEY(@algo, @priv2); -- Generate symmetric key using public key of first party, -- private key of second party SET @sym1 = ASYMMETRIC_DERIVE(@pub1, @priv2); -- Or use public key of second party, private key of first party SET @sym2 = ASYMMETRIC_DERIVE(@pub2, @priv1);
Key string values can be created at runtime and stored into a
variable or table using
SET
,
SELECT
, or
INSERT
:
SET @priv1 = CREATE_ASYMMETRIC_PRIV_KEY('RSA', 1024); SELECT CREATE_ASYMMETRIC_PRIV_KEY('RSA', 1024) INTO @priv2; INSERT INTO t (key_col) VALUES(CREATE_ASYMMETRIC_PRIV_KEY('RSA', 1024));
Key string values stored in files can be read using the
LOAD_FILE()
function by users
who have the FILE
privilege.
Digest and signature strings can be handled similarly.
The
CREATE_ASYMMETRIC_PRIV_KEY()
and CREATE_DH_PARAMETERS()
encryption functions take a key-length parameter, and the
amount of CPU resources required by these functions increases
as the key length increases. For some installations, this
might result in unacceptable CPU usage if applications
frequently generate excessively long keys.
OpenSSL imposes a minimum key length of 1,024 bits for all
keys. OpenSSL also imposes a maximum key length of 10,000 bits
and 16,384 bits for DSA and RSA keys, respectively, for
CREATE_ASYMMETRIC_PRIV_KEY()
,
and a maximum key length of 10,000 bits for
CREATE_DH_PARAMETERS()
. If
those maximum values are too high, three environment variables
are available to enable MySQL server administrators to set
lower maximum lengths for key generation, and thereby to limit
CPU usage:
MYSQL_OPENSSL_UDF_DSA_BITS_THRESHOLD
:
Maximum DSA key length in bits for
CREATE_ASYMMETRIC_PRIV_KEY()
.
The minimum and maximum values for this variable are 1,024
and 10,000.
MYSQL_OPENSSL_UDF_RSA_BITS_THRESHOLD
:
Maximum RSA key length in bits for
CREATE_ASYMMETRIC_PRIV_KEY()
.
The minimum and maximum values for this variable are 1,024
and 16,384.
MYSQL_OPENSSL_UDF_DH_BITS_THRESHOLD
:
Maximum key length in bits for
CREATE_DH_PARAMETERS()
. The
minimum and maximum values for this variable are 1,024 and
10,000.
To use any of these environment variables, set them in the
environment of the process that starts the server. If set,
their values take precedence over the maximum key lengths
imposed by OpenSSL. For example, to set a maximum key length
of 4,096 bits for DSA and RSA keys for
CREATE_ASYMMETRIC_PRIV_KEY()
,
set these variables:
export MYSQL_OPENSSL_UDF_DSA_BITS_THRESHOLD=4096 export MYSQL_OPENSSL_UDF_RSA_BITS_THRESHOLD=4096
The example uses Bourne shell syntax. The syntax for other shells may differ.
Table 12.24 MySQL Enterprise Encryption Functions
Name | Description |
---|---|
ASYMMETRIC_DECRYPT() |
Decrypt ciphertext using private or public key |
ASYMMETRIC_DERIVE() |
Derive symmetric key from asymmetric keys |
ASYMMETRIC_ENCRYPT() |
Encrypt cleartext using private or public key |
ASYMMETRIC_SIGN() |
Generate signature from digest |
ASYMMETRIC_VERIFY() |
Verify that signature matches digest |
CREATE_ASYMMETRIC_PRIV_KEY() |
Create private key |
CREATE_ASYMMETRIC_PUB_KEY() |
Create public key |
CREATE_DH_PARAMETERS() |
Generate shared DH secret |
CREATE_DIGEST() |
Generate digest from string |
MySQL Enterprise Encryption functions have these general characteristics:
For arguments of the wrong type or an incorrect number of arguments, each function returns an error.
If the arguments are not suitable to permit a function to
perform the requested operation, it returns
NULL
or 0 as appropriate. This occurs,
for example, if a function does not support a specified
algorithm, a key length is too short or long, or a string
expected to be a key string in PEM format is not a valid
key. (OpenSSL imposes its own key-length limits, and server
administrators can impose additional limits on maximum key
length by setting environment variables. See
Section 12.19.2, “MySQL Enterprise Encryption Usage and Examples”.)
The underlying SSL library takes care of randomness initialization.
Several of the functions take an encryption algorithm argument. The following table summarizes the supported algorithms by function.
Table 12.25 Supported Algorithms by Function
Function | Supported Algorithms |
---|---|
ASYMMETRIC_DECRYPT() |
RSA |
ASYMMETRIC_DERIVE() |
DH |
ASYMMETRIC_ENCRYPT() |
RSA |
ASYMMETRIC_SIGN() |
RSA, DSA |
ASYMMETRIC_VERIFY() |
RSA, DSA |
CREATE_ASYMMETRIC_PRIV_KEY() |
RSA, DSA, DH |
CREATE_ASYMMETRIC_PUB_KEY() |
RSA, DSA, DH |
CREATE_DH_PARAMETERS() |
DH |
Although you can create keys using any of the RSA, DSA, or DH
encryption algorithms, other functions that take key arguments
might accept only certain types of keys. For example,
ASYMMETRIC_ENCRYPT()
and
ASYMMETRIC_DECRYPT()
accept
only RSA keys.
The following descriptions describe the calling sequences for MySQL Enterprise Encryption functions. For additional examples and discussion, see Section 12.19.2, “MySQL Enterprise Encryption Usage and Examples”.
ASYMMETRIC_DECRYPT(
algorithm
,
crypt_str
,
key_str
)
Decrypts an encrypted string using the given algorithm and
key string, and returns the resulting cleartext as a binary
string. If decryption fails, the result is
NULL
.
key_str
must be a valid key
string in PEM format. For successful decryption, it must be
the public or private key string corresponding to the
private or public key string used with
ASYMMETRIC_ENCRYPT()
to
produce the encrypted string.
algorithm
indicates the
encryption algorithm used to create the key.
Supported algorithm
values:
'RSA'
For a usage example, see the description of
ASYMMETRIC_ENCRYPT()
.
ASYMMETRIC_DERIVE(
pub_key_str
,
priv_key_str
)
Derives a symmetric key using the private key of one party
and the public key of another, and returns the resulting key
as a binary string. If key derivation fails, the result is
NULL
.
pub_key_str
and
priv_key_str
must be valid key
strings in PEM format. They must be created using the DH
algorithm.
Suppose that you have two pairs of public and private keys:
SET @dhp = CREATE_DH_PARAMETERS(1024); SET @priv1 = CREATE_ASYMMETRIC_PRIV_KEY('DH', @dhp); SET @pub1 = CREATE_ASYMMETRIC_PUB_KEY('DH', @priv1); SET @priv2 = CREATE_ASYMMETRIC_PRIV_KEY('DH', @dhp); SET @pub2 = CREATE_ASYMMETRIC_PUB_KEY('DH', @priv2);
Suppose further that you use the private key from one pair and the public key from the other pair to create a symmetric key string. Then this symmetric key identity relationship holds:
ASYMMETRIC_DERIVE(@pub1, @priv2) = ASYMMETRIC_DERIVE(@pub2, @priv1)
ASYMMETRIC_ENCRYPT(
algorithm
,
str
,
key_str
)
Encrypts a string using the given algorithm and key string,
and returns the resulting ciphertext as a binary string. If
encryption fails, the result is NULL
.
The str
length cannot be greater
than the key_str
length −
11, in bytes
key_str
must be a valid key
string in PEM format. algorithm
indicates the encryption algorithm used to create the key.
Supported algorithm
values:
'RSA'
To encrypt a string, pass a private or public key string to
ASYMMETRIC_ENCRYPT()
. To
recover the original unencrypted string, pass the encrypted
string to
ASYMMETRIC_DECRYPT()
, along
with the public or private key string correponding to the
private or public key string used for encryption.
-- Generate private/public key pair SET @priv = CREATE_ASYMMETRIC_PRIV_KEY('RSA', 1024); SET @pub = CREATE_ASYMMETRIC_PUB_KEY('RSA', @priv); -- Encrypt using private key, decrypt using public key SET @ciphertext = ASYMMETRIC_ENCRYPT('RSA', 'The quick brown fox', @priv); SET @cleartext = ASYMMETRIC_DECRYPT('RSA', @ciphertext, @pub); -- Encrypt using public key, decrypt using private key SET @ciphertext = ASYMMETRIC_ENCRYPT('RSA', 'The quick brown fox', @pub); SET @cleartext = ASYMMETRIC_DECRYPT('RSA', @ciphertext, @priv);
Suppose that:
SET @s = a string to be encrypted SET @priv = a valid private RSA key string in PEM format SET @pub = the corresponding public RSA key string in PEM format
Then these identity relationships hold:
ASYMMETRIC_DECRYPT('RSA', ASYMMETRIC_ENCRYPT('RSA', @s, @priv), @pub) = @s ASYMMETRIC_DECRYPT('RSA', ASYMMETRIC_ENCRYPT('RSA', @s, @pub), @priv) = @s
ASYMMETRIC_SIGN(
algorithm
,
digest_str
,
priv_key_str
,
digest_type
)
Signs a digest string using a private key string, and
returns the signature as a binary string. If signing fails,
the result is NULL
.
digest_str
is the digest string.
It can be generated by calling
CREATE_DIGEST()
.
digest_type
indicates the digest
algorithm used to generate the digest string.
priv_key_str
is the private key
string to use for signing the digest string. It must be a
valid key string in PEM format.
algorithm
indicates the
encryption algorithm used to create the key.
Supported algorithm
values:
'RSA'
, 'DSA'
Supported digest_type
values:
'SHA224'
, 'SHA256'
,
'SHA384'
, 'SHA512'
For a usage example, see the description of
ASYMMETRIC_VERIFY()
.
ASYMMETRIC_VERIFY(
algorithm
,
digest_str
,
sig_str
,
pub_key_str
,
digest_type
)
Verifies whether the signature string matches the digest string, and returns 1 or 0 to indicate whether verification succeeded or failed.
digest_str
is the digest string.
It can be generated by calling
CREATE_DIGEST()
.
digest_type
indicates the digest
algorithm used to generate the digest string.
sig_str
is the signature string.
It can be generated by calling
ASYMMETRIC_SIGN()
.
pub_key_str
is the public key
string of the signer. It corresponds to the private key
passed to ASYMMETRIC_SIGN()
to generate the signature string and must be a valid key
string in PEM format. algorithm
indicates the encryption algorithm used to create the key.
Supported algorithm
values:
'RSA'
, 'DSA'
Supported digest_type
values:
'SHA224'
, 'SHA256'
,
'SHA384'
, 'SHA512'
-- Set the encryption algorithm and digest type SET @algo = 'RSA'; SET @dig_type = 'SHA224'; -- Create private/public key pair SET @priv = CREATE_ASYMMETRIC_PRIV_KEY(@algo, 1024); SET @pub = CREATE_ASYMMETRIC_PUB_KEY(@algo, @priv); -- Generate digest from string SET @dig = CREATE_DIGEST(@dig_type, 'The quick brown fox'); -- Generate signature for digest and verify signature against digest SET @sig = ASYMMETRIC_SIGN(@algo, @dig, @priv, @dig_type); SET @verf = ASYMMETRIC_VERIFY(@algo, @dig, @sig, @pub, @dig_type);
CREATE_ASYMMETRIC_PRIV_KEY(
algorithm
,
{key_len
|dh_secret
})
Creates a private key using the given algorithm and key
length or DH secret, and returns the key as a binary string
in PEM format. If key generation fails, the result is
NULL
.
Supported algorithm
values:
'RSA'
, 'DSA'
,
'DH'
Supported key_len
values: The
minimum key length in bits is 1,024. The maximum key length
depends on the algorithm: 16,384 for RSA and 10,000 for DSA.
These key-length limits are constraints imposed by OpenSSL.
Server administrators can impose additional limits on
maximum key length by setting environment variables. See
Section 12.19.2, “MySQL Enterprise Encryption Usage and Examples”.
For DH keys, pass a shared DH secret instead of a key
length. To create the secret, pass the key length to
CREATE_DH_PARAMETERS()
.
This example creates a 2,048-bit DSA private key, then derives a public key from the private key:
SET @priv = CREATE_ASYMMETRIC_PRIV_KEY('DSA', 2048); SET @pub = CREATE_ASYMMETRIC_PUB_KEY('DSA', @priv);
For an example showing DH key generation, see the
description of
ASYMMETRIC_DERIVE()
.
Some general considerations in choosing key lengths and encryption algorithms:
The strength of encryption for private and public keys increases with the key size, but the time for key generation increases as well.
Generation of DH keys takes much longer than RSA or RSA keys.
Asymmetric encryption functions are slower than
symmetric functions. If performance is an important
factor and the functions are to be used very frequently,
you are better off using symmetric encryption. For
example, consider using
AES_ENCRYPT()
and
AES_DECRYPT()
.
CREATE_ASYMMETRIC_PUB_KEY(
algorithm
,
priv_key_str
)
Derives a public key from the given private key using the
given algorithm, and returns the key as a binary string in
PEM format. If key derivation fails, the result is
NULL
.
priv_key_str
must be a valid key
string in PEM format. algorithm
indicates the encryption algorithm used to create the key.
Supported algorithm
values:
'RSA'
, 'DSA'
,
'DH'
For a usage example, see the description of
CREATE_ASYMMETRIC_PRIV_KEY()
.
Creates a shared secret for generating a DH private/public
key pair and returns a binary string that can be passed to
CREATE_ASYMMETRIC_PRIV_KEY()
.
If secret generation fails, the result is null.
Supported key_len
values: The
minimum and maximum key lengths in bits are 1,024 and
10,000. These key-length limits are constraints imposed by
OpenSSL. Server administrators can impose additional limits
on maximum key length by setting environment variables. See
Section 12.19.2, “MySQL Enterprise Encryption Usage and Examples”.
For an example showing how to use the return value for
generating symmetric keys, see the description of
ASYMMETRIC_DERIVE()
.
SET @dhp = CREATE_DH_PARAMETERS(1024);
CREATE_DIGEST(
digest_type
,
str
)
Creates a digest from the given string using the given
digest type, and returns the digest as a binary string. If
digest generation fails, the result is
NULL
.
Supported digest_type
values:
'SHA224'
, 'SHA256'
,
'SHA384'
, 'SHA512'
SET @dig = CREATE_DIGEST('SHA512', The quick brown fox');
The resulting digest string is suitable for use with
ASYMMETRIC_SIGN()
and
ASYMMETRIC_VERIFY()
.
This section describes group (aggregate) functions that operate on sets of values.
Table 12.26 Aggregate (GROUP BY) Functions
Name | Description |
---|---|
AVG() |
Return the average value of the argument |
BIT_AND() |
Return bitwise AND |
BIT_OR() |
Return bitwise OR |
BIT_XOR() |
Return bitwise XOR |
COUNT() |
Return a count of the number of rows returned |
COUNT(DISTINCT) |
Return the count of a number of different values |
GROUP_CONCAT() |
Return a concatenated string |
JSON_ARRAYAGG() |
Return result set as a single JSON array |
JSON_OBJECTAGG() |
Return result set as a single JSON object |
MAX() |
Return the maximum value |
MIN() |
Return the minimum value |
STD() |
Return the population standard deviation |
STDDEV() |
Return the population standard deviation |
STDDEV_POP() |
Return the population standard deviation |
STDDEV_SAMP() |
Return the sample standard deviation |
SUM() |
Return the sum |
VAR_POP() |
Return the population standard variance |
VAR_SAMP() |
Return the sample variance |
VARIANCE() |
Return the population standard variance |
Unless otherwise stated, group functions ignore
NULL
values.
If you use a group function in a statement containing no
GROUP BY
clause, it is equivalent to grouping
on all rows. For more information, see
Section 12.20.3, “MySQL Handling of GROUP BY”.
Most aggregate functions can be used as window functions. Those
that can be used this way are signified in their syntax
description by
[
,
representing an optional over_clause
]OVER
clause.
over_clause
is described in
Section 12.21.2, “Window Function Concepts and Syntax”, which also includes
other information about window function usage.
For numeric arguments, the variance and standard deviation
functions return a DOUBLE
value.
The SUM()
and
AVG()
functions return a
DECIMAL
value for exact-value
arguments (integer or DECIMAL
),
and a DOUBLE
value for
approximate-value arguments
(FLOAT
or
DOUBLE
).
The SUM()
and
AVG()
aggregate functions do not
work with temporal values. (They convert the values to numbers,
losing everything after the first nonnumeric character.) To work
around this problem, convert to numeric units, perform the
aggregate operation, and convert back to a temporal value.
Examples:
SELECT SEC_TO_TIME(SUM(TIME_TO_SEC(time_col
))) FROMtbl_name
; SELECT FROM_DAYS(SUM(TO_DAYS(date_col
))) FROMtbl_name
;
Functions such as SUM()
or
AVG()
that expect a numeric
argument cast the argument to a number if necessary. For
SET
or
ENUM
values, the cast operation
causes the underlying numeric value to be used.
The BIT_AND()
,
BIT_OR()
, and
BIT_XOR()
aggregate functions
perform bit operations. Prior to MySQL 8.0, bit functions and
operators required BIGINT
(64-bit
integer) arguments and returned
BIGINT
values, so they had a
maximum range of 64 bits.
Non-BIGINT
arguments were
converted to BIGINT
prior to
performing the operation and truncation could occur.
In MySQL 8.0, bit functions and operators permit binary string
type arguments (BINARY
,
VARBINARY
, and the
BLOB
types) and return a value of
like type, which enables them to take arguments and produce
return values larger than 64 bits. For discussion about argument
evaluation and result types for bit operations, see the
introductory discussion in Section 12.12, “Bit Functions and Operators”.
AVG([DISTINCT]
expr
)
[over_clause
]
Returns the average value of
. The
expr
DISTINCT
option can be used to return the
average of the distinct values of
expr
.
If there are no matching rows,
AVG()
returns
NULL
.
This function executes as a window function if
over_clause
is present.
over_clause
is as described in
Section 12.21.2, “Window Function Concepts and Syntax”; it cannot be used
with DISTINCT
.
mysql>SELECT student_name, AVG(test_score)
FROM student
GROUP BY student_name;
Returns the bitwise AND
of all bits in
expr
.
The result type depends on whether the function argument values are evaluated as binary strings or numbers:
Binary-string evaluation occurs when the argument values
have a binary string type, and the argument is not a
hexadecimal literal, bit literal, or
NULL
literal. Numeric evaluation
occurs otherwise, with argument value conversion to
unsigned 64-bit integers as necessary.
Binary-string evaluation produces a binary string of the
same length as the argument values. If argument values
have unequal lengths, an
ER_INVALID_BITWISE_OPERANDS_SIZE
error occurs. If the argument size exceeds 511 bytes, an
ER_INVALID_BITWISE_AGGREGATE_OPERANDS_SIZE
error occurs. Numeric evaluation produces an unsigned
64-bit integer.
If there are no matching rows,
BIT_AND()
returns a neutral
value (all bits set to 1) having the same length as the
argument values.
NULL
values do not affect the result
unless all values are NULL
. In that case,
the result is a neutral value having the same length as the
argument values.
For more information discussion about argument evaluation and result types, see the introductory discussion in Section 12.12, “Bit Functions and Operators”.
As of MySQL 8.0.12, this function executes as a window
function if over_clause
is
present. over_clause
is as
described in Section 12.21.2, “Window Function Concepts and Syntax”.
Returns the bitwise OR
of all bits in
expr
.
The result type depends on whether the function argument values are evaluated as binary strings or numbers:
Binary-string evaluation occurs when the argument values
have a binary string type, and the argument is not a
hexadecimal literal, bit literal, or
NULL
literal. Numeric evaluation
occurs otherwise, with argument value conversion to
unsigned 64-bit integers as necessary.
Binary-string evaluation produces a binary string of the
same length as the argument values. If argument values
have unequal lengths, an
ER_INVALID_BITWISE_OPERANDS_SIZE
error occurs. If the argument size exceeds 511 bytes, an
ER_INVALID_BITWISE_AGGREGATE_OPERANDS_SIZE
error occurs. Numeric evaluation produces an unsigned
64-bit integer.
If there are no matching rows,
BIT_OR()
returns a neutral
value (all bits set to 0) having the same length as the
argument values.
NULL
values do not affect the result
unless all values are NULL
. In that case,
the result is a neutral value having the same length as the
argument values.
For more information discussion about argument evaluation and result types, see the introductory discussion in Section 12.12, “Bit Functions and Operators”.
As of MySQL 8.0.12, this function executes as a window
function if over_clause
is
present. over_clause
is as
described in Section 12.21.2, “Window Function Concepts and Syntax”.
Returns the bitwise XOR
of all
bits in expr
.
The result type depends on whether the function argument values are evaluated as binary strings or numbers:
Binary-string evaluation occurs when the argument values
have a binary string type, and the argument is not a
hexadecimal literal, bit literal, or
NULL
literal. Numeric evaluation
occurs otherwise, with argument value conversion to
unsigned 64-bit integers as necessary.
Binary-string evaluation produces a binary string of the
same length as the argument values. If argument values
have unequal lengths, an
ER_INVALID_BITWISE_OPERANDS_SIZE
error occurs. If the argument size exceeds 511 bytes, an
ER_INVALID_BITWISE_AGGREGATE_OPERANDS_SIZE
error occurs. Numeric evaluation produces an unsigned
64-bit integer.
If there are no matching rows,
BIT_XOR()
returns a neutral
value (all bits set to 0) having the same length as the
argument values.
NULL
values do not affect the result
unless all values are NULL
. In that case,
the result is a neutral value having the same length as the
argument values.
For more information discussion about argument evaluation and result types, see the introductory discussion in Section 12.12, “Bit Functions and Operators”.
As of MySQL 8.0.12, this function executes as a window
function if over_clause
is
present. over_clause
is as
described in Section 12.21.2, “Window Function Concepts and Syntax”.
Returns a count of the number of non-NULL
values of expr
in the rows
retrieved by a SELECT
statement. The result is a
BIGINT
value.
If there are no matching rows,
COUNT()
returns
0
.
This function executes as a window function if
over_clause
is present.
over_clause
is as described in
Section 12.21.2, “Window Function Concepts and Syntax”.
mysql>SELECT student.student_name,COUNT(*)
FROM student,course
WHERE student.student_id=course.student_id
GROUP BY student_name;
COUNT(*)
is somewhat
different in that it returns a count of the number of rows
retrieved, whether or not they contain
NULL
values.
For transactional storage engines such as
InnoDB
, storing an exact row count is
problematic. Multiple transactions may be occurring at the
same time, each of which may affect the count.
InnoDB
does not keep an internal count of
rows in a table because concurrent transactions might
“see” different numbers of rows at the same
time. Consequently, SELECT COUNT(*)
statements only count rows visible to the current
transaction.
As of MySQL 8.0.13, SELECT COUNT(*) FROM
query
performance for tbl_name
InnoDB
tables is
optimized for single-threaded workloads if there are no
extra clauses such as WHERE
or
GROUP BY
.
InnoDB
processes SELECT
COUNT(*)
statements by traversing the smallest
available secondary index unless an index or optimizer hint
directs the optimizer to use a different index. If a
secondary index is not present, InnoDB
processes SELECT COUNT(*)
statements by
scanning the clustered index.
Processing SELECT COUNT(*)
statements
takes some time if index records are not entirely in the
buffer pool. For a faster count, create a counter table and
let your application update it according to the inserts and
deletes it does. However, this method may not scale well in
situations where thousands of concurrent transactions are
initiating updates to the same counter table. If an
approximate row count is sufficient, use
SHOW TABLE STATUS
.
InnoDB
handles SELECT
COUNT(*)
and SELECT COUNT(1)
operations in the same way. There is no performance
difference.
For MyISAM
tables,
COUNT(*)
is optimized to
return very quickly if the
SELECT
retrieves from one
table, no other columns are retrieved, and there is no
WHERE
clause. For example:
mysql> SELECT COUNT(*) FROM student;
This optimization only applies to MyISAM
tables, because an exact row count is stored for this
storage engine and can be accessed very quickly.
COUNT(1)
is only subject to the same
optimization if the first column is defined as NOT
NULL
.
COUNT(DISTINCT
expr
,[expr
...])
Returns a count of the number of rows with different
non-NULL
expr
values.
If there are no matching rows,
COUNT(DISTINCT)
returns
0
.
mysql> SELECT COUNT(DISTINCT results) FROM student;
In MySQL, you can obtain the number of distinct expression
combinations that do not contain NULL
by
giving a list of expressions. In standard SQL, you would
have to do a concatenation of all expressions inside
COUNT(DISTINCT ...)
.
This function returns a string result with the concatenated
non-NULL
values from a group. It returns
NULL
if there are no
non-NULL
values. The full syntax is as
follows:
GROUP_CONCAT([DISTINCT]expr
[,expr
...] [ORDER BY {unsigned_integer
|col_name
|expr
} [ASC | DESC] [,col_name
...]] [SEPARATORstr_val
])
mysql>SELECT student_name,
GROUP_CONCAT(test_score)
FROM student
GROUP BY student_name;
Or:
mysql>SELECT student_name,
GROUP_CONCAT(DISTINCT test_score
ORDER BY test_score DESC SEPARATOR ' ')
FROM student
GROUP BY student_name;
In MySQL, you can get the concatenated values of expression
combinations. To eliminate duplicate values, use the
DISTINCT
clause. To sort values in the
result, use the ORDER BY
clause. To sort
in reverse order, add the DESC
(descending) keyword to the name of the column you are
sorting by in the ORDER BY
clause. The
default is ascending order; this may be specified explicitly
using the ASC
keyword. The default
separator between values in a group is comma
(,
). To specify a separator explicitly,
use SEPARATOR
followed by the string
literal value that should be inserted between group values.
To eliminate the separator altogether, specify
SEPARATOR ''
.
The result is truncated to the maximum length that is given
by the group_concat_max_len
system variable, which has a default value of 1024. The
value can be set higher, although the effective maximum
length of the return value is constrained by the value of
max_allowed_packet
. The
syntax to change the value of
group_concat_max_len
at
runtime is as follows, where val
is an unsigned integer:
SET [GLOBAL | SESSION] group_concat_max_len = val
;
The return value is a nonbinary or binary string, depending
on whether the arguments are nonbinary or binary strings.
The result type is TEXT
or
BLOB
unless
group_concat_max_len
is
less than or equal to 512, in which case the result type is
VARCHAR
or
VARBINARY
.
See also CONCAT()
and
CONCAT_WS()
:
Section 12.5, “String Functions”.
JSON_ARRAYAGG(
col_or_expr
)
[over_clause
]
Aggregates a result set as a single
JSON
array whose elements
consist of the rows. The order of elements in this array is
undefined. The function acts on a column or an expression
that evaluates to a single value. Returns
NULL
if the result contains no rows, or
in the event of an error.
As of MySQL 8.0.14, this function executes as a window
function if over_clause
is
present. over_clause
is as
described in Section 12.21.2, “Window Function Concepts and Syntax”.
mysql>SELECT o_id, attribute, value FROM t3;
+------+-----------+--------+ | o_id | attribute | value | +------+-----------+--------+ | 2 | color | red | | 2 | fabric | silk | | 3 | color | green | | 3 | shape | square | +------+-----------+--------+ 4 rows in set (0.00 sec) mysql>SELECT o_id, JSON_ARRAYAGG(attribute) AS attributes
FROM t3 GROUP BY o_id;
+------+---------------------+ | o_id | attributes | +------+---------------------+ | 2 | ["color", "fabric"] | | 3 | ["color", "shape"] | +------+---------------------+ 2 rows in set (0.00 sec)
JSON_OBJECTAGG(
key
,
value
)
[over_clause
]
Takes two column names or expressions as arguments, the
first of these being used as a key and the second as a
value, and returns a JSON object containing key-value pairs.
Returns NULL
if the result contains no
rows, or in the event of an error. An error occurs if any
key name is NULL
or the number of
arguments is not equal to 2.
As of MySQL 8.0.14, this function executes as a window
function if over_clause
is
present. over_clause
is as
described in Section 12.21.2, “Window Function Concepts and Syntax”.
mysql>SELECT o_id, attribute, value FROM t3;
+------+-----------+--------+ | o_id | attribute | value | +------+-----------+--------+ | 2 | color | red | | 2 | fabric | silk | | 3 | color | green | | 3 | shape | square | +------+-----------+--------+ 4 rows in set (0.00 sec) mysql>SELECT o_id, JSON_OBJECTAGG(attribute, value)
FROM t3 GROUP BY o_id;
+------+---------------------------------------+ | o_id | JSON_OBJECTAGG(attribute, value) | +------+---------------------------------------+ | 2 | {"color": "red", "fabric": "silk"} | | 3 | {"color": "green", "shape": "square"} | +------+---------------------------------------+ 2 rows in set (0.00 sec)
When the result of this function is normalized, values
having duplicate keys are discarded. In keeping with the
MySQL JSON
data type
specification that does not permit duplicate keys, only
the last value encountered is used with that key in the
returned object (“last duplicate key wins”).
This means that the result of using this function on
columns from a SELECT
can depend on the
order in which in the rows are returned, which is not
guaranteed. When used as a window function, if there are
duplicate keys within a frame, only the last value for the
key is present in the result. The value for the key from
last row in the frame is deterministic if the
ORDER BY
specification guarantees that
the values have a specific order. If not, the resulting
value of the key is nondeterministic. Consider the
following:
mysql>CREATE TABLE t(c VARCHAR(10), i INT);
Query OK, 0 rows affected (0.03 sec) mysql>INSERT INTO t VALUES
('key', 3), ('key', 4), ('key', 5);
Query OK, 3 rows affected (0.01 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql>SELECT c, i FROM t;
+------+------+ | c | i | +------+------+ | key | 3 | | key | 4 | | key | 5 | +------+------+ 3 rows in set (0.00 sec) mysql>SELECT JSON_OBJECTAGG(c, i) FROM t;
+----------------------+ | JSON_OBJECTAGG(c, i) | +----------------------+ | {"key": 5} | +----------------------+ 1 row in set (0.00 sec) mysql>DELETE FROM t;
Query OK, 3 rows affected (0.00 sec) mysql>INSERT INTO t VALUES
('key', 3), ('key', 5), ('key', 4);
Query OK, 3 rows affected (0.00 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql>SELECT c, i FROM t;
+------+------+ | c | i | +------+------+ | key | 3 | | key | 5 | | key | 4 | +------+------+ 3 rows in set (0.00 sec) mysql>SELECT JSON_OBJECTAGG(c, i) FROM t;
+----------------------+ | JSON_OBJECTAGG(c, i) | +----------------------+ | {"key": 4} | +----------------------+ 1 row in set (0.00 sec)
The key chosen from the last query is nondeterministic. If
you prefer a particular key ordering, you can invoke
JSON_OBJECTAGG()
as a window function by
including an OVER
clause with an
ORDER BY
specification to impose a
particular order on frame rows. The following examples show
what happens with and without ORDER BY
for a few different frame specifications.
Without ORDER BY
, the frame is the entire
partition:
mysql>SELECT JSON_OBJECTAGG(c, i)
OVER () AS json_object FROM t;
+-------------+ | json_object | +-------------+ | {"key": 4} | | {"key": 4} | | {"key": 4} | +-------------+
With ORDER BY
, where the frame is the
default of RANGE BETWEEN UNBOUNDED PRECEDING AND
CURRENT ROW
(in both ascending and descending
order):
mysql>SELECT JSON_OBJECTAGG(c, i)
OVER (ORDER BY i) AS json_object FROM t;
+-------------+ | json_object | +-------------+ | {"key": 3} | | {"key": 4} | | {"key": 5} | +-------------+ mysql>SELECT JSON_OBJECTAGG(c, i)
OVER (ORDER BY i DESC) AS json_object FROM t;
+-------------+ | json_object | +-------------+ | {"key": 5} | | {"key": 4} | | {"key": 3} | +-------------+
With ORDER BY
and an explicit frame of
the entire partition:
mysql>SELECT JSON_OBJECTAGG(c, i)
OVER (ORDER BY i
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
AS json_object
FROM t;
+-------------+ | json_object | +-------------+ | {"key": 5} | | {"key": 5} | | {"key": 5} | +-------------+
To return a particular key value (such as the smallest or
largest), include a LIMIT
clause in the
appropriate query. For example:
mysql>SELECT JSON_OBJECTAGG(c, i)
OVER (ORDER BY i) AS json_object FROM t LIMIT 1;
+-------------+ | json_object | +-------------+ | {"key": 3} | +-------------+ mysql>SELECT JSON_OBJECTAGG(c, i)
OVER (ORDER BY i DESC) AS json_object FROM t LIMIT 1;
+-------------+ | json_object | +-------------+ | {"key": 5} | +-------------+
See Normalization, Merging, and Autowrapping of JSON Values, for additional information and examples.
MAX([DISTINCT]
expr
)
[over_clause
]
Returns the maximum value of
expr
.
MAX()
may take a string
argument; in such cases, it returns the maximum string
value. See Section 8.3.1, “How MySQL Uses Indexes”. The
DISTINCT
keyword can be used to find the
maximum of the distinct values of
expr
, however, this produces the
same result as omitting DISTINCT
.
If there are no matching rows,
MAX()
returns
NULL
.
This function executes as a window function if
over_clause
is present.
over_clause
is as described in
Section 12.21.2, “Window Function Concepts and Syntax”; it cannot be used
with DISTINCT
.
mysql>SELECT student_name, MIN(test_score), MAX(test_score)
FROM student
GROUP BY student_name;
For MAX()
, MySQL currently
compares ENUM
and
SET
columns by their string
value rather than by the string's relative position in the
set. This differs from how ORDER BY
compares them.
MIN([DISTINCT]
expr
)
[over_clause
]
Returns the minimum value of
expr
.
MIN()
may take a string
argument; in such cases, it returns the minimum string
value. See Section 8.3.1, “How MySQL Uses Indexes”. The
DISTINCT
keyword can be used to find the
minimum of the distinct values of
expr
, however, this produces the
same result as omitting DISTINCT
.
If there are no matching rows,
MIN()
returns
NULL
.
This function executes as a window function if
over_clause
is present.
over_clause
is as described in
Section 12.21.2, “Window Function Concepts and Syntax”; it cannot be used
with DISTINCT
.
mysql>SELECT student_name, MIN(test_score), MAX(test_score)
FROM student
GROUP BY student_name;
For MIN()
, MySQL currently
compares ENUM
and
SET
columns by their string
value rather than by the string's relative position in the
set. This differs from how ORDER BY
compares them.
Returns the population standard deviation of
expr
.
STD()
is a synonym for the
standard SQL function
STDDEV_POP()
, provided as a
MySQL extension.
If there are no matching rows,
STD()
returns
NULL
.
This function executes as a window function if
over_clause
is present.
over_clause
is as described in
Section 12.21.2, “Window Function Concepts and Syntax”.
Returns the population standard deviation of
expr
.
STDDEV()
is a synonym for the
standard SQL function
STDDEV_POP()
, provided for
compatibility with Oracle.
If there are no matching rows,
STDDEV()
returns
NULL
.
This function executes as a window function if
over_clause
is present.
over_clause
is as described in
Section 12.21.2, “Window Function Concepts and Syntax”.
STDDEV_POP(
expr
)
[over_clause
]
Returns the population standard deviation of
expr
(the square root of
VAR_POP()
). You can also use
STD()
or
STDDEV()
, which are
equivalent but not standard SQL.
If there are no matching rows,
STDDEV_POP()
returns
NULL
.
This function executes as a window function if
over_clause
is present.
over_clause
is as described in
Section 12.21.2, “Window Function Concepts and Syntax”.
STDDEV_SAMP(
expr
)
[over_clause
]
Returns the sample standard deviation of
expr
(the square root of
VAR_SAMP()
.
If there are no matching rows,
STDDEV_SAMP()
returns
NULL
.
This function executes as a window function if
over_clause
is present.
over_clause
is as described in
Section 12.21.2, “Window Function Concepts and Syntax”.
SUM([DISTINCT]
expr
)
[over_clause
]
Returns the sum of expr
. If the
return set has no rows, SUM()
returns NULL
. The
DISTINCT
keyword can be used to sum only
the distinct values of expr
.
If there are no matching rows,
SUM()
returns
NULL
.
This function executes as a window function if
over_clause
is present.
over_clause
is as described in
Section 12.21.2, “Window Function Concepts and Syntax”; it cannot be used
with DISTINCT
.
Returns the population standard variance of
expr
. It considers rows as the
whole population, not as a sample, so it has the number of
rows as the denominator. You can also use
VARIANCE()
, which is
equivalent but is not standard SQL.
If there are no matching rows,
VAR_POP()
returns
NULL
.
This function executes as a window function if
over_clause
is present.
over_clause
is as described in
Section 12.21.2, “Window Function Concepts and Syntax”.
Returns the sample variance of
expr
. That is, the denominator is
the number of rows minus one.
If there are no matching rows,
VAR_SAMP()
returns
NULL
.
This function executes as a window function if
over_clause
is present.
over_clause
is as described in
Section 12.21.2, “Window Function Concepts and Syntax”.
Returns the population standard variance of
expr
.
VARIANCE()
is a synonym for
the standard SQL function
VAR_POP()
, provided as a
MySQL extension.
If there are no matching rows,
VARIANCE()
returns
NULL
.
This function executes as a window function if
over_clause
is present.
over_clause
is as described in
Section 12.21.2, “Window Function Concepts and Syntax”.
The GROUP BY
clause permits a WITH
ROLLUP
modifier that causes summary output to include
extra rows that represent higher-level (that is,
super-aggregate) summary operations. ROLLUP
thus enables you to answer questions at multiple levels of
analysis with a single query. For example,
ROLLUP
can be used to provide support for
OLAP (Online Analytical Processing) operations.
Suppose that a sales
table has
year
, country
,
product
, and profit
columns for recording sales profitability:
CREATE TABLE sales ( year INT, country VARCHAR(20), product VARCHAR(32), profit INT );
To summarize table contents per year, use a simple
GROUP BY
like this:
mysql>SELECT year, SUM(profit) AS profit
FROM sales
GROUP BY year;
+------+--------+ | year | profit | +------+--------+ | 2000 | 4525 | | 2001 | 3010 | +------+--------+
The output shows the total (aggregate) profit for each year. To
also determine the total profit summed over all years, you must
add up the individual values yourself or run an additional
query. Or you can use ROLLUP
, which provides
both levels of analysis with a single query. Adding a
WITH ROLLUP
modifier to the GROUP
BY
clause causes the query to produce another
(super-aggregate) row that shows the grand total over all year
values:
mysql>SELECT year, SUM(profit) AS profit
FROM sales
GROUP BY year WITH ROLLUP;
+------+--------+ | year | profit | +------+--------+ | 2000 | 4525 | | 2001 | 3010 | | NULL | 7535 | +------+--------+
The NULL
value in the year
column identifies the grand total super-aggregate line.
ROLLUP
has a more complex effect when there
are multiple GROUP BY
columns. In this case,
each time there is a change in value in any but the last
grouping column, the query produces an extra super-aggregate
summary row.
For example, without ROLLUP
, a summary of the
sales
table based on year
,
country
, and product
might
look like this, where the output indicates summary values only
at the year/country/product level of analysis:
mysql>SELECT year, country, product, SUM(profit) AS profit
FROM sales
GROUP BY year, country, product;
+------+---------+------------+--------+ | year | country | product | profit | +------+---------+------------+--------+ | 2000 | Finland | Computer | 1500 | | 2000 | Finland | Phone | 100 | | 2000 | India | Calculator | 150 | | 2000 | India | Computer | 1200 | | 2000 | USA | Calculator | 75 | | 2000 | USA | Computer | 1500 | | 2001 | Finland | Phone | 10 | | 2001 | USA | Calculator | 50 | | 2001 | USA | Computer | 2700 | | 2001 | USA | TV | 250 | +------+---------+------------+--------+
With ROLLUP
added, the query produces several
extra rows:
mysql>SELECT year, country, product, SUM(profit) AS profit
FROM sales
GROUP BY year, country, product WITH ROLLUP;
+------+---------+------------+--------+ | year | country | product | profit | +------+---------+------------+--------+ | 2000 | Finland | Computer | 1500 | | 2000 | Finland | Phone | 100 | | 2000 | Finland | NULL | 1600 | | 2000 | India | Calculator | 150 | | 2000 | India | Computer | 1200 | | 2000 | India | NULL | 1350 | | 2000 | USA | Calculator | 75 | | 2000 | USA | Computer | 1500 | | 2000 | USA | NULL | 1575 | | 2000 | NULL | NULL | 4525 | | 2001 | Finland | Phone | 10 | | 2001 | Finland | NULL | 10 | | 2001 | USA | Calculator | 50 | | 2001 | USA | Computer | 2700 | | 2001 | USA | TV | 250 | | 2001 | USA | NULL | 3000 | | 2001 | NULL | NULL | 3010 | | NULL | NULL | NULL | 7535 | +------+---------+------------+--------+
Now the output includes summary information at four levels of analysis, not just one:
Following each set of product rows for a given year and
country, an extra super-aggregate summary row appears
showing the total for all products. These rows have the
product
column set to
NULL
.
Following each set of rows for a given year, an extra
super-aggregate summary row appears showing the total for
all countries and products. These rows have the
country
and products
columns set to NULL
.
Finally, following all other rows, an extra super-aggregate
summary row appears showing the grand total for all years,
countries, and products. This row has the
year
, country
, and
products
columns set to
NULL
.
The NULL
indicators in each super-aggregate
row are produced when the row is sent to the client. The server
looks at the columns named in the GROUP BY
clause following the leftmost one that has changed value. For
any column in the result set with a name that matches any of
those names, its value is set to NULL
. (If
you specify grouping columns by column position, the server
identifies which columns to set to NULL
by
position.)
Because the NULL
values in the
super-aggregate rows are placed into the result set at such a
late stage in query processing, you can test them as
NULL
values only in the select list or
HAVING
clause. You cannot test them as
NULL
values in join conditions or the
WHERE
clause to determine which rows to
select. For example, you cannot add WHERE product IS
NULL
to the query to eliminate from the output all but
the super-aggregate rows.
The NULL
values do appear as
NULL
on the client side and can be tested as
such using any MySQL client programming interface. However, at
this point, you cannot distinguish whether a
NULL
represents a regular grouped value or a
super-aggregate value. To test the distinction, use the
GROUPING()
function, described
later.
Previously, MySQL did not allow the use of
DISTINCT
or ORDER BY
in a
query having a WITH ROLLUP
option. This
restriction is lifted in MySQL 8.0.12 and later. (Bug #87450,
Bug #86311, Bug #26640100, Bug #26073513)
For GROUP BY ... WITH ROLLUP
queries, to test
whether NULL
values in the result represent
super-aggregate values, the
GROUPING()
function is available
for use in the select list, HAVING
clause,
and (as of MySQL 8.0.12) ORDER BY
clause. For
example, GROUPING(year)
returns 1
when NULL
in the year
column occurs in a super-aggregate row, and 0 otherwise.
Similarly, GROUPING(country)
and
GROUPING(product)
return 1 for
super-aggregate NULL
values in the
country
and product
columns, respectively:
mysql>SELECT
year, country, product, SUM(profit) AS profit,
GROUPING(year) AS grp_year,
GROUPING(country) AS grp_country,
GROUPING(product) AS grp_product
FROM sales
GROUP BY year, country, product WITH ROLLUP;
+------+---------+------------+--------+----------+-------------+-------------+ | year | country | product | profit | grp_year | grp_country | grp_product | +------+---------+------------+--------+----------+-------------+-------------+ | 2000 | Finland | Computer | 1500 | 0 | 0 | 0 | | 2000 | Finland | Phone | 100 | 0 | 0 | 0 | | 2000 | Finland | NULL | 1600 | 0 | 0 | 1 | | 2000 | India | Calculator | 150 | 0 | 0 | 0 | | 2000 | India | Computer | 1200 | 0 | 0 | 0 | | 2000 | India | NULL | 1350 | 0 | 0 | 1 | | 2000 | USA | Calculator | 75 | 0 | 0 | 0 | | 2000 | USA | Computer | 1500 | 0 | 0 | 0 | | 2000 | USA | NULL | 1575 | 0 | 0 | 1 | | 2000 | NULL | NULL | 4525 | 0 | 1 | 1 | | 2001 | Finland | Phone | 10 | 0 | 0 | 0 | | 2001 | Finland | NULL | 10 | 0 | 0 | 1 | | 2001 | USA | Calculator | 50 | 0 | 0 | 0 | | 2001 | USA | Computer | 2700 | 0 | 0 | 0 | | 2001 | USA | TV | 250 | 0 | 0 | 0 | | 2001 | USA | NULL | 3000 | 0 | 0 | 1 | | 2001 | NULL | NULL | 3010 | 0 | 1 | 1 | | NULL | NULL | NULL | 7535 | 1 | 1 | 1 | +------+---------+------------+--------+----------+-------------+-------------+
Instead of displaying the
GROUPING()
results directly, you
can use GROUPING()
to substitute
labels for super-aggregate NULL
values:
mysql>SELECT
IF(GROUPING(year), 'All years', year) AS year,
IF(GROUPING(country), 'All countries', country) AS country,
IF(GROUPING(product), 'All products', product) AS product,
SUM(profit) AS profit
FROM sales
GROUP BY year, country, product WITH ROLLUP;
+-----------+---------------+--------------+--------+ | year | country | product | profit | +-----------+---------------+--------------+--------+ | 2000 | Finland | Computer | 1500 | | 2000 | Finland | Phone | 100 | | 2000 | Finland | All products | 1600 | | 2000 | India | Calculator | 150 | | 2000 | India | Computer | 1200 | | 2000 | India | All products | 1350 | | 2000 | USA | Calculator | 75 | | 2000 | USA | Computer | 1500 | | 2000 | USA | All products | 1575 | | 2000 | All countries | All products | 4525 | | 2001 | Finland | Phone | 10 | | 2001 | Finland | All products | 10 | | 2001 | USA | Calculator | 50 | | 2001 | USA | Computer | 2700 | | 2001 | USA | TV | 250 | | 2001 | USA | All products | 3000 | | 2001 | All countries | All products | 3010 | | All years | All countries | All products | 7535 | +-----------+---------------+--------------+--------+
With multiple expression arguments,
GROUPING()
returns a result
representing a bitmask the combines the results for each
expression, with the lowest-order bit corresponding to the
result for the rightmost expression. For example,
GROUPING(year, country, product)
is evaluated like this:
result for GROUPING(product
) + result for GROUPING(country
) << 1 + result for GROUPING(year
) << 2
The result of such a GROUPING()
is nonzero if any of the expressions represents a
super-aggregate NULL
, so you can return only
the super-aggregate rows and filter out the regular grouped rows
like this:
mysql>SELECT year, country, product, SUM(profit) AS profit
FROM sales
GROUP BY year, country, product WITH ROLLUP
HAVING GROUPING(year, country, product) <> 0;
+------+---------+---------+--------+ | year | country | product | profit | +------+---------+---------+--------+ | 2000 | Finland | NULL | 1600 | | 2000 | India | NULL | 1350 | | 2000 | USA | NULL | 1575 | | 2000 | NULL | NULL | 4525 | | 2001 | Finland | NULL | 10 | | 2001 | USA | NULL | 3000 | | 2001 | NULL | NULL | 3010 | | NULL | NULL | NULL | 7535 | +------+---------+---------+--------+
The sales
table contains no
NULL
values, so all NULL
values in a ROLLUP
result represent
super-aggregate values. When the data set contains
NULL
values, ROLLUP
summaries may contain NULL
values not only in
super-aggregate rows, but also in regular grouped rows.
GROUPING()
enables these to be
distinguished. Suppose that table t1
contains
a simple data set with two grouping factors for a set of
quantity values, where NULL
indicates
something like “other” or “unknown”:
mysql> SELECT * FROM t1;
+------+-------+----------+
| name | size | quantity |
+------+-------+----------+
| ball | small | 10 |
| ball | large | 20 |
| ball | NULL | 5 |
| hoop | small | 15 |
| hoop | large | 5 |
| hoop | NULL | 3 |
+------+-------+----------+
A simple ROLLUP
operation produces these
results, in which it is not so easy to distinguish
NULL
values in super-aggregate rows from
NULL
values in regular grouped rows:
mysql>SELECT name, size, SUM(quantity) AS quantity
FROM t1
GROUP BY name, size WITH ROLLUP;
+------+-------+----------+ | name | size | quantity | +------+-------+----------+ | ball | NULL | 5 | | ball | large | 20 | | ball | small | 10 | | ball | NULL | 35 | | hoop | NULL | 3 | | hoop | large | 5 | | hoop | small | 15 | | hoop | NULL | 23 | | NULL | NULL | 58 | +------+-------+----------+
Using GROUPING()
to substitute
labels for the super-aggregate NULL
values
makes the result easier to interpret:
mysql>SELECT
IF(GROUPING(name) = 1, 'All items', name) AS name,
IF(GROUPING(size) = 1, 'All sizes', size) AS size,
SUM(quantity) AS quantity
FROM t1
GROUP BY name, size WITH ROLLUP;
+-----------+-----------+----------+ | name | size | quantity | +-----------+-----------+----------+ | ball | NULL | 5 | | ball | large | 20 | | ball | small | 10 | | ball | All sizes | 35 | | hoop | NULL | 3 | | hoop | large | 5 | | hoop | small | 15 | | hoop | All sizes | 23 | | All items | All sizes | 58 | +-----------+-----------+----------+
The following discussion lists some behaviors specific to the
MySQL implementation of ROLLUP
.
Prior to MySQL 8.0.12, when you use ROLLUP
,
you cannot also use an ORDER BY
clause to
sort the results. In other words, ROLLUP
and ORDER BY
were mutually exclusive in
MySQL. However, you still have some control over sort order.
To work around the restriction that prevents using
ROLLUP
with ORDER BY
and
achieve a specific sort order of grouped results, generate the
grouped result set as a derived table and apply ORDER
BY
to it. For example:
mysql>SELECT * FROM
(SELECT year, SUM(profit) AS profit
FROM sales GROUP BY year WITH ROLLUP) AS dt
ORDER BY year DESC;
+------+--------+ | year | profit | +------+--------+ | 2001 | 3010 | | 2000 | 4525 | | NULL | 7535 | +------+--------+
As of MySQL 8.0.12, ORDER BY
and
ROLLUP
can be used together, which enables
the use of ORDER BY
and
GROUPING()
to achieve a
specific sort order of grouped results. For example:
mysql>SELECT year, SUM(profit) AS profit
FROM sales
GROUP BY year WITH ROLLUP
ORDER BY GROUPING(year) DESC;
+------+--------+ | year | profit | +------+--------+ | NULL | 7535 | | 2000 | 4525 | | 2001 | 3010 | +------+--------+
In both cases, the super-aggregate summary rows sort with the rows from which they are calculated, and their placement depends on sort order (at the end for ascending sort, at the beginning for descending sort).
LIMIT
can be used to restrict the number of
rows returned to the client. LIMIT
is
applied after ROLLUP
, so the limit applies
against the extra rows added by ROLLUP
. For
example:
mysql>SELECT year, country, product, SUM(profit) AS profit
FROM sales
GROUP BY year, country, product WITH ROLLUP
LIMIT 5;
+------+---------+------------+--------+ | year | country | product | profit | +------+---------+------------+--------+ | 2000 | Finland | Computer | 1500 | | 2000 | Finland | Phone | 100 | | 2000 | Finland | NULL | 1600 | | 2000 | India | Calculator | 150 | | 2000 | India | Computer | 1200 | +------+---------+------------+--------+
Using LIMIT
with ROLLUP
may produce results that are more difficult to interpret,
because there is less context for understanding the
super-aggregate rows.
A MySQL extension permits a column that does not appear in the
GROUP BY
list to be named in the select
list. (For information about nonaggregated columns and
GROUP BY
, see
Section 12.20.3, “MySQL Handling of GROUP BY”.) In this case, the server
is free to choose any value from this nonaggregated column in
summary rows, and this includes the extra rows added by
WITH ROLLUP
. For example, in the following
query, country
is a nonaggregated column
that does not appear in the GROUP BY
list
and values chosen for this column are nondeterministic:
mysql>SELECT year, country, SUM(profit) AS profit
FROM sales
GROUP BY year WITH ROLLUP;
+------+---------+--------+ | year | country | profit | +------+---------+--------+ | 2000 | India | 4525 | | 2001 | USA | 3010 | | NULL | USA | 7535 | +------+---------+--------+
This behavior is permitted when the
ONLY_FULL_GROUP_BY
SQL mode
is not enabled. If that mode is enabled, the server rejects
the query as illegal because country
is not
listed in the GROUP BY
clause. With
ONLY_FULL_GROUP_BY
enabled,
you can still execute the query by using the
ANY_VALUE()
function for
nondeterministic-value columns:
mysql>SELECT year, ANY_VALUE(country) AS country, SUM(profit) AS profit
FROM sales
GROUP BY year WITH ROLLUP;
+------+---------+--------+ | year | country | profit | +------+---------+--------+ | 2000 | India | 4525 | | 2001 | USA | 3010 | | NULL | USA | 7535 | +------+---------+--------+
SQL-92 and earlier does not permit queries for which the select
list, HAVING
condition, or ORDER
BY
list refer to nonaggregated columns that are not
named in the GROUP BY
clause. For example,
this query is illegal in standard SQL-92 because the
nonaggregated name
column in the select list
does not appear in the GROUP BY
:
SELECT o.custid, c.name, MAX(o.payment) FROM orders AS o, customers AS c WHERE o.custid = c.custid GROUP BY o.custid;
For the query to be legal in SQL-92, the name
column must be omitted from the select list or named in the
GROUP BY
clause.
SQL:1999 and later permits such nonaggregates per optional
feature T301 if they are functionally dependent on
GROUP BY
columns: If such a relationship
exists between name
and
custid
, the query is legal. This would be the
case, for example, were custid
a primary key
of customers
.
MySQL implements detection of functional dependence. If the
ONLY_FULL_GROUP_BY
SQL mode is
enabled (which it is by default), MySQL rejects queries for
which the select list, HAVING
condition, or
ORDER BY
list refer to nonaggregated columns
that are neither named in the GROUP BY
clause
nor are functionally dependent on them.
If ONLY_FULL_GROUP_BY
is
disabled, a MySQL extension to the standard SQL use of
GROUP BY
permits the select list,
HAVING
condition, or ORDER
BY
list to refer to nonaggregated columns even if the
columns are not functionally dependent on GROUP
BY
columns. This causes MySQL to accept the preceding
query. In this case, the server is free to choose any value from
each group, so unless they are the same, the values chosen are
nondeterministic, which is probably not what you want.
Furthermore, the selection of values from each group cannot be
influenced by adding an ORDER BY
clause.
Result set sorting occurs after values have been chosen, and
ORDER BY
does not affect which value within
each group the server chooses. Disabling
ONLY_FULL_GROUP_BY
is useful
primarily when you know that, due to some property of the data,
all values in each nonaggregated column not named in the
GROUP BY
are the same for each group.
You can achieve the same effect without disabling
ONLY_FULL_GROUP_BY
by using
ANY_VALUE()
to refer to the
nonaggregated column.
The following discussion demonstrates functional dependence, the error message MySQL produces when functional dependence is absent, and ways of causing MySQL to accept a query in the absence of functional dependence.
This query might be invalid with
ONLY_FULL_GROUP_BY
enabled
because the nonaggregated address
column in
the select list is not named in the GROUP BY
clause:
SELECT name, address, MAX(age) FROM t GROUP BY name;
The query is valid if name
is a primary key
of t
or is a unique NOT
NULL
column. In such cases, MySQL recognizes that the
selected column is functionally dependent on a grouping column.
For example, if name
is a primary key, its
value determines the value of address
because
each group has only one value of the primary key and thus only
one row. As a result, there is no randomness in the choice of
address
value in a group and no need to
reject the query.
The query is invalid if name
is not a primary
key of t
or a unique NOT
NULL
column. In this case, no functional dependency
can be inferred and an error occurs:
mysql> SELECT name, address, MAX(age) FROM t GROUP BY name;
ERROR 1055 (42000): Expression #2 of SELECT list is not in GROUP
BY clause and contains nonaggregated column 'mydb.t.address' which
is not functionally dependent on columns in GROUP BY clause; this
is incompatible with sql_mode=only_full_group_by
If you know that, for a given data set,
each name
value in fact uniquely determines
the address
value, address
is effectively functionally dependent on
name
. To tell MySQL to accept the query, you
can use the ANY_VALUE()
function:
SELECT name, ANY_VALUE(address), MAX(age) FROM t GROUP BY name;
Alternatively, disable
ONLY_FULL_GROUP_BY
.
The preceding example is quite simple, however. In particular, it is unlikely you would group on a single primary key column because every group would contain only one row. For addtional examples demonstrating functional dependence in more complex queries, see Section 12.20.4, “Detection of Functional Dependence”.
If a query has aggregate functions and no GROUP
BY
clause, it cannot have nonaggregated columns in the
select list, HAVING
condition, or
ORDER BY
list with
ONLY_FULL_GROUP_BY
enabled:
mysql> SELECT name, MAX(age) FROM t;
ERROR 1140 (42000): In aggregated query without GROUP BY, expression
#1 of SELECT list contains nonaggregated column 'mydb.t.name'; this
is incompatible with sql_mode=only_full_group_by
Without GROUP BY
, there is a single group and
it is nondeterministic which name
value to
choose for the group. Here, too,
ANY_VALUE()
can be used, if it is
immaterial which name
value MySQL chooses:
SELECT ANY_VALUE(name), MAX(age) FROM t;
ONLY_FULL_GROUP_BY
also affects handling of
queries that use DISTINCT
and ORDER
BY
. Consider the case of a table t
with three columns c1
, c2
,
and c3
that contains these rows:
c1 c2 c3 1 2 A 3 4 B 1 2 C
Suppose that we execute the following query, expecting the
results to be ordered by c3
:
SELECT DISTINCT c1, c2 FROM t ORDER BY c3;
To order the result, duplicates must be eliminated first. But to
do so, should we keep the first row or the third? This arbitrary
choice influences the retained value of c3
,
which in turn influences ordering and makes it arbitrary as
well. To prevent this problem, a query that has
DISTINCT
and ORDER BY
is
rejected as invalid if any ORDER BY
expression does not satisfy at least one of these conditions:
The expression is equal to one in the select list
All columns referenced by the expression and belonging to the query's selected tables are elements of the select list
Another MySQL extension to standard SQL permits references in
the HAVING
clause to aliased expressions in
the select list. For example, the following query returns
name
values that occur only once in table
orders
:
SELECT name, COUNT(name) FROM orders GROUP BY name HAVING COUNT(name) = 1;
The MySQL extension permits the use of an alias in the
HAVING
clause for the aggregated column:
SELECT name, COUNT(name) AS c FROM orders GROUP BY name HAVING c = 1;
Standard SQL permits only column expressions in GROUP
BY
clauses, so a statement such as this is invalid
because FLOOR(value/100)
is a noncolumn
expression:
SELECT id, FLOOR(value/100)
FROM tbl_name
GROUP BY id, FLOOR(value/100);
MySQL extends standard SQL to permit noncolumn expressions in
GROUP BY
clauses and considers the preceding
statement valid.
Standard SQL also does not permit aliases in GROUP
BY
clauses. MySQL extends standard SQL to permit
aliases, so another way to write the query is as follows:
SELECT id, FLOOR(value/100) AS val
FROM tbl_name
GROUP BY id, val;
The alias val
is considered a column
expression in the GROUP BY
clause.
In the presence of a noncolumn expression in the GROUP
BY
clause, MySQL recognizes equality between that
expression and expressions in the select list. This means that
with ONLY_FULL_GROUP_BY
SQL
mode enabled, the query containing GROUP BY id,
FLOOR(value/100)
is valid because that same
FLOOR()
expression occurs in the
select list. However, MySQL does not try to recognize functional
dependence on GROUP BY
noncolumn expressions,
so the following query is invalid with
ONLY_FULL_GROUP_BY
enabled,
even though the third selected expression is a simple formula of
the id
column and the
FLOOR()
expression in the
GROUP BY
clause:
SELECT id, FLOOR(value/100), id+FLOOR(value/100)
FROM tbl_name
GROUP BY id, FLOOR(value/100);
A workaround is to use a derived table:
SELECT id, F, id+F
FROM
(SELECT id, FLOOR(value/100) AS F
FROM tbl_name
GROUP BY id, FLOOR(value/100)) AS dt;
The following discussion provides several examples of the ways in which MySQL detects functional dependencies. The examples use this notation:
{X
} -> {Y
}
Understand this as “X
uniquely
determines Y
,” which also
means that Y
is functionally
dependent on X
.
The examples use the world
database, which
can be downloaded from
https://dev.mysql.com/doc/index-other.html. You can find details
on how to install the database on the same page.
The following query selects, for each country, a count of spoken languages:
SELECT co.Name, COUNT(*) FROM countrylanguage cl, country co WHERE cl.CountryCode = co.Code GROUP BY co.Code;
co.Code
is a primary key of
co
, so all columns of co
are functionally dependent on it, as expressed using this
notation:
{co.Code} -> {co.*}
Thus, co.name
is functionally dependent on
GROUP BY
columns and the query is valid.
A UNIQUE
index over a NOT
NULL
column could be used instead of a primary key
and the same functional dependence would apply. (This is not
true for a UNIQUE
index that permits
NULL
values because it permits multiple
NULL
values and in that case uniqueness is
lost.)
This query selects, for each country, a list of all spoken languages and how many people speak them:
SELECT co.Name, cl.Language, cl.Percentage * co.Population / 100.0 AS SpokenBy FROM countrylanguage cl, country co WHERE cl.CountryCode = co.Code GROUP BY cl.CountryCode, cl.Language;
The pair (cl.CountryCode
,
cl.Language
) is a two-column composite
primary key of cl
, so that column pair
uniquely determines all columns of cl
:
{cl.CountryCode, cl.Language} -> {cl.*}
Moreover, because of the equality in the
WHERE
clause:
{cl.CountryCode} -> {co.Code}
And, because co.Code
is primary key of
co
:
{co.Code} -> {co.*}
“Uniquely determines” relationships are transitive, therefore:
{cl.CountryCode, cl.Language} -> {cl.*,co.*}
As a result, the query is valid.
As with the previous example, a UNIQUE
key
over NOT NULL
columns could be used instead
of a primary key.
An INNER JOIN
condition can be used instead
of WHERE
. The same functional dependencies
apply:
SELECT co.Name, cl.Language, cl.Percentage * co.Population/100.0 AS SpokenBy FROM countrylanguage cl INNER JOIN country co ON cl.CountryCode = co.Code GROUP BY cl.CountryCode, cl.Language;
Whereas an equality test in a WHERE
condition or INNER JOIN
condition is
symmetric, an equality test in an outer join condition is not,
because tables play different roles.
Assume that referential integrity has been accidentally broken
and there exists a row of countrylanguage
without a corresponding row in country
.
Consider the same query as in the previous example, but with a
LEFT JOIN
:
SELECT co.Name, cl.Language, cl.Percentage * co.Population/100.0 AS SpokenBy FROM countrylanguage cl LEFT JOIN country co ON cl.CountryCode = co.Code GROUP BY cl.CountryCode, cl.Language;
For a given value of cl.CountryCode
, the
value of co.Code
in the join result is
either found in a matching row (determined by
cl.CountryCode
) or is
NULL
-complemented if there is no match
(also determined by cl.CountryCode
). In
each case, this relationship applies:
{cl.CountryCode} -> {co.Code}
cl.CountryCode
is itself functionally
dependent on {cl.CountryCode
,
cl.Language
} which is a primary key.
If in the join result co.Code
is
NULL
-complemented,
co.Name
is as well. If
co.Code
is not
NULL
-complemented, then because
co.Code
is a primary key, it determines
co.Name
. Therefore, in all cases:
{co.Code} -> {co.Name}
Which yields:
{cl.CountryCode, cl.Language} -> {cl.*,co.*}
As a result, the query is valid.
However, suppose that the tables are swapped, as in this query:
SELECT co.Name, cl.Language, cl.Percentage * co.Population/100.0 AS SpokenBy FROM country co LEFT JOIN countrylanguage cl ON cl.CountryCode = co.Code GROUP BY cl.CountryCode, cl.Language;
Now this relationship does not apply:
{cl.CountryCode, cl.Language} -> {cl.*,co.*}
Indeed, all NULL
-complemented rows made for
cl
will be put into a single group (they
have both GROUP BY
columns equal to
NULL
), and inside this group the value of
co.Name
can vary. The query is invalid and
MySQL rejects it.
Functional dependence in outer joins is thus linked to whether
determinant columns belong to the left or right side of the
LEFT JOIN
. Determination of functional
dependence becomes more complex if there are nested outer
joins or the join condition does not consist entirely of
equality comparisons.
Suppose that a view on countries produces their code, their name in uppercase, and how many different official languages they have:
CREATE VIEW Country2 AS SELECT co.Code, UPPER(co.Name) AS UpperName, COUNT(cl.Language) AS OfficialLanguages FROM country AS co JOIN countrylanguage AS cl ON cl.CountryCode = co.Code WHERE cl.isOfficial = 'T' GROUP BY co.Code;
This definition is valid because:
{co.Code} -> {co.*}
In the view result, the first selected column is
co.Code
, which is also the group column and
thus determines all other selected expressions:
{Country2.Code} -> {Country2.*}
MySQL understands this and uses this information, as described following.
This query displays countries, how many different official
languages they have, and how many cities they have, by joining
the view with the city
table:
SELECT co2.Code, co2.UpperName, co2.OfficialLanguages, COUNT(*) AS Cities FROM country2 AS co2 JOIN city ci ON ci.CountryCode = co2.Code GROUP BY co2.Code;
This query is valid because, as seen previously:
{co2.Code} -> {co2.*}
MySQL is able to discover a functional dependency in the
result of a view and use that to validate a query which uses
the view. The same would be true if
country2
were a derived table (or common
table expression), as in:
SELECT co2.Code, co2.UpperName, co2.OfficialLanguages, COUNT(*) AS Cities FROM ( SELECT co.Code, UPPER(co.Name) AS UpperName, COUNT(cl.Language) AS OfficialLanguages FROM country AS co JOIN countrylanguage AS cl ON cl.CountryCode=co.Code WHERE cl.isOfficial='T' GROUP BY co.Code ) AS co2 JOIN city ci ON ci.CountryCode = co2.Code GROUP BY co2.Code;
MySQL supports window functions that, for each row from a query,
perform a calculation using rows related to that row. The
following sections discuss how to use window functions, including
descriptions of the OVER
and
WINDOW
clauses. The first section provides
descriptions of the nonaggregate window functions. For
descriptions of the aggregate window functions, see
Section 12.20.1, “Aggregate (GROUP BY) Function Descriptions”.
For information about optimization and window functions, see Section 8.2.1.19, “Window Function Optimization”.
This section describes nonaggregate window functions that, for each row from a query, perform a calculation using rows related to that row. Most aggregate functions also can be used as window functions; see Section 12.20.1, “Aggregate (GROUP BY) Function Descriptions”.
For window function usage information and examples, and
definitions of terms such as the OVER
clause,
window, partition, frame, and peer, see
Section 12.21.2, “Window Function Concepts and Syntax”.
Table 12.27 Window Functions
Name | Description |
---|---|
CUME_DIST() |
Cumulative distribution value |
DENSE_RANK() |
Rank of current row within its partition, without gaps |
FIRST_VALUE() |
Value of argument from first row of window frame |
LAG() |
Value of argument from row lagging current row within partition |
LAST_VALUE() |
Value of argument from last row of window frame |
LEAD() |
Value of argument from row leading current row within partition |
NTH_VALUE() |
Value of argument from N-th row of window frame |
NTILE() |
Bucket number of current row within its partition. |
PERCENT_RANK() |
Percentage rank value |
RANK() |
Rank of current row within its partition, with gaps |
ROW_NUMBER() |
Number of current row within its partition |
In the following function descriptions,
over_clause
represents the
OVER
clause, described in
Section 12.21.2, “Window Function Concepts and Syntax”. Some window functions
permit a null_treatment
clause that
specifies how to handle NULL
values when
calculating results. This clause is optional. It is part of the
SQL standard, but the MySQL implementation permits only
RESPECT NULLS
(which is also the default).
This means that NULL
values are considered
when calculating results. IGNORE NULLS
is
parsed, but produces an error.
CUME_DIST()
over_clause
Returns the cumulative distribution of a value within a group of values; that is, the percentage of partition values less than or equal to the value in the current row. This represents the number of rows preceding or peer with the current row in the window ordering of the window partition divided by the total number of rows in the window partition. Return values range from 0 to 1.
This function should be used with ORDER
BY
to sort partition rows into the desired order.
Without ORDER BY
, all rows are peers and
have value
N
/N
=
1, where N
is the partition size.
over_clause
is as described in
Section 12.21.2, “Window Function Concepts and Syntax”.
The following query shows, for the set of values in the
val
column, the
CUME_DIST()
value for each row, as well
as the percentage rank value returned by the similar
PERCENT_RANK()
function. For reference,
the query also displays row numbers using
ROW_NUMBER()
:
mysql>SELECT
val,
ROW_NUMBER() OVER w AS 'row_number',
CUME_DIST() OVER w AS 'cume_dist',
PERCENT_RANK() OVER w AS 'percent_rank'
FROM numbers
WINDOW w AS (ORDER BY val);
+------+------------+--------------------+--------------+ | val | row_number | cume_dist | percent_rank | +------+------------+--------------------+--------------+ | 1 | 1 | 0.2222222222222222 | 0 | | 1 | 2 | 0.2222222222222222 | 0 | | 2 | 3 | 0.3333333333333333 | 0.25 | | 3 | 4 | 0.6666666666666666 | 0.375 | | 3 | 5 | 0.6666666666666666 | 0.375 | | 3 | 6 | 0.6666666666666666 | 0.375 | | 4 | 7 | 0.8888888888888888 | 0.75 | | 4 | 8 | 0.8888888888888888 | 0.75 | | 5 | 9 | 1 | 1 | +------+------------+--------------------+--------------+
DENSE_RANK()
over_clause
Returns the rank of the current row within its partition,
without gaps. Peers are considered ties and receive the same
rank. This function assigns consecutive ranks to peer
groups; the result is that groups of size greater than one
do not produce noncontiguous rank numbers. For an example,
see the RANK()
function
description.
This function should be used with ORDER
BY
to sort partition rows into the desired order.
Without ORDER BY
, all rows are peers.
over_clause
is as described in
Section 12.21.2, “Window Function Concepts and Syntax”.
FIRST_VALUE(
[expr
)null_treatment
]
over_clause
Returns the value of expr
from
the first row of the window frame.
over_clause
is as described in
Section 12.21.2, “Window Function Concepts and Syntax”.
null_treatment
is as described in
the section introduction.
The following query demonstrates
FIRST_VALUE()
,
LAST_VALUE()
, and two
instances of NTH_VALUE()
:
mysql>SELECT
time, subject, val,
FIRST_VALUE(val) OVER w AS 'first',
LAST_VALUE(val) OVER w AS 'last',
NTH_VALUE(val, 2) OVER w AS 'second',
NTH_VALUE(val, 4) OVER w AS 'fourth'
FROM observations
WINDOW w AS (PARTITION BY subject ORDER BY time
ROWS UNBOUNDED PRECEDING);
+----------+---------+------+-------+------+--------+--------+ | time | subject | val | first | last | second | fourth | +----------+---------+------+-------+------+--------+--------+ | 07:00:00 | st113 | 10 | 10 | 10 | NULL | NULL | | 07:15:00 | st113 | 9 | 10 | 9 | 9 | NULL | | 07:30:00 | st113 | 25 | 10 | 25 | 9 | NULL | | 07:45:00 | st113 | 20 | 10 | 20 | 9 | 20 | | 07:00:00 | xh458 | 0 | 0 | 0 | NULL | NULL | | 07:15:00 | xh458 | 10 | 0 | 10 | 10 | NULL | | 07:30:00 | xh458 | 5 | 0 | 5 | 10 | NULL | | 07:45:00 | xh458 | 30 | 0 | 30 | 10 | 30 | | 08:00:00 | xh458 | 25 | 0 | 25 | 10 | 30 | +----------+---------+------+-------+------+--------+--------+
Each function uses the rows in the current frame, which, per
the window definition shown, extends from the first
partition row to the current row. For the
NTH_VALUE()
calls, the
current frame does not always include the requested row; in
such cases, the return value is NULL
.
LAG(
[expr
[,
N
[,
default
]])null_treatment
]
over_clause
Returns the value of expr
from
the row that lags (precedes) the current row by
N
rows within its partition. If
there is no such row, the return value is
default
. For example, if
N
is 3, the return value is
default
for the first two rows.
If N
or
default
are missing, the defaults
are 1 and NULL
, respectively.
N
must be a literal nonnegative
integer. If N
is 0,
expr
is evaluated for the current
row.
over_clause
is as described in
Section 12.21.2, “Window Function Concepts and Syntax”.
null_treatment
is as described in
the section introduction.
LAG()
(and the similar
LEAD()
function) are often
used to compute differences between rows. The following
query shows a set of time-ordered observations and, for each
one, the LAG()
and
LEAD()
values from the
adjoining rows, as well as the differences between the
current and adjoining rows:
mysql>SELECT
t, val,
LAG(val) OVER w AS 'lag',
LEAD(val) OVER w AS 'lead',
val - LAG(val) OVER w AS 'lag diff',
val - LEAD(val) OVER w AS 'lead diff'
FROM series
WINDOW w AS (ORDER BY t);
+----------+------+------+------+----------+-----------+ | t | val | lag | lead | lag diff | lead diff | +----------+------+------+------+----------+-----------+ | 12:00:00 | 100 | NULL | 125 | NULL | -25 | | 13:00:00 | 125 | 100 | 132 | 25 | -7 | | 14:00:00 | 132 | 125 | 145 | 7 | -13 | | 15:00:00 | 145 | 132 | 140 | 13 | 5 | | 16:00:00 | 140 | 145 | 150 | -5 | -10 | | 17:00:00 | 150 | 140 | 200 | 10 | -50 | | 18:00:00 | 200 | 150 | NULL | 50 | NULL | +----------+------+------+------+----------+-----------+
In the example, the LAG()
and
LEAD()
calls use the default
N
and
default
values of 1 and
NULL
, respectively.
The first row shows what happens when there is no previous
row for LAG()
: The function
returns the default
value (in
this case, NULL
). The last row shows the
same thing when there is no next row for
LEAD()
.
LAG()
and
LEAD()
also serve to compute
sums rather than differences. Consider this data set, which
contains the first few numbers of the Fibonacci series:
mysql> SELECT n FROM fib ORDER BY n;
+------+
| n |
+------+
| 1 |
| 1 |
| 2 |
| 3 |
| 5 |
| 8 |
+------+
The following query shows the
LAG()
and
LEAD()
values for the rows
adjacent to the current row. It also uses those functions to
add to the current row value the values from the preceding
and following rows. The effect is to generate the next
number in the Fibonacci series, and the next number after
that:
mysql>SELECT
n,
LAG(n, 1, 0) OVER w AS 'lag',
LEAD(n, 1, 0) OVER w AS 'lead',
n + LAG(n, 1, 0) OVER w AS 'next_n',
n + LEAD(n, 1, 0) OVER w AS 'next_next_n'
FROM fib
WINDOW w AS (ORDER BY n);
+------+------+------+--------+-------------+ | n | lag | lead | next_n | next_next_n | +------+------+------+--------+-------------+ | 1 | 0 | 1 | 1 | 2 | | 1 | 1 | 2 | 2 | 3 | | 2 | 1 | 3 | 3 | 5 | | 3 | 2 | 5 | 5 | 8 | | 5 | 3 | 8 | 8 | 13 | | 8 | 5 | 0 | 13 | 8 | +------+------+------+--------+-------------+
One way to generate the initial set of Fibonacci numbers is to use a recursive common table expression. For an example, see Fibonacci Series Generation.
LAST_VALUE(
[expr
)null_treatment
]
over_clause
Returns the value of expr
from
the last row of the window frame.
over_clause
is as described in
Section 12.21.2, “Window Function Concepts and Syntax”.
null_treatment
is as described in
the section introduction.
For an example, see the
FIRST_VALUE()
function
description.
LEAD(
[expr
[,
N
[,
default
]])null_treatment
]
over_clause
Returns the value of expr
from
the row that leads (follows) the current row by
N
rows within its partition. If
there is no such row, the return value is
default
. For example, if
N
is 3, the return value is
default
for the last two rows. If
N
or
default
are missing, the defaults
are 1 and NULL
, respectively.
N
must be a literal nonnegative
integer. If N
is 0,
expr
is evaluated for the current
row.
over_clause
is as described in
Section 12.21.2, “Window Function Concepts and Syntax”.
null_treatment
is as described in
the section introduction.
For an example, see the LAG()
function description.
NTH_VALUE(
[expr
,
N
)from_first_last
]
[null_treatment
]
over_clause
Returns the value of expr
from
the N
-th row of the window frame.
If there is no such row, the return value is
NULL
.
N
must be a literal positive
integer.
from_first_last
is part of the
SQL standard, but the MySQL implementation permits only
FROM FIRST
(which is also the default).
This means that calculations begin at the first row of the
window. FROM LAST
is parsed, but produces
an error. To obtain the same effect as FROM
LAST
(begin calculations at the last row of the
window), use ORDER BY
to sort in reverse
order.
over_clause
is as described in
Section 12.21.2, “Window Function Concepts and Syntax”.
null_treatment
is as described in
the section introduction.
For an example, see the
FIRST_VALUE()
function
description.
NTILE(
N
)over_clause
Divides a partition into N
groups
(buckets), assigns each row in the partition its bucket
number, and returns the bucket number of the current row
within its partition. For example, if
N
is 4,
NTILE()
divides rows into four buckets.
If N
is 100,
NTILE()
divides rows into 100 buckets.
N
must be a literal positive
integer. Bucket number return values range from 1 to
N
.
This function should be used with ORDER
BY
to sort partition rows into the desired order.
over_clause
is as described in
Section 12.21.2, “Window Function Concepts and Syntax”.
The following query shows, for the set of values in the
val
column, the percentile values
resulting from dividing the rows into two or four groups.
For reference, the query also displays row numbers using
ROW_NUMBER()
:
mysql>SELECT
val,
ROW_NUMBER() OVER w AS 'row_number',
NTILE(2) OVER w AS 'ntile2',
NTILE(4) OVER w AS 'ntile4'
FROM numbers
WINDOW w AS (ORDER BY val);
+------+------------+--------+--------+ | val | row_number | ntile2 | ntile4 | +------+------------+--------+--------+ | 1 | 1 | 1 | 1 | | 1 | 2 | 1 | 1 | | 2 | 3 | 1 | 1 | | 3 | 4 | 1 | 2 | | 3 | 5 | 1 | 2 | | 3 | 6 | 2 | 3 | | 4 | 7 | 2 | 3 | | 4 | 8 | 2 | 4 | | 5 | 9 | 2 | 4 | +------+------------+--------+--------+
PERCENT_RANK()
over_clause
Returns the the percentage of partition values less than the
value in the current row, excluding the highest value.
Return values range from 0 to 1 and represent the row
relative rank, calculated as the result of this formula,
where rank
is the row rank and
rows
is the number of partition
rows:
(rank
- 1) / (rows
- 1)
This function should be used with ORDER
BY
to sort partition rows into the desired order.
Without ORDER BY
, all rows are peers.
over_clause
is as described in
Section 12.21.2, “Window Function Concepts and Syntax”.
For an example, see the
CUME_DIST()
function
description.
RANK()
over_clause
Returns the rank of the current row within its partition, with gaps. Peers are considered ties and receive the same rank. This function does not assign consecutive ranks to peer groups if groups of size greater than one exist; the result is noncontiguous rank numbers.
This function should be used with ORDER
BY
to sort partition rows into the desired order.
Without ORDER BY
, all rows are peers.
over_clause
is as described in
Section 12.21.2, “Window Function Concepts and Syntax”.
The following query shows the difference between
RANK()
, which produces ranks
with gaps, and DENSE_RANK()
,
which produces ranks without gaps. The query shows rank
values for each member of a set of values in the
val
column, which contains some
duplicates. RANK()
assigns
peers (the duplicates) the same rank value, and the next
greater value has a rank higher by the number of peers minus
one. DENSE_RANK()
also
assigns peers the same rank value, but the next higher value
has a rank one greater. For reference, the query also
displays row numbers using
ROW_NUMBER()
:
mysql>SELECT
val,
ROW_NUMBER() OVER w AS 'row_number',
RANK() OVER w AS 'rank',
DENSE_RANK() OVER w AS 'dense_rank'
FROM numbers
WINDOW w AS (ORDER BY val);
+------+------------+------+------------+ | val | row_number | rank | dense_rank | +------+------------+------+------------+ | 1 | 1 | 1 | 1 | | 1 | 2 | 1 | 1 | | 2 | 3 | 3 | 2 | | 3 | 4 | 4 | 3 | | 3 | 5 | 4 | 3 | | 3 | 6 | 4 | 3 | | 4 | 7 | 7 | 4 | | 4 | 8 | 7 | 4 | | 5 | 9 | 9 | 5 | +------+------------+------+------------+
ROW_NUMBER()
over_clause
Returns the number of the current row within its partition. Rows numbers range from 1 to the number of partition rows.
ORDER BY
affects the order in which rows
are numbered. Without ORDER BY
, row
numbering is nondeterministic.
ROW_NUMBER()
assigns peers
different row numbers. To assign peers the same value, use
RANK()
or
DENSE_RANK()
. For an example,
see the RANK()
function
description.
over_clause
is as described in
Section 12.21.2, “Window Function Concepts and Syntax”.
This section describes how to use window functions. Examples use
the same sales information data set as found in the discussion
of the GROUPING()
function in
Section 12.20.2, “GROUP BY Modifiers”:
mysql> SELECT * FROM sales ORDER BY country, year, product;
+------+---------+------------+--------+
| year | country | product | profit |
+------+---------+------------+--------+
| 2000 | Finland | Computer | 1500 |
| 2000 | Finland | Phone | 100 |
| 2001 | Finland | Phone | 10 |
| 2000 | India | Calculator | 75 |
| 2000 | India | Calculator | 75 |
| 2000 | India | Computer | 1200 |
| 2000 | USA | Calculator | 75 |
| 2000 | USA | Computer | 1500 |
| 2001 | USA | Calculator | 50 |
| 2001 | USA | Computer | 1500 |
| 2001 | USA | Computer | 1200 |
| 2001 | USA | TV | 150 |
| 2001 | USA | TV | 100 |
+------+---------+------------+--------+
A window function performs an aggregate-like operation on a set of query rows. However, whereas an aggregate operation groups query rows into a single result row, a window function produces a result for each query row:
The row for which function evaluation occurs is called the current row.
The query rows related to the current row over which function evaluation occurs comprise the window for the current row.
For example, using the sales information table, these two queries perform aggregate operations that produce a single global sum for all rows taken as a group, and sums grouped per country:
mysql>SELECT SUM(profit) AS total_profit
FROM sales;
+--------------+ | total_profit | +--------------+ | 7535 | +--------------+ mysql>SELECT country, SUM(profit) AS country_profit
FROM sales
GROUP BY country
ORDER BY country;
+---------+----------------+ | country | country_profit | +---------+----------------+ | Finland | 1610 | | India | 1350 | | USA | 4575 | +---------+----------------+
By contrast, window operations do not collapse groups of query
rows to a single output row. Instead, they produce a result for
each row. Like the preceding queries, the following query uses
SUM()
, but this time as a window
function:
mysql>SELECT
year, country, product, profit,
SUM(profit) OVER() AS total_profit,
SUM(profit) OVER(PARTITION BY country) AS country_profit
FROM sales
ORDER BY country, year, product, profit;
+------+---------+------------+--------+--------------+----------------+ | year | country | product | profit | total_profit | country_profit | +------+---------+------------+--------+--------------+----------------+ | 2000 | Finland | Computer | 1500 | 7535 | 1610 | | 2000 | Finland | Phone | 100 | 7535 | 1610 | | 2001 | Finland | Phone | 10 | 7535 | 1610 | | 2000 | India | Calculator | 75 | 7535 | 1350 | | 2000 | India | Calculator | 75 | 7535 | 1350 | | 2000 | India | Computer | 1200 | 7535 | 1350 | | 2000 | USA | Calculator | 75 | 7535 | 4575 | | 2000 | USA | Computer | 1500 | 7535 | 4575 | | 2001 | USA | Calculator | 50 | 7535 | 4575 | | 2001 | USA | Computer | 1200 | 7535 | 4575 | | 2001 | USA | Computer | 1500 | 7535 | 4575 | | 2001 | USA | TV | 100 | 7535 | 4575 | | 2001 | USA | TV | 150 | 7535 | 4575 | +------+---------+------------+--------+--------------+----------------+
Each window operation in the query is signified by inclusion of
an OVER
clause that specifies how to
partition query rows into groups for processing by the window
function:
The first OVER
clause is empty, which
treats the entire set of query rows as a single partition.
The window function thus produces a global sum, but does so
for each row.
The second OVER
clause partitions rows by
country, producing a sum per partition (per country). The
function produces this sum for each partition row.
Window functions are permitted only in the select list and
ORDER BY
clause. Query result rows are
determined from the FROM
clause, after
WHERE
, GROUP BY
, and
HAVING
processing, and windowing execution
occurs before ORDER BY
,
LIMIT
, and SELECT
DISTINCT
.
The OVER
clause is permitted for many
aggregate functions, which therefore can be used as window or
nonwindow functions, depending on whether the
OVER
clause is present or absent:
AVG() BIT_AND() BIT_OR() BIT_XOR() COUNT() JSON_ARRAYAGG() JSON_OBJECTAGG() MAX() MIN() STDDEV_POP(), STDDEV(), STD() STDDEV_SAMP() SUM() VAR_POP(), VARIANCE() VAR_SAMP()
For details about each aggregate function, see Section 12.20.1, “Aggregate (GROUP BY) Function Descriptions”.
MySQL also supports nonaggregate functions that are used only as
window functions. For these, the OVER
clause
is mandatory:
CUME_DIST() DENSE_RANK() FIRST_VALUE() LAG() LAST_VALUE() LEAD() NTH_VALUE() NTILE() PERCENT_RANK() RANK() ROW_NUMBER()
For details about each nonaggregate function, see Section 12.21.1, “Window Function Descriptions”.
As an example of one of those nonaggregate window functions,
this query uses ROW_NUMBER()
,
which produces the row number of each row within its partition.
In this case, rows are numbered per country. By default,
partition rows are unordered and row numbering is
nondeterministic. To sort partition rows, include an
ORDER BY
clause within the window definition.
The query uses unordered and ordered partitions (the
row_num1
and row_num2
columns) to illustrate the difference between omitting and
including ORDER BY
:
mysql>SELECT
year, country, product, profit,
ROW_NUMBER() OVER(PARTITION BY country) AS row_num1,
ROW_NUMBER() OVER(PARTITION BY country ORDER BY year, product) AS row_num2
FROM sales;
+------+---------+------------+--------+----------+----------+ | year | country | product | profit | row_num1 | row_num2 | +------+---------+------------+--------+----------+----------+ | 2000 | Finland | Computer | 1500 | 2 | 1 | | 2000 | Finland | Phone | 100 | 1 | 2 | | 2001 | Finland | Phone | 10 | 3 | 3 | | 2000 | India | Calculator | 75 | 2 | 1 | | 2000 | India | Calculator | 75 | 3 | 2 | | 2000 | India | Computer | 1200 | 1 | 3 | | 2000 | USA | Calculator | 75 | 5 | 1 | | 2000 | USA | Computer | 1500 | 4 | 2 | | 2001 | USA | Calculator | 50 | 2 | 3 | | 2001 | USA | Computer | 1500 | 3 | 4 | | 2001 | USA | Computer | 1200 | 7 | 5 | | 2001 | USA | TV | 150 | 1 | 6 | | 2001 | USA | TV | 100 | 6 | 7 | +------+---------+------------+--------+----------+----------+
As mentioned previously, to use a window function (or treat an
aggregate function as a window function), include an
OVER
clause following the function call. The
OVER
clause has two forms:
over_clause
: {OVER (window_spec
) | OVERwindow_name
}
Both forms define how the window function should process query
rows. They differ in whether the window is defined directly in
the OVER
clause, or supplied by a reference
to a named window defined elsewhere in the query:
In the first case, the window specification appears directly
in the OVER
clause, between the
parentheses.
In the second case, window_name
is the name for a window specification defined by a
WINDOW
clause elsewhere in the query. For
details, see
Section 12.21.4, “Named Windows”.
For OVER
(
syntax, the
window specification has several parts, all optional:
window_spec
)
window_spec
: [window_name
] [partition_clause
] [order_clause
] [frame_clause
]
If OVER()
is empty, the window consists of
all query rows and the window function computes a result using
all rows. Otherwise, the clauses present within the parentheses
determine which query rows are used to compute the function
result and how they are partitioned and ordered:
window_name
: The name of a window
defined by a WINDOW
clause elsewhere in
the query. If window_name
appears
by itself within the OVER
clause, it
completely defines the window. If partitioning, ordering, or
framing clauses are also given, they modify interpretation
of the named window. For details, see
Section 12.21.4, “Named Windows”.
partition_clause
: A
PARTITION BY
clause indicates how to
divide the query rows into groups. The window function
result for a given row is based on the rows of the partition
that contains the row. If PARTITION BY
is
omitted, there is a single partition consisting of all query
rows.
Partitioning for window functions differs from table partitioning. For information about table partitioning, see Chapter 23, Partitioning.
partition_clause
has this syntax:
partition_clause
: PARTITION BYexpr
[,expr
] ...
Standard SQL requires PARTITION BY
to be
followed by column names only. A MySQL extension is to
permit expressions, not just column names. For example, if a
table contains a TIMESTAMP
column named ts
, standard SQL permits
PARTITION BY ts
but not
PARTITION BY HOUR(ts)
, whereas MySQL
permits both.
order_clause
: An ORDER
BY
clause indicates how to sort rows in each
partition. Partition rows that are equal according to the
ORDER BY
clause are considered peers. If
ORDER BY
is omitted, partition rows are
unordered, with no processing order implied, and all
partition rows are peers.
order_clause
has this syntax:
order_clause
: ORDER BYexpr
[ASC|DESC] [,expr
[ASC|DESC]] ...
Each ORDER BY
expression optionally can
be followed by ASC
or
DESC
to indicate sort direction. The
default is ASC
if no direction is
specified. NULL
values sort first for
ascending sorts, last for descending sorts.
An ORDER BY
in a window definition
applies within individual partitions. To sort the result set
as a whole, include an ORDER BY
at the
query top level.
frame_clause
: A frame is a subset
of the current partition and the frame clause specifies how
to define the subset. The frame clause has many subclauses
of its own. For details, see
Section 12.21.3, “Window Function Frame Specification”.
The definition of a window used with a window function can include a frame clause. A frame is a subset of the current partition and the frame clause specifies how to define the subset.
Frames are determined with respect to the current row, which enables a frame to move within a partition depending on the location of the current row within its partition. Examples:
By defining a frame to be all rows from the partition start to the current row, you can compute running totals for each row.
By defining a frame as extending
N
rows on either side of the
current row, you can compute rolling averages.
The following query demonstrates the use of moving frames to
compute running totals within each group of time-ordered
level
values, as well as rolling averages
computed from the current row and the rows that immediately
precede and follow it:
mysql>SELECT
time, subject, val,
SUM(val) OVER (PARTITION BY subject ORDER BY time
ROWS UNBOUNDED PRECEDING)
AS running_total,
AVG(val) OVER (PARTITION BY subject ORDER BY time
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING)
AS running_average
FROM observations;
+----------+---------+------+---------------+-----------------+ | time | subject | val | running_total | running_average | +----------+---------+------+---------------+-----------------+ | 07:00:00 | st113 | 10 | 10 | 9.5000 | | 07:15:00 | st113 | 9 | 19 | 14.6667 | | 07:30:00 | st113 | 25 | 44 | 18.0000 | | 07:45:00 | st113 | 20 | 64 | 22.5000 | | 07:00:00 | xh458 | 0 | 0 | 5.0000 | | 07:15:00 | xh458 | 10 | 10 | 5.0000 | | 07:30:00 | xh458 | 5 | 15 | 15.0000 | | 07:45:00 | xh458 | 30 | 45 | 20.0000 | | 08:00:00 | xh458 | 25 | 70 | 27.5000 | +----------+---------+------+---------------+-----------------+
For the running_average
column, there is no
frame row preceding the first one or following the last. In
these cases, AVG()
computes the
average of the rows that are available.
Aggregate functions used as window functions operate on rows in the current row frame, as do these nonaggregate window functions:
FIRST_VALUE() LAST_VALUE() NTH_VALUE()
Standard SQL specifies that window functions that operate on the entire partition should have no frame clause. MySQL permits a frame clause for such functions but ignores it. These functions use the entire partition even if a frame is specified:
CUME_DIST() DENSE_RANK() LAG() LEAD() NTILE() PERCENT_RANK() RANK() ROW_NUMBER()
The frame clause, if given, has this syntax:
frame_clause
:frame_units
frame_extent
frame_units
: {ROWS | RANGE}
In the absence of a frame clause, the default frame depends on
whether an ORDER BY
clause is present, as
described later in this section.
The frame_units
value indicates the
type of relationship between the current row and frame rows:
ROWS
: The frame is defined by beginning
and ending row positions. Offsets are differences in row
numbers from the current row number.
RANGE
: The frame is defined by rows
within a value range. Offsets are differences in row values
from the current row value.
The frame_extent
value indicates the
start and end points of the frame. You can specify just the
start of the frame (in which case the current row is implicitly
the end) or use BETWEEN
to specify both frame
endpoints:
frame_extent
: {frame_start
|frame_between
}frame_between
: BETWEENframe_start
ANDframe_end
frame_start
,frame_end
: { CURRENT ROW | UNBOUNDED PRECEDING | UNBOUNDED FOLLOWING |expr
PRECEDING |expr
FOLLOWING }
With BETWEEN
syntax,
frame_start
must not occur later than
frame_end
.
The permitted frame_start
and
frame_end
values have these meanings:
CURRENT ROW
: For ROWS
,
the bound is the current row. For RANGE
,
the bound is the peers of the current row.
UNBOUNDED PRECEDING
: The bound is the
first partition row.
UNBOUNDED FOLLOWING
: The bound is the
last partition row.
: For expr
PRECEDINGROWS
, the bound
is expr
rows before the current
row. For RANGE
, the bound is the rows
with values equal to the current row value minus
expr
; if the current row value is
NULL
, the bound is the peers of the row.
For
(and
expr
PRECEDING
), expr
FOLLOWINGexpr
can be
a ?
parameter marker (for use in a
prepared statement), a nonnegative numeric literal, or a
temporal interval of the form INTERVAL
. For
val
unit
INTERVAL
expressions,
val
specifies nonnegative
interval value, and unit
is a
keyword indicating the units in which the value should be
interpreted. (For details about the permitted
units
specifiers, see the
description of the DATE_ADD()
function in Section 12.7, “Date and Time Functions”.)
RANGE
on a numeric or temporal
expr
requires ORDER
BY
on a numeric or temporal expression,
respectively.
Examples of valid
and
expr
PRECEDING
indicators:
expr
FOLLOWING
10 PRECEDING INTERVAL 5 DAY PRECEDING 5 FOLLOWING INTERVAL '2:30' MINUTE_SECOND FOLLOWING
: For expr
FOLLOWINGROWS
, the bound
is expr
rows after the current
row. For RANGE
, the bound is the rows
with values equal to the current row value plus
expr
; if the current row value is
NULL
, the bound is the peers of the row.
For permitted values of expr
, see
the description of
.
expr
PRECEDING
The following query demonstrates
FIRST_VALUE()
,
LAST_VALUE()
, and two instances
of NTH_VALUE()
:
mysql>SELECT
time, subject, val,
FIRST_VALUE(val) OVER w AS 'first',
LAST_VALUE(val) OVER w AS 'last',
NTH_VALUE(val, 2) OVER w AS 'second',
NTH_VALUE(val, 4) OVER w AS 'fourth'
FROM observations
WINDOW w AS (PARTITION BY subject ORDER BY time
ROWS UNBOUNDED PRECEDING);
+----------+---------+------+-------+------+--------+--------+ | time | subject | val | first | last | second | fourth | +----------+---------+------+-------+------+--------+--------+ | 07:00:00 | st113 | 10 | 10 | 10 | NULL | NULL | | 07:15:00 | st113 | 9 | 10 | 9 | 9 | NULL | | 07:30:00 | st113 | 25 | 10 | 25 | 9 | NULL | | 07:45:00 | st113 | 20 | 10 | 20 | 9 | 20 | | 07:00:00 | xh458 | 0 | 0 | 0 | NULL | NULL | | 07:15:00 | xh458 | 10 | 0 | 10 | 10 | NULL | | 07:30:00 | xh458 | 5 | 0 | 5 | 10 | NULL | | 07:45:00 | xh458 | 30 | 0 | 30 | 10 | 30 | | 08:00:00 | xh458 | 25 | 0 | 25 | 10 | 30 | +----------+---------+------+-------+------+--------+--------+
Each function uses the rows in the current frame, which, per the
window definition shown, extends from the first partition row to
the current row. For the
NTH_VALUE()
calls, the current
frame does not always include the requested row; in such cases,
the return value is NULL
.
In the absence of a frame clause, the default frame depends on
whether an ORDER BY
clause is present:
With ORDER BY
: The default frame includes
rows from the partition start through the current row,
including all peers of the current row (rows equal to the
current row according to the ORDER BY
clause). The default is equivalent to this frame
specification:
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
Without ORDER BY
: The default frame
includes all partition rows (because, without ORDER
BY
, all partition rows are peers). The default is
equivalent to this frame specification:
RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
Because the default frame differs depending on presence or
absence of ORDER BY
, adding ORDER
BY
to a query to get deterministic results may change
the results. (For example, the values produced by
SUM()
might change.) To obtain
the same results but ordered per ORDER BY
,
provide an explicit frame specification to be used regardless of
whether ORDER BY
is present.
The meaning of a frame specification can be nonobvious when the
current row value is NULL
. Assuming that to
be the case, these examples illustrate how various frame
specifications apply:
ORDER BY X ASC RANGE BETWEEN 10 FOLLOWING AND 15
FOLLOWING
The frame starts at NULL
and stops at
NULL
, thus includes only rows with value
NULL
.
ORDER BY X ASC RANGE BETWEEN 10 FOLLOWING AND
UNBOUNDED FOLLOWING
The frame starts at NULL
and stops at the
end of the partition. Because an ASC
sort
puts NULL
values first, the frame is the
entire partition.
ORDER BY X DESC RANGE BETWEEN 10 FOLLOWING AND
UNBOUNDED FOLLOWING
The frame starts at NULL
and stops at the
end of the partition. Because a DESC
sort
puts NULL
values last, the frame is only
the NULL
values.
ORDER BY X ASC RANGE BETWEEN 10 PRECEDING AND
UNBOUNDED FOLLOWING
The frame starts at NULL
and stops at the
end of the partition. Because an ASC
sort
puts NULL
values first, the frame is the
entire partition.
ORDER BY X ASC RANGE BETWEEN 10 PRECEDING AND 10
FOLLOWING
The frame starts at NULL
and stops at
NULL
, thus includes only rows with value
NULL
.
ORDER BY X ASC RANGE BETWEEN 10 PRECEDING AND 1
PRECEDING
The frame starts at NULL
and stops at
NULL
, thus includes only rows with value
NULL
.
ORDER BY X ASC RANGE BETWEEN UNBOUNDED PRECEDING
AND 10 FOLLOWING
The frame starts at the beginning of the partition and stops
at rows with value NULL
. Because an
ASC
sort puts NULL
values first, the frame is only the NULL
values.
Windows can be defined and given names by which to refer to them
in OVER
clauses. To do this, use a
WINDOW
clause. If present in a query, the
WINDOW
clause falls between the positions of
the HAVING
and ORDER BY
clauses, and has this syntax:
WINDOWwindow_name
AS (window_spec
) [,window_name
AS (window_spec
)] ...
For each window definition,
window_name
is the window name, and
window_spec
is the same type of
window specification as given between the parentheses of an
OVER
clause, as described in
Section 12.21.2, “Window Function Concepts and Syntax”:
window_spec
: [window_name
] [partition_clause
] [order_clause
] [frame_clause
]
A WINDOW
clause is useful for queries in
which multiple OVER
clauses would otherwise
define the same window. Instead, you can define the window once,
give it a name, and refer to the name in the
OVER
clauses. Consider this query, which
defines the same window multiple times:
SELECT val, ROW_NUMBER() OVER (ORDER BY val) AS 'row_number', RANK() OVER (ORDER BY val) AS 'rank', DENSE_RANK() OVER (ORDER BY val) AS 'dense_rank' FROM numbers;
The query can be written more simply by using
WINDOW
to define the window once and
referring to the window by name in the OVER
clauses:
SELECT val, ROW_NUMBER() OVER w AS 'row_number', RANK() OVER w AS 'rank', DENSE_RANK() OVER w AS 'dense_rank' FROM numbers WINDOW w AS (ORDER BY val);
A named window also makes it easier to experiment with the
window definition to see the effect on query results. You need
only modify the window definition in the
WINDOW
clause, rather than multiple
OVER
clause definitions.
If an OVER
clause uses OVER
(
rather
than window_name
...)OVER
, the named
window can be modified by the addition of other clauses. For
example, this query defines a window that includes partitioning,
and uses window_name
ORDER BY
in the
OVER
clauses to modify the window in
different ways:
SELECT DISTINCT year, country, FIRST_VALUE(year) OVER (w ORDER BY year ASC) AS first, FIRST_VALUE(year) OVER (w ORDER BY year DESC) AS last FROM sales WINDOW w AS (PARTITION BY country);
An OVER
clause can only add properties to a
named window, not modify them. If the named window definition
includes a partitioning, ordering, or framing property, the
OVER
clause that refers to the window name
cannot also include the same kind of property or an error
occurs:
This construct is permitted because the window definition
and the referring OVER
clause do not
contain the same kind of properties:
OVER (w ORDER BY country) ... WINDOW w AS (PARTITION BY country)
This construct is not permitted because the
OVER
clause specifies PARTITION
BY
for a named window that already has
PARTITION BY
:
OVER (w PARTITION BY year) ... WINDOW w AS (PARTITION BY country)
The definition of a named window can itself begin with a
window_name
. In such cases, forward
and backward references are permitted, but not cycles:
This is permitted; it contains forward and backward references but no cycles:
WINDOW w1 AS (w2), w2 AS (), w3 AS (w1)
This is not permitted because it contains a cycle:
WINDOW w1 AS (w2), w2 AS (w3), w3 AS (w1)
The SQL standard imposes a constraint on window functions that
they cannot be used in UPDATE
or
DELETE
statements to update rows.
Using such functions in a subquery of these statements (to
select rows) is permitted.
MySQL does not support these window function features:
DISTINCT
syntax for aggregate window
functions.
Nested window functions.
Dynamic frame endpoints that depend on the value of the current row.
The parser recognizes these window constructs which nevertheless are not supported:
The GROUPS
frame units specifier is
parsed, but produces an error. Only ROWS
and RANGE
are supported.
The EXCLUDE
clause for frame
specification is parsed, but produces an error.
IGNORE NULLS
is parsed, but produces an
error. Only RESPECT NULLS
is supported.
FROM LAST
is parsed, but produces an
error. Only FROM FIRST
is supported.
As of MySQL 8.0.16, MySQL includes built-in SQL functions that
format or retrieve Performance Schema data, and that may be used
as equivalents for the corresponding sys
schema
stored functions. The built-in functions can be invoked in any
schema and require no qualifier, unlike the sys
functions, which require either a sys.
schema
qualifier or that sys
be the current schema.
Table 12.28 Performance Schema Functions
Name | Description |
---|---|
FORMAT_BYTES() |
Convert byte count to value with units |
FORMAT_PICO_TIME() |
Convert time in picoseconds to value with units |
PS_CURRENT_THREAD_ID() |
Performance Schema thread ID for current thread |
PS_THREAD_ID() |
Performance Schema thread ID for given thread |
The built-in functions supersede the corresponding
sys
functions, which are deprecated and will be
removed in a future MySQL version. Applications that use the
sys
functions should be adjusted to use the
built-in functions instead, keeping in mind some minor differences
between the sys
functions and the built-in
functions. For details about these differences, see the function
descriptions in this section.
Given a numeric byte count, converts it to human-readable format and returns a string consisting of a value and a units indicator. The string contains the number of bytes rounded to 2 decimal places and a minimum of 3 significant digits. Numbers less than 1024 bytes are represented as whole numbers and are not rounded.
The units indicator depends on the size of the byte-count argument as shown in the following table.
Argument Value | Result Units | Result Units Indicator |
---|---|---|
Up to 1023 | bytes | bytes |
Up to 10242 − 1 | kibibytes | KiB |
Up to 10243 − 1 | mebibytes | MiB |
Up to 10244 − 1 | gibibytes | GiB |
Up to 10245 − 1 | tebibytes | TiB |
Up to 10246 − 1 | pebibytes | PiB |
10246 and up | exbibytes | EiB |
mysql> SELECT FORMAT_BYTES(512), FORMAT_BYTES(18446644073709551615);
+-------------------+------------------------------------+
| FORMAT_BYTES(512) | FORMAT_BYTES(18446644073709551615) |
+-------------------+------------------------------------+
| 512 bytes | 16.00 EiB |
+-------------------+------------------------------------+
FORMAT_BYTES()
was added in
MySQL 8.0.16. It may be used instead of the
sys
schema
format_bytes()
function, keeping
in mind this difference:
FORMAT_BYTES()
uses the
EiB
units indicator.
sys.format_bytes()
does not.
Given a numeric Performance Schema latency or wait time in picoseconds, converts it to human-readable format and returns a string consisting of a value and a units indicator. The string contains the decimal time rounded to 2 decimal places and a minimum of 3 significant digits. Times under 1 nanosecond are represented as whole numbers and are not rounded.
The units indicator depends on the size of the time-value argument as shown in the following table.
Argument Value | Result Units | Result Units Indicator |
---|---|---|
Up to 103 − 1 | picoseconds | ps |
Up to 106 − 1 | nanoseconds | ns |
Up to 109 − 1 | microseconds | us |
Up to 1012 − 1 | milliseconds | ms |
Up to 60×1012 − 1 | seconds | s |
Up to 3.6×1015 − 1 | minutes | min |
Up to 8.64×1016 − 1 | hours | h |
8.64×1016 and up | days | d |
mysql> SELECT FORMAT_PICO_TIME(3501), FORMAT_PICO_TIME(188732396662000);
+------------------------+-----------------------------------+
| FORMAT_PICO_TIME(3501) | FORMAT_PICO_TIME(188732396662000) |
+------------------------+-----------------------------------+
| 3.50 ns | 3.15 min |
+------------------------+-----------------------------------+
FORMAT_PICO_TIME()
was added in
MySQL 8.0.16. It may be used instead of the
sys
schema
format_time()
function, keeping
in mind these differences:
To indicate minutes,
sys.format_time()
uses the
m
units indicator, whereas
FORMAT_PICO_TIME()
uses
min
.
sys.format_time()
uses the
w
(weeks) units indicator.
FORMAT_PICO_TIME()
does
not.
Returns a BIGINT UNSIGNED
value
representing the Performance Schema thread ID assigned to the
current connection.
The thread ID return value is a value of the type given in the
THREAD_ID
column of Performance Schema
tables.
Performance Schema configuration affects
PS_CURRENT_THREAD_ID()
the same
way as for PS_THREAD_ID()
. For
details, see the description of that function.
mysql>SELECT PS_CURRENT_THREAD_ID();
+------------------------+ | PS_CURRENT_THREAD_ID() | +------------------------+ | 52 | +------------------------+ mysql>SELECT PS_THREAD_ID(CONNECTION_ID());
+-------------------------------+ | PS_THREAD_ID(CONNECTION_ID()) | +-------------------------------+ | 52 | +-------------------------------+
PS_CURRENT_THREAD_ID()
was
added in MySQL 8.0.16. It may be used as a shortcut for
invoking the sys
schema
ps_thread_id()
function with an
argument of NULL
or
CONNECTION_ID()
.
Given a connection ID, returns a BIGINT
UNSIGNED
value representing the Performance Schema
thread ID assigned to the connection ID, or
NULL
if no thread ID exists for the
connection ID. The latter can occur for threads that are not
instrumented.
The connection ID argument is a value of the type given in the
PROCESSLIST_ID
column of the Performance
Schema threads
table or the
Id
column of SHOW
PROCESSLIST
output.
The thread ID return value is a value of the type given in the
THREAD_ID
column of Performance Schema
tables.
Performance Schema configuration affects
PS_THREAD_ID()
operation as
follows. (These remarks also apply to
PS_CURRENT_THREAD_ID()
.)
Disabling the thread_instrumentation
consumer disables statistics from being collected and
aggregated at the thread level, but has no effect on
PS_THREAD_ID()
.
If
performance_schema_max_thread_instances
is not 0, the Performance Schema allocates memory for
thread statistics and assigns an internal ID to each
thread for which instance memory is available. If there
are threads for which instance memory is not available,
PS_THREAD_ID()
returns
NULL
; in this case,
Performance_schema_thread_instances_lost
will be nonzero.
If
performance_schema_max_thread_instances
is 0, the Performance Schema allocates no thread memory
and PS_THREAD_ID()
returns
NULL
.
If the Performance Schema itself is disabled,
PS_THREAD_ID()
produces an
error.
mysql> SELECT PS_THREAD_ID(6);
+-----------------+
| PS_THREAD_ID(6) |
+-----------------+
| 45 |
+-----------------+
PS_THREAD_ID()
was added in
MySQL 8.0.16. It may be used instead of the
sys
schema
ps_thread_id()
function, keeping
in mind this difference:
With an argument of NULL
,
sys.ps_thread_id()
returns
the thread ID for the current connection, whereas
PS_THREAD_ID()
returns
NULL
. To obtain the current connection
thread ID, use
PS_CURRENT_THREAD_ID()
instead.
Table 12.29 Internal Functions
Name | Description |
---|---|
CAN_ACCESS_COLUMN() |
Internal use only |
CAN_ACCESS_DATABASE() |
Internal use only |
CAN_ACCESS_TABLE() |
Internal use only |
CAN_ACCESS_VIEW() |
Internal use only |
GET_DD_COLUMN_PRIVILEGES() |
Internal use only |
GET_DD_CREATE_OPTIONS() |
Internal use only |
GET_DD_INDEX_SUB_PART_LENGTH() |
Internal use only |
INTERNAL_AUTO_INCREMENT() |
Internal use only |
INTERNAL_AVG_ROW_LENGTH() |
Internal use only |
INTERNAL_CHECK_TIME() |
Internal use only |
INTERNAL_CHECKSUM() |
Internal use only |
INTERNAL_DATA_FREE() |
Internal use only |
INTERNAL_DATA_LENGTH() |
Internal use only |
INTERNAL_DD_CHAR_LENGTH() |
Internal use only |
INTERNAL_GET_COMMENT_OR_ERROR() |
Internal use only |
INTERNAL_GET_VIEW_WARNING_OR_ERROR() |
Internal use only |
INTERNAL_INDEX_COLUMN_CARDINALITY() |
Internal use only |
INTERNAL_INDEX_LENGTH() |
Internal use only |
INTERNAL_KEYS_DISABLED() |
Internal use only |
INTERNAL_MAX_DATA_LENGTH() |
Internal use only |
INTERNAL_TABLE_ROWS() |
Internal use only |
INTERNAL_UPDATE_TIME() |
Internal use only |
The functions listed in this section are intended only for internal use by the server. Attempts by users to invoke them result in an error.
Table 12.30 Miscellaneous Functions
Name | Description |
---|---|
ANY_VALUE() |
Suppress ONLY_FULL_GROUP_BY value rejection |
BIN_TO_UUID() |
Convert binary UUID to string |
DEFAULT() |
Return the default value for a table column |
GROUPING() |
Distinguish super-aggregate ROLLUP rows from regular rows |
INET_ATON() |
Return the numeric value of an IP address |
INET_NTOA() |
Return the IP address from a numeric value |
INET6_ATON() |
Return the numeric value of an IPv6 address |
INET6_NTOA() |
Return the IPv6 address from a numeric value |
IS_IPV4() |
Whether argument is an IPv4 address |
IS_IPV4_COMPAT() |
Whether argument is an IPv4-compatible address |
IS_IPV4_MAPPED() |
Whether argument is an IPv4-mapped address |
IS_IPV6() |
Whether argument is an IPv6 address |
IS_UUID() |
Whether argument is a valid UUID |
MASTER_POS_WAIT() |
Block until the slave has read and applied all updates up to the specified position |
NAME_CONST() |
Cause the column to have the given name |
RAND() |
Return a random floating-point value |
SLEEP() |
Sleep for a number of seconds |
UUID() |
Return a Universal Unique Identifier (UUID) |
UUID_SHORT() |
Return an integer-valued universal identifier |
UUID_TO_BIN() |
Convert string UUID to binary |
VALUES() |
Define the values to be used during an INSERT |
This function is useful for GROUP BY
queries when the
ONLY_FULL_GROUP_BY
SQL mode
is enabled, for cases when MySQL rejects a query that you know
is valid for reasons that MySQL cannot determine. The function
return value and type are the same as the return value and
type of its argument, but the function result is not checked
for the ONLY_FULL_GROUP_BY
SQL mode.
For example, if name
is a nonindexed
column, the following query fails with
ONLY_FULL_GROUP_BY
enabled:
mysql> SELECT name, address, MAX(age) FROM t GROUP BY name;
ERROR 1055 (42000): Expression #2 of SELECT list is not in GROUP
BY clause and contains nonaggregated column 'mydb.t.address' which
is not functionally dependent on columns in GROUP BY clause; this
is incompatible with sql_mode=only_full_group_by
The failure occurs because address
is a
nonaggregated column that is neither named among
GROUP BY
columns nor functionally dependent
on them. As a result, the address
value for
rows within each name
group is
nondeterministic. There are multiple ways to cause MySQL to
accept the query:
Alter the table to make name
a primary
key or a unique NOT NULL
column. This
enables MySQL to determine that address
is functionally dependent on name
; that
is, address
is uniquely determined by
name
. (This technique is inapplicable
if NULL
must be permitted as a valid
name
value.)
Use ANY_VALUE()
to refer to
address
:
SELECT name, ANY_VALUE(address), MAX(age) FROM t GROUP BY name;
In this case, MySQL ignores the nondeterminism of
address
values within each
name
group and accepts the query. This
may be useful if you simply do not care which value of a
nonaggregated column is chosen for each group.
ANY_VALUE()
is not an
aggregate function, unlike functions such as
SUM()
or
COUNT()
. It simply acts to
suppress the test for nondeterminism.
Disable
ONLY_FULL_GROUP_BY
. This
is equivalent to using
ANY_VALUE()
with
ONLY_FULL_GROUP_BY
enabled, as described in the previous item.
ANY_VALUE()
is also useful if
functional dependence exists between columns but MySQL cannot
determine it. The following query is valid because
age
is functionally dependent on the
grouping column age-1
, but MySQL cannot
tell that and rejects the query with
ONLY_FULL_GROUP_BY
enabled:
SELECT age FROM t GROUP BY age-1;
To cause MySQL to accept the query, use
ANY_VALUE()
:
SELECT ANY_VALUE(age) FROM t GROUP BY age-1;
ANY_VALUE()
can be used for
queries that refer to aggregate functions in the absence of a
GROUP BY
clause:
mysql> SELECT name, MAX(age) FROM t;
ERROR 1140 (42000): In aggregated query without GROUP BY, expression
#1 of SELECT list contains nonaggregated column 'mydb.t.name'; this
is incompatible with sql_mode=only_full_group_by
Without GROUP BY
, there is a single group
and it is nondeterministic which name
value
to choose for the group.
ANY_VALUE()
tells MySQL to
accept the query:
SELECT ANY_VALUE(name), MAX(age) FROM t;
It may be that, due to some property of a given data set, you
know that a selected nonaggregated column is effectively
functionally dependent on a GROUP BY
column. For example, an application may enforce uniqueness of
one column with respect to another. In this case, using
ANY_VALUE()
for the effectively
functionally dependent column may make sense.
For additional discussion, see Section 12.20.3, “MySQL Handling of GROUP BY”.
BIN_TO_UUID(
,
binary_uuid
)BIN_TO_UUID(
binary_uuid
,
swap_flag
)
BIN_TO_UUID()
is the inverse of
UUID_TO_BIN()
. It converts a
binary UUID to a string UUID and returns the result. The
binary value should be a UUID as a
VARBINARY(16)
value. The return
value is a utf8
string of five hexadecimal
numbers separated by dashes. (For details about this format,
see the UUID()
function
description.) If the UUID argument is NULL
,
the return value is NULL
. If any argument
is invalid, an error occurs.
BIN_TO_UUID()
takes one or two
arguments:
The one-argument form takes a binary UUID value. The UUID value is assumed not to have its time-low and time-high parts swapped. The string result is in the same order as the binary argument.
The two-argument form takes a binary UUID value and a swap-flag value:
If swap_flag
is 0, the
two-argument form is equivalent to the one-argument
form. The string result is in the same order as the
binary argument.
If swap_flag
is 1, the UUID
value is assumed to have its time-low and time-high
parts swapped. These parts are swapped back to their
original position in the result value.
For usage examples and information about time-part swapping,
see the UUID_TO_BIN()
function
description.
Returns the default value for a table column. An error results if the column has no default value.
The use of
DEFAULT(
to specify the default value for a named column is permitted
only for columns that have a literal default value, not for
columns that have an expression default value.
col_name
)
mysql> UPDATE t SET i = DEFAULT(i)+1 WHERE id < 100;
Formats the number X
to a format
like '#,###,###.##'
, rounded to
D
decimal places, and returns the
result as a string. For details, see
Section 12.5, “String Functions”.
For GROUP BY
queries that include a
WITH ROLLUP
modifier, the
ROLLUP
operation produces super-aggregate
output rows where NULL
represents the set
of all values. The GROUPING()
function enables you to distinguish NULL
values for super-aggregate rows from NULL
values in regular grouped rows.
GROUPING()
is permitted only in
the select list or HAVING
clause.
Each argument to GROUPING()
must be an expression that exactly matches an expression in
the GROUP BY
clause. The expression cannot
be a positional specifier. For each expression,
GROUPING()
produces 1 if the
expression value in the current row is a
NULL
representing a super-aggregate value.
Otherwise, GROUPING()
produces
0, indicating that the expression value is a
NULL
for a regular result row or is not
NULL
.
Suppose that table t1
contains these rows,
where NULL
indicates something like
“other” or “unknown”:
mysql> SELECT * FROM t1;
+------+-------+----------+
| name | size | quantity |
+------+-------+----------+
| ball | small | 10 |
| ball | large | 20 |
| ball | NULL | 5 |
| hoop | small | 15 |
| hoop | large | 5 |
| hoop | NULL | 3 |
+------+-------+----------+
A summary of the table without WITH ROLLUP
looks like this:
mysql>SELECT name, size, SUM(quantity) AS quantity
FROM t1
GROUP BY name, size;
+------+-------+----------+ | name | size | quantity | +------+-------+----------+ | ball | small | 10 | | ball | large | 20 | | ball | NULL | 5 | | hoop | small | 15 | | hoop | large | 5 | | hoop | NULL | 3 | +------+-------+----------+
The result contains NULL
values, but those
do not represent super-aggregate rows because the query does
not include WITH ROLLUP
.
Adding WITH ROLLUP
produces super-aggregate
summary rows containing additional NULL
values. However, without comparing this result to the previous
one, it is not easy to see which NULL
values occur in super-aggregate rows and which occur in
regular grouped rows:
mysql>SELECT name, size, SUM(quantity) AS quantity
FROM t1
GROUP BY name, size WITH ROLLUP;
+------+-------+----------+ | name | size | quantity | +------+-------+----------+ | ball | NULL | 5 | | ball | large | 20 | | ball | small | 10 | | ball | NULL | 35 | | hoop | NULL | 3 | | hoop | large | 5 | | hoop | small | 15 | | hoop | NULL | 23 | | NULL | NULL | 58 | +------+-------+----------+
To distinguish NULL
values in in
super-aggregate rows from those in regular grouped rows, use
GROUPING()
, which returns 1
only for super-aggregate NULL
values:
mysql>SELECT
name, size, SUM(quantity) AS quantity,
GROUPING(name) AS grp_name,
GROUPING(size) AS grp_size
FROM t1
GROUP BY name, size WITH ROLLUP;
+------+-------+----------+----------+----------+ | name | size | quantity | grp_name | grp_size | +------+-------+----------+----------+----------+ | ball | NULL | 5 | 0 | 0 | | ball | large | 20 | 0 | 0 | | ball | small | 10 | 0 | 0 | | ball | NULL | 35 | 0 | 1 | | hoop | NULL | 3 | 0 | 0 | | hoop | large | 5 | 0 | 0 | | hoop | small | 15 | 0 | 0 | | hoop | NULL | 23 | 0 | 1 | | NULL | NULL | 58 | 1 | 1 | +------+-------+----------+----------+----------+
Common uses for GROUPING()
:
Substitute a label for super-aggregate
NULL
values:
mysql>SELECT
IF(GROUPING(name) = 1, 'All items', name) AS name,
IF(GROUPING(size) = 1, 'All sizes', size) AS size,
SUM(quantity) AS quantity
FROM t1
GROUP BY name, size WITH ROLLUP;
+-----------+-----------+----------+ | name | size | quantity | +-----------+-----------+----------+ | ball | NULL | 5 | | ball | large | 20 | | ball | small | 10 | | ball | All sizes | 35 | | hoop | NULL | 3 | | hoop | large | 5 | | hoop | small | 15 | | hoop | All sizes | 23 | | All items | All sizes | 58 | +-----------+-----------+----------+
Return only super-aggregate lines by filtering out the regular grouped lines:
mysql>SELECT name, size, SUM(quantity) AS quantity
FROM t1
GROUP BY name, size WITH ROLLUP
HAVING GROUPING(name) = 1 OR GROUPING(size) = 1;
+------+------+----------+ | name | size | quantity | +------+------+----------+ | ball | NULL | 35 | | hoop | NULL | 23 | | NULL | NULL | 58 | +------+------+----------+
GROUPING()
permits multiple
expression arguments. In this case, the
GROUPING()
return value
represents a bitmask combined from the results for each
expression, where the lowest-order bit corresponds to the
result for the rightmost expression. For example, with three
expression arguments,
GROUPING(
is evaluated like
this:
expr1
,
expr2
,
expr3
)
result for GROUPING(expr3
) + result for GROUPING(expr2
) << 1 + result for GROUPING(expr1
) << 2
The following query shows how
GROUPING()
results for single
arguments combine for a multiple-argument call to produce a
bitmask value:
mysql>SELECT
name, size, SUM(quantity) AS quantity,
GROUPING(name) AS grp_name,
GROUPING(size) AS grp_size,
GROUPING(name, size) AS grp_all
FROM t1
GROUP BY name, size WITH ROLLUP;
+------+-------+----------+----------+----------+---------+ | name | size | quantity | grp_name | grp_size | grp_all | +------+-------+----------+----------+----------+---------+ | ball | NULL | 5 | 0 | 0 | 0 | | ball | large | 20 | 0 | 0 | 0 | | ball | small | 10 | 0 | 0 | 0 | | ball | NULL | 35 | 0 | 1 | 1 | | hoop | NULL | 3 | 0 | 0 | 0 | | hoop | large | 5 | 0 | 0 | 0 | | hoop | small | 15 | 0 | 0 | 0 | | hoop | NULL | 23 | 0 | 1 | 1 | | NULL | NULL | 58 | 1 | 1 | 3 | +------+-------+----------+----------+----------+---------+
With multiple expression arguments, the
GROUPING()
return value is
nonzero if any expression represents a super-aggregate value.
Multiple-argument GROUPING()
syntax thus provides a simpler way to write the earlier query
that returned only super-aggregate rows, by using a single
multiple-argument GROUPING()
call rather than multiple single-argument calls:
mysql>SELECT name, size, SUM(quantity) AS quantity
FROM t1
GROUP BY name, size WITH ROLLUP
HAVING GROUPING(name, size) <> 0;
+------+------+----------+ | name | size | quantity | +------+------+----------+ | ball | NULL | 35 | | hoop | NULL | 23 | | NULL | NULL | 58 | +------+------+----------+
Use of GROUPING()
is subject to
these limitations:
Do not use subquery GROUP BY
expressions as GROUPING()
arguments because matching might fail. For example,
matching fails for this query:
mysql>SELECT GROUPING((SELECT MAX(name) FROM t1))
FROM t1
GROUP BY (SELECT MAX(name) FROM t1) WITH ROLLUP;
ERROR 3580 (HY000): Argument #1 of GROUPING function is not in GROUP BY
GROUP BY
literal expressions should not
be used within a HAVING
clause as
GROUPING()
arguments. Due
to differences between when the optimizer evaluates
GROUP BY
and HAVING
,
matching may succeed but
GROUPING()
evaluation does
not produce the expected result. Consider this query:
SELECT a AS f1, 'w' AS f2 FROM t GROUP BY f1, f2 WITH ROLLUP HAVING GROUPING(f2) = 1;
GROUPING()
is evaluated
earlier for the literal constant expression than for the
HAVING
clause as a whole and returns 0.
To check whether a query such as this is affected, use
EXPLAIN
and look for
Impossible having
in the
Extra
column.
For more information about WITH ROLLUP
and
GROUPING()
, see
Section 12.20.2, “GROUP BY Modifiers”.
Given the dotted-quad representation of an IPv4 network
address as a string, returns an integer that represents the
numeric value of the address in network byte order (big
endian). INET_ATON()
returns
NULL
if it does not understand its
argument.
mysql> SELECT INET_ATON('10.0.5.9');
-> 167773449
For this example, the return value is calculated as 10×2563 + 0×2562 + 5×256 + 9.
INET_ATON()
may or may not
return a non-NULL
result for short-form IP
addresses (such as '127.1'
as a
representation of '127.0.0.1'
). Because of
this, INET_ATON()
a should not
be used for such addresses.
To store values generated by
INET_ATON()
, use an
INT UNSIGNED
column rather than
INT
, which is signed. If you
use a signed column, values corresponding to IP addresses
for which the first octet is greater than 127 cannot be
stored correctly. See
Section 11.2.6, “Out-of-Range and Overflow Handling”.
Given a numeric IPv4 network address in network byte order,
returns the dotted-quad string representation of the address
as a string in the connection character set.
INET_NTOA()
returns
NULL
if it does not understand its
argument.
mysql> SELECT INET_NTOA(167773449);
-> '10.0.5.9'
Given an IPv6 or IPv4 network address as a string, returns a
binary string that represents the numeric value of the address
in network byte order (big endian). Because numeric-format
IPv6 addresses require more bytes than the largest integer
type, the representation returned by this function has the
VARBINARY
data type:
VARBINARY(16)
for IPv6
addresses and VARBINARY(4)
for
IPv4 addresses. If the argument is not a valid address,
INET6_ATON()
returns
NULL
.
The following examples use
HEX()
to display the
INET6_ATON()
result in
printable form:
mysql>SELECT HEX(INET6_ATON('fdfe::5a55:caff:fefa:9089'));
-> 'FDFE0000000000005A55CAFFFEFA9089' mysql>SELECT HEX(INET6_ATON('10.0.5.9'));
-> '0A000509'
INET6_ATON()
observes several constraints
on valid arguments. These are given in the following list
along with examples.
A trailing zone ID is not permitted, as in
fe80::3%1
or
fe80::3%eth0
.
A trailing network mask is not permitted, as in
2001:45f:3:ba::/64
or
198.51.100.0/24
.
For values representing IPv4 addresses, only classless
addresses are supported. Classful addresses such as
198.51.1
are rejected. A trailing port
number is not permitted, as in
198.51.100.2:8080
. Hexadecimal numbers
in address components are not permitted, as in
198.0xa0.1.2
. Octal numbers are not
supported: 198.51.010.1
is treated as
198.51.10.1
, not
198.51.8.1
. These IPv4 constraints also
apply to IPv6 addresses that have IPv4 address parts, such
as IPv4-compatible or IPv4-mapped addresses.
To convert an IPv4 address expr
represented in numeric form as an
INT
value to an IPv6 address
represented in numeric form as a
VARBINARY
value, use this
expression:
INET6_ATON(INET_NTOA(expr
))
For example:
mysql> SELECT HEX(INET6_ATON(INET_NTOA(167773449)));
-> '0A000509'
Given an IPv6 or IPv4 network address represented in numeric
form as a binary string, returns the string representation of
the address as a string in the connection character set. If
the argument is not a valid address,
INET6_NTOA()
returns
NULL
.
INET6_NTOA()
has these
properties:
It does not use operating system functions to perform conversions, thus the output string is platform independent.
The return string has a maximum length of 39 (4 x 8 + 7). Given this statement:
CREATE TABLE t AS SELECT INET6_NTOA(expr
) AS c1;
The resulting table would have this definition:
CREATE TABLE t (c1 VARCHAR(39) CHARACTER SET utf8 DEFAULT NULL);
The return string uses lowercase letters for IPv6 addresses.
mysql>SELECT INET6_NTOA(INET6_ATON('fdfe::5a55:caff:fefa:9089'));
-> 'fdfe::5a55:caff:fefa:9089' mysql>SELECT INET6_NTOA(INET6_ATON('10.0.5.9'));
-> '10.0.5.9' mysql>SELECT INET6_NTOA(UNHEX('FDFE0000000000005A55CAFFFEFA9089'));
-> 'fdfe::5a55:caff:fefa:9089' mysql>SELECT INET6_NTOA(UNHEX('0A000509'));
-> '10.0.5.9'
Returns 1 if the argument is a valid IPv4 address specified as a string, 0 otherwise.
mysql> SELECT IS_IPV4('10.0.5.9'), IS_IPV4('10.0.5.256');
-> 1, 0
For a given argument, if
IS_IPV4()
returns 1,
INET_ATON()
(and
INET6_ATON()
) will return
non-NULL
. The converse statement is not
true: In some cases,
INET_ATON()
returns
non-NULL
when
IS_IPV4()
returns 0.
As implied by the preceding remarks,
IS_IPV4()
is more strict than
INET_ATON()
about what
constitutes a valid IPv4 address, so it may be useful for
applications that need to perform strong checks against
invalid values. Alternatively, use
INET6_ATON()
to convert IPv4
addresses to internal form and check for a
NULL
result (which indicates an invalid
address). INET6_ATON()
is
equally strong as IS_IPV4()
about checking IPv4 addresses.
This function takes an IPv6 address represented in numeric
form as a binary string, as returned by
INET6_ATON()
. It returns 1 if
the argument is a valid IPv4-compatible IPv6 address, 0
otherwise. IPv4-compatible addresses have the form
::
.
ipv4_address
mysql>SELECT IS_IPV4_COMPAT(INET6_ATON('::10.0.5.9'));
-> 1 mysql>SELECT IS_IPV4_COMPAT(INET6_ATON('::ffff:10.0.5.9'));
-> 0
The IPv4 part of an IPv4-compatible address can also be
represented using hexadecimal notation. For example,
198.51.100.1
has this raw hexadecimal
value:
mysql> SELECT HEX(INET6_ATON('198.51.100.1'));
-> 'C6336401'
Expressed in IPv4-compatible form,
::198.51.100.1
is equivalent to
::c0a8:0001
or (without leading zeros)
::c0a8:1
mysql>SELECT
->IS_IPV4_COMPAT(INET6_ATON('::198.51.100.1')),
->IS_IPV4_COMPAT(INET6_ATON('::c0a8:0001')),
->IS_IPV4_COMPAT(INET6_ATON('::c0a8:1'));
-> 1, 1, 1
This function takes an IPv6 address represented in numeric
form as a binary string, as returned by
INET6_ATON()
. It returns 1 if
the argument is a valid IPv4-mapped IPv6 address, 0 otherwise.
IPv4-mapped addresses have the form
::ffff:
.
ipv4_address
mysql>SELECT IS_IPV4_MAPPED(INET6_ATON('::10.0.5.9'));
-> 0 mysql>SELECT IS_IPV4_MAPPED(INET6_ATON('::ffff:10.0.5.9'));
-> 1
As with IS_IPV4_COMPAT()
the IPv4 part of
an IPv4-mapped address can also be represented using
hexadecimal notation:
mysql>SELECT
->IS_IPV4_MAPPED(INET6_ATON('::ffff:198.51.100.1')),
->IS_IPV4_MAPPED(INET6_ATON('::ffff:c0a8:0001')),
->IS_IPV4_MAPPED(INET6_ATON('::ffff:c0a8:1'));
-> 1, 1, 1
Returns 1 if the argument is a valid IPv6 address specified as a string, 0 otherwise. This function does not consider IPv4 addresses to be valid IPv6 addresses.
mysql> SELECT IS_IPV6('10.0.5.9'), IS_IPV6('::1');
-> 0, 1
For a given argument, if
IS_IPV6()
returns 1,
INET6_ATON()
will return
non-NULL
.
Returns 1 if the argument is a valid string-format UUID, 0 if
the argument is not a valid UUID, and NULL
if the argument is NULL
.
“Valid” means that the value is in a format that can be parsed. That is, it has the correct length and contains only the permitted characters (hexadecimal digits in any lettercase and, optionally, dashes and curly braces). This format is most common:
aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee
These other formats are also permitted:
aaaaaaaabbbbccccddddeeeeeeeeeeee {aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee}
For the meanings of fields within the value, see the
UUID()
function description.
mysql>SELECT IS_UUID('6ccd780c-baba-1026-9564-5b8c656024db');
+-------------------------------------------------+ | IS_UUID('6ccd780c-baba-1026-9564-5b8c656024db') | +-------------------------------------------------+ | 1 | +-------------------------------------------------+ mysql>SELECT IS_UUID('6CCD780C-BABA-1026-9564-5B8C656024DB');
+-------------------------------------------------+ | IS_UUID('6CCD780C-BABA-1026-9564-5B8C656024DB') | +-------------------------------------------------+ | 1 | +-------------------------------------------------+ mysql>SELECT IS_UUID('6ccd780cbaba102695645b8c656024db');
+---------------------------------------------+ | IS_UUID('6ccd780cbaba102695645b8c656024db') | +---------------------------------------------+ | 1 | +---------------------------------------------+ mysql>SELECT IS_UUID('{6ccd780c-baba-1026-9564-5b8c656024db}');
+---------------------------------------------------+ | IS_UUID('{6ccd780c-baba-1026-9564-5b8c656024db}') | +---------------------------------------------------+ | 1 | +---------------------------------------------------+ mysql>SELECT IS_UUID('6ccd780c-baba-1026-9564-5b8c6560');
+---------------------------------------------+ | IS_UUID('6ccd780c-baba-1026-9564-5b8c6560') | +---------------------------------------------+ | 0 | +---------------------------------------------+ mysql>SELECT IS_UUID(RAND());
+-----------------+ | IS_UUID(RAND()) | +-----------------+ | 0 | +-----------------+
MASTER_POS_WAIT(
log_name
,log_pos
[,timeout
][,channel
])
This function is useful for control of master/slave
synchronization. It blocks until the slave has read and
applied all updates up to the specified position in the master
log. The return value is the number of log events the slave
had to wait for to advance to the specified position. The
function returns NULL
if the slave SQL
thread is not started, the slave's master information is not
initialized, the arguments are incorrect, or an error occurs.
It returns -1
if the timeout has been
exceeded. If the slave SQL thread stops while
MASTER_POS_WAIT()
is waiting,
the function returns NULL
. If the slave is
past the specified position, the function returns immediately.
On a multithreaded slave, the function waits until expiry of
the limit set by the
slave_checkpoint_group
or
slave_checkpoint_period
system variable, when the checkpoint operation is called to
update the status of the slave. Depending on the setting for
the system variables, the function might therefore return some
time after the specified position was reached.
If a timeout
value is specified,
MASTER_POS_WAIT()
stops waiting
when timeout
seconds have elapsed.
timeout
must be greater than 0; a
zero or negative timeout
means no
timeout.
The optional channel
value enables
you to name which replication channel the function applies to.
See Section 17.2.3, “Replication Channels” for more
information.
This function is unsafe for statement-based replication. A
warning is logged if you use this function when
binlog_format
is set to
STATEMENT
.
Returns the given value. When used to produce a result set
column, NAME_CONST()
causes the
column to have the given name. The arguments should be
constants.
mysql> SELECT NAME_CONST('myname', 14);
+--------+
| myname |
+--------+
| 14 |
+--------+
This function is for internal use only. The server uses it when writing statements from stored programs that contain references to local program variables, as described in Section 24.7, “Binary Logging of Stored Programs”. You might see this function in the output from mysqlbinlog.
For your applications, you can obtain exactly the same result as in the example just shown by using simple aliasing, like this:
mysql> SELECT 14 AS myname;
+--------+
| myname |
+--------+
| 14 |
+--------+
1 row in set (0.00 sec)
See Section 13.2.10, “SELECT Syntax”, for more information about column aliases.
Sleeps (pauses) for the number of seconds given by the
duration
argument, then returns 0.
The duration may have a fractional part. If the argument is
NULL
or negative,
SLEEP()
produces a warning, or
an error in strict SQL mode.
When sleep returns normally (without interruption), it returns 0:
mysql> SELECT SLEEP(1000);
+-------------+
| SLEEP(1000) |
+-------------+
| 0 |
+-------------+
When SLEEP()
is the only thing
invoked by a query that is interrupted, it returns 1 and the
query itself returns no error. This is true whether the query
is killed or times out:
This statement is interrupted using
KILL QUERY
from another session:
mysql> SELECT SLEEP(1000);
+-------------+
| SLEEP(1000) |
+-------------+
| 1 |
+-------------+
This statement is interrupted by timing out:
mysql> SELECT /*+ MAX_EXECUTION_TIME(1) */ SLEEP(1000);
+-------------+
| SLEEP(1000) |
+-------------+
| 1 |
+-------------+
When SLEEP()
is only part of a
query that is interrupted, the query returns an error:
This statement is interrupted using
KILL QUERY
from another session:
mysql> SELECT 1 FROM t1 WHERE SLEEP(1000);
ERROR 1317 (70100): Query execution was interrupted
This statement is interrupted by timing out:
mysql> SELECT /*+ MAX_EXECUTION_TIME(1000) */ 1 FROM t1 WHERE SLEEP(1000);
ERROR 3024 (HY000): Query execution was interrupted, maximum statement
execution time exceeded
This function is unsafe for statement-based replication. A
warning is logged if you use this function when
binlog_format
is set to
STATEMENT
.
Returns a Universal Unique Identifier (UUID) generated according to RFC 4122, “A Universally Unique IDentifier (UUID) URN Namespace” (http://www.ietf.org/rfc/rfc4122.txt).
A UUID is designed as a number that is globally unique in
space and time. Two calls to
UUID()
are expected to generate
two different values, even if these calls are performed on two
separate devices not connected to each other.
Although UUID()
values are
intended to be unique, they are not necessarily unguessable
or unpredictable. If unpredictability is required, UUID
values should be generated some other way.
UUID()
returns a value that
conforms to UUID version 1 as described in RFC 4122. The value
is a 128-bit number represented as a utf8
string of five hexadecimal numbers in
aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee
format:
The first three numbers are generated from the low, middle, and high parts of a timestamp. The high part also includes the UUID version number.
The fourth number preserves temporal uniqueness in case the timestamp value loses monotonicity (for example, due to daylight saving time).
The fifth number is an IEEE 802 node number that provides spatial uniqueness. A random number is substituted if the latter is not available (for example, because the host device has no Ethernet card, or it is unknown how to find the hardware address of an interface on the host operating system). In this case, spatial uniqueness cannot be guaranteed. Nevertheless, a collision should have very low probability.
The MAC address of an interface is taken into account only on FreeBSD and Linux. On other operating systems, MySQL uses a randomly generated 48-bit number.
mysql> SELECT UUID();
-> '6ccd780c-baba-1026-9564-5b8c656024db'
To convert between string and binary UUID values, use the
UUID_TO_BIN()
and
BIN_TO_UUID()
functions. To
check whether a string is a valid UUID value, use the
IS_UUID()
function.
UUID()
does not work with
statement-based replication.
Returns a “short” universal identifier as a
64-bit unsigned integer. Values returned by
UUID_SHORT()
differ from the
string-format 128-bit identifiers returned by the
UUID()
function and have
different uniqueness properties. The value of
UUID_SHORT()
is guaranteed to
be unique if the following conditions hold:
The server_id
value of
the current server is between 0 and 255 and is unique
among your set of master and slave servers
You do not set back the system time for your server host between mysqld restarts
You invoke UUID_SHORT()
on
average fewer than 16 million times per second between
mysqld restarts
The UUID_SHORT()
return value
is constructed this way:
(server_id & 255) << 56 + (server_startup_time_in_seconds << 24) + incremented_variable++;
mysql> SELECT UUID_SHORT();
-> 92395783831158784
UUID_SHORT()
does not work
with statement-based replication.
UUID_TO_BIN(
,
string_uuid
)UUID_TO_BIN(
string_uuid
,
swap_flag
)
Converts a string UUID to a binary UUID and returns the
result. (The IS_UUID()
function
description lists the permitted string UUID formats.) The
return binary UUID is a
VARBINARY(16)
value. If the
UUID argument is NULL
, the return value is
NULL
. If any argument is invalid, an error
occurs.
UUID_TO_BIN()
takes one or two
arguments:
The one-argument form takes a string UUID value. The binary result is in the same order as the string argument.
The two-argument form takes a string UUID value and a flag value:
If swap_flag
is 0, the
two-argument form is equivalent to the one-argument
form. The binary result is in the same order as the
string argument.
If swap_flag
is 1, the
format of the return value differs: The time-low and
time-high parts (the first and third groups of
hexadecimal digits, respectively) are swapped. This
moves the more rapidly varying part to the right and
can improve indexing efficiency if the result is
stored in an indexed column.
Time-part swapping assumes the use of UUID version 1 values,
such as are generated by the
UUID()
function. For UUID
values produced by other means that do not follow version 1
format, time-part swapping provides no benefit. For details
about version 1 format, see the
UUID()
function description.
Suppose that you have the following string UUID value:
mysql> SET @uuid = '6ccd780c-baba-1026-9564-5b8c656024db';
To convert the string UUID to binary with or without time-part
swapping, use UUID_TO_BIN()
:
mysql>SELECT HEX(UUID_TO_BIN(@uuid));
+----------------------------------+ | HEX(UUID_TO_BIN(@uuid)) | +----------------------------------+ | 6CCD780CBABA102695645B8C656024DB | +----------------------------------+ mysql>SELECT HEX(UUID_TO_BIN(@uuid, 0));
+----------------------------------+ | HEX(UUID_TO_BIN(@uuid, 0)) | +----------------------------------+ | 6CCD780CBABA102695645B8C656024DB | +----------------------------------+ mysql>SELECT HEX(UUID_TO_BIN(@uuid, 1));
+----------------------------------+ | HEX(UUID_TO_BIN(@uuid, 1)) | +----------------------------------+ | 1026BABA6CCD780C95645B8C656024DB | +----------------------------------+
To convert a binary UUID returned by
UUID_TO_BIN()
to a string UUID,
use BIN_TO_UUID()
. If you
produce a binary UUID by calling
UUID_TO_BIN()
with a second
argument of 1 to swap time parts, you should also pass a
second argument of 1 to
BIN_TO_UUID()
to unswap the
time parts when converting the binary UUID back to a string
UUID:
mysql>SELECT BIN_TO_UUID(UUID_TO_BIN(@uuid));
+--------------------------------------+ | BIN_TO_UUID(UUID_TO_BIN(@uuid)) | +--------------------------------------+ | 6ccd780c-baba-1026-9564-5b8c656024db | +--------------------------------------+ mysql>SELECT BIN_TO_UUID(UUID_TO_BIN(@uuid,0),0);
+--------------------------------------+ | BIN_TO_UUID(UUID_TO_BIN(@uuid,0),0) | +--------------------------------------+ | 6ccd780c-baba-1026-9564-5b8c656024db | +--------------------------------------+ mysql>SELECT BIN_TO_UUID(UUID_TO_BIN(@uuid,1),1);
+--------------------------------------+ | BIN_TO_UUID(UUID_TO_BIN(@uuid,1),1) | +--------------------------------------+ | 6ccd780c-baba-1026-9564-5b8c656024db | +--------------------------------------+
If the use of time-part swapping is not the same for the conversion in both directions, the original UUID will not be recovered properly:
mysql>SELECT BIN_TO_UUID(UUID_TO_BIN(@uuid,0),1);
+--------------------------------------+ | BIN_TO_UUID(UUID_TO_BIN(@uuid,0),1) | +--------------------------------------+ | baba1026-780c-6ccd-9564-5b8c656024db | +--------------------------------------+ mysql>SELECT BIN_TO_UUID(UUID_TO_BIN(@uuid,1),0);
+--------------------------------------+ | BIN_TO_UUID(UUID_TO_BIN(@uuid,1),0) | +--------------------------------------+ | 1026baba-6ccd-780c-9564-5b8c656024db | +--------------------------------------+
In an
INSERT
... ON DUPLICATE KEY UPDATE
statement, you can use
the
VALUES(
function in the col_name
)UPDATE
clause
to refer to column values from the
INSERT
portion of the
statement. In other words,
VALUES(
in the col_name
)UPDATE
clause refers to
the value of col_name
that would be
inserted, had no duplicate-key conflict occurred. This
function is especially useful in multiple-row inserts. The
VALUES()
function is meaningful
only in the ON DUPLICATE KEY UPDATE
clause
of INSERT
statements and
returns NULL
otherwise. See
Section 13.2.6.2, “INSERT ... ON DUPLICATE KEY UPDATE Syntax”.
mysql>INSERT INTO table (a,b,c) VALUES (1,2,3),(4,5,6)
->ON DUPLICATE KEY UPDATE c=VALUES(a)+VALUES(b);
MySQL provides support for precision math: numeric value handling that results in extremely accurate results and a high degree control over invalid values. Precision math is based on these two features:
SQL modes that control how strict the server is about accepting or rejecting invalid data.
The MySQL library for fixed-point arithmetic.
These features have several implications for numeric operations and provide a high degree of compliance with standard SQL:
Precise calculations: For
exact-value numbers, calculations do not introduce
floating-point errors. Instead, exact precision is used. For
example, MySQL treats a number such as .0001
as an exact value rather than as an approximation, and summing
it 10,000 times produces a result of exactly
1
, not a value that is merely
“close” to 1.
Well-defined rounding behavior:
For exact-value numbers, the result of
ROUND()
depends on its argument,
not on environmental factors such as how the underlying C
library works.
Platform independence: Operations on exact numeric values are the same across different platforms such as Windows and Unix.
Control over handling of invalid
values: Overflow and division by zero are detectable
and can be treated as errors. For example, you can treat a value
that is too large for a column as an error rather than having
the value truncated to lie within the range of the column's data
type. Similarly, you can treat division by zero as an error
rather than as an operation that produces a result of
NULL
. The choice of which approach to take is
determined by the setting of the server SQL mode.
The following discussion covers several aspects of how precision math works, including possible incompatibilities with older applications. At the end, some examples are given that demonstrate how MySQL handles numeric operations precisely. For information about controlling the SQL mode, see Section 5.1.11, “Server SQL Modes”.
The scope of precision math for exact-value operations includes
the exact-value data types (integer and
DECIMAL
types) and exact-value
numeric literals. Approximate-value data types and numeric
literals are handled as floating-point numbers.
Exact-value numeric literals have an integer part or fractional
part, or both. They may be signed. Examples: 1
,
.2
, 3.4
,
-5
, -6.78
,
+9.10
.
Approximate-value numeric literals are represented in scientific
notation with a mantissa and exponent. Either or both parts may be
signed. Examples: 1.2E3
,
1.2E-3
, -1.2E3
,
-1.2E-3
.
Two numbers that look similar may be treated differently. For
example, 2.34
is an exact-value (fixed-point)
number, whereas 2.34E0
is an approximate-value
(floating-point) number.
The DECIMAL
data type is a
fixed-point type and calculations are exact. In MySQL, the
DECIMAL
type has several synonyms:
NUMERIC
,
DEC
,
FIXED
. The integer types also are
exact-value types.
The FLOAT
and
DOUBLE
data types are
floating-point types and calculations are approximate. In MySQL,
types that are synonymous with
FLOAT
or
DOUBLE
are
DOUBLE PRECISION
and
REAL
.
This section discusses the characteristics of the
DECIMAL
data type (and its
synonyms), with particular regard to the following topics:
Maximum number of digits
Storage format
Storage requirements
The nonstandard MySQL extension to the upper range of
DECIMAL
columns
The declaration syntax for a
DECIMAL
column is
DECIMAL(
.
The ranges of values for the arguments are as follows:
M
,D
)
M
is the maximum number of digits
(the precision). It has a range of 1 to 65.
D
is the number of digits to the
right of the decimal point (the scale). It has a range of 0 to
30 and must be no larger than M
.
If D
is omitted, the default is 0. If
M
is omitted, the default is 10.
The maximum value of 65 for M
means
that calculations on DECIMAL
values
are accurate up to 65 digits. This limit of 65 digits of precision
also applies to exact-value numeric literals, so the maximum range
of such literals differs from before.
Values for DECIMAL
columns are
stored using a binary format that packs nine decimal digits into 4
bytes. The storage requirements for the integer and fractional
parts of each value are determined separately. Each multiple of
nine digits requires 4 bytes, and any remaining digits left over
require some fraction of 4 bytes. The storage required for
remaining digits is given by the following table.
Leftover Digits | Number of Bytes |
---|---|
0 | 0 |
1–2 | 1 |
3–4 | 2 |
5–6 | 3 |
7–9 | 4 |
For example, a DECIMAL(18,9)
column has nine
digits on either side of the decimal point, so the integer part
and the fractional part each require 4 bytes. A
DECIMAL(20,6)
column has fourteen integer
digits and six fractional digits. The integer digits require four
bytes for nine of the digits and 3 bytes for the remaining five
digits. The six fractional digits require 3 bytes.
DECIMAL
columns do not store a
leading +
character or -
character or leading 0
digits. If you insert
+0003.1
into a DECIMAL(5,1)
column, it is stored as 3.1
. For negative
numbers, a literal -
character is not stored.
DECIMAL
columns do not permit
values larger than the range implied by the column definition. For
example, a DECIMAL(3,0)
column supports a range
of -999
to 999
. A
DECIMAL(
column permits up to M
,D
)M
-
D
digits to the left of the decimal
point.
The SQL standard requires that the precision of
NUMERIC(
be exactly M
,D
)M
digits. For
DECIMAL(
,
the standard requires a precision of at least
M
,D
)M
digits but permits more. In MySQL,
DECIMAL(
and
M
,D
)NUMERIC(
are the same, and both have a precision of exactly
M
,D
)M
digits.
For a full explanation of the internal format of
DECIMAL
values, see the file
strings/decimal.c
in a MySQL source
distribution. The format is explained (with an example) in the
decimal2bin()
function.
With precision math, exact-value numbers are used as given
whenever possible. For example, numbers in comparisons are used
exactly as given without a change in value. In strict SQL mode,
for INSERT
into a column with an
exact data type (DECIMAL
or
integer), a number is inserted with its exact value if it is
within the column range. When retrieved, the value should be the
same as what was inserted. (If strict SQL mode is not enabled,
truncation for INSERT
is
permissible.)
Handling of a numeric expression depends on what kind of values the expression contains:
If any approximate values are present, the expression is approximate and is evaluated using floating-point arithmetic.
If no approximate values are present, the expression contains
only exact values. If any exact value contains a fractional
part (a value following the decimal point), the expression is
evaluated using DECIMAL
exact
arithmetic and has a precision of 65 digits. The term
“exact” is subject to the limits of what can be
represented in binary. For example, 1.0/3.0
can be approximated in decimal notation as
.333...
, but not written as an exact
number, so (1.0/3.0)*3.0
does not evaluate
to exactly 1.0
.
Otherwise, the expression contains only integer values. The
expression is exact and is evaluated using integer arithmetic
and has a precision the same as
BIGINT
(64 bits).
If a numeric expression contains any strings, they are converted to double-precision floating-point values and the expression is approximate.
Inserts into numeric columns are affected by the SQL mode, which
is controlled by the sql_mode
system variable. (See Section 5.1.11, “Server SQL Modes”.) The following
discussion mentions strict mode (selected by the
STRICT_ALL_TABLES
or
STRICT_TRANS_TABLES
mode values)
and ERROR_FOR_DIVISION_BY_ZERO
.
To turn on all restrictions, you can simply use
TRADITIONAL
mode, which includes
both strict mode values and
ERROR_FOR_DIVISION_BY_ZERO
:
SET sql_mode='TRADITIONAL';
If a number is inserted into an exact type column
(DECIMAL
or integer), it is
inserted with its exact value if it is within the column range and
precision.
If the value has too many digits in the fractional part, rounding occurs and a note is generated. Rounding is done as described in Section 12.25.4, “Rounding Behavior”. Truncation due to rounding of the fractional part is not an error, even in strict mode.
If the value has too many digits in the integer part, it is too large (out of range) and is handled as follows:
If strict mode is not enabled, the value is truncated to the nearest legal value and a warning is generated.
If strict mode is enabled, an overflow error occurs.
Underflow is not detected, so underflow handling is undefined.
For inserts of strings into numeric columns, conversion from string to number is handled as follows if the string has nonnumeric contents:
A string that does not begin with a number cannot be used as a number and produces an error in strict mode, or a warning otherwise. This includes the empty string.
A string that begins with a number can be converted, but the trailing nonnumeric portion is truncated. If the truncated portion contains anything other than spaces, this produces an error in strict mode, or a warning otherwise.
By default, division by zero produces a result of
NULL
and no warning. By setting the SQL mode
appropriately, division by zero can be restricted.
With the
ERROR_FOR_DIVISION_BY_ZERO
SQL
mode enabled, MySQL handles division by zero differently:
If strict mode is not enabled, a warning occurs.
If strict mode is enabled, inserts and updates involving division by zero are prohibited, and an error occurs.
In other words, inserts and updates involving expressions that
perform division by zero can be treated as errors, but this
requires
ERROR_FOR_DIVISION_BY_ZERO
in
addition to strict mode.
Suppose that we have this statement:
INSERT INTO t SET i = 1/0;
This is what happens for combinations of strict and
ERROR_FOR_DIVISION_BY_ZERO
modes.
sql_mode Value |
Result |
---|---|
'' (Default) |
No warning, no error; i is set to
NULL . |
strict | No warning, no error; i is set to
NULL . |
ERROR_FOR_DIVISION_BY_ZERO |
Warning, no error; i is set to
NULL . |
strict,ERROR_FOR_DIVISION_BY_ZERO |
Error condition; no row is inserted. |
This section discusses precision math rounding for the
ROUND()
function and for inserts
into columns with exact-value types
(DECIMAL
and integer).
The ROUND()
function rounds
differently depending on whether its argument is exact or
approximate:
For exact-value numbers,
ROUND()
uses the “round
half up” rule: A value with a fractional part of .5 or
greater is rounded up to the next integer if positive or down
to the next integer if negative. (In other words, it is
rounded away from zero.) A value with a fractional part less
than .5 is rounded down to the next integer if positive or up
to the next integer if negative. (In other words, it is
rounded toward zero.)
For approximate-value numbers, the result depends on the C
library. On many systems, this means that
ROUND()
uses the “round
to nearest even” rule: A value with a fractional part
exactly half way between two integers is rounded to the
nearest even integer.
The following example shows how rounding differs for exact and approximate values:
mysql> SELECT ROUND(2.5), ROUND(25E-1);
+------------+--------------+
| ROUND(2.5) | ROUND(25E-1) |
+------------+--------------+
| 3 | 2 |
+------------+--------------+
For inserts into a DECIMAL
or
integer column, the target is an exact data type, so rounding uses
“round half away from zero,” regardless of whether
the value to be inserted is exact or approximate:
mysql>CREATE TABLE t (d DECIMAL(10,0));
Query OK, 0 rows affected (0.00 sec) mysql>INSERT INTO t VALUES(2.5),(2.5E0);
Query OK, 2 rows affected, 2 warnings (0.00 sec) Records: 2 Duplicates: 0 Warnings: 2 mysql>SHOW WARNINGS;
+-------+------+----------------------------------------+ | Level | Code | Message | +-------+------+----------------------------------------+ | Note | 1265 | Data truncated for column 'd' at row 1 | | Note | 1265 | Data truncated for column 'd' at row 2 | +-------+------+----------------------------------------+ 2 rows in set (0.00 sec) mysql>SELECT d FROM t;
+------+ | d | +------+ | 3 | | 3 | +------+ 2 rows in set (0.00 sec)
The SHOW WARNINGS
statement
displays the notes that are generated by truncation due to
rounding of the fractional part. Such truncation is not an error,
even in strict SQL mode (see
Section 12.25.3, “Expression Handling”).
This section provides some examples that show precision math query results in MySQL. These examples demonstrate the principles described in Section 12.25.3, “Expression Handling”, and Section 12.25.4, “Rounding Behavior”.
Example 1. Numbers are used with their exact value as given when possible:
mysql> SELECT (.1 + .2) = .3;
+----------------+
| (.1 + .2) = .3 |
+----------------+
| 1 |
+----------------+
For floating-point values, results are inexact:
mysql> SELECT (.1E0 + .2E0) = .3E0;
+----------------------+
| (.1E0 + .2E0) = .3E0 |
+----------------------+
| 0 |
+----------------------+
Another way to see the difference in exact and approximate value
handling is to add a small number to a sum many times. Consider
the following stored procedure, which adds
.0001
to a variable 1,000 times.
CREATE PROCEDURE p () BEGIN DECLARE i INT DEFAULT 0; DECLARE d DECIMAL(10,4) DEFAULT 0; DECLARE f FLOAT DEFAULT 0; WHILE i < 10000 DO SET d = d + .0001; SET f = f + .0001E0; SET i = i + 1; END WHILE; SELECT d, f; END;
The sum for both d
and f
logically should be 1, but that is true only for the decimal
calculation. The floating-point calculation introduces small
errors:
+--------+------------------+ | d | f | +--------+------------------+ | 1.0000 | 0.99999999999991 | +--------+------------------+
Example 2. Multiplication is
performed with the scale required by standard SQL. That is, for
two numbers X1
and
X2
that have scale
S1
and S2
,
the scale of the result is
:
S1
+ S2
mysql> SELECT .01 * .01;
+-----------+
| .01 * .01 |
+-----------+
| 0.0001 |
+-----------+
Example 3. Rounding behavior for exact-value numbers is well-defined:
Rounding behavior (for example, with the
ROUND()
function) is independent of
the implementation of the underlying C library, which means that
results are consistent from platform to platform.
Rounding for exact-value columns
(DECIMAL
and integer) and
exact-valued numbers uses the “round half away from
zero” rule. A value with a fractional part of .5 or
greater is rounded away from zero to the nearest integer, as
shown here:
mysql> SELECT ROUND(2.5), ROUND(-2.5);
+------------+-------------+
| ROUND(2.5) | ROUND(-2.5) |
+------------+-------------+
| 3 | -3 |
+------------+-------------+
Rounding for floating-point values uses the C library, which on many systems uses the “round to nearest even” rule. A value with a fractional part exactly half way between two integers is rounded to the nearest even integer:
mysql> SELECT ROUND(2.5E0), ROUND(-2.5E0);
+--------------+---------------+
| ROUND(2.5E0) | ROUND(-2.5E0) |
+--------------+---------------+
| 2 | -2 |
+--------------+---------------+
Example 4. In strict mode, inserting a value that is out of range for a column causes an error, rather than truncation to a legal value.
When MySQL is not running in strict mode, truncation to a legal value occurs:
mysql>SET sql_mode='';
Query OK, 0 rows affected (0.00 sec) mysql>CREATE TABLE t (i TINYINT);
Query OK, 0 rows affected (0.01 sec) mysql>INSERT INTO t SET i = 128;
Query OK, 1 row affected, 1 warning (0.00 sec) mysql>SELECT i FROM t;
+------+ | i | +------+ | 127 | +------+ 1 row in set (0.00 sec)
However, an error occurs if strict mode is in effect:
mysql>SET sql_mode='STRICT_ALL_TABLES';
Query OK, 0 rows affected (0.00 sec) mysql>CREATE TABLE t (i TINYINT);
Query OK, 0 rows affected (0.00 sec) mysql>INSERT INTO t SET i = 128;
ERROR 1264 (22003): Out of range value adjusted for column 'i' at row 1 mysql>SELECT i FROM t;
Empty set (0.00 sec)
Example 5: In strict mode and
with ERROR_FOR_DIVISION_BY_ZERO
set, division by zero causes an error, not a result of
NULL
.
In nonstrict mode, division by zero has a result of
NULL
:
mysql>SET sql_mode='';
Query OK, 0 rows affected (0.01 sec) mysql>CREATE TABLE t (i TINYINT);
Query OK, 0 rows affected (0.00 sec) mysql>INSERT INTO t SET i = 1 / 0;
Query OK, 1 row affected (0.00 sec) mysql>SELECT i FROM t;
+------+ | i | +------+ | NULL | +------+ 1 row in set (0.03 sec)
However, division by zero is an error if the proper SQL modes are in effect:
mysql>SET sql_mode='STRICT_ALL_TABLES,ERROR_FOR_DIVISION_BY_ZERO';
Query OK, 0 rows affected (0.00 sec) mysql>CREATE TABLE t (i TINYINT);
Query OK, 0 rows affected (0.00 sec) mysql>INSERT INTO t SET i = 1 / 0;
ERROR 1365 (22012): Division by 0 mysql>SELECT i FROM t;
Empty set (0.01 sec)
Example 6. Exact-value literals are evaluated as exact values.
Approximate-value literals are evaluated using floating point, but
exact-value literals are handled as
DECIMAL
:
mysql>CREATE TABLE t SELECT 2.5 AS a, 25E-1 AS b;
Query OK, 1 row affected (0.01 sec) Records: 1 Duplicates: 0 Warnings: 0 mysql>DESCRIBE t;
+-------+-----------------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+-----------------------+------+-----+---------+-------+ | a | decimal(2,1) unsigned | NO | | 0.0 | | | b | double | NO | | 0 | | +-------+-----------------------+------+-----+---------+-------+ 2 rows in set (0.01 sec)
Example 7. If the argument to an aggregate function is an exact numeric type, the result is also an exact numeric type, with a scale at least that of the argument.
Consider these statements:
mysql>CREATE TABLE t (i INT, d DECIMAL, f FLOAT);
mysql>INSERT INTO t VALUES(1,1,1);
mysql>CREATE TABLE y SELECT AVG(i), AVG(d), AVG(f) FROM t;
The result is a double only for the floating-point argument. For exact type arguments, the result is also an exact type:
mysql> DESCRIBE y;
+--------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------+---------------+------+-----+---------+-------+
| AVG(i) | decimal(14,4) | YES | | NULL | |
| AVG(d) | decimal(14,4) | YES | | NULL | |
| AVG(f) | double | YES | | NULL | |
+--------+---------------+------+-----+---------+-------+
The result is a double only for the floating-point argument. For exact type arguments, the result is also an exact type.