|
大多数的SQL数据库引擎(我们已知的除SQL以外的所有数据库引擎)使用静态、严格的分类。在使用静态分类中,值的数据类型是由它的容器决定的——存储这个值的那一列。
Most SQL database engines (every SQL database engine other than SQLite,
as far as we know) uses static, rigid typing. With static typing, the datatype
of a value is determined by its container - the particular column in
which the value is stored.
SQLite使用了一个更普通的动态类型系统。在SQLite中,值的数据类型是由这个值自身决定的,而不是它的容器。
SQLite的动态类型系统向后兼容其他数据库引擎中更为通用的静态类型系统,这意味着在静态类型数据库上运行的SQL语句都同样可以在SQLite上运行。不过,SQLite中的动态类型允许做的一些事情在传统严格类型数据库上可能是无法运行的。
SQLite uses a more general dynamic type system. In SQLite, the datatype
of a value is associated with the value itself, not with its container.
The dynamic type system of SQLite is backwards
compatible with the more common static type systems of other database engines
in the sense that SQL statements that work on statically typed databases should
work the same way in SQLite. However, the dynamic typing in SQLite allows
it to do things which are not possible in traditional rigidly typed
databases.
每个存储在SQLite数据库中(或数据库引擎处理的)的值都属于下列的存储类之一:
Each value stored in an SQLite database (or manipulated by the
database engine) has one of the following storage classes:
NULL.
NULL值
The value is a NULL value.
INTEGER.
有符号整数,根据值的量级存储在1、2、3、4、6或8个字节中。
The value is a signed integer, stored in 1,
2, 3, 4, 6, or 8 bytes depending on the magnitude of the value.
REAL. 浮点数值,存储为一个8字节的IEEE浮点数。
The value is a floating point value, stored as
an 8-byte IEEE floating point number.
TEXT.文本字符串,使用数据库编码存储(UTF-8, UTF-16BE or UTF-16LE)。
The value is a text string, stored using the
database encoding (UTF-8, UTF-16BE or UTF-16LE).
BLOB. 二进制数据,完全按照输入内容存储。
The value is a blob of data, stored exactly as
it was input.
注意,存储类比数据类型要稍微通用一些。例如INTEGER存储类,包含6种不同长度的整数数据类型。这会导致在磁盘上的区别,但是一旦将INTEGER值从磁盘上读取到内存中以备处理,就会被转换为更为通用的数据类型(8字节有符号整数)。所以对于大多数情况,是无法区分"存储类(storage class)"与"数据类型(datatype)"的,并且这两者是可以互换使用的。
Note that a storage class is slightly more general than a datatype.
The INTEGER storage class, for example, includes 6 different integer
datatypes of different lengths. This makes a difference on disk. But
as soon as INTEGER values are read off of disk and into memory for processing,
they are converted to the most general datatype (8-byte signed integer).
And so for the most part, "storage class" is indistinguishable from
"datatype" and the two terms can be used interchangeably.
在SQLite第三版数据库中除了INTEGER PRIMARY KEY列以外的任何列都可以用来存储任何存储类型的值。
Any column in an SQLite version 3 database,
except an INTEGER PRIMARY KEY column, may be used to store a value
of any storage class.
在SQL语句中的所有值,无论是内嵌在SQL语句文本中的还是绑定在预编译SQL语句中的参数都有一个隐式的存储类。在下面描述的情形中,数据库引擎可能会在执行查询的过程中将值在数字存储类型(INTEGER和REAL)和TEXT类型之间进行转换。
All values in SQL statements, whether they are literals embedded in SQL
statement text or parameters bound to
precompiled SQL statements
have an implicit storage class.
Under circumstances described below, the
database engine may convert values between numeric storage classes
(INTEGER and REAL) and TEXT during query execution.
SQLite没有一个单独的布尔存储类。而是吧布尔值存储为一个整数0(false)和1(true)。
SQLite does not have a separate Boolean storage class.
Instead, Boolean values are stored as integers 0 (false) and 1 (true).
SQLite没有为日期和时间类型使用单独的存储类。而是使用SQLite内置的日期和时间函数将日期和时间存储为一个TEXT、REAL或者INTEGER值:
SQLite does not have a storage class set aside for storing
dates and/or times.
Instead, the built-in Date And Time Functions of SQLite are capable of
storing dates and times as TEXT, REAL, or INTEGER values:
应用程序可以选择按照这些格式中的任意一种来存储日期和时间,并且可以使用内置的日期时间函数任意的相互转换。
Applications can chose to store dates and times in any of these
formats and freely convert between formats using the built-in date
and time functions.
为了尽可能的兼容SQLite和其他数据库引擎,SQLite在列上支持"类型亲和性(type affinity)"的概念。一列的类型亲和性是指这一列中存储的数据的推荐类型。这里最重要的概念是这个类型是推荐的而不是必须的。任何列已然可以存储任何类型的数据。这只是允许列优先选择一个存储类。这个优先选择的存储类就叫做“亲和性”。
In order to maximize compatibility between SQLite and other database
engines, SQLite supports the concept of "type affinity" on columns.
The type affinity of a column is the recommended type for data stored
in that column. The important idea here is that the type is recommended, not
required. Any column can still store any type of data.
It is just that some columns, given the choice, will prefer to use
one storage class over another. The preferred storage class for
a column is called its "affinity".
SQLite3数据库中的每一列都会被指定下列类型之一的亲和性:
Each column in an SQLite 3 database is assigned one of the
following type affinities:
TEXT亲和性的列使用NULL、TEXT或BLOB存储类来存储所有的数据。如果向一个TEXT亲和性的列插入一个数值数据,那么会在存储之前被转换为文本。
A column with TEXT affinity stores all data using storage classes
NULL, TEXT or BLOB. If numerical data is inserted into a column with
TEXT affinity it is converted into text form before being stored.
NUMERIC亲和性的列可以包含全部五个存储类的数据。当将一个文本插入到一个NUMERIC列中时,如果可以将数据无损、可逆的转换为INTEGER或REAL(按顺序优先),那么将把文本的存储类转换为这两者。
对于TEXT到REAL存储类的转换,SQLite认为如果15个有效位可以覆盖这个数字,那么这个转换就是无损、可逆的。如果无法做到TEXT到INTEGER或REAL的无损转换,那么这个值会使用TEXT存储类来保存。NULL和BLOB值是无需尝试转换的。
A column with NUMERIC affinity may contain values using all five
storage classes. When text data is inserted into a NUMERIC column, the
storage class of the text is converted to INTEGER or REAL (in order of
preference) if such conversion is lossless and reversible.
For conversions between TEXT and REAL storage classes, SQLite considers
the conversion to be lossless and reversible if the first 15 significant
decimal digits of the number are preserved.
If the lossless conversion of TEXT to INTEGER or REAL is not possible then
the value is stored using the TEXT storage class. No
attempt is made to convert NULL or BLOB values.
如果一个字符串类似一个带小数点和指数的浮点数,但是只要这个值可以使用一个整数来表示,那么NUMERIC亲和性就会将其转换为整数。因此,将字符串'3.0e+5'存储在一个NUMERIC亲和性的列中时会被转换为整数300000,而不是浮点数300000.0。
A string might look like a floating-point literal with
a decimal point and/or exponent notation but as long as
the value can be expressed as an integer, the NUMERIC affinity will convert
it into an integer. Hence, the string '3.0e+5' is stored in a
column with NUMERIC affinity as the integer 300000, not as the floating
point value 300000.0.
使用INTEGER亲和性的列行为与NUMERIC亲和性的列是一样的。INTEGER亲和性和NUMERIC亲和性唯一区别是在CAST 表达式中。
A column that uses INTEGER affinity behaves the same as a column
with NUMERIC affinity. The difference between INTEGER and NUMERIC affinity
is only evident in a CAST expression.
REAL亲和性的列行为与NUMERIC亲和性的列类似,不过会强制使用浮点数表示整数值。(一个内部优化,没有小数部分的小浮点数在存储到REAL亲和性的列中时,会被作为整数写入磁盘,这是为了减少空间占用,并且会在读出时自动转换回浮点数。
这个优化在SQL层面是完全不可见的,并且只有在查看数据库文件原始字节的时候才会发现。
A column with REAL affinity behaves like a column with NUMERIC
affinity except that it forces integer values into floating point
representation. (As an internal optimization, small floating point
values with no fractional component and stored in columns with REAL
affinity are written to disk as integers in order to take up less
space and are automatically converted back into floating point as
the value is read out.
This optimization is completely invisible at the SQL level and can only
be detected by examining the raw bits of the database file.)
NONE亲和性的列没有优先选择的存储类,所以也不会尝试将一个存储类转换成另一个。
A column with affinity NONE does not prefer one storage class over
another and no attempt is made to coerce data from one storage class into
another.
列的亲和性是通过列的定义类型按照下列规则的顺序来决定的。
The affinity of a column is determined by the declared type
of the column, according to the following rules in the order shown:
如果定义类型包含字符串"INT"那么会指定INTEGER亲和性。
If the declared type contains the string "INT" then it
is assigned INTEGER affinity.
如果列定义类型包含"CHAR"、"CLOB"或"TEXT"这几个字符串,那么会指定TEXT亲和性。注意类型VARCHAR包含了字符串"CHAR",所以指定TEXT亲和性。
If the declared type of the column contains any of the strings
"CHAR", "CLOB", or "TEXT" then that
column has TEXT affinity. Notice that the type VARCHAR contains the
string "CHAR" and is thus assigned TEXT affinity.
如果列定义类型包含字符串"BLOB",或者没有指定类型,那么列的亲和性为NONE。
If the declared type for a column
contains the string "BLOB" or if
no type is specified then the column has affinity NONE.
如果列定义类型包含"REAL"、"FLOA"或"DOUB"这些字符串,那么列亲和性为REAL。
If the declared type for a column
contains any of the strings "REAL", "FLOA",
or "DOUB" then the column has REAL affinity.
否则,就是NUMERIC亲和性。
Otherwise, the affinity is NUMERIC.
注意确定列亲和性的规则顺序是非常重要的。如果一列的定义类型为"CHARINT",这会同时满足规则1和2,但是第一个规则优先,所以列亲和性为INTEGER。
Note that the order of the rules for determining column affinity
is important. A column whose declared type is "CHARINT" will match
both rules 1 and 2 but the first rule takes precedence and so the
column affinity will be INTEGER.
下面的表格展示了来自许多传统SQL实现的大量通用数据类型名称通过上一节的五条规则转换成的亲和性。这个表格只展示了SQLite所能支持的数据类型名称的很小一部分。注意,类型名后面小括号中的数字参数(例如:"VARCHAR(255)")在SQLite中会被忽略——SQLite未在字符串、BLOB、数字值上强加长度限制(除了SQLITE_MAX_LENGTH全局限制)。
The following table shows how many common datatype names from
more traditional SQL implementations are converted into affinities by the five rules of the
previous section. This table shows only a small subset of the
datatype names that SQLite will accept. Note that numeric arguments
in parentheses that following the type name (ex: "VARCHAR(255)") are
ignored by SQLite - SQLite does not impose any length restrictions
(other than the large global SQLITE_MAX_LENGTH limit) on the length of
strings, BLOBs or numeric values.
来自
CREATE TABLE 语句
或CAST表达式的示例类型名
Example Typenames From The
CREATE TABLE Statement
or CAST Expression最终亲和性
Resulting Affinity确定亲和性使用的规则
Rule Used To Determine AffinityINT
INTEGER
TINYINT
SMALLINT
MEDIUMINT
BIGINT
UNSIGNED BIG INT
INT2
INT8INTEGER 1 CHARACTER(20)
VARCHAR(255)
VARYING CHARACTER(255)
NCHAR(55)
NATIVE CHARACTER(70)
NVARCHAR(100)
TEXT
CLOBTEXT 2 BLOB
未指定类型
no datatype specifiedNONE 3 REAL
DOUBLE
DOUBLE PRECISION
FLOATREAL 4 NUMERIC
DECIMAL(10,5)
BOOLEAN
DATE
DATETIMENUMERIC 5
注意,声明"FLOATING POINT"类型会指定INTEGER亲和性,而不是REAL亲和性,这是由于"POINT"的末尾是"INT"。还有申明"STRING"类型会指定亲和性为NUMERIC,而不是TEXT。
Note that a declared type of "FLOATING POINT" would give INTEGER
affinity, not REAL affinity, due to the "INT" at the end of "POINT".
And the declared type of "STRING" has an affinity of NUMERIC, not TEXT.
下面的SQL展示了SQLite在将值插入到表格时时如何使用列亲和性进行类型转换的。
The following SQL demonstrates how SQLite uses column affinity
to do type conversions when values are inserted into a table.
CREATE TABLE t1( t TEXT, -- 规则2,TEXT亲和性。text affinity by rule 2 nu NUMERIC, -- 规则5,NUMERIC亲和性。numeric affinity by rule 5 i INTEGER, -- 规则1,INTEGER亲和性。integer affinity by rule 1 r REAL, -- 规则4,REAL亲和性。real affinity by rule 4 no BLOB -- 规则3,NONE亲和性。no affinity by rule 3 ); --值作为TEXT、INTEGER、INTEGER、REAL、TEXT存储。 -- Values stored as TEXT, INTEGER, INTEGER, REAL, TEXT. INSERT INTO t1 VALUES('500.0', '500.0', '500.0', '500.0', '500.0'); SELECT typeof(t), typeof(nu), typeof(i), typeof(r), typeof(no) FROM t1; text|integer|integer|real|text --值作为TEXT、INTEGER、INTEGER、REAL、REAL存储。 -- Values stored as TEXT, INTEGER, INTEGER, REAL, REAL. DELETE FROM t1; INSERT INTO t1 VALUES(500.0, 500.0, 500.0, 500.0, 500.0); SELECT typeof(t), typeof(nu), typeof(i), typeof(r), typeof(no) FROM t1; text|integer|integer|real|real --值作为TEXT、INTEGER、INTEGER、REAL、INTEGER存储。 -- Values stored as TEXT, INTEGER, INTEGER, REAL, INTEGER. DELETE FROM t1; INSERT INTO t1 VALUES(500, 500, 500, 500, 500); SELECT typeof(t), typeof(nu), typeof(i), typeof(r), typeof(no) FROM t1; text|integer|integer|real|integer --无论列亲和性是什么,BLOB总是被存储为BLOB。 -- BLOBs are always stored as BLOBs regardless of column affinity. DELETE FROM t1; INSERT INTO t1 VALUES(x'0500', x'0500', x'0500', x'0500', x'0500'); SELECT typeof(t), typeof(nu), typeof(i), typeof(r), typeof(no) FROM t1; blob|blob|blob|blob|blob --NULL不会受到亲和性的影响。 -- NULLs are also unaffected by affinity DELETE FROM t1; INSERT INTO t1 VALUES(NULL,NULL,NULL,NULL,NULL); SELECT typeof(t), typeof(nu), typeof(i), typeof(r), typeof(no) FROM t1; null|null|null|null|null
SQLite3中包含了常见的SQL比较操作符,包括:"="、 "=="、"<"、"<="、">"、">="、"!="、"<>"、"IN"、"NOT IN"、"BETWEEN"、"IS"和"IS NOT"。
SQLite version 3 has the usual set of SQL comparison operators
including "=", "==", "<", "<=", ">", ">=", "!=", "<>",
"IN", "NOT IN", "BETWEEN", "IS", and "IS NOT", .
比较的结果依赖于操作数的存储类型,按照如下的规则:
The results of a comparison depend on the storage classes of the
operands, according to the following rules:
一个NULL存储类的值会被当做比其它值都小(包括NULL存储类的其它值)。
A value with storage class NULL is considered less than any
other value (including another value with storage class NULL).
INTEGER或REAL值小于TEXT或BLOB值。当在两个INTEGER或REAL值之间做比较时会执行数值比较。
An INTEGER or REAL value is less than any TEXT or BLOB value.
When an INTEGER or REAL is compared to another INTEGER or REAL, a
numerical comparison is performed.
TEXT值小于BLOB值。当两个TEXT值比较时,会使用一个合适的排序序列来决定结果。
A TEXT value is less than a BLOB value. When two TEXT values
are compared an appropriate collating sequence is used to determine
the result.
当两个BLOB值进行比较时,结果由memcmp()来决定。
When two BLOB values are compared, the result is
determined using memcmp().
SQLite在执行比较操作前可能会尝试将值在INTEGER、REAL和TEXT存储类之间进行转换。不管怎样,在比较发生前的所有尝试的转换都依赖于操作数的亲和性。操作数的亲和性是由下列规则来决定的:
SQLite may attempt to convert values between the storage classes
INTEGER, REAL, and/or TEXT before performing a comparison.
Whether or not any conversions are attempted before the comparison takes
place depends on the affinity of the operands.
Operand affinity is determined by the following rules:
一个表达式只是简单的引用了一列的值,那么亲和性与该列一样。注意,如果X和Y.Z是列名,那么+X和+Y.Z 被认为是用于决定亲和性的表达式。
An expression that is a simple reference to a column value
has the same affinity as the column.
Note that if X and Y.Z
are column names, then +X and +Y.Z are considered expressions for the
purpose of determining affinity.
"CAST(expr AS type)"形式的表达式的亲和性与定义类型为 "type"的列一样。
An expression of the form "CAST(expr AS type)"
has an affinity that is the same as a column with a declared
type of "type".
否则,表达式就是NONE亲和性。
Otherwise, an expression has NONE affinity.
"应用亲和性"的是说,只有在转换是无损和可逆的情况下才可以将操作数转换为指定的存储类型。在执行操作符比较之前对操作数应用亲和性,这是按照下列规则顺序执行的:
To "apply affinity" means to convert an operand to a particular storage
class if and only if the conversion is lossless and reversible.
Affinity is applied to operands of a comparison operator prior to
the comparison according to the following rules in the order shown:
如果一个操作数是INTEGER、REAL或NUMERIC亲和性,另一个操作数是TEXT或NONE亲和性,那么会将NUMERIC亲和性应用于另一个操作数。
If one operand has INTEGER, REAL or NUMERIC affinity
and the other operand as TEXT or NONE affinity
then NUMERIC affinity is applied to other operand.
如果一个操作数是TEXT亲和性,另一个是NONE亲和性,那么会将TEXT亲和性应用于另一个操作数。
If one operand has TEXT affinity and the other has NONE affinity,
then TEXT affinity is applied to the other operand.
否则,不会对比较的两个操作数应用亲和性。
Otherwise, no affinity is applied and both operands are compared
as is.
表达式"a BETWEEN b AND c"会被当做两个分开的二元比较操作"a >= b AND a <= c"来对待,这可能会使在两个比较操作中应用于'a'的亲和性不一样。
在"x IN (SELECT y ...)"这种形式的比较中的数据类型转换会当做"x=y"比较来处理。表达式"a IN (x, y, z, ...)"等价于"a = +x OR a = +y OR a = +z OR ..."。换句话说,IN操作符右边的值(例子中的"x"、"y"和"z"值)会被认为是没有亲和性,即使它们来自于列值或者CAST表达式。
The expression "a BETWEEN b AND c" is treated as two separate
binary comparisons "a >= b AND a <= c", even if that means
different affinities are applied to 'a' in each of the comparisons.
Datatype conversions in comparisons of the
form "x IN (SELECT y ...)" are handled is if
the comparison were really "x=y".
The expression "a IN (x, y, z, ...)" is equivalent to "a = +x OR
a = +y OR a = +z OR ...".
In other words, the values to the right of the IN operator (the "x", "y",
and "z" values in this example) are considered to have no affinity,
even if they happen to be column values or CAST expressions.
CREATE TABLE t1( a TEXT, -- TEXT亲和性 text affinity b NUMERIC, -- NUMERIC亲和性 numeric affinity c BLOB, -- NONE亲和性 no affinity d -- NONE亲和性 no affinity ); -- 值会分别存储为TEXT、INTEGER、TEXT和INTEGER。 -- Values will be stored as TEXT, INTEGER, TEXT, and INTEGER respectively INSERT INTO t1 VALUES('500', '500', '500', 500); SELECT typeof(a), typeof(b), typeof(c), typeof(d) FROM t1; text|integer|text|integer -- 由于列"a"是TEXT亲和性,在进行比较前,比较符右边的NUMERIC值会被转换为TEXT。 -- Because column "a" has text affinity, numeric values on the -- right-hand side of the comparisons are converted to text before -- the comparison occurs. SELECT a < 40, a < 60, a < 600 FROM t1; 0|1|1 -- TEXT 亲和性会应用于右边的操作数,但是由于它们本来就是TEXT,所以无需操作,不会进行转换。 -- Text affinity is applied to the right-hand operands but since -- they are already TEXT this is a no-op; no conversions occur. SELECT a < '40', a < '60', a < '600' FROM t1; 0|1|1 -- 列"b"是NUMERIC 亲和性,因此会对右边的操作数应用NUMERIC亲和性。由于操作数都已经是NUMERIC了,所以不会应用亲和性,也不会发生转换。所有的值都是按照数值来比较的。 -- Column "b" has numeric affinity and so numeric affinity is applied -- to the operands on the right. Since the operands are already numeric, -- the application of affinity is a no-op; no conversions occur. All -- values are compared numerically. SELECT b < 40, b < 60, b < 600 FROM t1; 0|0|1 -- 右边的操作数会应用NUMERIC亲和性,将其从TEXT转换为INTEGER。然后再进行数值比较操作。 -- Numeric affinity is applied to operands on the right, converting them -- from text to integers. Then a numeric comparison occurs. SELECT b < '40', b < '60', b < '600' FROM t1; 0|0|1 -- 不会发生亲和性转换。右边的值全都是INTEGER存储类,这总是小于左边的TEXT值。 -- No affinity conversions occur. Right-hand side values all have -- storage class INTEGER which are always less than the TEXT values -- on the left. SELECT c < 40, c < 60, c < 600 FROM t1; 0|0|0 -- 不会发生亲和性转换。值会按照TEXT进行比较。 -- No affinity conversions occur. Values are compared as TEXT. SELECT c < '40', c < '60', c < '600' FROM t1; 0|1|1 -- 不会发生亲和性转换。右边的值全都是INTEGER存储类,会直接与左边的INTEGER值进行数值比较。 -- No affinity conversions occur. Right-hand side values all have -- storage class INTEGER which compare numerically with the INTEGER -- values on the left. SELECT d < 40, d < 60, d < 600 FROM t1; 0|0|1 -- 不会发生亲和性转换。左边的INTEGER值总是小于右边的TEXT值。 -- No affinity conversions occur. INTEGER values on the left are -- always less than TEXT values on the right. SELECT d < '40', d < '60', d < '600' FROM t1; 1|1|1
如果交换比较,所有示例的结果都是一样的。例如将表达式"a<40"改为"40>a"。
All of the result in the example are the same if the comparisons are
commuted - if expressions of the form "a<40" are rewritten
as "40>a".
所有数学操作符 (+、 -、 *、 /、 %、 <<、 >>、 & 和 |)都会在进行计算之前将两个操作数转换为NUMERIC存储类。这个转换即使是有损和不可逆的,也会进行转换。
如果数学操作符的操作数是NULL那么会得出NULL的结果。
如果数学操作符的一个操作数任何方式都无法转换成数字,并且不是NULL,那么会被转换为0或0.0。
All mathematical operators (+, -, *, /, %, <<, >>,
&, and |)
cast both operands to the NUMERIC storage class prior to being carried out.
The cast is carried through even if it is lossy and irreversible.
A NULL operand on a mathematical operator yields a NULL result.
An operand on a mathematical operator that does not look in any way
numeric and is not NULL is converted to 0 or 0.0.
当使用ORDER BY子句对查询结果进行排序时,首先是NULL存储类的值,其次是按照数字排序的INTEGER和REAL值,然后是依照排序序列排列TEXT值,最后是按照memcmp()排序的BLOB值。在排序前不会发生存储类的转换。
When query results are sorted by an ORDER BY clause, values with storage
class NULL come first, followed by INTEGER and REAL values
interspersed in numeric order, followed by TEXT values in collating
sequence order, and finally BLOB values in memcmp() order. No storage
class conversions occur before the sort.
当使用GROUP BY子句对值进行分组时,不同存储类的值会被单独分开,除了INTEGER和REAL值,当二者的值在数值上是相等的,则会被当做相同值。在GROUP BY子句的结果中不会对任何值应用亲和性。
When grouping values with the GROUP BY clause values with
different storage classes are considered distinct, except for INTEGER
and REAL values which are considered equal if they are numerically
equal. No affinities are applied to any values as the result of a
GROUP by clause.
复合SELECT操作符UNION、INTERSECT和EXCEPT操作是隐含了值之间的比较操作的。
在UNION、INTERSECT或EXCEPT这些隐式的比较操作中,是不会对操作数应用亲和性的——意思就是说值的比较是与is相同的。
The compound SELECT operators UNION
INTERSECT and EXCEPT perform implicit comparisons between values.
No affinity is applied to comparison operands for the implicit
comparisons associated with UNION, INTERSECT, or EXCEPT - the values
are compared as is.
当SQLite比较两个字符串时,会使用一个排序序列或者排序方法(同一东西的不同说法)来决定那个字符串更大或者两个字符串相同。
SQLite有三个内置的排序函数:BINARY、NOCASE和RTRIM。
When SQLite compares two strings, it uses a collating sequence or
collating function (two words for the same thing) to determine which
string is greater or if the two strings are equal.
SQLite has three built-in collating functions: BINARY, NOCASE, and
RTRIM.
应用程序可以使用sqlite3_create_collation()接口来注册附加的排序函数。
An application can register additional collating functions using
the sqlite3_create_collation() interface.
每个表的每一列都会被分配一个排序函数。如果没有明确定义排序函数,那么会默认使用BINARY排序函数。列定义中的COLLATE子句是用来定义该列的排序函数的。
Every column of every
table has an associated collating function. If no collating function
is explicitly defined, then the collating function defaults to BINARY.
The COLLATE clause of the column definition is used
to define alternative collating functions for a column.
决定对于一个二元比较操作符(=、 <、 >、 <=、 >=、 !=、 IS 和 IS NOT)使用哪个排序函数是按照下列规则顺序来进行的。
The rules for determining which collating function to use for a
binary comparison operator (=, <, >, <=, >=, !=, IS, and
IS NOT) are as follows and in the order shown:
如果某个操作数都使用COLLATE 操作符后缀明确指定了排序函数,那么使用指定的排序函数来进行比较。选择时会优先使用左边操作数指定的排序函数。
If either operand has an explicit collating function assignment
using the postfix COLLATE operator, then the explicit collating function
is used for comparison, with precedence to the collating function of the
left operand.
如果某个操作数是一个列,那么会依照左侧操作数优先的原则使用这一列的排序函数。由于历史语句的原因,一个前缀了一个或多个一元"+"操作符的列名依然被当做为一个列名。
If either operand is a column, then the collating function of
that column is used with precedence to the left operand.
For the purposes of the previous sentence, a column name
preceded by one or more unary "+" operators is still considered a column name.
否则使用BINARY排序函数进行比较。
Otherwise, the BINARY collating function is used for comparison.
如果比较操作中操作数的任意字表达式使用了COLLATE 操作符后缀,那么就认为这个操作数明确指定了排序函数(规则1)。
因此,只要在比较表达式的任意地方使用了COLLATE 操作符,那么,不管表达式中是否包含表列,都会使用该操作符定义的排序函数来进行字符串比较。
如果在比较表达式中出现了两个或多个COLLATE 操作符子表达式,那么,总是会使用最左边操作符指定的排序函数,而不关心这个操作符在表达式中被嵌套了多少次或者包含了括号等。
An operand of a comparison is considered to have an explicit
collating function assignment (rule 1 above)
if any subexpression of the operand uses
the postfix COLLATE operator. Thus, if a COLLATE operator is used
anywhere in a comparision expression, the collating function defined
by that operator is used for string comparison regardless of what
table columns might be a part of that expression. If two or more
COLLATE operator subexpressions appear anywhere in a comparison, the
left most explicit collating function is used regardless of how deeply the
COLLATE operators are nested in the expression and regardless of
how the expression is parenthesized.
表达式"x BETWEEN y and z"在逻辑上等于两个比较操作"x >= y AND x <= z" ,并且在执行是处理排序函数也是当做两个独立比较来对待的。
表达式"x IN (SELECT y ...)"处理选择排序序列的的方式与表达式"x = "是一样的。
用于"x IN (y, z, ...)"这种形式表达式的排序序列是来自x的排序序列。
The expression "x BETWEEN y and z" is logically
equivalent to two comparisons "x >= y AND x <= z" and works with
respect to collating functions as if it were two separate comparisons.
The expression "x IN (SELECT y ...)" is handled in the same way as the
expression "x = y" for the purposes of determining the collating sequence.
The collating sequence used for expressions of the form
"x IN (y, z, ...)" is the collating sequence of x.
SELECT语句中的ORDER BY部分也可以通过COLLATE 操作符来制定一个排序序列,这里指定的排序序列是用于排序的。
否则,如果ORDER BY子句中排列的表达式是一个列,那么使用这一列的排序序列来决定排序的顺序。
如果表达式不是一个列,并且没有COLLATE子句,那么使用BINARY排序序列。
Terms of the ORDER BY clause that is part of a SELECT
statement may be assigned a collating sequence using the
COLLATE operator, in which case the specified collating function is
used for sorting.
Otherwise, if the expression sorted by an ORDER BY clause is
a column, then the collating sequence of the column is used to
determine sort order. If the expression is not a column and has no
COLLATE clause, then the BINARY collating sequence is used.
排序序列标志下的例子将使用其来决定各种SQL语句中执行的文本排序的结果。
注意,在示例NUMERIC、BLOB或NULL值中文本比较操作可能不是必须的,并且没有使用排序序列。
The examples below identify the collating sequences that would be used to
determine the results of text comparisons that may be performed by various
SQL statements. Note that a text comparison may not be required, and no
collating sequence used, in the case of numeric, blob or NULL values.
CREATE TABLE t1( x INTEGER PRIMARY KEY, a, /* BINARY 排序序列 collating sequence BINARY */ b COLLATE BINARY, /* BINARY 排序序列 collating sequence BINARY */ c COLLATE RTRIM, /* RTRIM 排序序列 collating sequence RTRIM */ d COLLATE NOCASE /* NOCASE 排序序列 collating sequence NOCASE */ ); /* x a b c d */ INSERT INTO t1 VALUES(1,'abc','abc', 'abc ','abc'); INSERT INTO t1 VALUES(2,'abc','abc', 'abc', 'ABC'); INSERT INTO t1 VALUES(3,'abc','abc', 'abc ', 'Abc'); INSERT INTO t1 VALUES(4,'abc','abc ','ABC', 'abc'); /* 使用BINARY排序序列来执行文本比较a=b。 ** Text comparison a=b is performed using the BINARY collating sequence. */ SELECT x FROM t1 WHERE a = b ORDER BY x; --result 1 2 3 /* 使用RTRIM排序序列来执行文本比较a=b。 ** Text comparison a=b is performed using the RTRIM collating sequence. */ SELECT x FROM t1 WHERE a = b COLLATE RTRIM ORDER BY x; --result 1 2 3 4 /* 使用NOCASE排序序列来执行文本比较d=a ** Text comparison d=a is performed using the NOCASE collating sequence. */ SELECT x FROM t1 WHERE d = a ORDER BY x; --result 1 2 3 4 /* 使用BINARY排序序列来执行文本比较a=d。 ** Text comparison a=d is performed using the BINARY collating sequence. */ SELECT x FROM t1 WHERE a = d ORDER BY x; --result 1 4 /* 使用RTRIM排序序列来执行文本比较'abc'=c。 ** Text comparison 'abc'=c is performed using the RTRIM collating sequence. */ SELECT x FROM t1 WHERE 'abc' = c ORDER BY x; --result 1 2 3 /* 使用RTRIM排序序列来执行文本比较c='abc'。 ** Text comparison c='abc' is performed using the RTRIM collating sequence. */ SELECT x FROM t1 WHERE c = 'abc' ORDER BY x; --result 1 2 3 /* 使用NOCASE排序序列来执行分组(值'abc'、'ABC'和'Abc'都会分在同一组里)。 ** Grouping is performed using the NOCASE collating sequence (Values ** 'abc', 'ABC', and 'Abc' are placed in the same group). */ SELECT count(*) FROM t1 GROUP BY d ORDER BY 1; --result 4 /* 使用BINARY排序序列来执行分组(值'abc'、'ABC'和'Abc'都会分在不同组里)。 ** Grouping is performed using the BINARY collating sequence. 'abc' and ** 'ABC' and 'Abc' form different groups */ SELECT count(*) FROM t1 GROUP BY (d || '') ORDER BY 1; --result 1 1 2 /* 使用RTRIM排序序列来执行c列排序。 ** Sorting of column c is performed using the RTRIM collating sequence. */ SELECT x FROM t1 ORDER BY c, x; --result 4 1 2 3 /* 使用BINARY排序序列来执行(c||'')排序 ** Sorting of (c||'') is performed using the BINARY collating sequence. */ SELECT x FROM t1 ORDER BY (c||''), x; --result 4 2 3 1 /* 使用NOCASE排序序列来执行c列排序。 ** Sorting of column c is performed using the NOCASE collating sequence. */ SELECT x FROM t1 ORDER BY c COLLATE NOCASE, x; --result 2 4 3 1