MySQL分区如何处理NULL
在MySQL中进行分区并不能NULL
作为分区表达式的值,无论它是列值还是用户提供的表达式的值。即使允许将其NULL
用作必须以其他方式产生整数的表达式的值,但切记它NULL
不是数字,这一点很重要。MySQL的分区实现NULL
就像对待任何非NULL
值一样对待ORDER BY
。
这意味着NULL
在不同类型的分区之间对的处理会有所不同,并且如果您没有为此做好准备,可能会产生您不希望的行为。在这种情况下,我们将在本节中讨论每种MySQL分区类型NULL
在确定应存储行的分区时如何处理值,并提供每种示例。
使用RANGE分区处理NULL。如果将行插入到按分区的表中RANGE
,则用于确定该分区的列值是NULL
,则该行将插入到最低的分区中。在名为的数据库中考虑以下两个表p
:
mysql>CREATE TABLE t1 ( -> c1 INT, -> c2 VARCHAR(20) -> ) ->PARTITION BY RANGE (c1) ( ->PARTITION p0VALUES LESS THAN (0), ->PARTITION p1VALUES LESS THAN (10), ->PARTITION p2VALUES LESS THAN MAXVALUE -> ); Query OK, 0 rows affected (0.09 sec) mysql>CREATE TABLE t2 ( -> c1 INT, -> c2 VARCHAR(20) -> ) ->PARTITION BY RANGE (c1) ( ->PARTITION p0VALUES LESS THAN (-5), ->PARTITION p1VALUES LESS THAN (0), ->PARTITION p2VALUES LESS THAN (10), ->PARTITION p3VALUES LESS THAN MAXVALUE -> ); Query OK, 0 rows affected (0.09 sec)
您可以对数据库CREATE TABLE
中的PARTITIONS
表使用以下查询来参见由这两个语句创建的分区INFORMATION_SCHEMA
:
mysql>SELECT TABLE_NAME , PARTITION_NAME, TABLE_ROWS,AVG_ROW_LENGTH , DATA_LENGTH >FROM INFORMATION_SCHEMA.PARTITIONS >WHERE TABLE_SCHEMA = 'p' ANDTABLE_NAME LIKE 't_'; +------------ +---------------- +------------ +---------------- +------------- + | TABLE_NAME | PARTITION_NAME | TABLE_ROWS | AVG_ROW_LENGTH | DATA_LENGTH | +------------ +---------------- +------------ +---------------- +------------- + | t1 | p0 | 0 | 0 | 0 | | t1 | p1 | 0 | 0 | 0 | | t1 | p2 | 0 | 0 | 0 | | t2 | p0 | 0 | 0 | 0 | | t2 | p1 | 0 | 0 | 0 | | t2 | p2 | 0 | 0 | 0 | | t2 | p3 | 0 | 0 | 0 | +------------ +---------------- +------------ +---------------- +------------- + 7 rows in set (0.00 sec)
(有关此表的更多信息,请参见“ INFORMATION_SCHEMA PARTITIONS表”。)现在,让我们使用包含NULL
在用作分区键的列中的一行来填充每个表,并验证是否插入了行使用一对SELECT
语句:
mysql>INSERT INTO t1VALUES (NULL, 'mothra'); Query OK, 1 row affected (0.00 sec) mysql>INSERT INTO t2VALUES (NULL, 'mothra'); Query OK, 1 row affected (0.00 sec) mysql>SELECT *FROM t1; +------ +-------- + | id | name | +------ +-------- + | NULL | mothra | +------ +-------- + 1 row in set (0.00 sec) mysql>SELECT *FROM t2; +------ +-------- + | id | name | +------ +-------- + | NULL | mothra | +------ +-------- + 1 row in set (0.00 sec)
您可以通过针对之前的查询重新运行INFORMATION_SCHEMA.PARTITIONS
并检查输出,来参见哪些分区用于存储插入的行:
mysql>SELECT TABLE_NAME , PARTITION_NAME, TABLE_ROWS,AVG_ROW_LENGTH , DATA_LENGTH >FROM INFORMATION_SCHEMA.PARTITIONS >WHERE TABLE_SCHEMA = 'p' ANDTABLE_NAME LIKE 't_'; +------------ +---------------- +------------ +---------------- +------------- + | TABLE_NAME | PARTITION_NAME | TABLE_ROWS | AVG_ROW_LENGTH | DATA_LENGTH | +------------ +---------------- +------------ +---------------- +------------- + | t1 | p0 | 1 | 20 | 20 | | t1 | p1 | 0 | 0 | 0 | | t1 | p2 | 0 | 0 | 0 | | t2 | p0 | 1 | 20 | 20 | | t2 | p1 | 0 | 0 | 0 | | t2 | p2 | 0 | 0 | 0 | | t2 | p3 | 0 | 0 | 0 | +------------ +---------------- +------------ +---------------- +------------- + 7 rows in set (0.01 sec)
您还可以通过删除以下分区,然后重新运行以下SELECT
语句,来证明这些行存储在每个表的编号最小的分区中:
mysql>ALTER TABLE t1DROP PARTITION p0; Query OK, 0 rows affected (0.16 sec) mysql>ALTER TABLE t2DROP PARTITION p0; Query OK, 0 rows affected (0.16 sec) mysql>SELECT *FROM t1; Empty set (0.00 sec) mysql>SELECT *FROM t2; Empty set (0.00 sec)
(有关的更多信息ALTER TABLE ... DROP PARTITION
,请参见“ ALTER TABLE语句”。)
NULL
对于使用SQL函数的表达式进行分区也以这种方式处理。假设我们使用如下CREATE TABLE
语句来定义表:
CREATE TABLE tndate ( id INT, dt DATE )PARTITION BY RANGE ( YEAR(dt) ) (PARTITION p0VALUES LESS THAN (1990),PARTITION p1VALUES LESS THAN (2000),PARTITION p2VALUES LESS THAN MAXVALUE );
与其他MySQL函数一样,YEAR(NULL)
return NULL
。用一排dt
的列值NULL
被视为虽然分隔表情评估的值小于任何其它的值,所以被插入到分区p0
。
使用LIST分区处理NULL。当且仅当使用包含的值列表定义其分区之一时,由分区的表才LIST
允许输入NULL
值NULL
。与此相反的是,LIST
未NULL
在值列表中显式使用的分区表将拒绝产生NULL
分区表达式值的行,如本示例所示:
mysql>CREATE TABLE ts1 ( -> c1 INT, -> c2 VARCHAR(20) -> ) ->PARTITION BY LIST (c1) ( ->PARTITION p0VALUES IN (0, 3, 6), ->PARTITION p1VALUES IN (1, 4, 7), ->PARTITION p2VALUES IN (2, 5, 8) -> ); Query OK, 0 rows affected (0.01 sec) mysql>INSERT INTO ts1VALUES (9, 'mothra'); ERROR 1504 (HY000): Table has no partition for value 9 mysql>INSERT INTO ts1VALUES (NULL, 'mothra'); ERROR 1504 (HY000): Table has no partition for value NULL
只能将c1
值介于0
和之间的行8
插入ts1
。NULL
就像数字一样落在这个范围之外9
。我们可以创建表ts2
并使ts3
值列表包含NULL
,如下所示:
mysql>CREATE TABLE ts2 ( -> c1 INT, -> c2 VARCHAR(20) -> ) ->PARTITION BY LIST (c1) ( ->PARTITION p0VALUES IN (0, 3, 6), ->PARTITION p1VALUES IN (1, 4, 7), ->PARTITION p2VALUES IN (2, 5, 8), ->PARTITION p3VALUES IN (NULL) -> ); Query OK, 0 rows affected (0.01 sec) mysql>CREATE TABLE ts3 ( -> c1 INT, -> c2 VARCHAR(20) -> ) ->PARTITION BY LIST (c1) ( ->PARTITION p0VALUES IN (0, 3, 6), ->PARTITION p1VALUES IN (1, 4, 7, NULL), ->PARTITION p2VALUES IN (2, 5, 8) -> ); Query OK, 0 rows affected (0.01 sec)
在定义用于分区的值列表时,您可以(并且应该)像对待NULL
其他任何值一样对待。例如,无论是VALUES IN(NULL)
和VALUES IN(1, 4, 7, NULL)
是有效的,因为是VALUES IN(1, NULL, 4, 7)
,VALUES IN(NULL, 1, 4, 7)
等。您可以在每个表和中插入具有NULL
for列的行:c1
ts2
ts3
mysql>INSERT INTO ts2VALUES (NULL, 'mothra'); Query OK, 1 row affected (0.00 sec) mysql>INSERT INTO ts3VALUES (NULL, 'mothra'); Query OK, 1 row affected (0.00 sec)
通过针对发出适当的查询INFORMATION_SCHEMA.PARTITIONS
,您可以确定使用了哪些分区来存储刚插入的行(与前面的示例一样,我们假设分区表是在p
数据库中创建的):
mysql>SELECT TABLE_NAME , PARTITION_NAME, TABLE_ROWS,AVG_ROW_LENGTH , DATA_LENGTH >FROM INFORMATION_SCHEMA.PARTITIONS >WHERE TABLE_SCHEMA = 'p' ANDTABLE_NAME LIKE 'ts_'; +------------ +---------------- +------------ +---------------- +------------- + | TABLE_NAME | PARTITION_NAME | TABLE_ROWS | AVG_ROW_LENGTH | DATA_LENGTH | +------------ +---------------- +------------ +---------------- +------------- + | ts2 | p0 | 0 | 0 | 0 | | ts2 | p1 | 0 | 0 | 0 | | ts2 | p2 | 0 | 0 | 0 | | ts2 | p3 | 1 | 20 | 20 | | ts3 | p0 | 0 | 0 | 0 | | ts3 | p1 | 1 | 20 | 20 | | ts3 | p2 | 0 | 0 | 0 | +------------ +---------------- +------------ +---------------- +------------- + 7 rows in set (0.01 sec)
如本节前面所示,您还可以通过删除分区,然后执行来验证哪些分区用于存储行SELECT
。
使用HASH和KEY分区处理NULL。NULL
对于由HASH
或分区的表,其处理方式有所不同KEY
。在这些情况下,任何产生NULL
值的分区表达式都将被视为其返回值为零。我们可以通过检查创建文件分区对文件系统的影响来验证此行为,该表由分区HASH
并用包含适当值的记录填充该表。假设您有一个使用以下语句创建的表th
(也在p
数据库中):
mysql>CREATE TABLE th ( -> c1 INT, -> c2 VARCHAR(20) -> ) ->PARTITION BY HASH (c1) ->PARTITIONS 2; Query OK, 0 rows affected (0.00 sec)
可以使用此处显示的查询参见属于该表的分区:
mysql>SELECT TABLE_NAME ,PARTITION_NAME,TABLE_ROWS,AVG_ROW_LENGTH ,DATA_LENGTH >FROM INFORMATION_SCHEMA.PARTITIONS >WHERE TABLE_SCHEMA = 'p' ANDTABLE_NAME ='th'; +------------ +---------------- +------------ +---------------- +------------- + | TABLE_NAME | PARTITION_NAME | TABLE_ROWS | AVG_ROW_LENGTH | DATA_LENGTH | +------------ +---------------- +------------ +---------------- +------------- + | th | p0 | 0 | 0 | 0 | | th | p1 | 0 | 0 | 0 | +------------ +---------------- +------------ +---------------- +------------- + 2 rows in set (0.00 sec)
TABLE_ROWS
每个分区的值为0。现在将两行th
的c1
列值分别为NULL
和0,并验证是否插入了这些行,如下所示:
mysql>INSERT INTO thVALUES (NULL, 'mothra'), (0, 'gigan'); Query OK, 1 row affected (0.00 sec) mysql>SELECT *FROM th; +------ +--------- + | c1 | c2 | +------ +--------- + | NULL | mothra | +------ +--------- + | 0 | gigan | +------ +--------- + 2 rows in set (0.01 sec)
回想一下,对于任何整数N
,的值始终为。对于以或进行分区的表,将此结果用于确定正确的分区为。再次检查表,我们可以看到两行都插入了partition :NULL MOD N
NULL
HASH
KEY
0
INFORMATION_SCHEMA.PARTITIONS
p0
mysql>SELECT TABLE_NAME , PARTITION_NAME, TABLE_ROWS,AVG_ROW_LENGTH , DATA_LENGTH >FROM INFORMATION_SCHEMA.PARTITIONS >WHERE TABLE_SCHEMA = 'p' ANDTABLE_NAME ='th'; +------------ +---------------- +------------ +---------------- +------------- + | TABLE_NAME | PARTITION_NAME | TABLE_ROWS | AVG_ROW_LENGTH | DATA_LENGTH | +------------ +---------------- +------------ +---------------- +------------- + | th | p0 | 2 | 20 | 20 | | th | p1 | 0 | 0 | 0 | +------------ +---------------- +------------ +---------------- +------------- + 2 rows in set (0.00 sec)
通过重复使用最后一个例子PARTITION BY KEY
代替PARTITION BY HASH
在表的定义,你可以验证NULL
也像对待0这种类型的划分。