Percona XtraDB Cluster 8.0.25 has introduced a new option to perform online schema modifications: NBO (Non Blocking Operation).
When using PXC the cluster relies on wsrep_OSU_method parameter to define the Online Schema Upgrade (OSU) method the node uses to replicate DDL statements.
Until now we normally have 3 options:
- Use Total Isolation Order (TOI, the default)
- Use Rolling Schema Upgrade (RSU)
- Use Percona’s online schema change tool (TOI + PTOSC)
Each method has some positive and negative aspects. TOI will lock the whole cluster from being able to accept data modifications for the entire time it takes to perform the DDL operation. RSU will misalign the schema definition between the nodes, and in any case the node performing the DDL operation is still locked. Finally TOI+PTOSC will rely on creating triggers and copying data, so in some cases this can be very impactful.
The new Non Blocking Operation (NBO) method is to help to reduce the impact on the cluster and make it easier to perform some DDL operations.
At the moment we only support a limited set of operations with NBO like:
- ALTER INDEX
- CREATE INDEX
- DROP INDEX
Any other command will result in an error message ER_NOT_SUPPORTED_YET.
But let us see how it works and what is the impact while we will also compare it with the default method TOI.
What we will do is working with 4 connections:
1 to perform ddl
2 to perform insert data in the table being altered
3 to perform insert data on a different table
4-5 checking the other two nodes operations
PXC must be at least Ver 8.0.25-15.1.
The table we will modify is :
DC1-1(root@localhost) [windmills_s]>show create table windmills_test\G *************************** 1. row *************************** Table: windmills_test Create Table: CREATE TABLE `windmills_test` ( `id` bigint NOT NULL AUTO_INCREMENT, `uuid` char(36) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL, `millid` smallint NOT NULL, `kwatts_s` int NOT NULL, `date` date NOT NULL, `location` varchar(50) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL, `active` tinyint NOT NULL DEFAULT '1', `time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, `strrecordtype` char(3) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL, PRIMARY KEY (`id`), KEY `IDX_millid` (`millid`,`active`), KEY `IDX_active` (`id`,`active`), KEY `kuuid_x` (`uuid`), KEY `millid_x` (`millid`), KEY `active_x` (`active`) ) ENGINE=InnoDB AUTO_INCREMENT=8199260 DEFAULT CHARSET=utf8mb3 COLLATE=utf8_bin ROW_FORMAT=DYNAMIC 1 row in set (0.00 sec)
And contains ~5 million rows.
DC1-1(root@localhost) [windmills_s]>select count(*) from windmills_test; +----------+ | count(*) | +----------+ | 5002909 | +----------+ 1 row in set (0.44 sec)
The commands.
Connection 1:
ALTER TABLE windmills_test ADD INDEX idx_1 (`uuid`,`active`), ALGORITHM=INPLACE; ALTER TABLE windmills_test drop INDEX idx_1, ALGORITHM=INPLACE;
Connection 2:
while [ 1 = 1 ];do da=$(date +'%s.%3N');/opt/mysql_templates/PXC8P/bin/mysql --defaults-file=./my.cnf -uroot -D windmills_s -e "insert into windmills_test select null,uuid,millid,kwatts_s,date,location,active,time,strrecordtype from windmills7 limit 1;" -e "select count(*) from windmills_s.windmills_test;" > /dev/null;db=$(date +'%s.%3N'); echo "$(echo "($db - $da)"|bc)";sleep 1;done
Connection 3:
while [ 1 = 1 ];do da=$(date +'%s.%3N');/opt/mysql_templates/PXC8P/bin/mysql --defaults-file=./my.cnf -uroot -D windmills_s -e "insert into windmills8 select null,uuid,millid,kwatts_s,date,location,active,time,strrecordtype from windmills7 limit 1;" -e "select count(*) from windmills_s.windmills_test;" > /dev/null;db=$(date +'%s.%3N'); echo "$(echo "($db - $da)"|bc)";sleep 1;done
Connections 4-5:
while [ 1 = 1 ];do echo "$(date +'%T.%3N')";/opt/mysql_templates/PXC8P/bin/mysql --defaults-file=./my.cnf -uroot -D windmills_s -e "show full processlist;"|egrep -i -e "(windmills_test|windmills_s)"|grep -i -v localhost;sleep 1;done
Operations:
- start inserts from connections
- start commands in connections 4 - 5 on the other nodes
- execute:
- for TOI
-
DC1-1(root@localhost) [windmills_s]>SET SESSION wsrep_OSU_method=TOI;
-
- for NBO
-
DC1-1(root@localhost) [windmills_s]>SET SESSION wsrep_OSU_method=NBO;
-
- For both
-
DC1-1(root@localhost) [windmills_s]>ALTER TABLE windmills_test ADD INDEX idx_1 (`uuid`,`active`), ALGORITHM=INPLACE, LOCK=shared;
-
- for TOI
Let us run it
Altering a table with TOI.
DC1-1(root@localhost) [windmills_s]>ALTER TABLE windmills_test ADD INDEX idx_1 (`uuid`,`active`), ALGORITHM=INPLACE; Query OK, 0 rows affected (1 min 4.74 sec) Records: 0 Duplicates: 0 Warnings: 0
Inserts in the altering table (connection 2):
.450 .492 64.993 <--- Alter blocks all inserts on the table we are altering .788 .609
Inserts on the other table (connection 3):
.455 .461 64.161 <--- Alter blocks all inserts on all the other tables as well .641 .483
On the other nodes at the same time of the ALTER we can see:
Id User db Command Time State Info Time_ms Rows_sent Rows_examined 15 system user windmills_s Query 102 altering table ALTER TABLE windmills_test ADD INDEX idx_1 (`uuid`,`active`), ALGORITHM=INPLACE 102238 0 0 <--- time from start
So in short we have the whole cluster locked for ~64 seconds. During this period of time, all the operations to modify data or structure were on hold.
Let us now try with NBO
Inserts in the altering table:
.437 .487 120.758 <---- Execution time increase .617 .510
Inserts on the other table:
.468 .485 25.061 <---- still a metalock, but not locking the other tables for the whole duration .494 .471
On the other nodes at the same time of the ALTER we can see:
Id User db Command Time State Info Time_ms Rows_sent Rows_examined 110068 system user windmills_s Connect 86 altering table ALTER TABLE windmills_test ADD INDEX idx_1 (`uuid`,`active`), ALGORITHM=INPLACE 120420 0 0
In this case what is also interesting to note is that:
- We have a moment of metalock:
-
110174 pmm 127.0.0.1:42728 NULL Query 2 Waiting for table metadata lock SELECT x FROM information_schema.tables WHERE TABLE_SCHEMA = 'windmills_s' 1486 10 0
-
110068 system user connecting host windmills_s Connect 111 closing tables ALTER TABLE windmills_test ADD INDEX idx_1 (`uuid`,`active`), ALGORITHM=INPLACE 111000 0 0
-
- The execution time is longer
Summarizing:
TOI NBO Time on hold for insert for altering table ~64 sec ~120 sec Time on hold for insert for another table ~64 sec ~25 sec metalock whole time only at the end
What is happening, what are the differences and why takes longer with NBO?
Let see at very high level how the two works:
- TOI: when you issue a DDL like ADD INDEX a metadata lock is taken on the table and it will be released only at the end of the operation. During this time, you cannot:
- Perform DMLs on any cluster node
- Alter another table in the cluster
- NBO: the metadata lock is taken at the start and at the end for a very brief period of time. The ADD INDEX operation will then work on each node independently. The lock taken at the end is to have all the nodes agree on the operation and commit or roll back (using cluster error voting). This final phase costs a bit more in time and is what adds a few seconds to the operation execution. But during the operation:
- You can alter another table (using NBO)
- You can continue to insert data, except in the table(s) you are altering.
- On node crash the operation will continue on the other nodes, and if successful it will persist.
In short the cluster server behavior changes significantly when using NBO, offering significant flexibility compared to TOI. The cost in time should not linearly increase with the dimension of the table, but more in relation to the single node efficiency in performing the ALTER operation.
Conclusion
NBO can be significantly helpful to reduce the impact of DDL on the cluster, for now limited to the widely used creation/modification/drop of an index. But in the future … we may expand it.
The feature is still a technology preview, so do not trust in production, but test it and let us know what you think.
Final comment. Another distribution has introduced NBO, but only if you buy the enterprise version.
Percona, which is truly open source with facts not just words, has implemented NBO in standard PXC, and the code is fully open source. This is not the first one, but just another of the many features Percona is offering for free, while others ask you to buy the enterprise version.
Enjoy the product and let us have your feedback!
Great MySQL to all!