The other day, I use pt-online-schema-change to add a column and index to a table, the table is about 1 million rows and the .ibd file size is about 5G Bytes.
I tried several times each time i got an error says the process is aborted duo to the critical load is reached (which is threads_running=60, i always use this and never failed before)
I found something strange using show full process list: there are so many copy threads there like:REPLACE INTO `db`.`_some_table_new` …
running for a long time (tens and hundreds of seconds) and in a status of ‘update’.
As I know the pt-online-schema-change use only one copy threads and I use this tool many times and each time I can only observe one thread chunking and copying but this time everything changed, there were so many copy threads blocking there.
I searched the documents in percona.com and can not find any information about the option about multi-threaded copying.
The table I altered this time is accessed very frequently, but I think this will not cause a problem as long as there is a single copying thread.
Could anyone give me some help, thanks!
The REPLACE statements are run by the triggers created by pt-online-schema-change.
pt-online-schema-change runs only one thread. But the triggers might be executed by many other client threads that are doing INSERT/UPDATE/DELETE on your original table.
pt-online-schema-change is not a magical way to do alter table for free. It has a cost. Every update to that table becomes two updates.
It sounds like your server can’t handle the extra load of running pt-online-schema-change while you have heavy traffic running updates against your original table. The server can’t keep doing these updates as fast as they are executed by your clients. So they fall behind and queue up.
You could upgrade to a more powerful server. More CPU cores and faster I/O system might help. But that will be very expensive.
The simpler solution is to schedule your table changes to a later time when you have lighter traffic.