How to Implement ProxySQL with AWS Aurora
In this post, we'll look at how to implement ProxySQL with AWS Aurora. Recently, there have been a few discussions and customer requests that focused on AWS Aurora and how to make the various architectures and solutions more flexible. Flexible how, you may ask? Well, there are the usual expectations:
- How do you improve resource utilization?
- How I can filter (or block) things?
- Can I shard with Aurora?
- What is the best way to implement query caching?
- … and more.
The inclusion of ProxySQL solves many of the points above. We in Consulting design the solutions for our customers by applying the different functionalities to better match customers needs. Whenever we deal with Aurora, we've had to exclude ProxySQL because of some limitations in the software. Now however, ProxySQL 2.0 supports Aurora, and it does it amazingly. This article will show you how to implement ProxySQL with AWS Aurora. The the next article Leveraging AWS Aurora performance will show you WHY.
The Problem
ProxySQL has two different ways to deal with backend servers. One is using replication mechanisms, like standard Async replication and Group Replication. The other is to use the scheduler, as in the case of Percona XtraDB Cluster, MariaDB Cluster, etc. While we can use the scheduler as a solution for Aurora, it is not as immediate and well-integrated as the embedded support for replication, given that we normally opted not to use it in this specific case (Aurora). But what WAS the problem with Aurora? An Aurora cluster bases its definition of Writer vs. Readers using the innodb_read_only variable. So, where is the problem? Well actually no problem at all, just that ProxySQL up to version 2 for replication only supported the generic variable READ_ONLY. As such, it was not able to correctly identify the Writer/Readers set.
The Solution
In October 2017, this issue was opened (https://github.com/sysown/proxysql/issues/1195 )and the result was, as usual, a quite simple and flexible solution.
“Brainstorming, a possible solution could be to add another column in mysql_replication_hostgroups to specify what needs to be checked, either read_only or innodb_read_only, or even super_read_only”
This lead to the ProxySQL team delivering (“commit fe2f16d6df15252f0107a6a224dad7b1efdb13f6”):
Added support for innodb_read_only and super_read_only
MYHGM_MYSQL_REPLICATION_HOSTGROUPS "CREATE TABLE mysql_replication_hostgroups (writer_hostgroup INT CHECK (writer_hostgroup>=0) NOT NULL PRIMARY KEY , reader_hostgroup INT NOT NULL CHECK (reader_hostgroup<>writer_hostgroup AND reader_hostgroup>=0) , check_type VARCHAR CHECK (LOWER(check_type) IN ('read_only','innodb_read_only','super_read_only')) NOT NULL DEFAULT 'read_only' , comment VARCHAR NOT NULL DEFAULT '' , UNIQUE (reader_hostgroup))"
Which in short means they added a new column to the mysql_replication_hostgroup table. ProxySQL continues to behave exactly the same and manages the servers and the replication groups as usual. No need for scripts or other crazy stuff.
Implementation
Here we are, the HOW TO part. The first thing to keep in mind is that when you implement a new Aurora cluster, you should always consider having at least two instances in the same AZ and another instance in a remote AZ. To implement ProxySQL, you should refer directly to
the instances, NOT to the cluster entry-point. To be clear, you must take this for each instance:
The information is available in the web-admin interface, under the instance or using the command:
aws rds describe-db-instances
And filter the result for:
"Endpoint": { "Port": 3306, "Address": "proxysqltestdb.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com" },
To run ProxySQL with RDS in general, you need to install it on an intermediate server or on the application box. Once you decide which one fits your setup better, you must download or git clone ProxySQL v2.0+. DO NOT use v1.4.x, as it does not contain these new features and will not work as expected. Once you have all the Aurora instances up, it is time to configure ProxySQL.
Below is an example of all the commands used during the installation:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 GRANT usage, replication client ON *.* TO monitor@'%' IDENTIFIED BY 'monitor'; DELETE FROM mysql_servers WHERE hostgroup_id IN (70,71); DELETE FROM mysql_replication_hostgroups WHERE writer_hostgroup=70; INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('proxysqltestdb.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com',70,3306,1000,2000); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('proxysqltestdb.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com',71,3306,1000,2000); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('proxysqltestdb2.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com',71,3306,1000,2000); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('proxysqltestdb-eu-central-1b.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com',71,3306,1,2000); INSERT INTO mysql_replication_hostgroups(writer_hostgroup,reader_hostgroup,comment,check_type) VALUES (70,71,'aws-aurora','innodb_read_only'); LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK; DELETE FROM mysql_query_rules WHERE rule_id IN (50,51,52); INSERT INTO mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply) VALUES(50,6033,'m8_test',70,0,3,'.',1); INSERT INTO mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply) VALUES(51,6033,'m8_test',70,1,3,'^SELECT.*FOR UPDATE',1); INSERT INTO mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply) VALUES(52,6033,'m8_test',71,1,3,'^SELECT.*$',1); LOAD MYSQL QUERY RULES TO RUNTIME;SAVE MYSQL QUERY RULES TO DISK; DELETE FROM mysql_users WHERE username='m8_test'; INSERT INTO mysql_users (username,password,active,default_hostgroup,default_schema,transaction_persistent) VALUES ('m8_test','test',1,70,'mysql',1); LOAD MYSQL USERS TO RUNTIME;SAVE MYSQL USERS TO DISK; UPDATE global_variables SET variable_value="67108864" WHERE variable_name='mysql-max_allowed_packet'; UPDATE global_variables SET Variable_Value=0 WHERE Variable_name='mysql-hostgroup_manager_verbose'; LOAD mysql VARIABLES TO run;save mysql VARIABLES TO disk;
The above will give you a ready-to-go ProxySQL setup that supports Aurora cluster, performing all the usual operations ProxySQL does, including proper W/R split and more for a user named 'm8_test'. The key is in passing the value 'innodb_read_only' for the column check_type in the table mysql_replication_hostgroups. To check the status of your ProxySQL, you can use this command (which gives you a snapshot of what is going to happen):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 watch -n 1 'mysql --defaults-file=~/.my.cnf -h 127.0.0.1 -P 6032 -t -e "select b.weight, c.* from stats_mysql_connection_pool c left JOIN runtime_mysql_servers b ON c.hostgroup=b.hostgroup_id and c.srv_host=b.hostname and c.srv_port = b.port where hostgroup in( 50,52,70,71) order by hostgroup,srv_host desc;" -e " select srv_host,command,avg(time_ms), count(ThreadID) from stats_mysql_processlist group by srv_host,command;" -e "select * from stats_mysql_users;";mysql --defaults-file=~/.my.cnf -h 127.0.0.1 -P 6032 -t -e "select * from stats_mysql_global "|egrep -i "(mirror|memory|stmt|processor)"' +--------+-----------+--------------------------------------------------------------------------+----------+--------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+ | weight | hostgroup | srv_host | srv_port | STATUS | ConnUsed | ConnFree | ConnOK | ConnERR | MaxConnUsed | Queries | Queries_GTID_sync | Bytes_data_sent | Bytes_data_recv | Latency_us | +--------+-----------+--------------------------------------------------------------------------+----------+--------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+ | 1000 | 70 | proxysqltestdb.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5491 | | 1000 | 71 | proxysqltestdb2.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com | 3306 | ONLINE | 0 | 5 | 5 | 0 | 5 | 73 | 0 | 5483 | 28442 | 881 | | 1000 | 71 | proxysqltestdb.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com | 3306 | ONLINE | 0 | 5 | 5 | 0 | 5 | 82 | 0 | 6203 | 32217 | 5491 | | 1 | 71 | proxysqltestdb-eu-central-1b.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1593 | +--------+-----------+--------------------------------------------------------------------------+----------+--------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+ +----------+----------------------+--------------------------+ | username | frontend_connections | frontend_max_connections | +----------+----------------------+--------------------------+ | m8_test | 0 | 10000 | +----------+----------------------+--------------------------+ | Query_Processor_time_nsec | 0 | | Com_backend_stmt_prepare | 0 | | Com_backend_stmt_execute | 0 | | Com_backend_stmt_close | 0 | | Com_frontend_stmt_prepare | 0 | | Com_frontend_stmt_execute | 0 | | Com_frontend_stmt_close | 0 | | Mirror_concurrency | 0 | | Mirror_queue_length | 0 | | SQLite3_memory_bytes | 2652288 | | ConnPool_memory_bytes | 712720 | | Stmt_Client_Active_Total | 0 | | Stmt_Client_Active_Unique | 0 | | Stmt_Server_Active_Total | 0 | | Stmt_Server_Active_Unique | 0 | | Stmt_Max_Stmt_id | 1 | | Stmt_Cached | 0 | | Query_Cache_Memory_bytes | 0 |
At this point, you can connect your application and see how ProxySQL allows you to perform much better than the native cluster entry point. This will be expanded in the next article: Leverage AWS Aurora performance.
Conclusions
I had my first issue with the native Aurora connector a long time ago, but I had nothing to replace it. ProxySQL is a very good alternative to standard cluster access, with more options/controls and it also allows us to perform close-to-application caching, which is much more efficient than the remote MySQL one (http://www.proxysql.com/blog/scaling-with-proxysql-query-cache). In the next article I will illustrate how, in a simple setup, ProxySQL can help in achieving better results than using the default Aurora cluster endpoint.