Scaling MySQL in the Web Environment
What is “sharding”?
Today’s websites like MySpace and Facebook serve as many as 150k requests per second and thus require a well engineered architecture that is scalable and highly available. This is often achieved using a combination of custom designed applications, modified kernels, proxies, specialized network hardware, deployment systems, and creative database configurations. Although all of these components can be challenging to design and implement, the Achilles heel of any large web deployment is scaling the database layer. Scaling MySQL can become very challenging especially with write-intensive applications. Various MySQL replication schemes facilitate scaling database reads quite well. However, scaling writes leaves much to be desired since MySQL replication doesn’t allow writing to multiple database master servers. The MySQL NDB Cluster enables having numerous master servers that can be written to. Its performance, though, is subpar and this product is simply not ready for prime time in a demanding web environment. This leaves architects looking for other solutions to satisfy the write hungry web apps of today.
A more promising alternative employs MySQL “sharding”, a way to distribute MySQL data across numerous redundant database nodes. Each node may consist of many database servers using replication and hosting multiple shards. The web application can access the needed data for read and write queries using a database abstraction layer such as Hive-DB, Hibernate Shards or a custom solution. The abstraction layer sits between the application and database servers and is designed to make the numerous nodes and shards transparent to the application. It also determines where to allocate new data as well as where to retrieve existing data. The “sharding” model enables unlimited scalability on the database layer and is used by many high traffic web sites. Below, I describe a highly available and scalable architecture employing this model that can be used in a production web environment.
Scalable and Highly Available Web Cluster using “sharding”:
Application and Database layers:
The user’s http traffic comes in through the “web” load balancers and gets distributed to the application servers. The application servers are running Apache fronted with the Varnish cache proxy. The Varnish Proxy is used to offload traffic from Apache and serve cached content much faster via the proxy. By using the proxy, we significantly increase the number of requests per second that the application servers can handle and decrease page load times. By monitoring server and network usage with tools like Munin, we can determine when we are getting close to capacity on the application layer and add additional application servers whenever necessary. This architecture gives us the ability to easily scale the application layer and provide fault tolerance if any of the application servers fail.
A typical social networking site has millions of users whose information is kept in a MySQL database along with their posts and comments. This large user base and unpredictable growth potential requires scalability on the database layer. The need for growth on the database layer is fulfilled by using many database nodes where the different users, posts, and comments are distributed using “sharding”.
Every node consists of an “active” and “passive” database master server in a failover configuration using DRBD and Heartbeat. Each node also consists of numerous slave servers that use replication to stay in sync with the master server. Usage of DRBD and Heartbeat creates an active passive configuration where the failed Active Master server will be automatically replaced by the Passive Master server.
All the database nodes are positioned behind the “database” load balancers and each node has a read VIP and a write VIP. The write VIP is used to send queries to the active master server and the read VIP is used to load-balance read queries to the numerous slave servers. This architecture gives capability to scale database reads easily by adding additional slave servers to an existing node or scaling database writes by creating new nodes. Fault tolerance is achieved using DRB/Heartbeat and load balancing to eliminate single points of failure.
“Sharding” and the application:
Now that we have the application servers and numerous database nodes in place we need to find a way for them to work together. The most challenging part of the overall design is facilitating data access, such as allocating new user data and finding existing user data, complicated by the addition of numerous database nodes. We use a database abstraction layer to make these nodes transparent to the application as if there was only one database server. Hive-DB and Hibernate Shards are open source options providing database abstraction layer functionality. One of these or a custom solution enables the application to issue queries while the abstraction layer handles locating the data on the numerous nodes. All the nodes have the same database schema, but different data is distributed throughout all the nodes. Data belonging to different users may be found on different nodes. The abstraction layers is aware of the read and write VIPs for all the nodes and indexes certain values like user-ids in a directory database so that it can use them to locate the records on the different nodes. When “sharding” is implemented correctly it can make scaling on the database layer seamless and allow for moving shards around different nodes as well as rebalancing them.
Although “sharding” is great for scaling on the database layer, there is no out of the box solution that will work with your application. You will need to invest quite a bit of time and effort in customizing the abstraction layer to make it work with your application. With this investment, you will eliminate many technical growing pains and let your business grow.
by Pavel Pragin / pavel@clearscale.net

Recent Comments