Introducing TenDB Cluster - Daniel Ye - MariaDB Cloud MiniFest 2021

Jun 9, 2021 18:17 · 3281 words · 16 minute read

Hi everyone, this is Daniel Ye from Tencent Games. First up I want to thank MariaDB Corporation for reaching out and inviting us to make this presentation. Today my topic is to introduce our TenDB Cluster, a distributed database solution based on Mariadb and Spider. First, I want to briefly introduce our team in our company. I’m a database engineer at Tencent Games cross DBA team. Our team is responsible for providing database services for most of Tencent Games.

We have more than 10 years experience of using and developing MySQL, and more than five years of MariaDB. Over the years we have been actively contributing and advocating both MySQL and MariaDB. As for Tencent Games, we’re the world’s largest game company by revenue, and in 2020 our annual revenues are up to 24 billion US dollars. These are the contents of my presentation today. First I’ll be talking about what TenDB Cluster actually is, and next up is some statistics about TenDB cluster.

Then I’ll do a simple demonstration of deploying and using TenDB Cluster with Docker Compose to give a better sense of what it does. Then I’ll introduce how we’re bringing TenDB cluster to cloud, which is by bringing it on Kubernetes. Finally, there is some contact info of our team. So what’s TenDB Cluster? Before I introduce TenDB Cluster to you, I want to first introduce Spider in MariaDB, because it’s the basis of our TenDB cluster’s architecture. Spider is a database sharding and proxying solution.

It’s built into MariaDB as a plugable storage engine, so it’s also called Spider storage engine. When we create a Spider table, Spider will create table links to one of multiple remote servers also known as backend data nodes. The records in the Spider table are not actually stored in the Spider instance. Instead, they are stored in the remote servers as you can see from the picture to the right. Just one Spider instance in the middle and two remote servers at the bottom.

A DML or DDL operation sent to Spider will get distributed to the backend data nodes for actual execution. As you can see, Spider’s presence is intrinsically more like a proxy server and when you create records on a Spider table, it makes you feel like you’re handling a single table on a single instance, but in reality the records could come from multiple remote servers. So Spider offers built-in database sharding and proxying features with a very intuitive interface.

It’s proved to be a great solution to load balancing under huge data loads, and heavy write traffic. This is basically the reason why we chose Spider as the core of our TenDB Clusters architecture.

03:01 - Now I’m going to talk about what TenDB Cluster actually is and what it can do. As you can see in the picture, a typical TenDB Cluster is made up of three parts. At the center is TSpider, which is in charge of database sharding and query proxy. TSpider is our fork of Spider storage engine in MariaDB 10. 3. 7. Over the years we’ve added tons of features and performance optimizations to TSpider. As I said before, Spider is actually a storage engine inside MariaDB, so it naturally supports MySQL or MariaDB protocols and can process basically any request using MySQL standard APIs.

When TSpider receives a query from the application or a user, it will first look into its routing tables and then rewrite and distribute the query to the underlying backend data nodes. In our case these are those TenDB instances at the bottom. After all the data nodes finish executing the query TSpider will collect and pack the results and then send it back to the application or the user. Since TSpider itself doesn’t store actual data records, it only stores database and table structure, so it’s almost stateless, this allows us to horizontally scale TSpider with very little trouble.

At the bottom is TenDB. This is where the data are actually stored. Note that in TenDB cluster you don’t necessarily have to choose TenDB as your data nodes. It’s okay for you to use MariaDB, MySQL or even Percona Server as long as your chosen database understands structured query language. Yet we strongly recommend you to try TenDB out. What’s cool about TenDB is that we’ve developed a lot of cool features. Take the instant add column for example. On TenDB you’re able to add a column to a large table almost instantly without needing to wait for several hours for the change to take effect.

This feature is merged into MySQL and MariaDB in the latest versions. Other than this, we also have features like blob compression, binlog compression which can be really helpful for production and database management. Lastly to the right is Tdbtl control whose job is to make sure the cluster works correctly. One crucial part of its job is to configure routing within the cluster. That is to maintain the links between TSpider and the data nodes. Another important task it does is monitor the cluster status, making sure things like privileges routing, table structure in the cluster, are healthy, and report errors if not.

Now you may be wondering why should you use it. Actually, there are a lot of cool benefits that come with our TenDB Cluster’s architecture. The first one is about convenience. TenDB Cluster offers great transparency in design. When using TenDB cluster you don’t need to write a complex query specifying all the locations of the shards to create a table. Instead you can just do it like you’re doing it on a single instance. The sharding’s automatically done by the cluster and leaves you no worries.

Additionally, for sharding strategies, you can choose from various options like hash, range or list. You’re also allowed to design your own strategy, and what’s also awesome about TenDB cluster is that it supports dynamic scaling. This means the cluster’s running status does not necessarily get interrupted while you perform scaling. For TSpider depending on your scaling up or down, all you need to do is add or remove TSpider nodes, both in your applications name server and the cluster’s routing table.

For TenDB we support vertical scaling. In this case you need to do a hot backup and restore the instances on your new machine, then modify and flush the cluster routing. As for high availability, it’s also easy and simple for TenDB Cluster to recover from breaks. For TenDB if your deployment is a master-slave pair, you only need to redirect the link from TSpider to the healthy slave node in a cluster routing. In the case of GR, that is group replication, broken nodes will be automatically kicked out.

This is also the case for Tdbctl since Tdbctl is usually deployed with GR. For TSpider, simply cut access to broken nodes by removing them from the name server and cluster routing is enough. For now a heartbeat detect mechanism is integrated into Tdbctl to find and report broken nodes. In our future releases Tdbctl can be able to fix them and maintain cluster routing itself without needing extra tools. Now, here are some statistics about TenDB Cluster.

Here at Tencent Games our TenDB Cluster solution is widely applied to various applications, mostly games. At present there are more than 100 games using our solution and more than a thousand clusters are deployed. Within those clusters there are thousands of TSpider instances, tens of thousands of TenDB instances and up to two petabytes of data are stored. All these statistics are solid proof for the usability and reliability of our solution and also the motivation for us to keep improving and innovating.

08:50 - Outside Tencent, there are also lots of companies actively adopting our solution. They mostly are companies that Tencent has invested in, covering fields like gaming, finance and insurance. We sincerely welcome every individual or company to join our community and use our product. This will be a great honor for us, and also a great opportunity for us to improve and make it better. You can see our official site and GitHub repo in the bottom left.

09:24 - To make you better understand what TenDB Cluster actually is, and what it does, we additionally added support for deploying a TenDB Cluster with Docker Compose in order to allow you to get a quick hands-on. So now I’m going to demonstrate how to deploy our TenDB Cluster with Docker Compose and I’ll also perform some simple operations on a cluster. Now the first thing you want to do is go to our TenDB Cluster’s GitHub main repository, and the the address is github.

com/tencent/Tendbcluster-tspider here you can see all the source code of our TSpider, and if you look into the README, there is a quick start section, and if you click on the link, it leads you to our official website, and here’s a tutorial guiding you how to quickly deploy TenDB Cluster with Docker Compose.

10:22 - My demonstration here is basically following this tutorial, and if you have any questions or if you don’t understand any step that I make in the following demo, you can just go to our official website tendbcluster. com and check out this tutorial right here for detailed info. Also please note that deploying a TenDB Cluster with Docker Compose is just for demonstration purposes, since it’s only deploying TenDB Cluster on a single machine, so there’s no actual practical use for production.

So please don’t use it for production. The next thing you want to do is make sure Docker and Docker Compose are properly installed on your machine. Here on my test environment I have a Docker version of 18. 09. 7. For my Docker Compose it’s 1. 28. 6. Next I’m going to clone our TenDB Cluster Docker Compose repository. You can see the address here, so I’ll just save my time from typing it, and now I’m gonna clone it.

11:37 - It should be done in a few seconds. Here we go. Now you go to this repository.

11:51 - In this directory let’s zoom in a little bit. You can see several files and if you look into the config directory, you can see three configuration files for TenDB, TSpider and Tdbctl, and among these files the most important one is the Docker Compose file. Let’s take a look at it. This file contains all the configuration for our cluster, and you can see it contains a lot of specs and also specs for each of our cluster’s nodes. You can see the first one is the spec for our first TenDB instance, and also, you can see the second instance, third one, and also the fourth one.

There are four TenDB instances in our cluster and there are also specs for our TSpider. There are two TSspider nodes, and last one is the Tdbctl. Among these specs you can see some important ones, for example the image. This is the image path for our TenDB instance. You can obtain the image file from hub. docker. com and this is our container name. Within our Docker network you can actually access these nodes with this container name as host, and also the port forwarding, so as you see we can access this TenDB instance with the port 20000 on our local machine.

Also here you can see there are some file mappings that it’s done under this volume stack. First you can see it maps the TenDB configuration file that I showed you before into the Docker container, so when the TenDB instance is starting, it can read the configuration file and then use it to configure this instance, and also its the data directory. When we are accessing the cluster, we’re using a user that’s called tendbcluster and the corresponding password is tendbclusterpass.

It’s a very simple demonstration, so we are storing them with plain text here, and that’s it for this YAML file.

14:25 - Now it’s time to start our cluster and it’s very simple to do. All you have to do is type in docker-compose up and then what it does is Docker Compose will read the configuration file that I introduced before and then it will pull the images, create the containers and start the instances and configure the routing within the cluster, and now all you have to do is wait for the configuration to be finished. You can check the container status by using the docker ps command and when all the containers are healthy like this, then that basically means the cluster is ready to go, and now you can you can log into the Spider instance which is at port 25000, and then you can check the server table… like this, and then you can see this is the routing within the cluster, and as you can see it contains the unique identifier for each node in the cluster, and also the host and the port, so when Spider receives a query, it will look up this table to check the route to its backend data nodes and then distribute the query.

15:53 - Now I’m going to perform some simple operations on a cluster. First up I want to create a simple database that’s called mytest, and now use it and create a simple table mytable and it has an INT primary key in a text field. Let’s see the table’s definition. As you can see here, I didn’t specify any partition info in my create table query, and still our TSpider does the partition itself, and the number of partitions actually depends on the number of back-end data nodes in your cluster.

In my case, there are four TenDB instances at the bottom, so our TSpider creates four partition tables on each of the TenDB instances. Now I’m going to insert a record into this table…

17:00 - and then a few more… and then I’ll do a select to see the full records and you can see all of the four records, and all these operations make very little difference between operating on a single instance, but in fact none of these records are stored on this TSpider instance. Instead, they’re stored separately on those backend data nodes, and we can verify that by logging into one of our TenDB instances and check it out.

17:42 - Now I’m going to log into one of the TenDB instances that’s at port 20000…

17:52 - Let’s see the databases. You can see the mytest with the underscore zero. That means this is the first chart among those TenDB instances. Let’s see the content of it…

18:11 - and then we do a select… As you can see, there’s only one record on this shard, and that apparently means that the other three records are on the other TenDB instances, and now we go back to our TSpider instance.

18:39 - Now we can perform some other DML operations, for example I can delete a record where the key is four, and I can update the record, setting the second field to “hello world” where the key is 3. Then we do a select…

19:06 - As you can see these operations are successfully executed on our backend data nodes, otherwise we won’t be able to see it from our TSpider. We can also do some DDL operations like adding a column for example. Here I’m going to add a float column to this table with a default value of the famous pi.

19:35 - It’s executed successfully and now let’s see the table’s definition.

19:47 - There you go, and now let’s do a select to see if the records have the default value.

19:57 - And there it is. So that’s it for the demonstration part. What I demonstrated was merely the tip of the iceberg of TenDB Cluster and its ability, so we strongly recommend you try deploying TenDB Cluster with Docker Compose like I did, because it’s the easiest way for you to understand what it does, and whether or not it does the job that you want it to do. So if you’re interested please make sure to check out our official website tendbcluster.

com for full tutorials and manuals. And there’s one more thing. We’re a team that never stop innovating. We sometimes look carefully at our TenDB cluster and wonder, can we make it better, or can we use it better? Then we will propose lots of what-ifs. Like, what if we can deploy a multi-machines cluster that’s ready for production with just one command. What if we can scale our cluster at will with just one command. What if we can allocate computing or storage resources that are just about the amount the application demands.

What if the cluster can automatically decide which nodes to be deployed on which machines to provide the best performance and reliability, and what if we can achieve fully automatic high availability and let the cluster recover from a crash by itself. These are very practical questions and can really boost the efficiency of our production if solved. At present there’s a wonderful solution to these questions, that is cloud. To put TenDB cluster on cloud, we decided to bring TenDB Cluster to Kubernetes using using our own TenDB Cluster operator.

Since our presentation time is limited, I’ll just save our time for introducing Kubernetes and assume you know what Kubernetes is, and its main components. Basically our TenDB cluster operator does it all to make sure our cluster works well on Kubernetes. Either it’s to create a cluster, or scale a cluster. All we need to do is pass the YAML file describing the configuration to one command, and then let the operator do the rest. With modern containers, we are able to precisely allocate resources like CPU cores or dig space to fulfill different clusters needs.

With Kubernetes container orchestration ability, we’re able to design rules to make our operator find the best way to deploy nodes. It’s also easier for us to automate crash recovery in our cluster with Kubernetes ability to monitor and manage containers.

22:34 - Advanced resource management, high degree of automation, cutting cluster management costs as much as possible, are the main reasons we bring TenDB Cluster to Kubernetes. At present, TenDB Cluster operator supports basic cluster operations, including deployment, scaling and also crash recovery. In the near future, we’ll put our focus on completing and improving TenDB Cluster operator, making TenDB cluster on Kubernetes a reliable solution for production.

23:06 - Again, you’re always welcome to join our community and give a try on TenDB Cluster. Your opinion and feedbacks are all we are looking for. There are many ways to receive info on our product and contact us. You can visit our official website tendbcluster. com for full documentation, tutorials, and guide, and frequently asked questions. Our TenDB Cluster is fully open source, so you can visit our GitHub repo to get the source code, try it out and submit issues and PRs if you find out some problems or improvements that can be made.

Also, you can join our QQ group chat to directly talk to our team members, ask any questions, or just chat if you want. Lastly, thank you all for listening and sticking through here. I hope I don’t bore you already. It’s been a pleasure to present at this event and I really appreciate MariaDB Corporation for giving us this opportunity. Thank you and hope to see you soon. .