This article talks about MySQL partitioning in an easy and understandable way and is equally suitable for both database beginners and professionals.
Additionally, the readers of this article are going to learn one of the best database tools available to achieve the objective of partitioning in MySQL.
The article highlights the importance of large data partitioning to relieve disk space and improve the overall efficiency of your MySQL database with the help of examples that you can try yourself along with professional life tips.
Introduction to Partitioning
Let us first discuss MySQL partitioning and its basics that you need to know before we move to the practical implementation of it (later in the article).
What is Partitioning?
Partitioning in general term is a way to turn a big thing into small chunks to make it easy to handle.
What is Data Partitioning?
The data partitioning is a design technique to turn your data present (stored) in a single table (the storage structure that holds the data) into several small tables as if we have broken down a big unit of data (table) into smaller units of data (tables) but when joined together forms the same big one table we started off with.
In other terms, data partitioning is a method of splitting or sub-dividing data into smaller parts based on a criterion in order to make the database more efficient.
What is Database Partitioning?
Data partitioning term is often used interchangeably with another well-known term called Database partitioning.
In most cases both database partitioning and data partitioning mean the same thing.
Simple Example of Data Partitioning
A very simple example of understanding the concept of data partitioning is to think about a shopping scenario where you have just gone to a super market to buy bottles of milk. Now considering you have one big bag to carry, so, you decide to put all the three bottles of milk in it. The bag gets very heavy and not easy to handle and the next time you visit the same shopping place you bring three small bags with you and you put each bottle of milk in each of the bags there by distributing the overall load this time to three different places and find it easy to carry in both hands. That’s how partitioning works.
Technical Example of Data Partitioning
You have stored a large table of data about football players and the table got so big that it is costing you a lot of storage such as disk space and has also slowed down. Now you partition the big table into small sub-tables based on the age range of the players.
For example, you decide for easy access to subdivide players’ data into three partitions:
- Partition 1: All the players aged less than 30
- Partition 2: All the players where age is more than 30 and less than 39
- Partition 3: All the players having age more than 40
So, you end up by creating three partitions of the data, it could be understood in terms of distributing the data (table) into small portions (tables) as well.
MySQL Partitioning
Now that we know what is partitioning let us focus on MySQL partitioning and its types and uses.
What is MySQL Partitioning?
Partitioning of data in MySQL to distribute the load and release the disk space is termed as MySQL partitioning.
Please note that the partitioned objects (sub-tables) can be stored and managed individually.
Horizontal vs. Vertical Partitioning
There are two main ways to look at partitioning and that is in terms of horizontal partitioning and vertical partitioning.
Now when we talk about horizontal partitioning think of a horizontal (straight) line that covers everything related to the table specifically the columns.
In other words, horizontal partitioning means slicing a big table along with all of its columns into small slices of data having the same number of columns.
For example, if you have a large table of one million records that stores Name, Age and Rank of football players then horizonal partitioning of that table means you can slice those one million rows (records) into five small tables having two hundred thousand rows each representing the same Name, Age and Rank columns.
On the other side vertical partitioning is opposite of horizontal partitioning that means you can sub-divide the columns into two different partitions keeping a key column used to distinguish records across all partitions. If you take the same example of the table with one million records about Name, Age and Rank of the football players then vertical partitioning means we can turn into three small vertical tables where table 1 contains all the Name(a), table 2 has all the Age values and table 3 contains Rank values while an Id is used in all the three partitions to be able to uniquely identify a record in a sub table and also across all the vertical partitions.
Please note that vertical partitioning is not currently supported by MySQL.
Types of MySQL Partitioning
There are many types of partitioning methods available in MySQL.
You must choose wisely what type suits your requirements but more or less all types follow the same philosophy of relieving the load from one big table of data by distributing it across several smaller tables that can be managed separately.
In this section we are going to discuss the types of MySQL partitioning.
Range Partitioning
Range partitioning as the name indicates is the partition in which the data is partitioned based on some range of values against a column.
For example, we can use city_id column (values) to create three range partitions in the following order:
- Partition P0: City_id is less than 5
- Partition P1: City_id is less than 10
- Partition P2: City_id is less than 15
List Partitioning
List partitioning is also a form of range partitioning except it offers partitioning by range which can have user defined values like a list.
For example, we can use city_id column to create four list partitioning in the following order:
- Partition PSouth: City_id (1,3,5,7)
- Partition PWest: City_id (2,4,6,8)
- Partition PEast: City_id (9,11,15)
- Partition PNorth: City_id (10,12,13,14)
Hash Partitioning
Hash partitioning distributes the data based on a Hash value (a value generated by the system based) but by using a column or expression passed to it by the user.
In other words, unlike Range or List partitioning we do not need to specify a set of values or range for hash partitioning to take into account rather we only need to specify a column or expression while Hash partitioning will take care of the rest.
Technically speaking hashing uses modulus of the hashing function’s value.
For example, PARTITION BY HASH(city_id), PARTITIONS 4 instructions create four partitions of data based on city_id hash value.
Linear Hash Partitioning
Linear Hash partitioning is a slightly complex form of hash partitioning but uses the same principle where we need by apply linear hash to a column or expression but that expression must return an integer.
Unlike hashing partitioning, it uses a linear power-of-two algorithm.
For example: We can linear partition a table based on a column
LINEAR HASH (region_id)
PARTITION BY LINEAR HASH( YEAR(registrationdate) ) PARTITIONS 4;
The above partitioning is going to form four partitions based on the linear hashing calculation based on the year of registrationdate column.
Key Partitioning
Key partitioning is more advanced form of Hash partitioning while hash partitioning does demand a column or expression to be used to generate hashing value key partitioning is even free from that and totally uses its own internal algorithms (hashing function) to partition the data.
For example: PARTITION BY KEY() PARTITIONS 3; uses primary key to partition the data into three partitions but we don’t need to specify the which column values be used to form those partitions.
Sub Partitioning
Please remember you can sub partition a partition as well which is known as sub partitioning or composite partitioning.
For example, you can partition a table by range partition and then that range partition can be sub partitioned as well.
Let us look at the example below:
CREATE TABLE Sale (id INT, orderdate DATE)
PARTITION BY RANGE( YEAR(orderdate) )
SUBPARTITION BY HASH( TO_DAYS(orderdate) )
SUBPARTITIONS 2 (
PARTITION partition0 VALUES LESS THAN (2010),
PARTITION partition1 VALUES LESS THAN (2020),
PARTITION partition2 VALUES LESS THAN MAXVALUE
);
In the above example, a table called Sale is being created with three Range partitions based on orderdate column that is further partitioned in two sub partitions.
Partition Pruning
Partition pruning means your query selection criteria as a result of partitioning becomes smart enough to only search for the required results in the desired partitions.
For example, if you have partitioned a table by range based on registration date of employees and then you search for all the employees whose registration data lies between 01 Jan 2020 and 31 Dec 2021 then only that partition is going to be used to find your results while rest of the partitions won’t be searched by the query.
So, partition pruning is an advantage that you get with partitioning to speed up your query response time after you partition your data.
How to Check If Your Installed MySQL Version Supports Partitioning
There are two ways to check if partitioning is enabled for your installed MySQL version:
- Run SHOW PLUGINS; statement
- Query INFORMATION_SCHEMA.PLUGINS to find out
SELECT
PLUGIN_NAME as PluginName,
PLUGIN_VERSION as PluginVersion,
PLUGIN_STATUS as PluginStatus
FROM INFORMATION_SCHEMA.PLUGINS
WHERE PLUGIN_TYPE=’STORAGE ENGINE’;
Output :
PluginName | PluginVersion | PluginStatus |
partition | 1.0 | ACTIVE |
Key Benefits of Partitioning
Partitioning has many benefits including the following:
- Your data storage capacity of the table increases enormously
- You query response times improve a lot (using partitioning pruning)
- You can easily remove unimportant data (or the data no more required) by simply removing a partition
- You can easily add new data by creating a partitioning and storing it there
- Last but not least you only start seeing benefits of partitioning when you have large amount of data in your tables (millions or billions of records) with small tables it does not make much difference
Improving Performance with Partitioning
The performance gets improved by the use of partitioning especially keeping in mind the above topic of partition pruning.
However, you have to keep in mind that the column that you choose to base your partition plays a vital role in the query performance factor.
Examples of using MySQL partitioning with dbForge Studio for MySQL
DbForge Studio for MySQL is one of the most suitable tools available in the market that can make your job of partitioning data in MySQL easy and you don’t need to write those long scripts for partitioning your data in MySQL.
If you would like to test run the examples coming a head then please download the latest version by clicking the link below:
MySQL GUI Tool – Front End Client with IDE for Windows (devart.com).
Create a Demo Database (SportsDemo)
Let us quickly build a demo database called SportsDemo using dbForge Studio for MySQL.
These examples assume that you have already installed MySQL server locally on your machine.
Please open dbForge Studio and right click on localhost and click New Database.
Name the database as SportsDemo and click Apply Changes:
Please refresh the Database Explorer to see your created blank SportsDemo database:
Create Player table in the database (Sportsdemo)
Please create a new table called Player with the following columns
- PlayerId
- Name
- Score
- Ranking
- RegistrationDate
Right click on sportsdemo and click on New Object => Table:
Create the above mentioend fields with their types in the Columns tab:
Partition by Range Example (Score based partitioning)
Now we want to create partitions of the table Player based on Score range of four categories.
Let us do it by switching to the Partitioning tab and selecting Range Partition type and typing in Score in Partitioning expression.
Right click anywhere in the Partitions and click Add partition:
Name the partition as p0ScoreBelow1000 and set the Value to 1000 as shown below:
Please add two more partitions in the following order:
- p1ScoreBelow2000 Value:2000
- p2Score2000AndAbove Value:MAXVALUE
This can be illustrated as follows:
Click Apply changes and then add the following record into the Player table:
We added a new record for a player named Asif with score 1000, ranking 60 and registration date 01 July 2019.
Testing the partition
Run the following query to test if the record exists in the partition p0ScoreBelow1000:
— Check if record (Asif Score:1000) exists in p0ScoreBelow100 partition
SELECT * FROM Player PARTITION (p0ScoreBelow1000)
No record exists because the only record belongs to the player named Asif whose score is not below 1000.
Now check the partition p1ScoreBelow2000:
We can clearly see the record exists in p1ScoreBelow2000 partition because the player Asif’s score is 1000.
Partitioning by List Example
Similarly, you can remove the partition by Range and create a Partition by list based on Ranking column as follows:
The automatically generated code by dbForge Studio for MySQL for the above partitioning is as follows:
CREATE TABLE sportsdemo.player (
PlayerId int NOT NULL,
Name varchar(50) DEFAULT NULL,
Score int NOT NULL,
Ranking smallint DEFAULT NULL,
RegistrationDate date DEFAULT NULL,
PRIMARY KEY (Score)
)
ENGINE = INNODB,
CHARACTER SET utf8mb4,
COLLATE utf8mb4_0900_ai_ci
PARTITION BY LIST (Ranking)
(
PARTITION p0RankTop10 VALUES IN (1, 2, 3, 4, 5, 6, 7, 8, 9,10)
ENGINE = INNODB,
PARTITION p1RankTop20 VALUES IN (11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
ENGINE = INNODB
);
Conclusion
Partitioning is one of the best ways to optimise your database access and managing large amount of data but at the same time it enforces certain limitations on your database and you have to compare the benefits against the limitation to decide whether to go for it or not as the scenario varies from organisation to organization. dbForge Studio for MySQL can make your MySQL data partitioning easier and without writing of long scripts that significantly saves your time.
Author – Haroon Ashraf
Haroon started his professional career as a programmer over 10 years ago. His work and interests are in database-centric architectures. His experience includes database and reporting design, development, testing, implementation and migration, and database lifecycle management.