admin管理员组文章数量:1201373
I'm finishing a web application and preparing for the possibility of 10 million active users each year. I'm looking at one table in MySQL that will have 100's of rows per user. I'm a programmer not a DBA and thought duplicating that table structure 50 times could work. Each User would be assigned a permanent table decided by their user_id. Those tables would not JOIN or connect with each other but will JOIN with a few other tables. This is only to manage the very large number of rows expected. Is this a valid design or am I missing something? While researching I see Partioning mentioned. Should I be looking into that? Below illustrates what I have in mind. Thank you.
User
=======
1 John
2 Adam
3 Jane
TableNum = ceiling( user_id )
Table1 ... Table50
======= =======
I'm finishing a web application and preparing for the possibility of 10 million active users each year. I'm looking at one table in MySQL that will have 100's of rows per user. I'm a programmer not a DBA and thought duplicating that table structure 50 times could work. Each User would be assigned a permanent table decided by their user_id. Those tables would not JOIN or connect with each other but will JOIN with a few other tables. This is only to manage the very large number of rows expected. Is this a valid design or am I missing something? While researching I see Partioning mentioned. Should I be looking into that? Below illustrates what I have in mind. Thank you.
User
=======
1 John
2 Adam
3 Jane
TableNum = ceiling( user_id )
Table1 ... Table50
======= =======
Share
Improve this question
asked Jan 21 at 22:20
user1742494user1742494
613 bronze badges
5
- 3 Don't do this. Use table partitioning. – Barmar Commented Jan 21 at 22:23
- 1 Doing it your way will require you to use dynamic SQL for all your queries, so you can select from the correct table. – Barmar Commented Jan 21 at 22:24
- 1 If you think your app will have 10s of millions of users and other tables containing potentially billion rows, then please do yourself a favour and include somebody, who does understand databases in your project! Mysql can be scaled up to deal with so many rows even without partitioning. Yeah, you may need powerful infra and fine tuning of database, may have to think about caching and having multiple replicas to serve all the demand. Unfortunately, we can't tell you what exactly you need as we are not aware of your requirements. But, please forget this idea! – Shadow Commented Jan 22 at 1:38
- Idea forgotten. I now see the wisdom of partitions. Thank you for your help! – user1742494 Commented Jan 22 at 4:21
- @Barmar - Partitioning is unlikely to provide any performance benefits. – Rick James Commented Jan 22 at 23:40
1 Answer
Reset to default 0Plan A: Single server
A single server can handle 10M users with a total of 1B rows of data. I will give you these tips:
CREATE TABLE users (
user_id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
name ...
etc...
) ENGINE=InnoDB;
CREATE TABLE the_billion_row_table (
user_id INT UNSIGNED NOT NULL,
id BIGINT AUTO_INCREMENT,
etc...
PRIMARY KEY (user_id, id), -- to 'cluster' rows for a user
INDEX(id) -- to keep AUTO_INCREMENT happy
etc...
) ENGINE=InnoDB;
This slightly-odd combination of PKs will help optimize all the queries that include
WHERE user_id = ? AND ...
Splitting the data into 50 tables is not beneficial; do not do it. Ditto for PARTITION
.
Plan B: "Sharding" across multiple servers.
When the traffic is too much for a single server, and you have done other optimizations (such as in Plan A), "sharding" is likely to be the solution.
This is where some of the users are on each of several servers. (You will unlikely need 50, but I will use that for discussion.)
To find where the user is, there must be some kind of "proxy". This can either be a proxy server that is available for free or purchase, or it can be homegrown, or it can be built into the client code. This is similar to your "pick which of 50 tables" but it is now "which of 50 servers".
Sharding makes it very difficult to do queries across multiple users. If you need that, we need to start over in the discussion.
Since one user may grow and another may shrink, there is a "balancing" problem. This can be solved by moving users between shards (lots of had-crafted code) or by not filling up a shard but instead preemptively spinning up a new shard.
Mapping user_id to shard number can be done in several ways. A purely "algorithmic" (modulo or hash) mapping has the problem of exacerbating the "balancing" problem. A "dictionary" approach requires some form of lookup to discover which server contains the data for the user as they login.
Further discussion of sharding is beyond the scope of this Q&A.
本文标签: Copying table structure to manage large number of rows in MySQLStack Overflow
版权声明:本文标题:Copying table structure to manage large number of rows in MySQL - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1738595828a2101782.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论