query optimization - MySQL: How to optimize a JOIN on a primary key with Indexes - Stack Overflow

IT技术

更新时间：2025-03-190

admin管理员组
文章数量:1332351

I am looking to optimize a JOIN between tables that uses a small subset of columns in the second table.

CREATE TABLE `employees` (
  `employee_id` bigint NOT NULL,
  `manager_id` bigint NOT NULL,
  `_id` bigint NOT NULL,
  `union_id` bigint NOT NULL
  ...
  PRIMARY KEY (employee_id),
  INDEX (union_id)
);

CREATE TABLE `managers` (
  `manager_id` bigint NOT NULL,
  `_id` bigint NOT NULL,
  `some_condition` boolean NOT NULL,
  PRIMARY KEY (manager_id)    
);

Now I want to optimize two types of queries. Both join the tables together on manager_id and _id, and optionally apply a filter to the some_condition column.

SELECT employees.* 
FROM employees 
JOIN managers
ON (employees.manager_id = managers.manager_id AND employees_id = managers_id)
WHERE (employees.union_id = ? AND managers.some_condition);

SELECT employees.* 
    FROM employees 
    JOIN managers
    ON (employees.manager_id = managers.manager_id AND employees_id = managers_id)
    WHERE (employees.union_id = ?);

Assuming these are very large tables and employees > manager. I am trying to create an index on managers that will speed up the query. Right now the query is slow because for each row it has to read _id and some_condition. I want to avoid going to disk if possible.

So far I have two indexes that might work:

INDEX `join_index` (`_id`,`some_condition`)
INDEX `id_join_index` (`manager_id`, `_id`, `some_condition`)

My main issue is that MySQL does not use either index in the EXPLAIN statement unless I force it to with use index (...).

Which index (if either) will speed up my query, and do I need manager_id in the index to speed up the join if I do not filter on some_condition?

I am looking to optimize a JOIN between tables that uses a small subset of columns in the second table.

CREATE TABLE `employees` (
  `employee_id` bigint NOT NULL,
  `manager_id` bigint NOT NULL,
  `_id` bigint NOT NULL,
  `union_id` bigint NOT NULL
  ...
  PRIMARY KEY (employee_id),
  INDEX (union_id)
);

CREATE TABLE `managers` (
  `manager_id` bigint NOT NULL,
  `_id` bigint NOT NULL,
  `some_condition` boolean NOT NULL,
  PRIMARY KEY (manager_id)    
);

Now I want to optimize two types of queries. Both join the tables together on manager_id and _id, and optionally apply a filter to the some_condition column.

SELECT employees.* 
FROM employees 
JOIN managers
ON (employees.manager_id = managers.manager_id AND employees._id = managers._id)
WHERE (employees.union_id = ? AND managers.some_condition);

SELECT employees.* 
    FROM employees 
    JOIN managers
    ON (employees.manager_id = managers.manager_id AND employees._id = managers._id)
    WHERE (employees.union_id = ?);

Assuming these are very large tables and employees > manager. I am trying to create an index on managers that will speed up the query. Right now the query is slow because for each row it has to read _id and some_condition. I want to avoid going to disk if possible.

So far I have two indexes that might work:

INDEX `join_index` (`_id`,`some_condition`)
INDEX `id_join_index` (`manager_id`, `_id`, `some_condition`)

My main issue is that MySQL does not use either index in the EXPLAIN statement unless I force it to with use index (...).

Which index (if either) will speed up my query, and do I need manager_id in the index to speed up the join if I do not filter on some_condition?

Share Improve this question edited Nov 21, 2024 at 0:32 Bill Karwin 563k87 gold badges702 silver badges861 bronze badges asked Nov 20, 2024 at 23:33 kingrich123 12 bronze badges

Do you really need _id in the JOIN? Can the manager be in a different anization than their employees? – Barmar Commented Nov 20, 2024 at 23:37
You should declare employees.manager_id to be a foreign key. This will automatically add an index. – Barmar Commented Nov 20, 2024 at 23:38
Aren't managers also employees? Do you really need a separate table for them? – Barmar Commented Nov 20, 2024 at 23:39
@Barmar This is for Vitess, _id is the shard key so it is required in the join. I changed the names to general ones as our business specifics wouldn't make much sense. – kingrich123 Commented Nov 21, 2024 at 1:01

Add a comment |

2 Answers 2

Sorted by: Reset to default 0

Here are my test results：

mysql> alter table managers add index test_1(`some_condition`);
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql>
mysql>
mysql>
mysql>
mysql> explain SELECT employees.*
    -> FROM employees
    -> JOIN managers
    -> ON (employees.manager_id = managers.manager_id AND employees._id = managers._id)
    -> WHERE (employees.union_id = "" and managers.some_condition="");
+----+-------------+-----------+------------+--------+----------------+----------+---------+----------------------------+------+----------+-----------------------+
| id | select_type | table     | partitions | type   | possible_keys  | key      | key_len | ref                        | rows | filtered | Extra                 |
+----+-------------+-----------+------------+--------+----------------+----------+---------+----------------------------+------+----------+-----------------------+
|  1 | SIMPLE      | employees | NULL       | ref    | union_id       | union_id | 8       | const                      |    1 |   100.00 | Using index condition |
|  1 | SIMPLE      | managers  | NULL       | eq_ref | PRIMARY,test_1 | PRIMARY  | 8       | zbdba.employees.manager_id |    1 |   100.00 | Using where           |
+----+-------------+-----------+------------+--------+----------------+----------+---------+----------------------------+------+----------+-----------------------+
2 rows in set, 1 warning (0.00 sec)

Which seem to be the same as yours. The "manager" table here uses the primary key index. Why doesn't it use the secondary index we created? Because the secondary index only contains data of specified columns. Since the fields in your SQL Join include the primary key "manager_id" field of the "manager" table, if MySQL used the secondary index you created, it would still need to perform a table lookup to query the primary key index to obtain the "manager_id" field. Therefore, MySQL directly chose the primary key index of the "manager" table. Because the primary key index contains a complete row of records, including the "_id" and "some_condition" fields, it only needs to compare and filter them one by one.

The Optimizer will decide which table to "start with". If it picks employees, then these indexes will be optimal:

employees:  INDEX(union_id,    -- first
                manager_id,  _id)  -- either order
managers: INDEX(manager_id,  _id,  -- either order
              some_condition)  -- last

But, since manager_id is the PRIMARY KEY, the above index into managers is useless.

The above indexes are reasonably good for either of your queries.

Adding the following may speed up the first query (if the optimizer would prefer to start with managers):

managers:  INDEX(some_condition, -- first
             manager_id, _id)  -- either order

Rationale:

The Optimizer will look up rows in one table, then reach into the other table -- one row at a time.
The Optimizer usually wants to start with something in the WHERE clause. Query 2's WHERE references only employees, so it will start with that table. Query 1 references both tables, so it will do some analysis to guestimate which table is better to start with.
When reaching into the "other" table, all tests in ON and WHERE (for the second table) are fair game -- that is, they should probably be in a composite index.

本文标签： query optimizationMySQL How to optimize a JOIN on a primary key with IndexesStack Overflow

版权声明：本文标题：query optimization - MySQL: How to optimize a JOIN on a primary key with Indexes - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1742322063a2452987.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

query optimization - MySQL: How to optimize a JOIN on a primary key with Indexes - Stack Overflow

2 Answers 2

更多相关文章