admin管理员组文章数量:1244331
I have a complexe query to make statistics on few tables.
I need, for each party and each thematic the number of vote between a member and a party + the number of vote with the same value (member_vote.value = party_statistic.vote_result
)
The vote_thematic_category has a hierarchy structure and each parent has to count the number of vote matching they children
This query take 25 secondes, I try to optimise it, if somewone know how. Sorry this is a huge question.
This is the query I made
WITH MemberVotes AS (
-- Get the vote_ids and values for the specific member.
SELECT mv.vote_id, mv.value
FROM member_vote mv
WHERE mv.member_id = 262
),
PartyVoteResults AS (
-- Retrieve party vote results for only the votes that the specified member voted on.
SELECT ps.party_id, ps.vote_id, ps.vote_result
FROM party_statistic ps
WHERE EXISTS (SELECT 1 FROM MemberVotes mv WHERE mv.vote_id = ps.vote_id) -- Filter by vote_ids from MemberVotes
),
MatchedVotes AS (
-- Find votes where the member s vote matches the party s vote, and get the thematic category.
SELECT pvr.party_id, vht.vote_thematic_category_id, mv.vote_id
FROM PartyVoteResults pvr
JOIN MemberVotes mv ON pvr.vote_id = mv.vote_id AND pvr.vote_result = mv.value -- Match on vote_id and value
JOIN vote_has_thematic vht ON pvr.vote_id = vht.vote_id -- Get thematic category
),
AggregatedCategories AS (
-- Create a hierarchy of thematic categories, including parent categories.
SELECT id AS category_id, id AS aggregated_category_id FROM vote_thematic_category
UNION -- Combine root categories and child categories
SELECT id AS category_id, parent_id AS aggregated_category_id FROM vote_thematic_category WHERE parent_id IS NOT NULL -- Get child/parent link
),
AggregatedMatchedVotes AS (
-- Associate matched votes with their aggregated thematic category.
SELECT mv.party_id, ac.aggregated_category_id AS thematic_category_id, mv.vote_id
FROM MatchedVotes mv
JOIN AggregatedCategories ac ON mv.vote_thematic_category_id = ac.category_id -- Use the aggregated category
),
AggregatedCounts AS (
-- Count the number of distinct matching votes for each party and aggregated thematic category.
SELECT party_id, thematic_category_id, COUNT(DISTINCT vote_id) AS matching_votes
FROM AggregatedMatchedVotes
GROUP BY party_id, thematic_category_id -- Group by party and thematic category
),
AllVotesPerCategory AS (
-- Count the total number of distinct votes for each party and aggregated thematic category.
SELECT pvr.party_id, ac.aggregated_category_id AS thematic_category_id, COUNT(DISTINCT pvr.vote_id) AS total_votes
FROM PartyVoteResults pvr
JOIN vote_has_thematic vht ON pvr.vote_id = vht.vote_id
JOIN AggregatedCategories ac ON vht.vote_thematic_category_id = ac.category_id
GROUP BY pvr.party_id, ac.aggregated_category_id -- Group by party and thematic category
)
SELECT
ps.party_id,
tc.id AS thematic_category_id,
COALESCE(ac.matching_votes, 0) AS matching_votes, -- Ensure 0 for no matches
COALESCE(avpc.total_votes, 0) AS total_votes -- Ensure 0 for no votes
FROM (SELECT DISTINCT party_id FROM party_statistic) ps
CROSS JOIN (SELECT id FROM vote_thematic_category) tc -- Include all thematic categories
LEFT JOIN AggregatedCounts ac ON ps.party_id = ac.party_id AND tc.id = ac.thematic_category_id
LEFT JOIN AllVotesPerCategory avpc ON ps.party_id = avpc.party_id AND tc.id = avpc.thematic_category_id
ORDER BY ps.party_id, tc.id;
This is the tables :
member_vote
+------+-----------+---------+-------+
| id | member_id | vote_id | value |
+------+-----------+---------+-------+
| 262 | 262 | 1 | FOR |
| 1014 | 262 | 2 | FOR |
| 1765 | 262 | 3 | FOR |
| 2516 | 262 | 4 | FOR |
| 3267 | 262 | 5 | FOR |
+------+-----------+---------+-------+
vote
+----+
| id |
+----+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
+----+
vote_thematic_category
+----+-----------+
| id | parent_id |
+----+-----------+
| 1 | NULL |
| 2 | NULL |
| 14 | 1 |
| 15 | 1 |
| 24 | 2 |
| 25 | 2 |
+----+-----------+
vote_has_thematic
+---------+---------------------------+
| vote_id | vote_thematic_category_id |
+---------+---------------------------+
| 1 | 14 |
| 1 | 15 |
| 2 | 1 |
| 2 | 14 |
| 2 | 15 |
| 3 | 1 |
+---------+---------------------------+
party_statistic;
+-----+----------+---------+--------------+
| id | party_id | vote_id | vote_result |
+-----+----------+---------+--------------+
| 1 | 1 | 1 | NULL |
| 2 | 2 | 1 | FOR |
| 220 | 1 | 2 | NULL |
| 221 | 2 | 2 | FOR |
| 439 | 1 | 3 | NULL |
| 440 | 2 | 3 | FOR |
| 658 | 1 | 4 | FOR |
| 659 | 2 | 4 | DID_NOT_VOTE |
| 877 | 1 | 5 | FOR |
| 878 | 2 | 5 | FOR |
+-----+----------+---------+--------------+
result
+----------+----------------------+----------------+-------------+
| party_id | thematic_category_id | matching_votes | total_votes |
+----------+----------------------+----------------+-------------+
| 1 | 1 | 0 | 3 |
| 1 | 2 | 0 | 0 |
| 1 | 14 | 0 | 2 |
| 1 | 15 | 0 | 2 |
| 1 | 24 | 0 | 0 |
| 1 | 25 | 0 | 0 |
| 2 | 1 | 3 | 3 |
| 2 | 2 | 0 | 0 |
| 2 | 14 | 2 | 2 |
| 2 | 15 | 2 | 2 |
| 2 | 24 | 0 | 0 |
| 2 | 25 | 0 | 0 |
+----------+----------------------+----------------+-------------+
The create tables:
| member_vote | CREATE TABLE `member_vote` (
`id` int NOT NULL AUTO_INCREMENT,
`member_id` int DEFAULT NULL,
`vote_id` int DEFAULT NULL,
`value` varchar(12) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
PRIMARY KEY (`id`),
KEY `IDX_A698A9387597D3FE` (`member_id`),
KEY `IDX_A698A93872DCDAFC` (`vote_id`),
KEY `IDX_A698A9381D775834` (`value`),
KEY `IDX_A698A9387597D3FE72DCDAFC` (`member_id`,`vote_id`),
CONSTRAINT `FK_A698A93872DCDAFC` FOREIGN KEY (`vote_id`) REFERENCES `vote` (`id`) ON DELETE CASCADE,
CONSTRAINT `FK_A698A9387597D3FE` FOREIGN KEY (`member_id`) REFERENCES `member` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=1319272 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci |
| vote | CREATE TABLE `vote` (
`id` int NOT NULL AUTO_INCREMENT,
`official_id` int NOT NULL,
`is_featured` tinyint(1) NOT NULL,
`title` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci,
`reference` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`procedure_reference` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`description` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`summary_link` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`vote_date` datetime NOT NULL COMMENT '(DC2Type:datetime_immutable)',
PRIMARY KEY (`id`),
KEY `IDX_5A108564A9BA2FD0` (`is_featured`)
) ENGINE=InnoDB AUTO_INCREMENT=1868 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci |
| vote_thematic_category | CREATE TABLE `vote_thematic_category` (
`id` int NOT NULL AUTO_INCREMENT,
`parent_id` int DEFAULT NULL,
`label` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`prompt` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
PRIMARY KEY (`id`),
KEY `IDX_942FC12A727ACA70` (`parent_id`),
KEY `IDX_942FC12ABF396750727ACA70` (`id`,`parent_id`),
CONSTRAINT `FK_942FC12A727ACA70` FOREIGN KEY (`parent_id`) REFERENCES `vote_thematic_category` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=109 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci |
| vote_has_thematic | CREATE TABLE `vote_has_thematic` (
`vote_id` int NOT NULL,
`vote_thematic_category_id` int NOT NULL,
PRIMARY KEY (`vote_id`,`vote_thematic_category_id`),
KEY `IDX_1CF3D9D72DCDAFC` (`vote_id`),
KEY `IDX_1CF3D9D2AABDDC4` (`vote_thematic_category_id`),
CONSTRAINT `FK_1CF3D9D2AABDDC4` FOREIGN KEY (`vote_thematic_category_id`) REFERENCES `vote_thematic_category` (`id`) ON DELETE CASCADE,
CONSTRAINT `FK_1CF3D9D72DCDAFC` FOREIGN KEY (`vote_id`) REFERENCES `vote` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci |
| party_statistic | CREATE TABLE `party_statistic` (
`id` int NOT NULL AUTO_INCREMENT,
`party_id` int DEFAULT NULL,
`vote_id` int DEFAULT NULL,
`vote_result` varchar(12) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`stats` json NOT NULL,
PRIMARY KEY (`id`),
KEY `IDX_725B27D1213C1059` (`party_id`),
KEY `IDX_725B27D172DCDAFC` (`vote_id`),
KEY `IDX_725B27D172DCDAFC213C1059` (`vote_id`,`party_id`),
CONSTRAINT `FK_725B27D1213C1059` FOREIGN KEY (`party_id`) REFERENCES `party` (`id`) ON DELETE CASCADE,
CONSTRAINT `FK_725B27D172DCDAFC` FOREIGN KEY (`vote_id`) REFERENCES `vote` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=396566 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci |
And the explain query
+----+--------------+------------------------+------------+--------+---------------------------------------------------------------------------------------------+------------------------------+---------+--------------------------------------------+------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------+------------------------+------------+--------+---------------------------------------------------------------------------------------------+------------------------------+---------+--------------------------------------------+------+----------+----------------------------------------------+
| 1 | PRIMARY | vote_thematic_category | NULL | index | NULL | IDX_942FC12A727ACA70 | 5 | NULL | 6 | 100.00 | Using index; Using temporary; Using filesort |
| 1 | PRIMARY | <derived2> | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | Using join buffer (hash join) |
| 1 | PRIMARY | <derived4> | NULL | ref | <auto_key0> | <auto_key0> | 10 | ps.party_id,vote.vote_thematic_category.id | 2 | 100.00 | NULL |
| 1 | PRIMARY | <derived14> | NULL | ref | <auto_key0> | <auto_key0> | 10 | ps.party_id,vote.vote_thematic_category.id | 2 | 100.00 | NULL |
| 14 | DERIVED | vht | NULL | index | PRIMARY,IDX_1CF3D9D72DCDAFC,IDX_1CF3D9D2AABDDC4 | IDX_1CF3D9D72DCDAFC | 4 | NULL | 6 | 100.00 | Using index; Using temporary; Using filesort |
| 14 | DERIVED | mv | NULL | ref | IDX_A698A9387597D3FE,IDX_A698A93872DCDAFC,IDX_A698A9387597D3FE72DCDAFC | IDX_A698A93872DCDAFC | 5 | vote.vht.vote_id | 1 | 100.00 | Using where; Start temporary |
| 14 | DERIVED | ps | NULL | ref | IDX_725B27D172DCDAFC,IDX_725B27D172DCDAFC213C1059 | IDX_725B27D172DCDAFC213C1059 | 5 | vote.vht.vote_id | 1 | 100.00 | Using index; End temporary |
| 14 | DERIVED | <derived11> | NULL | ref | <auto_key1> | <auto_key1> | 4 | vote.vht.vote_thematic_category_id | 2 | 100.00 | NULL |
| 4 | DERIVED | mv | NULL | ref | IDX_A698A9387597D3FE,IDX_A698A93872DCDAFC,IDX_A698A9387597D3FE72DCDAFC,IDX_A698A9381D775834 | IDX_A698A9387597D3FE | 5 | const | 5 | 100.00 | Using where; Using temporary; Using filesort |
| 4 | DERIVED | ps | NULL | ref | IDX_725B27D172DCDAFC,IDX_725B27D172DCDAFC213C1059 | IDX_725B27D172DCDAFC213C1059 | 5 | vote.mv.vote_id | 1 | 10.00 | Using where |
| 4 | DERIVED | mv | NULL | ref | IDX_A698A9387597D3FE,IDX_A698A93872DCDAFC,IDX_A698A9387597D3FE72DCDAFC | IDX_A698A93872DCDAFC | 5 | vote.mv.vote_id | 1 | 100.00 | Using where; FirstMatch(ps) |
| 4 | DERIVED | <derived11> | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | Using join buffer (hash join) |
| 4 | DERIVED | vht | NULL | eq_ref | PRIMARY,IDX_1CF3D9D72DCDAFC,IDX_1CF3D9D2AABDDC4 | PRIMARY | 8 | vote.mv.vote_id,ac.category_id | 1 | 100.00 | Using index |
| 11 | DERIVED | vote_thematic_category | NULL | index | NULL | IDX_942FC12A727ACA70 | 5 | NULL | 6 | 100.00 | Using index |
| 12 | UNION | vote_thematic_category | NULL | range | IDX_942FC12A727ACA70,IDX_942FC12ABF396750727ACA70 | IDX_942FC12A727ACA70 | 5 | NULL | 4 | 100.00 | Using where; Using index |
| 13 | UNION RESULT | <union11,12> | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | Using temporary |
| 2 | DERIVED | party_statistic | NULL | index | IDX_725B27D1213C1059,IDX_725B27D172DCDAFC213C1059 | IDX_725B27D1213C1059 | 5 | NULL | 10 | 100.00 | Using index |
+----+--------------+------------------------+------------+--------+---------------------------------------------------------------------------------------------+------------------------------+---------+--------------------------------------------+------+----------+----------------------------------------------+
And the EXPLAIN ANALYZE
| -> Sort: ps.party_id, tc.thematic_category_id (actual time=34567..34568 rows=30456 loops=1)
-> Stream results (cost=110e+6 rows=0) (actual time=34509..34558 rows=30456 loops=1)
-> Nested loop left join (cost=110e+6 rows=0) (actual time=34509..34553 rows=30456 loops=1)
-> Nested loop left join (cost=307832 rows=0) (actual time=13395..13418 rows=30456 loops=1)
-> Inner hash join (no condition) (cost=3409 rows=33912) (actual time=3.39..4.07 rows=30456 loops=1)
-> Table scan on ps (cost=245..251 rows=314) (actual time=3.26..3.28 rows=282 loops=1)
-> Materialize (cost=245..245 rows=314) (actual time=3.25..3.25 rows=282 loops=1)
-> Covering index skip scan for deduplication on party_statistic using IDX_725B27D1213C1059 (cost=173 rows=314) (actual time=0.0835..3.19 rows=282 loops=1)
-> Hash
-> Covering index scan on vote_thematic_category using IDX_942FC12A727ACA70 (cost=11.6 rows=108) (actual time=0.0741..0.094 rows=108 loops=1)
-> Index lookup on ac using <auto_key0> (party_id=ps.party_id, thematic_category_id=vote_thematic_category.id) (cost=0.25..8.98 rows=35.9) (actual time=0.44..0.44 rows=0.972 loops=30456)
-> Materialize CTE aggregatedcounts (cost=0..0 rows=0) (actual time=13391..13391 rows=29598 loops=1)
-> Group aggregate: count(distinct member_vote.vote_id) (actual time=12320..13375 rows=29598 loops=1)
-> Sort: aggregatedmatchedvotes.party_id, aggregatedmatchedvotes.thematic_category_id (actual time=12320..12541 rows=9.44e+6 loops=1)
-> Stream results (cost=14.7e+6 rows=145e+6) (actual time=4.29..5196 rows=9.44e+6 loops=1)
-> Nested loop inner join (cost=14.7e+6 rows=145e+6) (actual time=4.28..3878 rows=9.44e+6 loops=1)
-> Nested loop inner join (cost=187692 rows=713469) (actual time=3.82..1367 rows=5.11e+6 loops=1)
-> Nested loop inner join (cost=116340 rows=33053) (actual time=3.75..389 rows=258763 loops=1)
-> Nested loop inner join (cost=654 rows=1867) (actual time=3.1..24.3 rows=1867 loops=1)
-> Filter: (`<subquery8>`.vote_id is not null) (cost=618..0.0963 rows=1867) (actual time=3..3.39 rows=1867 loops=1)
-> Table scan on <subquery8> (cost=618..644 rows=1867) (actual time=3..3.29 rows=1867 loops=1)
-> Materialize with deduplication (cost=618..618 rows=1867) (actual time=3..3 rows=1867 loops=1)
-> Filter: (mv.vote_id is not null) (cost=188 rows=1867) (actual time=0.0642..1.96 rows=1867 loops=1)
-> Covering index lookup on mv using IDX_A698A9387597D3FE72DCDAFC (member_id=262) (cost=188 rows=1867) (actual time=0.0617..1.63 rows=1867 loops=1)
-> Index lookup on mv using IDX_A698A9387597D3FE72DCDAFC (member_id=262, vote_id=`<subquery8>`.vote_id) (cost=0.25 rows=1) (actual time=0.0107..0.0111 rows=1 loops=1867)
-> Filter: (ps.vote_result = mv.`value`) (cost=82635 rows=17.7) (actual time=0.081..0.19 rows=139 loops=1867)
-> Index lookup on ps using IDX_725B27D172DCDAFC213C1059 (vote_id=`<subquery8>`.vote_id) (cost=82635 rows=177) (actual time=0.08..0.178 rows=212 loops=1867)
-> Covering index lookup on vht using IDX_1CF3D9D72DCDAFC (vote_id=`<subquery8>`.vote_id) (cost=0.379 rows=21.6) (actual time=0.00153..0.00302 rows=19.7 loops=258763)
-> Index lookup on ac using <auto_key1> (category_id=vht.vote_thematic_category_id) (cost=78.1..82.7 rows=10.2) (actual time=256e-6..370e-6 rows=1.85 loops=5.11e+6)
-> Materialize union CTE aggregatedcategories if needed with deduplication (cost=77.6..77.6 rows=203) (actual time=0.449..0.449 rows=203 loops=1)
-> Covering index scan on vote_thematic_category using IDX_942FC12A727ACA70 (cost=11.6 rows=108) (actual time=0.0554..0.0915 rows=108 loops=1)
-> Filter: (vote_thematic_category.parent_id is not null) (cost=19.3 rows=95) (actual time=0.0146..0.0986 rows=95 loops=1)
-> Covering index range scan on vote_thematic_category using IDX_942FC12A727ACA70 over (NULL < parent_id) (cost=19.3 rows=95) (actual time=0.0133..0.0847 rows=95 loops=1)
-> Index lookup on avpc using <auto_key0> (party_id=ps.party_id, thematic_category_id=vote_thematic_category.id) (cost=0.25..89.8 rows=359) (actual time=0.694..0.694 rows=0.99 loops=30456)
-> Materialize CTE allvotespercategory (cost=0..0 rows=0) (actual time=21114..21114 rows=30155 loops=1)
-> Group aggregate: count(distinct party_statistic.vote_id) (actual time=19313..21091 rows=30155 loops=1)
-> Sort: pvr.party_id, ac.aggregated_category_id (actual time=19313..19690 rows=14.8e+6 loops=1)
-> Stream results (cost=146e+6 rows=1.45e+9) (actual time=0.557..6634 rows=14.8e+6 loops=1)
-> Nested loop inner join (cost=146e+6 rows=1.45e+9) (actual time=0.555..5529 rows=14.8e+6 loops=1)
-> Nested loop inner join (cost=747175 rows=7.13e+6) (actual time=0.551..1591 rows=8e+6 loops=1)
-> Nested loop inner join (cost=33660 rows=330534) (actual time=0.542..72.8 rows=396565 loops=1)
-> Filter: (`<subquery16>`.vote_id is not null) (cost=618..0.00963 rows=1867) (actual time=0.525..1.22 rows=1867 loops=1)
-> Table scan on <subquery16> (cost=618..644 rows=1867) (actual time=0.524..1.06 rows=1867 loops=1)
-> Materialize with deduplication (cost=618..618 rows=1867) (actual time=0.524..0.524 rows=1867 loops=1)
-> Filter: (mv.vote_id is not null) (cost=188 rows=1867) (actual time=0.0429..0.352 rows=1867 loops=1)
-> Covering index lookup on mv using IDX_A698A9387597D3FE72DCDAFC (member_id=262) (cost=188 rows=1867) (actual time=0.0417..0.28 rows=1867 loops=1)
-> Covering index lookup on ps using IDX_725B27D172DCDAFC213C1059 (vote_id=`<subquery16>`.vote_id) (cost=625 rows=177) (actual time=0.00961..0.0306 rows=212 loops=1867)
-> Covering index lookup on vht using IDX_1CF3D9D72DCDAFC (vote_id=`<subquery16>`.vote_id) (cost=0.27 rows=21.6) (actual time=0.00153..0.00305 rows=20.2 loops=396565)
-> Index lookup on ac using <auto_key1> (category_id=vht.vote_thematic_category_id) (cost=78.1..82.7 rows=10.2) (actual time=256e-6..370e-6 rows=1.85 loops=8e+6)
-> Materialize union CTE aggregatedcategories if needed (query plan printed elsewhere) with deduplication (cost=77.6..77.6 rows=203) (never executed)
I have a complexe query to make statistics on few tables.
I need, for each party and each thematic the number of vote between a member and a party + the number of vote with the same value (member_vote.value = party_statistic.vote_result
)
The vote_thematic_category has a hierarchy structure and each parent has to count the number of vote matching they children
This query take 25 secondes, I try to optimise it, if somewone know how. Sorry this is a huge question.
This is the query I made
WITH MemberVotes AS (
-- Get the vote_ids and values for the specific member.
SELECT mv.vote_id, mv.value
FROM member_vote mv
WHERE mv.member_id = 262
),
PartyVoteResults AS (
-- Retrieve party vote results for only the votes that the specified member voted on.
SELECT ps.party_id, ps.vote_id, ps.vote_result
FROM party_statistic ps
WHERE EXISTS (SELECT 1 FROM MemberVotes mv WHERE mv.vote_id = ps.vote_id) -- Filter by vote_ids from MemberVotes
),
MatchedVotes AS (
-- Find votes where the member s vote matches the party s vote, and get the thematic category.
SELECT pvr.party_id, vht.vote_thematic_category_id, mv.vote_id
FROM PartyVoteResults pvr
JOIN MemberVotes mv ON pvr.vote_id = mv.vote_id AND pvr.vote_result = mv.value -- Match on vote_id and value
JOIN vote_has_thematic vht ON pvr.vote_id = vht.vote_id -- Get thematic category
),
AggregatedCategories AS (
-- Create a hierarchy of thematic categories, including parent categories.
SELECT id AS category_id, id AS aggregated_category_id FROM vote_thematic_category
UNION -- Combine root categories and child categories
SELECT id AS category_id, parent_id AS aggregated_category_id FROM vote_thematic_category WHERE parent_id IS NOT NULL -- Get child/parent link
),
AggregatedMatchedVotes AS (
-- Associate matched votes with their aggregated thematic category.
SELECT mv.party_id, ac.aggregated_category_id AS thematic_category_id, mv.vote_id
FROM MatchedVotes mv
JOIN AggregatedCategories ac ON mv.vote_thematic_category_id = ac.category_id -- Use the aggregated category
),
AggregatedCounts AS (
-- Count the number of distinct matching votes for each party and aggregated thematic category.
SELECT party_id, thematic_category_id, COUNT(DISTINCT vote_id) AS matching_votes
FROM AggregatedMatchedVotes
GROUP BY party_id, thematic_category_id -- Group by party and thematic category
),
AllVotesPerCategory AS (
-- Count the total number of distinct votes for each party and aggregated thematic category.
SELECT pvr.party_id, ac.aggregated_category_id AS thematic_category_id, COUNT(DISTINCT pvr.vote_id) AS total_votes
FROM PartyVoteResults pvr
JOIN vote_has_thematic vht ON pvr.vote_id = vht.vote_id
JOIN AggregatedCategories ac ON vht.vote_thematic_category_id = ac.category_id
GROUP BY pvr.party_id, ac.aggregated_category_id -- Group by party and thematic category
)
SELECT
ps.party_id,
tc.id AS thematic_category_id,
COALESCE(ac.matching_votes, 0) AS matching_votes, -- Ensure 0 for no matches
COALESCE(avpc.total_votes, 0) AS total_votes -- Ensure 0 for no votes
FROM (SELECT DISTINCT party_id FROM party_statistic) ps
CROSS JOIN (SELECT id FROM vote_thematic_category) tc -- Include all thematic categories
LEFT JOIN AggregatedCounts ac ON ps.party_id = ac.party_id AND tc.id = ac.thematic_category_id
LEFT JOIN AllVotesPerCategory avpc ON ps.party_id = avpc.party_id AND tc.id = avpc.thematic_category_id
ORDER BY ps.party_id, tc.id;
This is the tables :
member_vote
+------+-----------+---------+-------+
| id | member_id | vote_id | value |
+------+-----------+---------+-------+
| 262 | 262 | 1 | FOR |
| 1014 | 262 | 2 | FOR |
| 1765 | 262 | 3 | FOR |
| 2516 | 262 | 4 | FOR |
| 3267 | 262 | 5 | FOR |
+------+-----------+---------+-------+
vote
+----+
| id |
+----+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
+----+
vote_thematic_category
+----+-----------+
| id | parent_id |
+----+-----------+
| 1 | NULL |
| 2 | NULL |
| 14 | 1 |
| 15 | 1 |
| 24 | 2 |
| 25 | 2 |
+----+-----------+
vote_has_thematic
+---------+---------------------------+
| vote_id | vote_thematic_category_id |
+---------+---------------------------+
| 1 | 14 |
| 1 | 15 |
| 2 | 1 |
| 2 | 14 |
| 2 | 15 |
| 3 | 1 |
+---------+---------------------------+
party_statistic;
+-----+----------+---------+--------------+
| id | party_id | vote_id | vote_result |
+-----+----------+---------+--------------+
| 1 | 1 | 1 | NULL |
| 2 | 2 | 1 | FOR |
| 220 | 1 | 2 | NULL |
| 221 | 2 | 2 | FOR |
| 439 | 1 | 3 | NULL |
| 440 | 2 | 3 | FOR |
| 658 | 1 | 4 | FOR |
| 659 | 2 | 4 | DID_NOT_VOTE |
| 877 | 1 | 5 | FOR |
| 878 | 2 | 5 | FOR |
+-----+----------+---------+--------------+
result
+----------+----------------------+----------------+-------------+
| party_id | thematic_category_id | matching_votes | total_votes |
+----------+----------------------+----------------+-------------+
| 1 | 1 | 0 | 3 |
| 1 | 2 | 0 | 0 |
| 1 | 14 | 0 | 2 |
| 1 | 15 | 0 | 2 |
| 1 | 24 | 0 | 0 |
| 1 | 25 | 0 | 0 |
| 2 | 1 | 3 | 3 |
| 2 | 2 | 0 | 0 |
| 2 | 14 | 2 | 2 |
| 2 | 15 | 2 | 2 |
| 2 | 24 | 0 | 0 |
| 2 | 25 | 0 | 0 |
+----------+----------------------+----------------+-------------+
The create tables:
| member_vote | CREATE TABLE `member_vote` (
`id` int NOT NULL AUTO_INCREMENT,
`member_id` int DEFAULT NULL,
`vote_id` int DEFAULT NULL,
`value` varchar(12) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
PRIMARY KEY (`id`),
KEY `IDX_A698A9387597D3FE` (`member_id`),
KEY `IDX_A698A93872DCDAFC` (`vote_id`),
KEY `IDX_A698A9381D775834` (`value`),
KEY `IDX_A698A9387597D3FE72DCDAFC` (`member_id`,`vote_id`),
CONSTRAINT `FK_A698A93872DCDAFC` FOREIGN KEY (`vote_id`) REFERENCES `vote` (`id`) ON DELETE CASCADE,
CONSTRAINT `FK_A698A9387597D3FE` FOREIGN KEY (`member_id`) REFERENCES `member` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=1319272 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci |
| vote | CREATE TABLE `vote` (
`id` int NOT NULL AUTO_INCREMENT,
`official_id` int NOT NULL,
`is_featured` tinyint(1) NOT NULL,
`title` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci,
`reference` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`procedure_reference` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`description` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`summary_link` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`vote_date` datetime NOT NULL COMMENT '(DC2Type:datetime_immutable)',
PRIMARY KEY (`id`),
KEY `IDX_5A108564A9BA2FD0` (`is_featured`)
) ENGINE=InnoDB AUTO_INCREMENT=1868 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci |
| vote_thematic_category | CREATE TABLE `vote_thematic_category` (
`id` int NOT NULL AUTO_INCREMENT,
`parent_id` int DEFAULT NULL,
`label` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`prompt` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
PRIMARY KEY (`id`),
KEY `IDX_942FC12A727ACA70` (`parent_id`),
KEY `IDX_942FC12ABF396750727ACA70` (`id`,`parent_id`),
CONSTRAINT `FK_942FC12A727ACA70` FOREIGN KEY (`parent_id`) REFERENCES `vote_thematic_category` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=109 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci |
| vote_has_thematic | CREATE TABLE `vote_has_thematic` (
`vote_id` int NOT NULL,
`vote_thematic_category_id` int NOT NULL,
PRIMARY KEY (`vote_id`,`vote_thematic_category_id`),
KEY `IDX_1CF3D9D72DCDAFC` (`vote_id`),
KEY `IDX_1CF3D9D2AABDDC4` (`vote_thematic_category_id`),
CONSTRAINT `FK_1CF3D9D2AABDDC4` FOREIGN KEY (`vote_thematic_category_id`) REFERENCES `vote_thematic_category` (`id`) ON DELETE CASCADE,
CONSTRAINT `FK_1CF3D9D72DCDAFC` FOREIGN KEY (`vote_id`) REFERENCES `vote` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci |
| party_statistic | CREATE TABLE `party_statistic` (
`id` int NOT NULL AUTO_INCREMENT,
`party_id` int DEFAULT NULL,
`vote_id` int DEFAULT NULL,
`vote_result` varchar(12) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`stats` json NOT NULL,
PRIMARY KEY (`id`),
KEY `IDX_725B27D1213C1059` (`party_id`),
KEY `IDX_725B27D172DCDAFC` (`vote_id`),
KEY `IDX_725B27D172DCDAFC213C1059` (`vote_id`,`party_id`),
CONSTRAINT `FK_725B27D1213C1059` FOREIGN KEY (`party_id`) REFERENCES `party` (`id`) ON DELETE CASCADE,
CONSTRAINT `FK_725B27D172DCDAFC` FOREIGN KEY (`vote_id`) REFERENCES `vote` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=396566 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci |
And the explain query
+----+--------------+------------------------+------------+--------+---------------------------------------------------------------------------------------------+------------------------------+---------+--------------------------------------------+------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------+------------------------+------------+--------+---------------------------------------------------------------------------------------------+------------------------------+---------+--------------------------------------------+------+----------+----------------------------------------------+
| 1 | PRIMARY | vote_thematic_category | NULL | index | NULL | IDX_942FC12A727ACA70 | 5 | NULL | 6 | 100.00 | Using index; Using temporary; Using filesort |
| 1 | PRIMARY | <derived2> | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | Using join buffer (hash join) |
| 1 | PRIMARY | <derived4> | NULL | ref | <auto_key0> | <auto_key0> | 10 | ps.party_id,vote.vote_thematic_category.id | 2 | 100.00 | NULL |
| 1 | PRIMARY | <derived14> | NULL | ref | <auto_key0> | <auto_key0> | 10 | ps.party_id,vote.vote_thematic_category.id | 2 | 100.00 | NULL |
| 14 | DERIVED | vht | NULL | index | PRIMARY,IDX_1CF3D9D72DCDAFC,IDX_1CF3D9D2AABDDC4 | IDX_1CF3D9D72DCDAFC | 4 | NULL | 6 | 100.00 | Using index; Using temporary; Using filesort |
| 14 | DERIVED | mv | NULL | ref | IDX_A698A9387597D3FE,IDX_A698A93872DCDAFC,IDX_A698A9387597D3FE72DCDAFC | IDX_A698A93872DCDAFC | 5 | vote.vht.vote_id | 1 | 100.00 | Using where; Start temporary |
| 14 | DERIVED | ps | NULL | ref | IDX_725B27D172DCDAFC,IDX_725B27D172DCDAFC213C1059 | IDX_725B27D172DCDAFC213C1059 | 5 | vote.vht.vote_id | 1 | 100.00 | Using index; End temporary |
| 14 | DERIVED | <derived11> | NULL | ref | <auto_key1> | <auto_key1> | 4 | vote.vht.vote_thematic_category_id | 2 | 100.00 | NULL |
| 4 | DERIVED | mv | NULL | ref | IDX_A698A9387597D3FE,IDX_A698A93872DCDAFC,IDX_A698A9387597D3FE72DCDAFC,IDX_A698A9381D775834 | IDX_A698A9387597D3FE | 5 | const | 5 | 100.00 | Using where; Using temporary; Using filesort |
| 4 | DERIVED | ps | NULL | ref | IDX_725B27D172DCDAFC,IDX_725B27D172DCDAFC213C1059 | IDX_725B27D172DCDAFC213C1059 | 5 | vote.mv.vote_id | 1 | 10.00 | Using where |
| 4 | DERIVED | mv | NULL | ref | IDX_A698A9387597D3FE,IDX_A698A93872DCDAFC,IDX_A698A9387597D3FE72DCDAFC | IDX_A698A93872DCDAFC | 5 | vote.mv.vote_id | 1 | 100.00 | Using where; FirstMatch(ps) |
| 4 | DERIVED | <derived11> | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | Using join buffer (hash join) |
| 4 | DERIVED | vht | NULL | eq_ref | PRIMARY,IDX_1CF3D9D72DCDAFC,IDX_1CF3D9D2AABDDC4 | PRIMARY | 8 | vote.mv.vote_id,ac.category_id | 1 | 100.00 | Using index |
| 11 | DERIVED | vote_thematic_category | NULL | index | NULL | IDX_942FC12A727ACA70 | 5 | NULL | 6 | 100.00 | Using index |
| 12 | UNION | vote_thematic_category | NULL | range | IDX_942FC12A727ACA70,IDX_942FC12ABF396750727ACA70 | IDX_942FC12A727ACA70 | 5 | NULL | 4 | 100.00 | Using where; Using index |
| 13 | UNION RESULT | <union11,12> | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | Using temporary |
| 2 | DERIVED | party_statistic | NULL | index | IDX_725B27D1213C1059,IDX_725B27D172DCDAFC213C1059 | IDX_725B27D1213C1059 | 5 | NULL | 10 | 100.00 | Using index |
+----+--------------+------------------------+------------+--------+---------------------------------------------------------------------------------------------+------------------------------+---------+--------------------------------------------+------+----------+----------------------------------------------+
And the EXPLAIN ANALYZE
| -> Sort: ps.party_id, tc.thematic_category_id (actual time=34567..34568 rows=30456 loops=1)
-> Stream results (cost=110e+6 rows=0) (actual time=34509..34558 rows=30456 loops=1)
-> Nested loop left join (cost=110e+6 rows=0) (actual time=34509..34553 rows=30456 loops=1)
-> Nested loop left join (cost=307832 rows=0) (actual time=13395..13418 rows=30456 loops=1)
-> Inner hash join (no condition) (cost=3409 rows=33912) (actual time=3.39..4.07 rows=30456 loops=1)
-> Table scan on ps (cost=245..251 rows=314) (actual time=3.26..3.28 rows=282 loops=1)
-> Materialize (cost=245..245 rows=314) (actual time=3.25..3.25 rows=282 loops=1)
-> Covering index skip scan for deduplication on party_statistic using IDX_725B27D1213C1059 (cost=173 rows=314) (actual time=0.0835..3.19 rows=282 loops=1)
-> Hash
-> Covering index scan on vote_thematic_category using IDX_942FC12A727ACA70 (cost=11.6 rows=108) (actual time=0.0741..0.094 rows=108 loops=1)
-> Index lookup on ac using <auto_key0> (party_id=ps.party_id, thematic_category_id=vote_thematic_category.id) (cost=0.25..8.98 rows=35.9) (actual time=0.44..0.44 rows=0.972 loops=30456)
-> Materialize CTE aggregatedcounts (cost=0..0 rows=0) (actual time=13391..13391 rows=29598 loops=1)
-> Group aggregate: count(distinct member_vote.vote_id) (actual time=12320..13375 rows=29598 loops=1)
-> Sort: aggregatedmatchedvotes.party_id, aggregatedmatchedvotes.thematic_category_id (actual time=12320..12541 rows=9.44e+6 loops=1)
-> Stream results (cost=14.7e+6 rows=145e+6) (actual time=4.29..5196 rows=9.44e+6 loops=1)
-> Nested loop inner join (cost=14.7e+6 rows=145e+6) (actual time=4.28..3878 rows=9.44e+6 loops=1)
-> Nested loop inner join (cost=187692 rows=713469) (actual time=3.82..1367 rows=5.11e+6 loops=1)
-> Nested loop inner join (cost=116340 rows=33053) (actual time=3.75..389 rows=258763 loops=1)
-> Nested loop inner join (cost=654 rows=1867) (actual time=3.1..24.3 rows=1867 loops=1)
-> Filter: (`<subquery8>`.vote_id is not null) (cost=618..0.0963 rows=1867) (actual time=3..3.39 rows=1867 loops=1)
-> Table scan on <subquery8> (cost=618..644 rows=1867) (actual time=3..3.29 rows=1867 loops=1)
-> Materialize with deduplication (cost=618..618 rows=1867) (actual time=3..3 rows=1867 loops=1)
-> Filter: (mv.vote_id is not null) (cost=188 rows=1867) (actual time=0.0642..1.96 rows=1867 loops=1)
-> Covering index lookup on mv using IDX_A698A9387597D3FE72DCDAFC (member_id=262) (cost=188 rows=1867) (actual time=0.0617..1.63 rows=1867 loops=1)
-> Index lookup on mv using IDX_A698A9387597D3FE72DCDAFC (member_id=262, vote_id=`<subquery8>`.vote_id) (cost=0.25 rows=1) (actual time=0.0107..0.0111 rows=1 loops=1867)
-> Filter: (ps.vote_result = mv.`value`) (cost=82635 rows=17.7) (actual time=0.081..0.19 rows=139 loops=1867)
-> Index lookup on ps using IDX_725B27D172DCDAFC213C1059 (vote_id=`<subquery8>`.vote_id) (cost=82635 rows=177) (actual time=0.08..0.178 rows=212 loops=1867)
-> Covering index lookup on vht using IDX_1CF3D9D72DCDAFC (vote_id=`<subquery8>`.vote_id) (cost=0.379 rows=21.6) (actual time=0.00153..0.00302 rows=19.7 loops=258763)
-> Index lookup on ac using <auto_key1> (category_id=vht.vote_thematic_category_id) (cost=78.1..82.7 rows=10.2) (actual time=256e-6..370e-6 rows=1.85 loops=5.11e+6)
-> Materialize union CTE aggregatedcategories if needed with deduplication (cost=77.6..77.6 rows=203) (actual time=0.449..0.449 rows=203 loops=1)
-> Covering index scan on vote_thematic_category using IDX_942FC12A727ACA70 (cost=11.6 rows=108) (actual time=0.0554..0.0915 rows=108 loops=1)
-> Filter: (vote_thematic_category.parent_id is not null) (cost=19.3 rows=95) (actual time=0.0146..0.0986 rows=95 loops=1)
-> Covering index range scan on vote_thematic_category using IDX_942FC12A727ACA70 over (NULL < parent_id) (cost=19.3 rows=95) (actual time=0.0133..0.0847 rows=95 loops=1)
-> Index lookup on avpc using <auto_key0> (party_id=ps.party_id, thematic_category_id=vote_thematic_category.id) (cost=0.25..89.8 rows=359) (actual time=0.694..0.694 rows=0.99 loops=30456)
-> Materialize CTE allvotespercategory (cost=0..0 rows=0) (actual time=21114..21114 rows=30155 loops=1)
-> Group aggregate: count(distinct party_statistic.vote_id) (actual time=19313..21091 rows=30155 loops=1)
-> Sort: pvr.party_id, ac.aggregated_category_id (actual time=19313..19690 rows=14.8e+6 loops=1)
-> Stream results (cost=146e+6 rows=1.45e+9) (actual time=0.557..6634 rows=14.8e+6 loops=1)
-> Nested loop inner join (cost=146e+6 rows=1.45e+9) (actual time=0.555..5529 rows=14.8e+6 loops=1)
-> Nested loop inner join (cost=747175 rows=7.13e+6) (actual time=0.551..1591 rows=8e+6 loops=1)
-> Nested loop inner join (cost=33660 rows=330534) (actual time=0.542..72.8 rows=396565 loops=1)
-> Filter: (`<subquery16>`.vote_id is not null) (cost=618..0.00963 rows=1867) (actual time=0.525..1.22 rows=1867 loops=1)
-> Table scan on <subquery16> (cost=618..644 rows=1867) (actual time=0.524..1.06 rows=1867 loops=1)
-> Materialize with deduplication (cost=618..618 rows=1867) (actual time=0.524..0.524 rows=1867 loops=1)
-> Filter: (mv.vote_id is not null) (cost=188 rows=1867) (actual time=0.0429..0.352 rows=1867 loops=1)
-> Covering index lookup on mv using IDX_A698A9387597D3FE72DCDAFC (member_id=262) (cost=188 rows=1867) (actual time=0.0417..0.28 rows=1867 loops=1)
-> Covering index lookup on ps using IDX_725B27D172DCDAFC213C1059 (vote_id=`<subquery16>`.vote_id) (cost=625 rows=177) (actual time=0.00961..0.0306 rows=212 loops=1867)
-> Covering index lookup on vht using IDX_1CF3D9D72DCDAFC (vote_id=`<subquery16>`.vote_id) (cost=0.27 rows=21.6) (actual time=0.00153..0.00305 rows=20.2 loops=396565)
-> Index lookup on ac using <auto_key1> (category_id=vht.vote_thematic_category_id) (cost=78.1..82.7 rows=10.2) (actual time=256e-6..370e-6 rows=1.85 loops=8e+6)
-> Materialize union CTE aggregatedcategories if needed (query plan printed elsewhere) with deduplication (cost=77.6..77.6 rows=203) (never executed)
Share
Improve this question
edited Feb 17 at 9:44
lala
asked Feb 17 at 9:24
lalalala
1337 bronze badges
4
- Check the explain plan and identify where the bottlenecks are...asking us to digest your entire plan and provide an answer is too broad a question IMHO – Tim Biegeleisen Commented Feb 17 at 9:25
- 1 Can you please explain what AggregatedCategories CTE is supposed to do? Based on description in the question and the comments around the CTE, it is supposed to be a recursive one, but it does not seem to be. – Shadow Commented Feb 17 at 11:45
- @Shadow AggregatedCategories flat the thematic_category tree. – lala Commented Feb 17 at 12:04
- make sure table statistics are up to date. – ticktalk Commented Feb 17 at 12:56
1 Answer
Reset to default 0(Not yet an Answer.)
If I read it correctly, the two "Group aggregate count(distinct...)" steps consume nearly all the time.
Group aggregate: count(distinct member_vote.vote_id)
Group aggregate: count(distinct party_statistic.vote_id)
Can you explain what is going on there?
Is member_vote
a many-to-many mapping table? If so, get rid of id
and have
PRIMARY KEY(member_id, vote_id),
INDEX(vote_id, member_id)
本文标签: Mysql slow recursive query with many joinStack Overflow
版权声明:本文标题:Mysql slow recursive query with many join - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1740170560a2235475.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论