admin管理员组文章数量:1122846
I've been having some problems on my wordpress site recently whereby it seems to become unavailable intraday requiring a reboot of the server. Prior to this I'd been seeing high CPU alerts showing from our monitoring software. Upon superficial investigation I could see that 200% of the CPU was showing against mysqld so I started to investigate.
We're running wordpress version 6.5.5 on Ubuntu 20.04 with mysql version 8.0.37-0ubuntu0.20.04.3.
A lot of what I have read so far suggested looking at slow running queries so I enabled slow query logging in the mysql config and ran pt-query-digest to get the top 3 queries. The number 1 query in the list seems to be the vast majority of the 'time' so I've pasted that query here. The full output is below though if it's of interest.
SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'
The query seems completely innocent and I've no idea why such a query would apparently be a problem so I decided to run it manually. No surprise it took 0.11 seconds to run and returned 891 rows. I'm not sure whether that's a lot but it certainly doesn't strike me as something that's the root cause of 200% CPU usage.
Other things I've done.
- Ran 'Show full Processlist'
This listed all the processes running, all of which were sleeping apart from the event_scheduler daemon and the query for the processlist.
I pulled the top wait events from the server (see at the bottom of the thread). I don't know whether anything stands out in that list but it's in picoseconds so they don't look super alarming to me...
Strangely top vs atop shows a different figure for CPU usage. ATop will show 200% CPU while top shows 51% CPU. Despite the large discrepancy 51% CPU usage by a database which should basically be quiet as a mouse seems unreasonable.
I ran perf on the process which seemed to show 10% of something being used MYSQLparse which doesn't seem particularly terrifying....
I'm afraid I'm about out of diagnostic things to run and I'm quite far from being a dba so I was hoping someone had come across a similar problem in the past and might give me a steer. Any help is very much appreciated.
# 1.1s user time, 360ms system time, 30.10M rss, 37.80M vsz
# Current date: Sat Jul 6 00:49:59 2024
# Hostname: vmi1329245.contaboserver
# Files: /var/log/mysql/mysql-slow.log
# Overall: 222 total, 6 unique, 0.01 QPS, 0.03x concurrency ______________
# Time range: 2024-07-05T16:13:10 to 2024-07-05T22:49:18
# Attribute total min max avg 95% stddev median
# ============ ======= ======= ======= ======= ======= ======= =======
# Exec time 597s 2s 6s 3s 4s 648ms 2s
# Lock time 8s 1us 2s 37ms 14us 271ms 2us
# Rows sent 170.54k 0 891 786.65 874.75 281.19 874.75
# Rows examine 170.56k 0 891 786.72 874.75 281.00 874.75
# Query size 24.28k 71 7.25k 112.00 118.34 473.87 69.19
# Profile
# Rank Query ID Response time Calls R/Call V/M I
# ==== =============================== ============== ===== ====== ===== =
# 1 0x665212F6AA7A477A92F3B1B361... 537.8294 90.0% 196 2.7440 0.16 SELECT wp_options
# 2 0x1D1DA379D51206DB94350EA2AB... 29.2167 4.9% 13 2.2474 0.03 UPDATE wp_options
# 3 0xA756477EF057381E989E693FD5... 16.7384 2.8% 8 2.0923 0.01 SELECT wp_options
# MISC 0xMISC 13.5024 2.3% 5 2.7005 0.0 <3 ITEMS>
# Query 1: 0.01 QPS, 0.02x concurrency, ID 0x665212F6AA7A477A92F3B1B361FB0E5B at byte 62125
# This item is included in the report because it matches --limit.
# Scores: V/M = 0.16
# Time range: 2024-07-05T16:13:10 to 2024-07-05T22:49:18
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 88 196
# Exec time 90 538s 2s 6s 3s 4s 663ms 3s
# Lock time 0 1ms 1us 31us 5us 13us 4us 3us
# Rows sent 99 170.54k 890 891 890.96 874.75 0 874.75
# Rows examine 99 170.54k 890 891 890.96 874.75 0 874.75
# Query size 55 13.59k 71 71 71 71 0 71
# String:
# Databases wordpress
# Hosts localhost
# Users wordpress
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms
# 10ms
# 100ms
# 1s ################################################################
# 10s+
# Tables
# SHOW TABLE STATUS FROM `wordpress` LIKE 'wp_options'\G
# SHOW CREATE TABLE `wordpress`.`wp_options`\G
# EXPLAIN /*!50100 PARTITIONS*/
SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'\G
mysql> SHOW FULL PROCESSLIST;
+-------+-----------------+-----------+-----------+---------+-------+------------------------+-----------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-------+-----------------+-----------+-----------+---------+-------+------------------------+-----------------------+
| 5 | event_scheduler | localhost | NULL | Daemon | 24632 | Waiting on empty queue | NULL |
| 253 | root | localhost | wordpress | Query | 0 | init | SHOW FULL PROCESSLIST |
| 91836 | wordpress | localhost | wordpress | Sleep | 4 | | NULL |
| 91851 | wordpress | localhost | wordpress | Sleep | 1 | | NULL |
| 91854 | wordpress | localhost | wordpress | Sleep | 2 | | NULL |
| 91869 | wordpress | localhost | wordpress | Sleep | 7 | | NULL |
This is the output from the top wait events:
+---------------------------------------+------------------+
| EVENT_NAME | SUM_TIMER_WAIT |
+---------------------------------------+------------------+
| idle | 3498378321113000 |
| wait/io/table/sql/handler | 45837978934832 |
| wait/io/file/innodb/innodb_data_file | 4457937654128 |
| wait/io/file/innodb/innodb_temp_file | 795657355640 |
| wait/io/file/innodb/innodb_log_file | 543968177872 |
| wait/io/file/innodb/innodb_dblwr_file | 155007627600 |
| wait/io/file/sql/binlog | 41626354056 |
| wait/lock/table/sql/handler | 22670087944 |
| wait/io/file/sql/casetest | 2522910488 |
| wait/io/file/sql/slow_log | 2211966008 |
+---------------------------------------+------------------+
10 rows in set (0.11 sec)
9.51% mysqld [.] MYSQLparse
1.87% libc-2.31.so [.] __libc_malloc
1.72% mysqld [.] Lex_hash::get_hash_symbol
1.71% libstdc++.so.6.0.28 [.] std::_Rb_tree_increment
1.71% mysqld [.] dispatch_command
I've been having some problems on my wordpress site recently whereby it seems to become unavailable intraday requiring a reboot of the server. Prior to this I'd been seeing high CPU alerts showing from our monitoring software. Upon superficial investigation I could see that 200% of the CPU was showing against mysqld so I started to investigate.
We're running wordpress version 6.5.5 on Ubuntu 20.04 with mysql version 8.0.37-0ubuntu0.20.04.3.
A lot of what I have read so far suggested looking at slow running queries so I enabled slow query logging in the mysql config and ran pt-query-digest to get the top 3 queries. The number 1 query in the list seems to be the vast majority of the 'time' so I've pasted that query here. The full output is below though if it's of interest.
SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'
The query seems completely innocent and I've no idea why such a query would apparently be a problem so I decided to run it manually. No surprise it took 0.11 seconds to run and returned 891 rows. I'm not sure whether that's a lot but it certainly doesn't strike me as something that's the root cause of 200% CPU usage.
Other things I've done.
- Ran 'Show full Processlist'
This listed all the processes running, all of which were sleeping apart from the event_scheduler daemon and the query for the processlist.
I pulled the top wait events from the server (see at the bottom of the thread). I don't know whether anything stands out in that list but it's in picoseconds so they don't look super alarming to me...
Strangely top vs atop shows a different figure for CPU usage. ATop will show 200% CPU while top shows 51% CPU. Despite the large discrepancy 51% CPU usage by a database which should basically be quiet as a mouse seems unreasonable.
I ran perf on the process which seemed to show 10% of something being used MYSQLparse which doesn't seem particularly terrifying....
I'm afraid I'm about out of diagnostic things to run and I'm quite far from being a dba so I was hoping someone had come across a similar problem in the past and might give me a steer. Any help is very much appreciated.
# 1.1s user time, 360ms system time, 30.10M rss, 37.80M vsz
# Current date: Sat Jul 6 00:49:59 2024
# Hostname: vmi1329245.contaboserver.net
# Files: /var/log/mysql/mysql-slow.log
# Overall: 222 total, 6 unique, 0.01 QPS, 0.03x concurrency ______________
# Time range: 2024-07-05T16:13:10 to 2024-07-05T22:49:18
# Attribute total min max avg 95% stddev median
# ============ ======= ======= ======= ======= ======= ======= =======
# Exec time 597s 2s 6s 3s 4s 648ms 2s
# Lock time 8s 1us 2s 37ms 14us 271ms 2us
# Rows sent 170.54k 0 891 786.65 874.75 281.19 874.75
# Rows examine 170.56k 0 891 786.72 874.75 281.00 874.75
# Query size 24.28k 71 7.25k 112.00 118.34 473.87 69.19
# Profile
# Rank Query ID Response time Calls R/Call V/M I
# ==== =============================== ============== ===== ====== ===== =
# 1 0x665212F6AA7A477A92F3B1B361... 537.8294 90.0% 196 2.7440 0.16 SELECT wp_options
# 2 0x1D1DA379D51206DB94350EA2AB... 29.2167 4.9% 13 2.2474 0.03 UPDATE wp_options
# 3 0xA756477EF057381E989E693FD5... 16.7384 2.8% 8 2.0923 0.01 SELECT wp_options
# MISC 0xMISC 13.5024 2.3% 5 2.7005 0.0 <3 ITEMS>
# Query 1: 0.01 QPS, 0.02x concurrency, ID 0x665212F6AA7A477A92F3B1B361FB0E5B at byte 62125
# This item is included in the report because it matches --limit.
# Scores: V/M = 0.16
# Time range: 2024-07-05T16:13:10 to 2024-07-05T22:49:18
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 88 196
# Exec time 90 538s 2s 6s 3s 4s 663ms 3s
# Lock time 0 1ms 1us 31us 5us 13us 4us 3us
# Rows sent 99 170.54k 890 891 890.96 874.75 0 874.75
# Rows examine 99 170.54k 890 891 890.96 874.75 0 874.75
# Query size 55 13.59k 71 71 71 71 0 71
# String:
# Databases wordpress
# Hosts localhost
# Users wordpress
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms
# 10ms
# 100ms
# 1s ################################################################
# 10s+
# Tables
# SHOW TABLE STATUS FROM `wordpress` LIKE 'wp_options'\G
# SHOW CREATE TABLE `wordpress`.`wp_options`\G
# EXPLAIN /*!50100 PARTITIONS*/
SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'\G
mysql> SHOW FULL PROCESSLIST;
+-------+-----------------+-----------+-----------+---------+-------+------------------------+-----------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-------+-----------------+-----------+-----------+---------+-------+------------------------+-----------------------+
| 5 | event_scheduler | localhost | NULL | Daemon | 24632 | Waiting on empty queue | NULL |
| 253 | root | localhost | wordpress | Query | 0 | init | SHOW FULL PROCESSLIST |
| 91836 | wordpress | localhost | wordpress | Sleep | 4 | | NULL |
| 91851 | wordpress | localhost | wordpress | Sleep | 1 | | NULL |
| 91854 | wordpress | localhost | wordpress | Sleep | 2 | | NULL |
| 91869 | wordpress | localhost | wordpress | Sleep | 7 | | NULL |
This is the output from the top wait events:
+---------------------------------------+------------------+
| EVENT_NAME | SUM_TIMER_WAIT |
+---------------------------------------+------------------+
| idle | 3498378321113000 |
| wait/io/table/sql/handler | 45837978934832 |
| wait/io/file/innodb/innodb_data_file | 4457937654128 |
| wait/io/file/innodb/innodb_temp_file | 795657355640 |
| wait/io/file/innodb/innodb_log_file | 543968177872 |
| wait/io/file/innodb/innodb_dblwr_file | 155007627600 |
| wait/io/file/sql/binlog | 41626354056 |
| wait/lock/table/sql/handler | 22670087944 |
| wait/io/file/sql/casetest | 2522910488 |
| wait/io/file/sql/slow_log | 2211966008 |
+---------------------------------------+------------------+
10 rows in set (0.11 sec)
9.51% mysqld [.] MYSQLparse
1.87% libc-2.31.so [.] __libc_malloc
1.72% mysqld [.] Lex_hash::get_hash_symbol
1.71% libstdc++.so.6.0.28 [.] std::_Rb_tree_increment
1.71% mysqld [.] dispatch_command
Share
Improve this question
asked Jul 5, 2024 at 23:18
Phill AshPhill Ash
101
5
- So I thought I had identified the issue. One of our wordpress backup tools updraftplus had been attempting to backup our site. It looked like the backup hadn't happened for several months and I discovered that the remote storage wasn't accessible. I renewed the fee payment for it (it was on an old payment card) backed the site up manually and the CPU seemed to drop to normal. I just had a look at our monitoring today though and I can see several high CPU alerts. It's possible that this is the backup happening since it's been reinstated... investigating. – Phill Ash Commented Jul 6, 2024 at 12:25
- I have a feeling I might have found what's really creating the problem with the database. 14886 0.37s 3.91s -3084K 0K 0K 0K -- - 51 S 3 170% mysqld 23900 0.27s 2.30s 2048K 1940K 0K 0K -- - 1 R 2 100% apache2 24106 0.38s 1.96s 0K 4K 0K 0K -- - 1 R 3 93% apache2 23893 0.12s 2.08s 0K 0K 0K 4K -- - 1 S 1 87% apache2 23988 0.30s 1.74s 0K 0K 0K 0K -- - 1 R 3 81% apache2 There are another 10 apache processes all running 81% CPU. – Phill Ash Commented Jul 6, 2024 at 12:31
- I found this thread serverfault.com/questions/823121/… and followed it's guidance. I now have apache using php-fpm. Let's see if this helps. I'm avoiding changing it to the answer because other users have suggest it's merely kicking the can down the road whilst acknowledging it should be done anyway. Others suggest simply moving off apache onto nginx – Phill Ash Commented Jul 6, 2024 at 13:07
- Sadly it appears that the problem has just moved to php-fpm. Now I see mysqld using 200% CPU and several (5) php-fpm instances using 100% CPU. – Phill Ash Commented Jul 7, 2024 at 19:47
- Looks like some kind of nefarious activities... I could see a large number of connections on port 80. 200 odd at a time when the site isn't doing anything... Sadly just as I went to check what they were my lil one woke up so I had to sort that and by the time I was back CPU was back to normal and all the connections were gone. I'll have to be faster off the mark tomorrow. – Phill Ash Commented Jul 7, 2024 at 22:36
1 Answer
Reset to default 0So the issue was actually a DDOS attack on our site. I used urlsnarf to see what requests were being placed and there were basically hundreds of POST request from about 20-30 or so IP addresses. Seems quite small scale for what I understood is a typical DDOS attack. It also explains why it took a few days before the site ground to a halt. We have quite a few cores running on this host and it was maxing out 7 of them. I think in a larger scale attack it would have taken out the site a lot faster... We're behind cloudflare so maybe it was setup at that level to avoid their automatic detection...
As soon as I turned on the 'We're under attack mode' the POST requests stopped coming in and the server returned to normal.
Now I just need to figure out an early detection mechanism and automatic switch of the cloudflare under attack functionality.
本文标签:
版权声明:本文标题:performance - mysql running over 200% CPU according to atop - pt-query-digest suggests slow running query is a lookup in wp_opti 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1736301445a1931180.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论