admin管理员组

文章数量:1122846

I've been having some problems on my wordpress site recently whereby it seems to become unavailable intraday requiring a reboot of the server. Prior to this I'd been seeing high CPU alerts showing from our monitoring software. Upon superficial investigation I could see that 200% of the CPU was showing against mysqld so I started to investigate.

We're running wordpress version 6.5.5 on Ubuntu 20.04 with mysql version 8.0.37-0ubuntu0.20.04.3.

A lot of what I have read so far suggested looking at slow running queries so I enabled slow query logging in the mysql config and ran pt-query-digest to get the top 3 queries. The number 1 query in the list seems to be the vast majority of the 'time' so I've pasted that query here. The full output is below though if it's of interest.

SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'

The query seems completely innocent and I've no idea why such a query would apparently be a problem so I decided to run it manually. No surprise it took 0.11 seconds to run and returned 891 rows. I'm not sure whether that's a lot but it certainly doesn't strike me as something that's the root cause of 200% CPU usage.

Other things I've done.

  1. Ran 'Show full Processlist'

This listed all the processes running, all of which were sleeping apart from the event_scheduler daemon and the query for the processlist.

  1. I pulled the top wait events from the server (see at the bottom of the thread). I don't know whether anything stands out in that list but it's in picoseconds so they don't look super alarming to me...

  2. Strangely top vs atop shows a different figure for CPU usage. ATop will show 200% CPU while top shows 51% CPU. Despite the large discrepancy 51% CPU usage by a database which should basically be quiet as a mouse seems unreasonable.

  3. I ran perf on the process which seemed to show 10% of something being used MYSQLparse which doesn't seem particularly terrifying....

I'm afraid I'm about out of diagnostic things to run and I'm quite far from being a dba so I was hoping someone had come across a similar problem in the past and might give me a steer. Any help is very much appreciated.


# 1.1s user time, 360ms system time, 30.10M rss, 37.80M vsz
# Current date: Sat Jul  6 00:49:59 2024
# Hostname: vmi1329245.contaboserver
# Files: /var/log/mysql/mysql-slow.log
# Overall: 222 total, 6 unique, 0.01 QPS, 0.03x concurrency ______________
# Time range: 2024-07-05T16:13:10 to 2024-07-05T22:49:18
# Attribute          total     min     max     avg     95%  stddev  median
# ============     ======= ======= ======= ======= ======= ======= =======
# Exec time           597s      2s      6s      3s      4s   648ms      2s
# Lock time             8s     1us      2s    37ms    14us   271ms     2us
# Rows sent        170.54k       0     891  786.65  874.75  281.19  874.75
# Rows examine     170.56k       0     891  786.72  874.75  281.00  874.75
# Query size        24.28k      71   7.25k  112.00  118.34  473.87   69.19

# Profile
# Rank Query ID                        Response time  Calls R/Call V/M   I
# ==== =============================== ============== ===== ====== ===== =
#    1 0x665212F6AA7A477A92F3B1B361... 537.8294 90.0%   196 2.7440  0.16 SELECT wp_options
#    2 0x1D1DA379D51206DB94350EA2AB...  29.2167  4.9%    13 2.2474  0.03 UPDATE wp_options
#    3 0xA756477EF057381E989E693FD5...  16.7384  2.8%     8 2.0923  0.01 SELECT wp_options
# MISC 0xMISC                           13.5024  2.3%     5 2.7005   0.0 <3 ITEMS>

# Query 1: 0.01 QPS, 0.02x concurrency, ID 0x665212F6AA7A477A92F3B1B361FB0E5B at byte 62125
# This item is included in the report because it matches --limit.
# Scores: V/M = 0.16
# Time range: 2024-07-05T16:13:10 to 2024-07-05T22:49:18
# Attribute    pct   total     min     max     avg     95%  stddev  median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count         88     196
# Exec time     90    538s      2s      6s      3s      4s   663ms      3s
# Lock time      0     1ms     1us    31us     5us    13us     4us     3us
# Rows sent     99 170.54k     890     891  890.96  874.75       0  874.75
# Rows examine  99 170.54k     890     891  890.96  874.75       0  874.75
# Query size    55  13.59k      71      71      71      71       0      71
# String:
# Databases    wordpress
# Hosts        localhost
# Users        wordpress
# Query_time distribution
#   1us
#  10us
# 100us
#   1ms
#  10ms
# 100ms
#    1s  ################################################################
#  10s+
# Tables
#    SHOW TABLE STATUS FROM `wordpress` LIKE 'wp_options'\G
#    SHOW CREATE TABLE `wordpress`.`wp_options`\G
# EXPLAIN /*!50100 PARTITIONS*/
SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'\G

mysql> SHOW FULL PROCESSLIST;
+-------+-----------------+-----------+-----------+---------+-------+------------------------+-----------------------+
| Id    | User            | Host      | db        | Command | Time  | State                  | Info                  |
+-------+-----------------+-----------+-----------+---------+-------+------------------------+-----------------------+
|     5 | event_scheduler | localhost | NULL      | Daemon  | 24632 | Waiting on empty queue | NULL                  |
|   253 | root            | localhost | wordpress | Query   |     0 | init                   | SHOW FULL PROCESSLIST |
| 91836 | wordpress       | localhost | wordpress | Sleep   |     4 |                        | NULL                  |
| 91851 | wordpress       | localhost | wordpress | Sleep   |     1 |                        | NULL                  |
| 91854 | wordpress       | localhost | wordpress | Sleep   |     2 |                        | NULL                  |
| 91869 | wordpress       | localhost | wordpress | Sleep   |     7 |                        | NULL                  |

This is the output from the top wait events: 

+---------------------------------------+------------------+
| EVENT_NAME                            | SUM_TIMER_WAIT   |
+---------------------------------------+------------------+
| idle                                  | 3498378321113000 |
| wait/io/table/sql/handler             |   45837978934832 |
| wait/io/file/innodb/innodb_data_file  |    4457937654128 |
| wait/io/file/innodb/innodb_temp_file  |     795657355640 |
| wait/io/file/innodb/innodb_log_file   |     543968177872 |
| wait/io/file/innodb/innodb_dblwr_file |     155007627600 |
| wait/io/file/sql/binlog               |      41626354056 |
| wait/lock/table/sql/handler           |      22670087944 |
| wait/io/file/sql/casetest             |       2522910488 |
| wait/io/file/sql/slow_log             |       2211966008 |
+---------------------------------------+------------------+
10 rows in set (0.11 sec)

   9.51%  mysqld                          [.] MYSQLparse
   1.87%  libc-2.31.so                    [.] __libc_malloc
   1.72%  mysqld                          [.] Lex_hash::get_hash_symbol
   1.71%  libstdc++.so.6.0.28             [.] std::_Rb_tree_increment
   1.71%  mysqld                          [.] dispatch_command

I've been having some problems on my wordpress site recently whereby it seems to become unavailable intraday requiring a reboot of the server. Prior to this I'd been seeing high CPU alerts showing from our monitoring software. Upon superficial investigation I could see that 200% of the CPU was showing against mysqld so I started to investigate.

We're running wordpress version 6.5.5 on Ubuntu 20.04 with mysql version 8.0.37-0ubuntu0.20.04.3.

A lot of what I have read so far suggested looking at slow running queries so I enabled slow query logging in the mysql config and ran pt-query-digest to get the top 3 queries. The number 1 query in the list seems to be the vast majority of the 'time' so I've pasted that query here. The full output is below though if it's of interest.

SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'

The query seems completely innocent and I've no idea why such a query would apparently be a problem so I decided to run it manually. No surprise it took 0.11 seconds to run and returned 891 rows. I'm not sure whether that's a lot but it certainly doesn't strike me as something that's the root cause of 200% CPU usage.

Other things I've done.

  1. Ran 'Show full Processlist'

This listed all the processes running, all of which were sleeping apart from the event_scheduler daemon and the query for the processlist.

  1. I pulled the top wait events from the server (see at the bottom of the thread). I don't know whether anything stands out in that list but it's in picoseconds so they don't look super alarming to me...

  2. Strangely top vs atop shows a different figure for CPU usage. ATop will show 200% CPU while top shows 51% CPU. Despite the large discrepancy 51% CPU usage by a database which should basically be quiet as a mouse seems unreasonable.

  3. I ran perf on the process which seemed to show 10% of something being used MYSQLparse which doesn't seem particularly terrifying....

I'm afraid I'm about out of diagnostic things to run and I'm quite far from being a dba so I was hoping someone had come across a similar problem in the past and might give me a steer. Any help is very much appreciated.


# 1.1s user time, 360ms system time, 30.10M rss, 37.80M vsz
# Current date: Sat Jul  6 00:49:59 2024
# Hostname: vmi1329245.contaboserver.net
# Files: /var/log/mysql/mysql-slow.log
# Overall: 222 total, 6 unique, 0.01 QPS, 0.03x concurrency ______________
# Time range: 2024-07-05T16:13:10 to 2024-07-05T22:49:18
# Attribute          total     min     max     avg     95%  stddev  median
# ============     ======= ======= ======= ======= ======= ======= =======
# Exec time           597s      2s      6s      3s      4s   648ms      2s
# Lock time             8s     1us      2s    37ms    14us   271ms     2us
# Rows sent        170.54k       0     891  786.65  874.75  281.19  874.75
# Rows examine     170.56k       0     891  786.72  874.75  281.00  874.75
# Query size        24.28k      71   7.25k  112.00  118.34  473.87   69.19

# Profile
# Rank Query ID                        Response time  Calls R/Call V/M   I
# ==== =============================== ============== ===== ====== ===== =
#    1 0x665212F6AA7A477A92F3B1B361... 537.8294 90.0%   196 2.7440  0.16 SELECT wp_options
#    2 0x1D1DA379D51206DB94350EA2AB...  29.2167  4.9%    13 2.2474  0.03 UPDATE wp_options
#    3 0xA756477EF057381E989E693FD5...  16.7384  2.8%     8 2.0923  0.01 SELECT wp_options
# MISC 0xMISC                           13.5024  2.3%     5 2.7005   0.0 <3 ITEMS>

# Query 1: 0.01 QPS, 0.02x concurrency, ID 0x665212F6AA7A477A92F3B1B361FB0E5B at byte 62125
# This item is included in the report because it matches --limit.
# Scores: V/M = 0.16
# Time range: 2024-07-05T16:13:10 to 2024-07-05T22:49:18
# Attribute    pct   total     min     max     avg     95%  stddev  median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count         88     196
# Exec time     90    538s      2s      6s      3s      4s   663ms      3s
# Lock time      0     1ms     1us    31us     5us    13us     4us     3us
# Rows sent     99 170.54k     890     891  890.96  874.75       0  874.75
# Rows examine  99 170.54k     890     891  890.96  874.75       0  874.75
# Query size    55  13.59k      71      71      71      71       0      71
# String:
# Databases    wordpress
# Hosts        localhost
# Users        wordpress
# Query_time distribution
#   1us
#  10us
# 100us
#   1ms
#  10ms
# 100ms
#    1s  ################################################################
#  10s+
# Tables
#    SHOW TABLE STATUS FROM `wordpress` LIKE 'wp_options'\G
#    SHOW CREATE TABLE `wordpress`.`wp_options`\G
# EXPLAIN /*!50100 PARTITIONS*/
SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'\G

mysql> SHOW FULL PROCESSLIST;
+-------+-----------------+-----------+-----------+---------+-------+------------------------+-----------------------+
| Id    | User            | Host      | db        | Command | Time  | State                  | Info                  |
+-------+-----------------+-----------+-----------+---------+-------+------------------------+-----------------------+
|     5 | event_scheduler | localhost | NULL      | Daemon  | 24632 | Waiting on empty queue | NULL                  |
|   253 | root            | localhost | wordpress | Query   |     0 | init                   | SHOW FULL PROCESSLIST |
| 91836 | wordpress       | localhost | wordpress | Sleep   |     4 |                        | NULL                  |
| 91851 | wordpress       | localhost | wordpress | Sleep   |     1 |                        | NULL                  |
| 91854 | wordpress       | localhost | wordpress | Sleep   |     2 |                        | NULL                  |
| 91869 | wordpress       | localhost | wordpress | Sleep   |     7 |                        | NULL                  |

This is the output from the top wait events: 

+---------------------------------------+------------------+
| EVENT_NAME                            | SUM_TIMER_WAIT   |
+---------------------------------------+------------------+
| idle                                  | 3498378321113000 |
| wait/io/table/sql/handler             |   45837978934832 |
| wait/io/file/innodb/innodb_data_file  |    4457937654128 |
| wait/io/file/innodb/innodb_temp_file  |     795657355640 |
| wait/io/file/innodb/innodb_log_file   |     543968177872 |
| wait/io/file/innodb/innodb_dblwr_file |     155007627600 |
| wait/io/file/sql/binlog               |      41626354056 |
| wait/lock/table/sql/handler           |      22670087944 |
| wait/io/file/sql/casetest             |       2522910488 |
| wait/io/file/sql/slow_log             |       2211966008 |
+---------------------------------------+------------------+
10 rows in set (0.11 sec)

   9.51%  mysqld                          [.] MYSQLparse
   1.87%  libc-2.31.so                    [.] __libc_malloc
   1.72%  mysqld                          [.] Lex_hash::get_hash_symbol
   1.71%  libstdc++.so.6.0.28             [.] std::_Rb_tree_increment
   1.71%  mysqld                          [.] dispatch_command
Share Improve this question asked Jul 5, 2024 at 23:18 Phill AshPhill Ash 101 5
  • So I thought I had identified the issue. One of our wordpress backup tools updraftplus had been attempting to backup our site. It looked like the backup hadn't happened for several months and I discovered that the remote storage wasn't accessible. I renewed the fee payment for it (it was on an old payment card) backed the site up manually and the CPU seemed to drop to normal. I just had a look at our monitoring today though and I can see several high CPU alerts. It's possible that this is the backup happening since it's been reinstated... investigating. – Phill Ash Commented Jul 6, 2024 at 12:25
  • I have a feeling I might have found what's really creating the problem with the database. 14886 0.37s 3.91s -3084K 0K 0K 0K -- - 51 S 3 170% mysqld 23900 0.27s 2.30s 2048K 1940K 0K 0K -- - 1 R 2 100% apache2 24106 0.38s 1.96s 0K 4K 0K 0K -- - 1 R 3 93% apache2 23893 0.12s 2.08s 0K 0K 0K 4K -- - 1 S 1 87% apache2 23988 0.30s 1.74s 0K 0K 0K 0K -- - 1 R 3 81% apache2 There are another 10 apache processes all running 81% CPU. – Phill Ash Commented Jul 6, 2024 at 12:31
  • I found this thread serverfault.com/questions/823121/… and followed it's guidance. I now have apache using php-fpm. Let's see if this helps. I'm avoiding changing it to the answer because other users have suggest it's merely kicking the can down the road whilst acknowledging it should be done anyway. Others suggest simply moving off apache onto nginx – Phill Ash Commented Jul 6, 2024 at 13:07
  • Sadly it appears that the problem has just moved to php-fpm. Now I see mysqld using 200% CPU and several (5) php-fpm instances using 100% CPU. – Phill Ash Commented Jul 7, 2024 at 19:47
  • Looks like some kind of nefarious activities... I could see a large number of connections on port 80. 200 odd at a time when the site isn't doing anything... Sadly just as I went to check what they were my lil one woke up so I had to sort that and by the time I was back CPU was back to normal and all the connections were gone. I'll have to be faster off the mark tomorrow. – Phill Ash Commented Jul 7, 2024 at 22:36
Add a comment  | 

1 Answer 1

Reset to default 0

So the issue was actually a DDOS attack on our site. I used urlsnarf to see what requests were being placed and there were basically hundreds of POST request from about 20-30 or so IP addresses. Seems quite small scale for what I understood is a typical DDOS attack. It also explains why it took a few days before the site ground to a halt. We have quite a few cores running on this host and it was maxing out 7 of them. I think in a larger scale attack it would have taken out the site a lot faster... We're behind cloudflare so maybe it was setup at that level to avoid their automatic detection...

As soon as I turned on the 'We're under attack mode' the POST requests stopped coming in and the server returned to normal.

Now I just need to figure out an early detection mechanism and automatic switch of the cloudflare under attack functionality.

本文标签: