admin管理员组

文章数量:1302895

I have a large table in ClickHouse (MergeTree engine) with a few hundred million rows and around a hundred columns. When I select a few dozen columns based on a complex condition that returns several dozen million rows, everything works fine - I get the results back and can process them in my code.

However, when I added a single "ORDER BY" clause:

SELECT <... my 20 columns ...>  
WHERE <... my complex filter>  
ORDER BY log_name_with_timestamp;

the query fails. The error message is:

Code: 241. DB::Exception: Received from localhost:9000. DB::Exception: (total) memory limit exceeded: 
would use 56.36 GiB (attempt to allocate chunk of 6628255 bytes), 
current RSS 48.65 GiB, maximum: 56.36 GiB. 
OvercommitTracker decision: Query was selected to stop by OvercommitTracker: 
While executing BufferingToFileTransform. (MEMORY_LIMIT_EXCEEDED)
  • I tried setting max_bytes_before_external_sort = 10000000000; -- e.g., 10GB, but it didn't help. I'm still getting an error.

  • I tried to add an index to this field, but it didn't help:

    
       ┌─database─┬─table──┬─name─────────┬─type───┬─type_full─┬─expr─────┬─granularity─┬─data_compressed_bytes─┬─data_uncompressed_bytes─┬─marks─┐
    1. │ default  │ mylogs │ idx_fname_ts │ minmax │ minmax    │ fname_ts │           1 │                236431 │                 2573020 │ 49574 │
       └──────────┴────────┴──────────────┴────────┴───────────┴──────────┴─────────────┴───────────────────────┴─────────────────────────┴───────┘
    

What's wrong and what an I missing to make the sorting work? Unfortunately, I can't remove the "ORDER BY" clause, but I'm willing to accept a performance cost.

I am sure I run out of memory on sorting. I clearly see in the CLI that all rows in the table got processed. I can also successfully run the query without without ORDER BY suffix. I suspect the issue is within BufferingToFileTransform, but I don't fully grasp how to fix it.

I have a large table in ClickHouse (MergeTree engine) with a few hundred million rows and around a hundred columns. When I select a few dozen columns based on a complex condition that returns several dozen million rows, everything works fine - I get the results back and can process them in my code.

However, when I added a single "ORDER BY" clause:

SELECT <... my 20 columns ...>  
WHERE <... my complex filter>  
ORDER BY log_name_with_timestamp;

the query fails. The error message is:

Code: 241. DB::Exception: Received from localhost:9000. DB::Exception: (total) memory limit exceeded: 
would use 56.36 GiB (attempt to allocate chunk of 6628255 bytes), 
current RSS 48.65 GiB, maximum: 56.36 GiB. 
OvercommitTracker decision: Query was selected to stop by OvercommitTracker: 
While executing BufferingToFileTransform. (MEMORY_LIMIT_EXCEEDED)
  • I tried setting max_bytes_before_external_sort = 10000000000; -- e.g., 10GB, but it didn't help. I'm still getting an error.

  • I tried to add an index to this field, but it didn't help:

    
       ┌─database─┬─table──┬─name─────────┬─type───┬─type_full─┬─expr─────┬─granularity─┬─data_compressed_bytes─┬─data_uncompressed_bytes─┬─marks─┐
    1. │ default  │ mylogs │ idx_fname_ts │ minmax │ minmax    │ fname_ts │           1 │                236431 │                 2573020 │ 49574 │
       └──────────┴────────┴──────────────┴────────┴───────────┴──────────┴─────────────┴───────────────────────┴─────────────────────────┴───────┘
    

What's wrong and what an I missing to make the sorting work? Unfortunately, I can't remove the "ORDER BY" clause, but I'm willing to accept a performance cost.

I am sure I run out of memory on sorting. I clearly see in the CLI that all rows in the table got processed. I can also successfully run the query without without ORDER BY suffix. I suspect the issue is within BufferingToFileTransform, but I don't fully grasp how to fix it.

Share edited Feb 10 at 21:37 jonrsharpe 122k30 gold badges267 silver badges474 bronze badges asked Feb 10 at 21:29 CuriousDCuriousD 1,7754 gold badges23 silver badges37 bronze badges 3
  • What's the version you are using? And how about reducing columns and still memory exceeded or nor? – jsc0218 Commented Feb 10 at 21:53
  • @jsc0218 ClickHouse server version 25.2.1. If I reduce the number of columns I request it also work just fine, even with "ORDER BY". But I do need all the columns. – CuriousD Commented Feb 10 at 23:33
  • How about max_memory_usage? max_bytes_before_external_sort should be less than it. Also what's your allocated ram? BTW, you can also try to set less max_threads. – jsc0218 Commented Feb 11 at 14:47
Add a comment  | 

2 Answers 2

Reset to default 0

Try to use Common Table Expression to divide your query into 2 parts:

  1. Select resulting data
  2. Sort result

This should reduce memory requirements for this query. Something like this pseudo-code:

WITH result AS (
    SELECT <... my 20 columns ...>  
    WHERE <... my complex filter>
)
SELECT * FROM result
ORDER BY log_name_with_timestamp;

Unfortunately, the only viable solution I found is to create a projection based on the field name. In this case, the query optimizer avoids sorting because the data is already pre-sorted in immutable files.

本文标签: out of memoryExternal sort fails with MEMORYLIMITEXCEEDEDStack Overflow