admin管理员组文章数量:1122826
It is well known that using ORDER BY in a subquery in PostgreSQL is not meaningful, and the DBMS will only respect the outer most ORDER BY.
However, if we include LIMIT (alongside ORDER BY) in the sub query, is this still the case?
It is well known that using ORDER BY in a subquery in PostgreSQL is not meaningful, and the DBMS will only respect the outer most ORDER BY.
However, if we include LIMIT (alongside ORDER BY) in the sub query, is this still the case?
Share Improve this question edited Nov 21, 2024 at 10:29 Joseph Young asked Nov 21, 2024 at 9:35 Joseph YoungJoseph Young 2,79514 silver badges23 bronze badges 8 | Show 3 more comments3 Answers
Reset to default 2Using LIMIT
in a subquery will affect the intermediate result set generated by that subquery. Consider the following two queries:
SELECT *
FROM
(
SELECT *
FROM yourTable
ORDER BY col1
) t
ORDER BY col2;
In this query, the inner ORDER BY
will (or maybe will not) execute, but since there is no LIMIT
, the ordering of that intermediate result has no bearing on the ordering of the final result. The order of the final result will only depend on the outer ORDER BY
clause.
However, in this query:
SELECT *
FROM
(
SELECT *
FROM yourTable
ORDER BY col1
LIMIT 10 -- only keep first 10 records
) t
ORDER BY col2;
The subquery aliased as t
would only return 10 records. Again, the final ordering would only depend on the outer ORDER BY
clause, but there would only be 10 records in the result set, i.e. that result set might be different than the first query.
It is well known that using ORDER BY in a subquery in PostgreSQL is not meaningful, and the DBMS will only respect the outer most ORDER BY.
Like many things that are "well known", this is not true.
In Postgres, ORDER BY
in a subquery does exactly what it says. The sort order just may change again in outer query levels. If you append a LIMIT
right after ORDER BY
on the same subquery level, the sort order takes effect immediately. (Remaining rows may still be re-ordered in outer query levels.)
It also becomes evident if you aggregate rows on the next level. The manual:
[...] supplying the input values from a sorted subquery will usually work. For example:
SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
Related:
- Create two arrays for two fields, keeping sort order of arrays in sync (without subquery)
Or for locking rows in the right order. See:
- SELECT FOR UPDATE subquery not respecting LIMIT clause under load
Or for any other operation where sort order is meaningful ...
The Postgres query planner is smart about this: The manual:
According to the SQL standard, the expressions in the output list should be computed before applying
DISTINCT
,ORDER BY
, orLIMIT
. This is obviously necessary when usingDISTINCT
, since otherwise it's not clear what values are being made distinct. However, in many cases it is convenient if output expressions are computed afterORDER BY
andLIMIT
; particularly if the output list contains any volatile or expensive functions. With that behavior, the order of function evaluations is more intuitive and there will not be evaluations corresponding to rows that never appear in the output. PostgreSQL will effectively evaluate output expressions after sorting and limiting, so long as those expressions are not referenced inDISTINCT
,ORDER BY
orGROUP BY
. (As a counterexample,SELECT f(x) FROM tab ORDER BY 1
clearly must evaluatef(x)
before sorting.) Output expressions that contain set-returning functions are effectively evaluated after sorting and before limiting, so thatLIMIT
will act to cut off the output from a set-returning function.
From the docs:
https://www.postgresql.org/docs/current/sql-select.html#SQL-LIMIT
The query planner takes LIMIT into account when generating a query plan, so you are very likely to get different plans (yielding different row orders) depending on what you use for LIMIT and OFFSET. Thus, using different LIMIT/OFFSET values to select different subsets of a query result will give inconsistent results unless you enforce a predictable result ordering with ORDER BY. This is not a bug; it is an inherent consequence of the fact that SQL does not promise to deliver the results of a query in any particular order unless ORDER BY is used to constrain the order.
It is even possible for repeated executions of the same LIMIT query to return different subsets of the rows of a table, if there is not an ORDER BY to enforce selection of a deterministic subset. Again, this is not a bug; determinism of the results is simply not guaranteed in such a case.
So yes using LIMIT
and ORDER BY
together is defined.
本文标签: Is using ORDER BY and LIMIT in a PostgreSQL subquery defined behaviourStack Overflow
版权声明:本文标题:Is using ORDER BY and LIMIT in a PostgreSQL subquery defined behaviour? - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1736312104a1934978.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
LIMIT
withoutORDER BY
- orLIMIT
withORDER BY
? – Dai Commented Nov 21, 2024 at 9:37LIMIT
andORDER BY
will be impacted by theORDER BY
. – Tim Biegeleisen Commented Nov 21, 2024 at 9:41