sql - How to combine LEFT and LOWER in a multicolumn index for leading text patterns? - Stack Overflow

IT技术

更新时间：2025-01-123

admin管理员组
文章数量:1129094

I create an index this way:

CREATE INDEX rep_tval_idx ON public.rep USING btree (t, lower(left(val, 127)));

Then I run a SELECT with matching filter:

explain
select * from rep
where t=3 and lower(left(val, 127)) like 'operation%';

According to EXPLAIN the index would not be used in this case. How do I make it work with both conditions in the index? The field type is text and I do not want to store more than 127 characters of its content.

EXPLAIN ANALYZE results:

Index Scan using rep_tval_idx on rep  (cost=0.14..3.67 rows=1 width=56) (actual time=0.044..0.045 rows=0 loops=1)
  Index Cond: (t = 3)
  Filter: (lower("left"(val, 127)) ~~ 'operation%'::text)
  Rows Removed by Filter: 16
Planning Time: 0.112 ms
Execution Time: 0.069 ms

There is a single table of this structure, and when I use LOWER or LEFT separately, they work ok with the index.

CREATE TABLE public.rep (
    id bigserial NOT NULL,
    up int8 NOT NULL,
    t int8 NOT NULL,
    val text NULL,
    CONSTRAINT rep_pk PRIMARY KEY (id)
);

CREATE INDEX rep_tval_idx ON public.rep USING btree (t, lower("left"(val, 127)));
CREATE INDEX rep_upt_idx ON public.rep USING btree (up, t);

Update

The problem was, I suppose, caused by testing on a small table. There are actually 2 different instances with a table in each. One is 125 rows large, while another is 310466 rows. In the bigger table I see that index working as it was expected to do. Yet, the execution time is still 0.064 ms, though it's a feature of the quintet data model.

Index Scan using rep_tval_idx on rep  (cost=0.42..121.50 rows=120 width=56) (actual time=0.028..0.037 rows=8 loops=1)
  Index Cond: ((t = 3) AND (lower("left"(val, 127)) >= 'operation'::text) AND (lower("left"(val, 127)) < 'operatioo'::text))
  Filter: (lower("left"(val, 127)) ~~ 'operation%'::text)
Planning Time: 0.102 ms
Execution Time: 0.064 ms

I create an index this way:

CREATE INDEX rep_tval_idx ON public.rep USING btree (t, lower(left(val, 127)));

Then I run a SELECT with matching filter:

explain
select * from rep
where t=3 and lower(left(val, 127)) like 'operation%';

According to EXPLAIN the index would not be used in this case. How do I make it work with both conditions in the index? The field type is text and I do not want to store more than 127 characters of its content.

EXPLAIN ANALYZE results:

Index Scan using rep_tval_idx on rep  (cost=0.14..3.67 rows=1 width=56) (actual time=0.044..0.045 rows=0 loops=1)
  Index Cond: (t = 3)
  Filter: (lower("left"(val, 127)) ~~ 'operation%'::text)
  Rows Removed by Filter: 16
Planning Time: 0.112 ms
Execution Time: 0.069 ms

There is a single table of this structure, and when I use LOWER or LEFT separately, they work ok with the index.

CREATE TABLE public.rep (
    id bigserial NOT NULL,
    up int8 NOT NULL,
    t int8 NOT NULL,
    val text NULL,
    CONSTRAINT rep_pk PRIMARY KEY (id)
);

CREATE INDEX rep_tval_idx ON public.rep USING btree (t, lower("left"(val, 127)));
CREATE INDEX rep_upt_idx ON public.rep USING btree (up, t);

Update

The problem was, I suppose, caused by testing on a small table. There are actually 2 different instances with a table in each. One is 125 rows large, while another is 310466 rows. In the bigger table I see that index working as it was expected to do. Yet, the execution time is still 0.064 ms, though it's a feature of the quintet data model.

Index Scan using rep_tval_idx on rep  (cost=0.42..121.50 rows=120 width=56) (actual time=0.028..0.037 rows=8 loops=1)
  Index Cond: ((t = 3) AND (lower("left"(val, 127)) >= 'operation'::text) AND (lower("left"(val, 127)) < 'operatioo'::text))
  Filter: (lower("left"(val, 127)) ~~ 'operation%'::text)
Planning Time: 0.102 ms
Execution Time: 0.064 ms

Share Improve this question edited Jan 9 at 1:05 Erwin Brandstetter 655k156 gold badges1.1k silver badges1.3k bronze badges asked Jan 8 at 19:40 Alexey Sam 111 silver badge1 bronze badge New contributor Alexey Sam is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.

1) How many rows are in table? 2) I don't see how Execution Time: 0.069 ms is an issue. – Adrian Klaver Commented Jan 8 at 19:51
There are actually 2 different instances with a table in each (it's QDM). One is 125 rows large, while another is 310466 rows. Seems, you solved it for me, thanks! In the bigger table I see that index working as it was expected to do. – Alexey Sam Commented Jan 8 at 20:37
Add updates as properly formatted text addition to original question text. – Adrian Klaver Commented Jan 8 at 20:59
For the record, the second query plan is only possible with the displayed index if you are running your database with LC_COLLATE = "C" - which is rather uncommon. (But then the first query plan wouldn't make sense.) – Erwin Brandstetter Commented Jan 9 at 1:12

Add a comment |

3 Answers 3

Sorted by: Reset to default 2

btree indexes are, by default, great for <, > or = but can't be used for pattern matching.

You must create the index with the text_pattern_ops to index character by character. See it in action in this blog

If the planner thinks that only one row will match t=3, then it might think there is no value in also checking the 2nd "column" against the index, since doing so saves no work over checking that one row against the table itself (which it needs to do anyway, as the shortcut pattern match might return false positives)

In the plan you show, it actually removed 16 rows based on the filter. However, that doesn't mean the planner thought it was going to need to remove 16 rows. The plan itself does not contain enough information to reconstruct the planners thought process here. If you want to double check what the planner was thinking, you could capture another plan based on the similar query but with only "where t=3" to see how many rows it thought that that would return.

It is frustrating to analyze these situations. The choice of whether to use one or more than one column from a multicolumn index when it thinks only one row will match on the first column does not seem to be very stable.

Use a COLLATE "C" index. Much like the traditional (obsolete, really) text_pattern_ops index, just simpler and more versatile. See:

Is there a difference between text_pattern_ops and COLLATE "C"?

CREATE INDEX rep_tval_c_idx ON rep USING btree (t, lower(left(val, 127)) COLLATE "C");

Then your current query works as is, with index support on both columns.

Better

For your leading search pattern, you can simplify using the "starts with" operator ^@ in Postgres 15+. See:

PostgreSQL LIKE query performance variations

SELECT * FROM rep
WHERE  t = 3
AND    lower(left(val, 127)) ^@ 'operation';  -- NO added wild card!

Typically, you don't need 127 leading characters to be selective. A much shorter string will do and make the index smaller, faster, and cheaper to maintain. Find the sweat spot for your data distribution and expected filter input. Typically, 10 - 20 characters are plenty. Then filter on the shortened string and on the full string like:

CREATE INDEX rep_tval_c_idx ON rep USING btree (t, left(val, 10) COLLATE "C");

SELECT * FROM rep
WHERE  t = 3
AND    lower(left(val, 10)) ^@ lower(left($1, 10))  -- bring in idx (logically redundant)
AND    val ^@ $1;  -- preserve full selectivity

fiddle

本文标签： sqlHow to combine LEFT and LOWER in a multicolumn index for leading text patternsStack Overflow

版权声明：本文标题：sql - How to combine LEFT and LOWER in a multicolumn index for leading text patterns? - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1736691404a1947950.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

sql - How to combine LEFT and LOWER in a multicolumn index for leading text patterns? - Stack Overflow

Update

Update

3 Answers 3

Better

更多相关文章

sql - How to combine LEFT and LOWER in a multicolumn index for leading text patterns? - Stack Overflow

发表评论

推荐文章

rust - How to serialize data at an offset from the start of a vector - Stack Overflow

r - perform metaregression with loop on results of meta-analysis loop - Stack Overflow

rest api - Isn&#39;t rest_do_request() filtered by pre_get_posts() filter?

javascript - Cannot open local file - Chrome: Not allowed to load local resource - Stack Overflow

Google maps chronology - Stack Overflow

热门文章

JavaScript plus sign in front of function expression - Stack Overflow

javascript - How to disable HTML links - Stack Overflow

error running android react native project with @react-native-communitycli-platform-android package - Stack Overflow

javascript - How to enable Bootstrap tooltip on disabled button? - Stack Overflow

go - Converting `func() Foo` to `func() any` in Golang? - Stack Overflow

javascript - Concat scripts in order with Gulp - Stack Overflow

python - How to visualize different high-dimensional vector spaces - Stack Overflow

swift - How to efficiently delete CoreData relationship child objects - Stack Overflow

javascript - Difference between MEAN.js and MEAN.io - Stack Overflow

javascript - Mongoose - What does the exec function do? - Stack Overflow

最新文章

Java入门级教学（IDEA的下载与安装与JDK的环境配置）

华硕笔记本电脑用U盘重装windows系统

物理网卡MAC修改器v3.0 - 真实网卡硬件MAC地址修改，重装系统不变！

如何一键安装win7系统(一键安装win7系统步骤)

Windows 11最稳定版本详解

javascript - How do I write a named arrow function in ES2015? - Stack Overflow

hide - Stop SQL file from automatically downloading - Stack Overflow

javascript - Google Chromecast sender error if Chromecast extension is not installed or using incognito - Stack Overflow

javascript - How to use JQuery with ReactJS - Stack Overflow

javascript - Issue converting JSON to XML using fast-xml-parser v4.5.1 - Stack Overflow

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价

rest api - Isn't rest_do_request() filtered by pre_get_posts() filter?