6

I have the following table (in PostgreSQL 14.6):

create table waste_trajectory
(
    id               uuid default uuid_generate_v4() not null primary key,
    observation_id   uuid not null,
    tank_id          varchar(30),
    stored_on        timestamp with time zone default now(),
    score            numeric(14, 10)
);

... with this index:

CREATE INDEX IF NOT EXISTS idx_wt_stored_on_desc
    ON waste_trajectory (stored_on desc);

The table contains a large amount of data. If I run the following query:

SELECT
    id,
    observation_id,
    tank_id,
    -- stored_on::text,
    score
    FROM waste_trajectory
ORDER BY stored_on DESC
LIMIT 1;

I get results in a reasonable amount of time:

[2025-05-12 14:02:46] completed in 969 ms

But if I un-comment that stored_on::text line, the query takes over 10 minutes:

[2025-05-12 14:14:39] completed in 11 m 39 s 250 ms

The EXPLAIN ANALYSE for the un-commented version of the query shows:

|Limit  (cost=24469520.63..24469520.75 rows=1 width=1035) (actual time=736266.363..736286.177 rows=1 loops=1)                                                                        |
|  ->  Gather Merge  (cost=24469520.63..46550572.58 rows=184416572 width=1035) (actual time=736211.570..736231.382 rows=1 loops=1)                                                   |
|        Workers Planned: 4                                                                                                                                                          |
|        Workers Launched: 4                                                                                                                                                         |
|        ->  Sort  (cost=24468520.57..24583780.93 rows=46104143 width=1035) (actual time=736147.303..736147.305 rows=1 loops=5)                                                      |
|              Sort Key: ((stored_on)::text) DESC                                                                                                                                    |
|              Sort Method: top-N heapsort  Memory: 31kB                                                                                                                             |
|              Worker 0:  Sort Method: top-N heapsort  Memory: 26kB                                                                                                                  |
|              Worker 1:  Sort Method: top-N heapsort  Memory: 26kB                                                                                                                  |
|              Worker 2:  Sort Method: top-N heapsort  Memory: 28kB                                                                                                                  |
|              Worker 3:  Sort Method: top-N heapsort  Memory: 26kB                                                                                                                  |
|              ->  Parallel Seq Scan on waste_trajectory  (cost=0.00..24237999.86 rows=46104143 width=1035) (actual time=1032.453..712735.895 rows=36882681 loops=5)|
|Planning Time: 0.104 ms                                                                                                                                                             |
|JIT:                                                                                                                                                                                |
|  Functions: 11                                                                                                                                                                     |
|  Options: Inlining true, Optimization true, Expressions true, Deforming true                                                                                                       |
|  Timing: Generation 2.080 ms, Inlining 397.283 ms, Optimization 186.396 ms, Emission 124.416 ms, Total 710.175 ms                                                                  |
|Execution Time: 736286.614 ms                                                                                                                                                       

Unfortunately, I don't have the liberty of changing the query, since it's in a component I don't control. Is there a way to handle this strictly using a different index?

I tried:

CREATE INDEX IF NOT EXISTS idx_wt_stored_on_desc_text
    ON waste_trajectory (cast(stored_on as text) desc);

... but this returns:

[42P17] ERROR: functions in index expression must be marked IMMUTABLE

Can I improve the performance of this query without resorting to a custom IMMUTABLE function for converting the timestamp to text?

1
  • 2
    In addition to @Charlieface's correct answer: the DESC index is somewhat worse than an ASC index if you insert ever-increasing values. PostgreSQL's B-tree indexes are optimized for inserting at the upper end. Commented May 13 at 7:30

2 Answers 2

9

If you can't change the query then there is nothing you can do.

It can't use the index because it's sorting on the conversion. There is basically a bug here:

ORDER BY stored_on DESC

In an ORDER BY, if the column does not have a table alias, it actually refers to the column in the SELECT list by default, unless there is no such column, in which case it falls back to using a column from any of the FROM tables.

Since you have stored_on::text in the SELECT list, it automatically uses that (and Postgres likes to invent column names even when you didn't specify). What the query should have done is referred to all columns by table aliases:

SELECT
    wt.id,
    wt.observation_id,
    wt.tank_id,
    wt.stored_on::text AS stored_on,
    wt.score
FROM waste_trajectory wt
ORDER BY
    wt.stored_on DESC
LIMIT 1;

Then it would have still used your normal index.

Even creating an IMMUTABLE function won't help here, ultimately it's not going to use it, because it doesn't match the ORDER BY calculated column. And as you noted, you can't create an index on the cast directly because it's not IMMUTABLE. You'd have to change the query to use your new function, and at that point you might as well do the proper fix above.

1
  • Luckily, I was able to work with the owner of the component with the select statement to modify it in the way described above, and the fix worked. Query returns in under 1 second. Commented May 16 at 14:43
1

Note that in addition to the accepted answer, the following more minimal change also resulted in the index being used:

SELECT
     id,
     observation_id,
     tankid,
     stored_on::text as stored_time,
     score
 FROM waste_trajectory
 ORDER BY stored_on DESC
 LIMIT 1

Simply returning stored_on::text as something other than stored_on resulted in the index being used.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.