2

TLDR

How to do a regex-match query with a column value ('input' ~ t.somecolumn), where just a known subset of rows has a valid regex in that column?

Full example

  • there is a blocked_items table including two varchar columns: type and value,
  • one of the types is DOMAIN_REGEX, and then the value always includes a correct regex,
  • but: for other types value doesn't need to be a regex and can cause errors when treated as one.

To check if a domain is blocked, I'm calling this query and passing the URL in question as $1 parameter:

SELECT 1 FROM blocked_items WHERE type = 'DOMAIN_REGEX' AND $1 ~ value LIMIT 1

The problem: on some database instances the query fails if rows with another type have value that's not a valid regex. On one database this query runs correctly, and on another instance, regardless of the input, throws: invalid regular expression: quantifier operand invalid.

Example test data:

| type         | value               |
|--------------+---------------------|
| EMAIL        | [email protected]   |
| DOMAIN_REGEX | test\d\.com         |

Question

I know the reason for my error is that the db engine can choose to check the second condition ($1 ~ value) first -- I've checked the EXPLAIN for my query and indeed it's different on these two database instances.

Is there a way I can

  • force the db to check the type column first, so the regex filter is always valid?
  • form the query differently to ignore the error for non-regex value entries? Or check if it's a valid regex first?
  • work around this issue in another way?

// I know changing the schema or using LIKE instead will probably suffice, but now that I stumbled upon this I'm curious if there is a solution using regexes like this :)

2 Answers 2

3

You should be able to force the order of operations using case:

SELECT 1
FROM blocked_items
WHERE (CASE WHEN type <> 'DOMAIN_REGEX' THEN false
            ELSE $1 ~ value
       END)
LIMIT 1;

In general, SQL (and Postgres) provide little control over the order of evaluation of expressions. However, CASE should provide that control under many circumstances.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks! I'll accept @dssjoblom's anwer since he was first and it looks like they need the points more than you do ;)
1

You are right, the schema is not great. If you still really have to keep the schema, you could try CASE/WHEN, https://www.postgresqltutorial.com/postgresql-case/

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.