TLDR
How to do a regex-match query with a column value ('input' ~ t.somecolumn), where just a known subset of rows has a valid regex in that column?
Full example
- there is a
blocked_itemstable including two varchar columns:typeandvalue, - one of the types is
DOMAIN_REGEX, and then thevaluealways includes a correct regex, - but: for other types
valuedoesn't need to be a regex and can cause errors when treated as one.
To check if a domain is blocked, I'm calling this query and passing the URL in question as $1 parameter:
SELECT 1 FROM blocked_items WHERE type = 'DOMAIN_REGEX' AND $1 ~ value LIMIT 1
The problem: on some database instances the query fails if rows with another type have value that's not a valid regex. On one database this query runs correctly, and on another instance, regardless of the input, throws: invalid regular expression: quantifier operand invalid.
Example test data:
| type | value |
|--------------+---------------------|
| EMAIL | [email protected] |
| DOMAIN_REGEX | test\d\.com |
Question
I know the reason for my error is that the db engine can choose to check the second condition ($1 ~ value) first -- I've checked the EXPLAIN for my query and indeed it's different on these two database instances.
Is there a way I can
- force the db to check the
typecolumn first, so the regex filter is always valid? - form the query differently to ignore the error for non-regex
valueentries? Or check if it's a valid regex first? - work around this issue in another way?
// I know changing the schema or using LIKE instead will probably suffice, but now that I stumbled upon this I'm curious if there is a solution using regexes like this :)