9

I have a column of type text, that contains random values. Some of them are numbers, some other texts, and some other mixed text and numbers. I'm trying to discriminate them as:

myfield   is_numeric
____________________
-4        true
0004      true
4.00      true
dog       false
D04       false
04f       false

Whereas others are names and other strings. I was using the regular expression

SELECT id,
       myfield 
       (myfield::varchar~ '^-?[0-9]*.?[0-9]*$') is_numeric 
FROM mytable

To tell wether the row contains a valid number or not. However, I noticed that values like D04 and 04c return true for that regex, which, for my use case, is a false positive.

Why is this happening? It seems that ^ is matching not necessarily the whole value, but any valid substring of the value. However, values like D04f do return false, so even if there's a numeric substring in the field, the combination of ^ and $ operators is doing its job.

I have temporarily resorted to using:

SELECT id, 
       myfield
       (myfield::varchar ~ '^-?[0-9]*.?[0-9]*$' 
       AND myfield::varchar !~ '[^0-9\-\.]') is_numeric 
FROM mytable

But this seem inneficient (and does not rule out double dots), and I'm still wondering why the regex is correctly ruling out strings that start and end with a non-numeric character, while incorrecly returning true for string that contain only a trailing or leading non numeric character.

1
  • 1
    here: '^-?[0-9]*.?[0-9]*$' this part .? you want to be any character or character . ? If it is the second it should be scaped Commented Feb 13, 2017 at 16:34

2 Answers 2

18

Would this work for you?

^-?[0-9]+\.?[0-9]*$

I'm asuming -0.07.5 is invalid (double dot present).

D04 will return false as well.

The problem in your original regex is that you're not escaping the dot, so it will match any characher, including D in your D04.

Hope it helps.

Sign up to request clarification or add additional context in comments.

3 Comments

OMG, I went over and over that regex and I didn't see the flaw. Thank you very much.
Actually this is not good enough select '.5' ~ '^-?[0-9]+\.?[0-9]*$' is false
It accepts 5. but not .5, doesn't accept NaN or inifinity, doesn't support digit grouping or formats that use , (or · or ' as alternative decimal points), non-decimal systems.
2

Any solution based on regular expressions will at best reinvent PostgreSQL's own logic used for casting text to numeric (or any other type).

In order to stay consistent with PostgreSQL's own definition of the type, it's easiest to let the db use it to run the validation. In PostgreSQL 16 and above, there's pg_input_is_valid():

pg_input_is_valid ( string text, type text ) → boolean
Tests whether the given string is valid input for the specified data type, returning true or false. This function will only work as desired if the data type's input function has been updated to report invalid input as a “soft” error. Otherwise, invalid input will abort the transaction, just as if the string had been cast to the type directly.

pg_input_is_valid('42', 'integer') → t
pg_input_is_valid('42000000000', 'integer') → f
pg_input_is_valid('1234.567', 'numeric(7,4)') → f

In PostgreSQL 15 and earlier, you can build your own:

create or replace function my_input_is_valid(arg text,arg_type regtype) 
returns boolean stable leakproof called on null input parallel safe 
as $f$
begin
    execute format('select %L::%s',arg,arg_type);
    return true;
exception when others then 
    return false;
end $f$ language plpgsql;

Note that it checks if the type you're asking about exists, by accepting it as a regtype, which leads to its lookup in pg_type. If it doesn't find it, you get an adequate error:

select my_input_is_valid('1','bleh');
ERROR:  type "bleh" does not exist
LINE 1: select my_input_is_valid('1','bleh');
                                     ^

Demo

4 Comments

Couldn't this function be declared as immutable?
Both it and the built-in pg_input_is_valid() are only stable: dbfiddle.uk/3n9GAIbG
Oh yes, I didn't think about this classic problem of date and time types… stable would be a nice addition though.
Good point, thanks. Added.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.