4

I have been importing data from a Pandas Dataframe to a postgres DB using my own custom import script. Unfortunately, my data is not tidy which caused every single one of my columns to be parsed as a text. Is there any way where I could get the entries where a certain column value is NOT a number? My plan is to delete those records and convert the column to a numeric type

Thanks!

1

1 Answer 1

12

get the records which have no digits, then you would have to use the following regex:

DELETE FROM myrecords WHERE record ~ '^[^0-9]+$';

Here, the ^ character outside of square brackets means the beginning of the field, the $ character means the end of the field, and we require that all characters in between are non-digits. + indicates that there should be at least one such characters. If we would also allow empty strings, then the regex would look like ^[^0-9]*$.

If you want the records which would include digits and lower-case letter, then I would expect a regex like:

DELETE FROM myrecords WHERE record ~ '[0-9a-z]';
Sign up to request clarification or add additional context in comments.

4 Comments

the first regex should be '^[0-9]+$ or you could shorten it to '^\d+$'
@HaleemurAli yes that's work as well \d is a metacharacter that matches any digit, which is identical to [0-9] @Macterror you are welcome, glad it helps
A single non-numeric character suffices to determine that a string is not a number. Therefore it would seem more appropriate to use [^0-9]+ (or [^\d]+) — that is, check if at least one character is not a number.
@HaleemurAli NO ! With your regex the query would delete all (and only) actual numbers…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.