Using SPLIT_PART in PostgreSQL

Question

I have a column of data that pulls in UTM tag from Google Ads with the below ID's. It contains campaign ID's (The initial part before the "___") and then the ad group ID after. For some cases we only have campaign ID's which are strings, this is the reason why I am type casting with ::TEXT.

This is what the UTM tags look like when pulled in.

835783587___42385125483
eu
968720083___47551372269
en_usa_search_brand
648594695___38174608372
886097479___45386492795
en_trust_control
competitors
es
en_esp_search_route
1072851000___55370810634

I'm trying to split out the ID's from each other and remove the underscores then push these to another table.

umc.campaign is the column that contains the UTM tag.

I'm creating this temp table to then push to the final table below.

 CREATE TABLE reports.tmp_sem_attribution  AS (
        SELECT DISTINCT ON (umc.user_id)
            umc.user_id,
            umc.source,
            umc.campaign ::TEXT,
    (SPLIT_PART(REPLACE(campaign,'__','_'),'__',1)) :: TEXT AS campaign_id,
    (SPLIT_PART(REPLACE(campaign,'__','_'),'__',2)) :: TEXT AS adgroup_id,

When I use the below query to check the results, I can see that some of the Ad Group ID's are empty or have a space in them.

reports.sem_attribution_v2 is the table where I am pushing out the ID's into two different columns.

SELECT * FROM reports.sem_attribution_v2 WHERE adgroup_id =''

**RESULT**
Campaign_ID                     AdGroup ID
eu  
1560591282  
en_usa_search_brand 
1560608121  
en_trust_control    
1560591282  
en_fra_search_generic_manual    
990427417   
eu

If you guys could shed some light on how I could approach this differently or if this query is incorrect. That would be much appreciated.

Thanks.

Can you provide source text for campaign ids: 1560608121, 1560591282 and 990427417? — Adam
– Adam, Commented Dec 17, 2018 at 13:38
@Adam For those 3 ID's the source text is exactly the same as the above, it only contains the campaign ID with no brackets or Adgroup ID's — Tarik
– Tarik, Commented Dec 17, 2018 at 14:13

Kaushik Nayak · Accepted Answer · 2018-12-17 14:58:27Z

1

You may use REGEXP_REPLACE

SELECT   REGEXP_REPLACE(campaign,'(\d+)___\d+','\1') as campaign_id,
         REGEXP_REPLACE(campaign,'\d+___(\d+)','\1') as adgroup_id
                 FROM t;

OR SUSBTRING with a case condition.

SELECT CASE 
         WHEN campaign ~ '(\d+)___(\d+)' THEN 
         substring(campaign FROM '(\d+)___')  --extracts string before "__"
         ELSE campaign                        --same string when pattern not found
       end AS campaign_id, 
       CASE 
         WHEN campaign ~ '(\d+)___(\d+)' THEN 
         substring(campaign FROM '___(\d+)')   --extracts string after "__"
         ELSE campaign 
       end AS adgroup_id 
FROM   t;

Demo

edited Dec 17, 2018 at 14:58

answered Dec 17, 2018 at 14:37

Kaushik Nayak

32k6 gold badges36 silver badges54 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Tarik Over a year ago

Thanks Kaushik, the REGEXP_REPLACE statement has done the job.

Adam · Accepted Answer · 2018-12-19 07:17:13Z

0

SPLIT_PART function will return an empty string when there are less fields after splitting then requested. For example: when there is only one field and you want to get the second field you would get an empty string. Which is correct and fine for your approach.

You can simplify your query because the REPLACE part is not nesesery:

(SPLIT_PART(campaign, '___', 1))::TEXT AS campaign_id,
(SPLIT_PART(campaign, '___', 2))::TEXT AS adgroup_id

Another improvement could be to replace empty strings with NULL values. You can do it while inserting data into reports.sem_attribution_v2 table:

CASE WHEN adgroup_id = '' THEN NULL ELSE adgroup_id END

edited Dec 19, 2018 at 7:17

answered Dec 17, 2018 at 14:29

Adam

5,6296 gold badges35 silver badges41 bronze badges

1 Comment

Tarik Over a year ago

Thanks for the help Adam

Collectives™ on Stack Overflow

Using SPLIT_PART in PostgreSQL

2 Answers 2

1 Comment

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related