1

I have a list which contain tablename like table A, table B , table C ... etc. This list of tables may change depend upon some conditions . Sometime list may have 4 tables or 6 tables .

tables = [table A, table B , table C, table D]

schema definition of tables present in list is same .

i want to generate a query as below .

select col A from table A 
union distinct    
select col A from table B
 union distinct 
 select col A from table C 
 union distinct  select col A
 from table D

size of query may increase or decrease depending upon tables in list

Once query is prepared . its needs to be executed in bigquery via python

1 Answer 1

3

Given question I am not sure about overall setup and where this final query needs to be executed.

Since title of question has "dynamic" in it I am providing a solution using f strings which should allow most dynamic insertion.

Here is a way using f strings and pandas library for Big query.

import pandas as pd
from pandas.io import gbq

### create your final sql string to be executed to which you will append
final_sql_string = """
"""

tables = ['table_A', 'table_B', 'table_C', 'table_D']

for table in tables:

    ### if this is your last table in list no need for union
    if table == tables[-1]:

        ### build out your query via f string 
        sql_string= f"""select col_A from `your_project.your_dataset.{table}`"""

        ### join via newline to your final sql query string
        final_sql_string = "\n".join((final_sql_string, sql_string))

    ### if table is not last table then need to union
    else:
        sql_string= f"""select col_A from `your_project.your_dataset.{table}` union distinct """
        final_sql_string = "\n".join((final_sql_string, sql_string))


### optional - check what your sql looks like        
print(final_sql_string)

### submit to bq via pandas to get a pandas dataframe back
### on first try will ask to authorize against your project    
pandas_dataframe_from_sql = gbq.read_gbq(final_sql_string,
                                         project_id = 'your_project_id')

Sample of print statement from above

select col_A from `your_project.your_dataset.table_A` union distinct 
select col_A from `your_project.your_dataset.table_B` union distinct 
select col_A from `your_project.your_dataset.table_C` union distinct 
select col_A from `your_project.your_dataset.table_D`

If you are executing via google cloud SDK you can insert this sql within as well.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.