11

I have a few cells in my Jupyter notebook that is querying data from Google's BigQuery. Parts of these queries share the same variables. I am looking for the cleanest way to factor these variables so I don't have to manually set them individually every time I want to make a change.

Here is an example of what I have:

%%bigquery df
select * from `project.dataset.table`
where date_str like '2019-02-%'

Here is what I am looking for:

date_str_query = '2019-02-%' # this variable would be in the python environment

%%bigquery df
select * from `project.dataset.table`
where date_str like date_str_query

1 Answer 1

15

##### UPDATE #####

New documentation has a detailed description of various inputs to the magic command. It can be found here: https://cloud.google.com/python/docs/reference/bigquery/latest/magics

Example:

%%bigquery --params {"corpus_name": "hamlet", "limit": 10}
SELECT word, SUM(word_count) as count
FROM `bigquery-public-data.samples.shakespeare`
WHERE corpus = @corpus_name
GROUP BY word
ORDER BY count DESC
LIMIT @limit

##### ORIGINAL POST #####

Unfortunately, this part is missing from the official documentation, and I had to scrape through the code to finally find a solution. I'm documenting my findings here in hopes to help people with the same issue. There are two ways to do this.

Way 1 [not using the magic command at all]:

date_str_query = '2019-02-%'
sql = """
select * from `project.dataset.table`
where date_str like {0}
""".format(date_str_query)
df = client.query(sql).to_dataframe()

Way 2 [magic command rules!]:

params = {'date_str_query': '%-2019-02-%'}

%%bigquery df --params $params
select * from `project.dataset.table`
where date_str like @date_str_query

I prefer Way 2 because you get SQL syntax highlighting and a bunch of other features with the magic command.

References:

Google's Documentation: https://cloud.google.com/bigquery/docs/visualize-jupyter

The source code: https://github.com/googleapis/google-cloud-python/blob/master/bigquery/google/cloud/bigquery/magics.py

Sign up to request clarification or add additional context in comments.

1 Comment

I struggled to make this work with a float parameter value, eg. params={"rate": 0.1} getting an error about non-json-serializable values. Eventually I found that I could work around by passing as a string with cast(@rate as float64) in the query. Confusing!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.