0

So I have a query which I can execute (and get results) from the web interface. Next, I want to use this query inside a python script. However, this fails. Next are the details.

Assume this is the query used in the web interface.

SELECT
  MIN(visitStartTime)
FROM (TABLE_DATE_RANGE([123456789.ga_sessions_], TIMESTAMP('2017-02-22'), TIMESTAMP('2017-05-22')))
GROUP BY
  visitId,
  fullVisitorId
LIMIT
  1000

Next, I want to use this very query from Python. First, here are two utility functions (based on Google's references):

def async_query(query, project='ga---big-query', max_results=1000):
    client = bigquery.Client(project)
    query_job = client.run_async_query(str(uuid.uuid4()), query)
    query_job.use_legacy_sql = False
    query_job.begin()

    wait_for_job(query_job)

    rows = query_job.results().fetch_data(max_results)
    return rows


def wait_for_job(job):
    while True:
        job.reload()  # Refreshes the state via a GET request.
        if job.state == 'DONE':
            if job.error_result:
                raise RuntimeError(job.errors)
            return
        time.sleep(1)

Lastly, here's the querying:

query = """SELECT
  MIN(visitStartTime)
FROM (TABLE_DATE_RANGE([94860076.ga_sessions_], TIMESTAMP('2017-02-22'), TIMESTAMP('2017-05-22')))
GROUP BY
  visitId,
  fullVisitorId
LIMIT
  1000
"""

res = async_query(query)

This returns the following error:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-64-1573194bda70> in <module>()
     27 # query = 'SELECT visitId FROM `94860076.ga_sessions_20170802`'
     28 
---> 29 res = async_query(query)

<ipython-input-33-e8addf14673a> in async_query(query, project, max_results)
      5     query_job.begin()
      6 
----> 7     wait_for_job(query_job)
      8 
      9     rows = query_job.results().fetch_data(max_results)

<ipython-input-33-e8addf14673a> in wait_for_job(job)
     16         if job.state == 'DONE':
     17             if job.error_result:
---> 18                 raise RuntimeError(job.errors)
     19             return
     20         time.sleep(1)

RuntimeError: [{'reason': 'invalidQuery', 'location': 'query', 'message': 'Syntax error: Expected "," or "]" but got identifier "ga_sessions_" at [3:34]'}]

I suspect that the problem is with the naming of the tables, but I don't know how to solve it. I did manage to port SELECT visitId FROM [94860076.ga_sessions_20170802] to query = SELECT visitId FROM <backtick>94860076.ga_sessions_20170802<backtick>

1 Answer 1

2

The problem is happening at this line:

query_job.use_legacy_sql = False

As you are using Legacy SQL, it should be:

query_job.use_legacy_sql = True

Or you could just leave it unassigned as the default value is True.

Still, it's highly recommended that you start using the Standard Version SQL, it's way more powerful, stable and is also the recommended approach by the BigQuery team.

The Standard Version of your query would be something like:

SELECT
  MIN(visitStartTime)
FROM `94860076.ga_sessions_*`
WHERE _TABLE_SUFFIX BETWEEN '20170222' AND '20170522'
GROUP BY
  visitId,
  fullVisitorId
LIMIT
  1000
Sign up to request clarification or add additional context in comments.

5 Comments

Great catch! Can you give a hint how to translate the query I have to the standard version?
Amazing! If I understand correctly, in the web interface, the dialect is the legacy one. Is that correct?
In WebUI you can actually choose which version you want in the Show Options button. Here I use BigQuery Mate and have Standard Version defined as default.
Maybe you know how do I port MAX(hits.transaction.transactionId) to standard SQL?
This might be a bit more complex as hits is a ARRAY field so it has to be UNNESTED. I recommend asking a new question describing what you need and what you have tried so far so we can better help you.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.