Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery: Avoid too long(10 seconds) interval for bigquery api to get results #7342

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

yoshiokatsuneo
Copy link
Contributor

What type of PR is this?

  • Feature

Description

On BigQuery runner, query results is fetched by calling BigQuery's getQueryResults API below.

https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/getQueryResults

As the default timeout for the API is 10 seconds, redash BigQuery runner calls another API after 10 seconds timeout. But, redash runner sleep for another 10 seconds before calling API again.

So, the situation is like below:
(Please note that the API call is synchronous. The HTTP connection is kept for up to 10 seconds.)

  • 0-10 second: API call
  • 10-20 seconds: sleep
  • 20-30 second: API call
  • 30-40 second: sleep
  • 40-50 second: API call
  • 50-60 second: sleep
    ...

So, if the query finished within 0-10 seconds, it is OK and the results was fetched on time.
But, if the query finished within 10-20 seconds, redash runner always wait for 20 seconds (even if query takes just 11 seconds.)
Same after that.
If the query finished within 20-30 seconds, it is OK and the results was fetched on time.
If the query finished within 30-40 seconds, redash runner always wait for 40 seconds

This PR fixes the issue by changing sleep time from 10 seconds to 1 second.

How is this tested?

  • Manually

I run the query that took about 14 seconds and see the worker log.

Before PR:
(worker log)
Getting query results took more than 20 seconds as the second API was called after 20 seconds.

[2025-02-22 15:28:29,576][PID:928][INFO][rq.job.redash.tasks.queries.execution] job.func_name=redash.tasks.queries.execution.execute_query job.id=xxx job=execute_query state=executing_query query_hash=3266ea82a8d2a0ec37b50ac4bacd7839 type=bigquery ds_id=2 job_id=xxx queue=queries query_id=4 username=xxxx
[2025-02-22 15:28:29,599][PID:928][INFO][googleapiclient.discovery] URL being requested: GET https://www.googleapis.com/discovery/v1/apis/bigquery/v2/rest
[2025-02-22 15:28:29,899][PID:928][INFO][googleapiclient.discovery] URL being requested: POST https://bigquery.googleapis.com/bigquery/v2/projects/xxxx/jobs?alt=json
[2025-02-22 15:28:30,675][PID:928][INFO][googleapiclient.discovery] URL being requested: GET https://bigquery.googleapis.com/bigquery/v2/projects/xxxx/queries/job_xxx?startIndex=0&alt=json
...
[2025-02-22 15:28:50,694][PID:928][INFO][googleapiclient.discovery] URL being requested: GET https://bigquery.googleapis.com/bigquery/v2/projects/xxxx/queries/job_xxx?startIndex=0&alt=json
[2025-02-22 15:28:51,266][PID:928][INFO][googleapiclient.discovery] URL being requested: GET https://bigquery.googleapis.com/bigquery/v2/projects/xxxx/queries/job_xxx?startIndex=828&alt=json
[2025-02-22 15:28:51,637][PID:928][INFO][rq.job.redash.tasks.queries.execution] job.func_name=redash.tasks.queries.execution.execute_query job.id=xxx job=execute_query query_hash=3266ea82a8d2a0ec37b50ac4bacd7839 ds_id=2 data_length=2408509 error=[None]
[2025-02-22 15:28:51,638][PID:928][INFO][root] Inserted query (3266ea82a8d2a0ec37b50ac4bacd7839) data; id=None

After PR:
(worker log)
Getting query results just took 14 secods.

[2025-02-22 15:55:17,463][PID:1078][INFO][rq.job.redash.tasks.queries.execution] job.func_name=redash.tasks.queries.execution.execute_query job.id=xxx job=execute_query state=executing_query query_hash=2a0bde1df5391857f2bd160769d8c867 type=bigquery ds_id=2 job_id=xxx queue=queries query_id=4 username=xxx
[2025-02-22 15:55:17,486][PID:1078][INFO][googleapiclient.discovery] URL being requested: GET https://www.googleapis.com/discovery/v1/apis/bigquery/v2/rest
[2025-02-22 15:55:17,649][PID:1078][INFO][googleapiclient.discovery] URL being requested: POST https://bigquery.googleapis.com/bigquery/v2/projects/xxx/jobs?alt=json
[2025-02-22 15:55:18,402][PID:1078][INFO][googleapiclient.discovery] URL being requested: GET https://bigquery.googleapis.com/bigquery/v2/projects/xxx/queries/job_xxx?startIndex=0&alt=json
...
[2025-02-22 15:55:29,441][PID:1078][INFO][googleapiclient.discovery] URL being requested: GET https://bigquery.googleapis.com/bigquery/v2/projects/paiza-project-1531392112673/queries/job_xxx?startIndex=0&alt=json
[2025-02-22 15:55:32,284][PID:1078][INFO][googleapiclient.discovery] URL being requested: GET https://bigquery.googleapis.com/bigquery/v2/projects/paiza-project-1531392112673/queries/job_xxx?startIndex=820&alt=json
[2025-02-22 15:55:32,761][PID:1078][INFO][rq.job.redash.tasks.queries.execution] job.func_name=redash.tasks.queries.execution.execute_query job.id=xxx job=execute_query query_hash=2a0bde1df5391857f2bd160769d8c867 ds_id=2 data_length=2386333 error=[None]
[2025-02-22 15:55:32,762][PID:1078][INFO][root] Inserted query (2a0bde1df5391857f2bd160769d8c867) data; id=None

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant