Avatar

Accessing GCP BigQuery: a Golang kata

← Back to list
Posted on 24.10.2023
Last updated on 04.12.2024
Image by AI on Midjourney
Refill!

Table of contents

Recently I got a task of scraping some data from one of our BigQuery tables. It could be easily done through the GCP console, however, my personal account hasn't had the necessary access. On the other hand, a service account that was at my disposal, had it.

So, an obvious solution was to make a small Golang script that makes a query to the desired BQ table. No sooner said than done, behold!

# The script

I hate writing raw SQL queries, especially if a language does not support proper string templating syntax. In case of Go, it is either ugly concatenation of strings, or fmt.Sprintf(). That's why I've decided to make use of a query builder, such as the famous gocraft/dbr. So I make a fake connection, then I make the select statement, and finally I interpolate it for the Postgres dialect.

Also, I always clamp the results using the limit clause. It is automatically attached to the query at the end. This is necessary, because unlike CloudSQL, BigQuery charges you based on the amount of data processed. That's why you probably don't want to grab unnecessary data you are not gonna put into any use.

The table name is also auto-prefixed with the namespace, if any. The service account is loaded from the JSON file, placed locally.

👉 📃  cmd/main.go
package main
import (
"context"
"database/sql"
"fmt"
"log"
"os"
"cloud.google.com/go/bigquery"
"github.com/gocraft/dbr/v2"
"github.com/gocraft/dbr/v2/dialect"
"github.com/lib/pq"
"github.com/luna-duclos/instrumentedsql"
"google.golang.org/api/iterator"
"google.golang.org/api/option"
)
const (
QueryLimit = 10
)
type Record struct {
SKU string `bigquery:"sku"`
Title string `bigquery:"title"`
}
type DB struct {
*dbr.Connection
}
func main() {
projectName := os.Getenv("PROJECT_NAME")
credentialsFile := os.Getenv("CREDENTIALS_FILE")
ctx := context.Background()
sess, closeConn, err := makeFakeSession()
if err != nil {
log.Fatalf("creation error: %v", err)
}
defer closeConn()
query, err := interpolateBQQuery(
sess.Select("sku, title").
From(prefTable("products")),
QueryLimit,
)
if err != nil {
log.Fatalf("interpolation failure: %v", err)
}
fmt.Printf("EXECUTING: %s\n", query)
client, err := bigquery.NewClient(ctx, projectName, option.WithCredentialsFile(credentialsFile))
if err != nil {
log.Fatalf("new bigquery client creation error: %v", err)
}
defer client.Close()
result, err := client.Query(query).Read(ctx)
if err != nil {
log.Fatalf("bigquery read error: %v", err)
}
for {
var row Record
err := result.Next(&row)
if err == iterator.Done {
break
}
if err != nil {
log.Fatalf("error iterating through results: %v", err)
}
fmt.Printf("%v\n", row)
fmt.Println("--------==============================--------")
}
}
func prefTable(sql string) string {
tablePrefix := os.Getenv("TABLE_PREFIX")
return fmt.Sprintf("%s%s", tablePrefix, sql)
}
func makeFakeSession() (*dbr.Session, func() error, error) {
sql.Register("instrumented-postgres", instrumentedsql.WrapDriver(&pq.Driver{}))
masterDB, err := sql.Open("instrumented-postgres", "host= port= user= password= dbname= sslmode=disable")
if err != nil {
return nil, nil, err
}
masterConn := &DB{
Connection: &dbr.Connection{
DB: masterDB,
Dialect: dialect.PostgreSQL,
EventReceiver: &dbr.NullEventReceiver{},
},
}
sess := masterConn.NewSession(nil)
return sess, masterDB.Close, nil
}
func interpolateSelectStatement(statement *dbr.SelectStmt) (string, error) {
buf := dbr.NewBuffer()
err := statement.Build(statement.Dialect, buf)
if err != nil {
return "", err
}
return dbr.InterpolateForDialect(buf.String(), buf.Value(), statement.Dialect)
}
func interpolateBQQuery(statement *dbr.SelectStmt, limit uint64) (string, error) {
statement.Limit(limit)
sqlQuery, err := interpolateSelectStatement(statement)
if err != nil {
return "", err
}
return sqlQuery, nil
}
The code is licensed under the MIT license

As you can see, the script is not rocket science at all, but I am glad I had a chance to work with BQ using Go.

And that's it. The code is available here.


Avatar

Sergei Gannochenko

Business-oriented fullstack engineer, in ❤️ with Tech.
Golang, React, TypeScript, Docker, AWS, Jamstack.
20+ years in dev.