February 20, 2013

How to easily use BigQuery's Querys & Charts with Google Apps Script! (and our FREE library)


BigQuery's ability to digest billions of rows, trillions of records stored in a 1Tb Database in seconds sounds really appealing for massive data-analysis! Using it you can, for example, digest all the sales in the last 20 years of a super market and make a chart with the evolution of the sales of the different brands of cereal inside such period... all of them or just a single one... Put it all inside some graphs and send them attached as a PDF report via Gmail....now that kind of power screams "Google" right? and hopefully might draw a smile on some of your client's face.

Google Apps Script certainly facilitates the use of Big Query! but still we need to get (and wait) for the results and iterate through them, so we (the team at MyWorkers) decided to create a library to facilitate working with BigQuery. So far it helps you with the 3 basic things you need to get started:
  • Sends your query, wait for and retrieve the results. 
  • Iterate through the results. 
  • Build a data table that can be drawn by any appropriate Google Chart. 

This functions are called and works independently, so you can make reuse them many times in your codes to save you a lot of time coding and debugging on the data handing so you can focus more on your charts and the rest of your super cool WebApp or automatic report.

This makes using Big Query VERY easy and straight forward! But before you can start using it, you need to make sure you have the following:



First you need to have (or create) a project in the Google APIS console and take note of the project number.



Also (if you haven't already) you need to enable the Big Query service in the Services tag of the Google APIS console.

To use BigQuery, you also need, well... a query. BigQuery queries are written using a variation of the standard SQL SELECT statement, you can read more details about how to make them in this Query Reference and/or using this Query Cook-book. We will be using one of the example query's (and free public tables) that Google provides:

select TOP(word, 10), COUNT(*) as word_count from publicdata:samples.shakespeare WHERE LENGTH(word) > 10;

Now you are ready to start using BigQuery via Google Apps Script! Let's check it out:



* Create a new script at script.google.com (let's call it bigQueryToCharts) and go to the Resources menu and click the Use Google APIs tag. From there, turn on the Big Query API.



* Then go to Manage libraries (also in the Resources menu) and:

  1. Go to "Find library:" and type: ME6_s9CBqVzsOfhZryNM0vzfIqMXwq4Jm .
  2. Click Select.
  3. In Version click  8 getQueryResults function added.
  4. Turn on development mode for setupBigQueryResults.

It's time to start putting some code, first lets put what we know inside some variables:

var projectNumber = '<ID>'; // The ID of your project.
var sql = 'select TOP(word, 10), COUNT(*) as word_count from publicdata:samples.shakespeare;';


IMPORTANT NOTE: I suggest using the same names for the variables, it's not necessary  but  this way the auto-complete of Apps Script will fill the names of the variables needed by the library when we call it.

Now let's use the library (setupBigQueryResults) to send our SQL to BigQuery, wait for the results and deliver them back, it will offer you options to autocomplete, choose getQueryResults and the line should look as follow:

 var queryResults = setupBigQueryResults.getQueryResults(projectNumber, sql);

To continue, we need to know 2 things: the resultCount (to iterate results) and the resultSchema to build our charts data table, both values are passed by setupBigQueryResults inside queryResults, to get them type as follows:

var resultCount queryResults.getTotalRows();
var resultSquema queryResults.getSchema();


Now lets hand those values back to our library, it's done this way to give more flexibility on the use of the library, (E.g. when processing multiple instances.

var resultValues setupBigQueryResults.iterateResults(queryResults, resultCount);
var bigQueryDataTable setupBigQueryResults.buildChartsDataTable(resultValues, resultSchema);


And that's it! We are ready to pass our data to any (appropriate) chart! Let's see our full example (with the required code to build a basic UI):

function doGet(e) {
  var projectNumber = '<YourProjectsNumber>'; 
  var sql = 'select TOP(word, 30), COUNT(*) as word_count from publicdata:samples.shakespeare WHERE LENGTH(word) > 10;';
  var queryResults = setupBigQueryResults.getQueryResults(projectNumber, sql);
  var resultCount = queryResults.getTotalRows();
  var resultSchema = queryResults.getSchema(); 
  var resultValues = setupBigQueryResults.iterateResults(queryResults, resultCount); 
  var bigQueryDataTable = setupBigQueryResults.buildChartsDataTable(resultValues, resultSchema);
  var chart = Charts.newBarChart()
    .setDataTable(bigQueryDataTable)
    .setDimensions(500, 700)
    .setYAxisTitle('Word')
    .setXAxisTitle('Count')
    .setLegendPosition(Charts.Position.BOTTOM)
    .build();
  var uiApp = UiApp.createApplication();
  uiApp.add(chart);
  return uiApp;
}


You can read all the details about what you can do with BigQuery, along with the original version of this example (without our library) in the Big Query class documentation.

Feel free to try this example, already published as a WebApp and to post your ideas, suggestions, requests, etc at the bottom of this article.

________________________________________________________________________
Important Note: The library provided in this example is setup to run as Me (César Antón Dorantes) and only gives access to the free datasets for Google BigQuery, to use it to process YOUR information (with the Billing activated on your account) you must use a library that runs as YOU. To do so, use the following library:

 "Project key" to import the library from "Resources" -> "Manage Libraries":


MsZBZDQxT88E-QS2OFguwejfIqMXwq4Jm
It will require you to authorize the script to use your BigQuery access, this is only done ONE TIME, to do it, simply run it from the following address:

https://script.google.com/macros/s/AKfycbyTD13yqIya75jntPkdubgq0f3DRsL-eTUDI8ZC9uGx5lYjHiM/exec

The FIRST time you run it, it SHOULD pop up an alert asking you to authorize the library. Once authorized, it should give this output (when run directly from the link):

"Unknown macro doGet".

________________________________________________________________________
Update: Version 3 of the library now available! It now includes Documentation and GNU License. Please Update your library to Version 3 so that your code get's the License for this Library. You can read the Library Documentation in the following link:
https://script.google.com/macros/library/d/MsZBZDQxT88E-QS2OFguwejfIqMXwq4Jm/3
________________________________________________________________________
Update: Version 4 of the library now available! It now incorporates the new method changes described on this articlePlease Update your library to Version 4 so that your code works again with the new methods.
________________________________________________________________________

15 comments:

  1. Hi,

    I have an error here :

    queryResults = BigQuery.Jobs.query(projectNumber, sql, {'timeoutMs':10000}); // TimeOut de [10 seg]

    I created a project from Google API Console and activate the API BigQuery. The log information is :

    Exception: Bad Request

    Are you an idea of this problem. Thank you.

    ReplyDelete
    Replies
    1. If you just tried to run the library's getQueryResults from WITHIN the library, it just won't work since it has no data (Valid project number & SQL query), without those, its simply a library filled with code to be recycled inside many other bigger scripts.

      Try calling it from a doGet() function like the one described at the end of the article.

      Remember you will need to use your own project number.

      Hope this fixes your problem, happy coding ;)

      Delete
  2. When I run the function doGet(e), I have this error

    TypeError: Impossible to call this method "getTotalRows" undefined. [line 13, file "Code": var resultCount = queryResults.getTotalRows();]

    With debugger and run step by step the function doGet() and I have error given in my first email.

    See screenshot
    https://docs.google.com/file/d/0B6UEJFMrvusLVjJnZFNGX1NqaTg/edit?usp=sharing
    https://docs.google.com/file/d/0B6UEJFMrvusLeThpQ3VGcFpYVXc/edit?usp=sharing

    ReplyDelete
    Replies
    1. If you used the exact code and imported the library (plus having activated the BigQuery API) it should work, try checking where the data goes missing by verifying the content of the vars using the Logger, check the outputs in the following order (remember to place the logger right after the variable get's a value:

      Delete
    2. // ******
      Logger.log(sql);
      select TOP(word, 30), COUNT(*) as word_count from publicdata:samples.shakespeare WHERE LENGTH(word) > 10;
      // ---

      Delete
    3. Logger.log(queryResults);
      {"schema":{"fields":[{"name":"f0_","type":"STRING","mode":"NULLABLE"},{"name":"word_count","type":"INTEGER","mode":"REQUIRED"}]},"totalRows":"30","jobReference":{"jobId":"job_0feccabba31f4649b93ecb28a6ad4cd4","projectId":"myworkerscloudmanagement"},"jobComplete":true,"kind":"bigquery#queryResponse","rows":[{"f":[{"v":"counterfeit"},{"v":"28"}]},{"f":[{"v":"remembrance"},{"v":"24"}]},{"f":[{"v":"countenance"},{"v":"24"}]},{"f":[{"v":"acquaintance"},{"v":"23"}]},{"f":[{"v":"satisfaction"},{"v":"20"}]},{"f":[{"v":"entertainment"},{"v":"20"}]},{"f":[{"v":"displeasure"},{"v":"20"}]},{"f":[{"v":"sovereignty"},{"v":"19"}]},{"f":[{"v":"imagination"},{"v":"19"}]},{"f":[{"v":"disposition"},{"v":"19"}]},{"f":[{"v":"treacherous"},{"v":"18"}]},{"f":[{"v":"gentlewoman"},{"v":"18"}]},{"f":[{"v":"circumstance"},{"v":"18"}]},{"f":[{"v":"preparation"},{"v":"17"}]},{"f":[{"v":"notwithstanding"},{"v":"17"}]},{"f":[{"v":"intelligence"},{"v":"17"}]},{"f":[{"v":"everlasting"},{"v":"17"}]},{"f":[{"v":"commonwealth"},{"v":"17"}]},{"f":[{"v":"instruments"},{"v":"16"}]},{"f":[{"v":"expectation"},{"v":"15"}]},{"f":[{"v":"destruction"},{"v":"15"}]},{"f":[{"v":"threatening"},{"v":"14"}]},{"f":[{"v":"superfluous"},{"v":"14"}]},{"f":[{"v":"proclamation"},{"v":"14"}]},{"f":[{"v":"consequence"},{"v":"14"}]},{"f":[{"v":"proceedings"},{"v":"13"}]},{"f":[{"v":"description"},{"v":"13"}]},{"f":[{"v":"apprehension"},{"v":"13"}]},{"f":[{"v":"acknowledge"},{"v":"13"}]},{"f":[{"v":"performance"},{"v":"12"}]}]}
      // ---

      Delete
    4. Logger.log(resultCount);
      30
      // ---

      Delete
    5. Logger.log(resultSchema)
      {"fields":[{"name":"f0_","type":"STRING","mode":"NULLABLE"},{"name":"word_count","type":"INTEGER","mode":"REQUIRED"}]}
      // ---

      Delete
    6. Logger.log(resultValues);
      [[counterfeit, 28], [remembrance, 24], [countenance, 24], [acquaintance, 23], [satisfaction, 20], [entertainment, 20], [displeasure, 20], [sovereignty, 19], [imagination, 19], [disposition, 19], [treacherous, 18], [gentlewoman, 18], [circumstance, 18], [preparation, 17], [notwithstanding, 17], [intelligence, 17], [everlasting, 17], [commonwealth, 17], [instruments, 16], [expectation, 15], [destruction, 15], [threatening, 14], [superfluous, 14], [proclamation, 14], [consequence, 14], [proceedings, 13], [description, 13], [apprehension, 13], [acknowledge, 13], [performance, 12]]
      // ---

      Delete
    7. Logger.log(bigQueryDataTable);
      DataTableBuilder
      // ******

      Delete
    8. Also make sure the library it's working:

      When you type:
      var queryResults = setupBigQueryResults.

      Right after the "." it should automatically give you the 3 functions of the library to autocomplete.

      buildChartsDataTable, getQueryResults & iterateResults.

      Happy coding! =)

      Delete
  3. Hi Cesar,

    Thank you for your help. I don't understand, because I have the 3 functions if the library, when I use CTRL + space (autocomplete). The code is the same. In the log file, I have this info :

    select TOP(word, 30), COUNT(*) as word_count from publicdata:samples.shakespeare WHERE LENGTH(word) > 10;
    Exception: Bad Request

    All information is provided as attachments (links):
    https://docs.google.com/file/d/0B6UEJFMrvusLNU9Cd3NVYmpFSGc/edit?usp=sharing
    https://docs.google.com/file/d/0B6UEJFMrvusLMWdSTEVSM1pIYUE/edit?usp=sharing
    https://docs.google.com/file/d/0B6UEJFMrvusLVmxEOThrYTRVak0/edit?usp=sharing
    https://docs.google.com/file/d/0B6UEJFMrvusLY19zMWlXd1REdWc/edit?usp=sharing


    ReplyDelete
    Replies
    1. Well, it sounds that the BigQuery API it's missing some authorization. Just made a lil adjustment, please try it one more time, if it fails again do the following:

      That library is setup to run as me so it works to the people using the example on the webpage. You can make a copy of your own and give it all the proper authorizations, or, you can use this other one, this one is setup to run as the user accessing it (you), it will need to be done this way if you want to do something that its not inside the free examples anyway:

      https://script.google.com/d/143HERierTapVf40EYOGcQmI4RLht0MzUk7fc7HTejXZFlvfNGAN3sL1Z/edit?usp=sharing

      So this would be the new "Project key" to import the library:
      MsZBZDQxT88E-QS2OFguwejfIqMXwq4Jm

      It will require you to authorize the script to use your BigQuery access, this is only done ONE TIME, to do it, simply run it from the following address:

      https://script.google.com/macros/s/AKfycbyTD13yqIya75jntPkdubgq0f3DRsL-eTUDI8ZC9uGx5lYjHiM/exec

      The FIRST time you run it, it SHOULD pop up a windows asking you to authorize the library. Once authorized, it should give this output (when run directly from the link):

      "Unknown macro doGet"

      Thank you for your feedback! I haven't noticed that the users WILL need a library that runs as them to use it for some real stuff outside the free examples of Google BigQuery. Sorry for any inconveniences.

      Delete
  4. Hi Cesar,

    I solved my problem. First, I inserted the function doGet and updated the script code setupBigQueryResults. It worked (OK histogram). Then I picked up the script code bigQueryToCharts. A systematic error "ReferenceError BigQuery is not defined". I reread the bigquery_tutorial and I updated "Under the Services tab, turn on the BigQuery API.". Then no problem.

    Thank you again for your help.

    ReplyDelete
    Replies
    1. You're welcome. I'm glad It's finally working for you.

      Happy coding.

      Delete

Do your have any questions or suggestions about this blog? Let me know! I'm always happy to receive feedback.