Automating Snowflake deployment using SnowSQL

Snowflake has a great - but underused - command line tool to automate operations.

One of the features we like best about Snowflake, the cloud data platform is its capability to fit into automated deployment pipelines. This blog covers basic ideas on how to run SQL code using automated deployment technology.


This is particulary useful in a number of scenario’s,  amongst which :

  • We want to separate development practices from the actual execution of the code against production data
  • We want to keep a version history of all code that ran against a database
  • We want to automate a number of actions once a development team hits a milestone

The scenario we’ll use to illustrate the idea,

  • Assumes teams or individuals commit code to a central version repository, in our case GIT.
  • Assumes contributions to the repository trigger automated events to create an ad-hoc computing environment outside of Snowflake
  • The ad-hoc computing environment is provisioned with both the committed code as well as the client interface tooling to execute this code against a target database.

We’ll also introduce good code hygiene, by witholding user credentials from our contributed code, instead storing these credentials as

Introducing SnowSQL

Snowflake offers a command line interface client, SnowSQL . The client runs on Windows, MacOS and a variety of Linux distributions. This makes it a great choice to manage our connections to Snowflake .


After installing the client, we can  pass on a parameter to the CLI to execute a series of SQL commands .

snowsql -f code.sql;

The code.sql file contains one straightforward piece of code:

select current_date()

Let’s start from a code versioning system

For this example, we’ll use Github.com  as it combines both of the features we need for the case:

  • Managing code versions through their implementation of GIT.
  • Providing continuous integration and continuous deployment capabilities through their Github Actions feature, which we’ll use to execute SQL code upon committing.

Any standard repository on github will be appropriate for the job. The code that needs execution will be committed to this repo.
Apart from one or more files containing SQL code,  we need to provide Github with instructions on how to execute the SQL.
This can be done by creating a yml file in the .github/workflows directory. This yml file contains Github Actions syntax that will

  • setup a temporary environment
  • install a SnowSQL interface
  • provide a secure environment to shield sensitive information, such as passwords.

A script that can provide all of this functionality using SnowSQL can be boiled down to:

name: SnowSQL
env:
  SNOWSQL_DEST: ~/snowflake
  SNOWSQL_ACCOUNT: tropos_zts.eu-west-1
  SNOWSQL_USER: github
  SNOWSQL_PWD: ${{ secrets.SF_PASSWORD }}
  
on: push                                                  
jobs:                         
  executequery:                           
    name: Install SnowSQL                          
    runs-on: ubuntu-latest                           
    steps:
    - name: Checkout
      uses: actions/checkout@master
    - name: Download SnowSQL
      run:  curl -O https://sfc-repo.snowflakecomputing.com/snowsql/bootstrap/1.2/linux_x86_64/snowsql-1.2.9-linux_x86_64.bash
    - name: Install SnowSQL
      run: SNOWSQL_DEST=~/snowflake SNOWSQL_LOGIN_SHELL=~/.profile bash snowsql-1.2.9-linux_x86_64.bash
    - name: Test installation
      run:  ~/snowflake/snowsql -v
    - name: Execute SQL against Snowflake
      run:  ~/snowflake/snowsql -f code.sql;


Let’s have a look at the major focus areas here

Setting up a temporary environment

First, we decide when to build the environment. To keep things straightforward, we configure the job to run every time one contributes code to the central repository.


The environment will be built every time one contributes code to the repository. We’ll use a version of Ubuntu as a base.

on: push                                                  
jobs:                         
  executequery:                           
    name: Install SnowSQL                          
    runs-on: ubuntu-latest   

Install SnowSQL

As there’s no package-based installation available for Ubuntu, we’ll download an installer from Snowflake’s public website .Alongside this, the checkout@master command copies all files from the repository to the actual run environment.

   steps:
    - name: Checkout
      uses: actions/checkout@master
    - name: Download SnowSQL
      run:  curl -O https://sfc-repo.snowflakecomputing.com/snowsql/bootstrap/1.2/linux_x86_64/snowsql-1.2.9-linux_x86_64.bash
    - name: Install SnowSQL
      run: SNOWSQL_DEST=~/snowflake SNOWSQL_LOGIN_SHELL=~/.profile bash snowsql-1.2.9-linux_x86_64.bash

To make sure out version of SnowSQL is installed, we run a test

    - name: Test installation
      run:  ~/snowflake/snowsql -v

Executing the code

Once the installation is verified, the content in. the code.sql file can be executed. – name: Execute SQL against Snowflake

    - name: Execute SQL against Snowflake
      run:  ~/snowflake/snowsql -f code.sql;

Keeping secrets at bay

SnowSQL takes its configuration parameters from environment variables. As per best practice, we don’t check sensitive information such as access credentials in code repositories – and should actually prevent anyone from doing so.
To make sure SnowSQL is able to access Snowflake anyhow, we make use of the secrets feature in Github Actions.

These are protected variables that are injected in our envionment at run time. Their values aren’t logged neither displayed.

env:
  SNOWSQL_DEST: ~/snowflake
  SNOWSQL_ACCOUNT: my_snowflake_account.eu-west-1
  SNOWSQL_USER: ${{ secrets.SF_USER }}
  SNOWSQL_PWD: ${{ secrets.SF_PASSWORD }}

Conclusion

In a few lines of code, we are able to split development and execution of SQL code using a simplified version of a deployment pipeline.Real-world examples would be far more elaborate, with collaboration strategies, automated quality control and approval mechanisms built in. Nonetheless, this quick setup will ensure an isolated production environment of Snowflake can exist without anyone having access to the privileged access users.

Next steps

This article summarizes the basic concept of GitOps, i.e. deploying code authorized through a central version repository. This is merely an illustration of how we typically run operations in project delivery context. We have packed our experience in building CI/CD (continuous integration, continuous deployment) pipelines, automating data testing and aligning them with development processes. Reach out if you like to know more!

Celine Meulemans

Celine Meulemans

Related articles

In just 6 weeks, Jacob had the opportunity to learn and grow through a series of courses designed to equip him with the skills and knowledge necessary to succeed in the data industry.

Revisiting my 6 weeks onboarding training

If you’re working in a hands-on data role using Snowflake, Databricks, or Bigquery, chances are you’ve encountered dbt as a companion technology. 🎉 On April 3rd, 2023, dbt Labs announced that Tropos.io became one of the 5 premier partners worldwide.

Exclusive! We Are Excited To Be A Dbt Premier Partner in 2023

The how-to guide to interpreting Snowflake's usage-based pricing model.

Calculate your Snowflake monthly cost now using this expert method

Scroll to Top