Automating Snowflake deployment using SnowSQL

Snowflake has a great - but underused - command line tool to automate operations.
Celine Meulemans

One of the features we like best about Snowflake, the cloud data platform is its capability to fit into automated deployment pipelines. This blog covers basic ideas on how to run SQL code using automated deployment technology.

This is particulary useful in a number of scenario’s, amongst which :

We want to separate development practices from the actual execution of the code against production data
We want to keep a version history of all code that ran against a database
We want to automate a number of actions once a development team hits a milestone

The scenario we’ll use to illustrate the idea,

Assumes teams or individuals commit code to a central version repository, in our case GIT.
Assumes contributions to the repository trigger automated events to create an ad-hoc computing environment outside of Snowflake
The ad-hoc computing environment is provisioned with both the committed code as well as the client interface tooling to execute this code against a target database.

We’ll also introduce good code hygiene, by witholding user credentials from our contributed code, instead storing these credentials as

Introducing SnowSQL

Snowflake offers a command line interface client, SnowSQL . The client runs on Windows, MacOS and a variety of Linux distributions. This makes it a great choice to manage our connections to Snowflake .

After installing the client, we can pass on a parameter to the CLI to execute a series of SQL commands .

snowsql -f code.sql;

The code.sql file contains one straightforward piece of code:

select current_date()

Let’s start from a code versioning system

For this example, we’ll use Github.com as it combines both of the features we need for the case:

Managing code versions through their implementation of GIT.
Providing continuous integration and continuous deployment capabilities through their Github Actions feature, which we’ll use to execute SQL code upon committing.

Any standard repository on github will be appropriate for the job. The code that needs execution will be committed to this repo.
Apart from one or more files containing SQL code, we need to provide Github with instructions on how to execute the SQL.
This can be done by creating a yml file in the .github/workflows directory. This yml file contains Github Actions syntax that will

setup a temporary environment
install a SnowSQL interface
provide a secure environment to shield sensitive information, such as passwords.

A script that can provide all of this functionality using SnowSQL can be boiled down to:

name: SnowSQL
env:
  SNOWSQL_DEST: ~/snowflake
  SNOWSQL_ACCOUNT: tropos_zts.eu-west-1
  SNOWSQL_USER: github
  SNOWSQL_PWD: ${{ secrets.SF_PASSWORD }}
  
on: push                                                  
jobs:                         
  executequery:                           
    name: Install SnowSQL                          
    runs-on: ubuntu-latest                           
    steps:
    - name: Checkout
      uses: actions/checkout@master
    - name: Download SnowSQL
      run:  curl -O https://sfc-repo.snowflakecomputing.com/snowsql/bootstrap/1.2/linux_x86_64/snowsql-1.2.9-linux_x86_64.bash
    - name: Install SnowSQL
      run: SNOWSQL_DEST=~/snowflake SNOWSQL_LOGIN_SHELL=~/.profile bash snowsql-1.2.9-linux_x86_64.bash
    - name: Test installation
      run:  ~/snowflake/snowsql -v
    - name: Execute SQL against Snowflake
      run:  ~/snowflake/snowsql -f code.sql;

Let’s have a look at the major focus areas here

Setting up a temporary environment

First, we decide when to build the environment. To keep things straightforward, we configure the job to run every time one contributes code to the central repository.

The environment will be built every time one contributes code to the repository. We’ll use a version of Ubuntu as a base.

on: push                                                  
jobs:                         
  executequery:                           
    name: Install SnowSQL                          
    runs-on: ubuntu-latest

Install SnowSQL

As there’s no package-based installation available for Ubuntu, we’ll download an installer from Snowflake’s public website .Alongside this, the checkout@master command copies all files from the repository to the actual run environment.

   steps:
    - name: Checkout
      uses: actions/checkout@master
    - name: Download SnowSQL
      run:  curl -O https://sfc-repo.snowflakecomputing.com/snowsql/bootstrap/1.2/linux_x86_64/snowsql-1.2.9-linux_x86_64.bash
    - name: Install SnowSQL
      run: SNOWSQL_DEST=~/snowflake SNOWSQL_LOGIN_SHELL=~/.profile bash snowsql-1.2.9-linux_x86_64.bash

To make sure out version of SnowSQL is installed, we run a test

    - name: Test installation
      run:  ~/snowflake/snowsql -v

Executing the code

Once the installation is verified, the content in. the code.sql file can be executed. – name: Execute SQL against Snowflake

    - name: Execute SQL against Snowflake
      run:  ~/snowflake/snowsql -f code.sql;

Keeping secrets at bay

SnowSQL takes its configuration parameters from environment variables. As per best practice, we don’t check sensitive information such as access credentials in code repositories – and should actually prevent anyone from doing so.
To make sure SnowSQL is able to access Snowflake anyhow, we make use of the secrets feature in Github Actions.

These are protected variables that are injected in our envionment at run time. Their values aren’t logged neither displayed.

env:
  SNOWSQL_DEST: ~/snowflake
  SNOWSQL_ACCOUNT: my_snowflake_account.eu-west-1
  SNOWSQL_USER: ${{ secrets.SF_USER }}
  SNOWSQL_PWD: ${{ secrets.SF_PASSWORD }}

Conclusion

In a few lines of code, we are able to split development and execution of SQL code using a simplified version of a deployment pipeline.Real-world examples would be far more elaborate, with collaboration strategies, automated quality control and approval mechanisms built in. Nonetheless, this quick setup will ensure an isolated production environment of Snowflake can exist without anyone having access to the privileged access users.

Next steps

This article summarizes the basic concept of GitOps, i.e. deploying code authorized through a central version repository. This is merely an illustration of how we typically run operations in project delivery context. We have packed our experience in building CI/CD (continuous integration, continuous deployment) pipelines, automating data testing and aligning them with development processes. Reach out if you like to know more!

Celine Meulemans

One of the most considerable challenges for a data platform owner today is upgrading their data platform infrastructure. We found a way to automate the conversion of several legacy technologies to Snowflake by autoconverting them to dbt projects. Here’s how we did it.

How We Accelerate Hadoop-to-Snowflake Migrations

In just 6 weeks, Jacob had the opportunity to learn and grow through a series of courses designed to equip him with the skills and knowledge necessary to succeed in the data industry.

Revisiting my 6 weeks onboarding training

If you’re working in a hands-on data role using Snowflake, Databricks, or Bigquery, chances are you’ve encountered dbt as a companion technology. 🎉 On April 3rd, 2023, dbt Labs announced that Tropos.io became one of the 5 premier partners worldwide.

Cookie	Duration	Description
__hssrc	session	This cookie is set by Hubspot whenever it changes the session cookie. The __hssrc cookie set to 1 indicates that the user has restarted the browser, and if the cookie does not exist, it is assumed to be a new session.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
elementor	never	This cookie is used by the website's WordPress theme. It allows the website owner to implement or change the website's content in real-time.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
__hssc	30 minutes	HubSpot sets this cookie to keep track of sessions and to determine if HubSpot should increment the session number and timestamps in the __hstc cookie.
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
bscookie	2 years	LinkedIn sets this cookie to store performed actions on the website.
lang	session	LinkedIn sets this cookie to remember a user's language setting.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
UserMatchHistory	1 month	LinkedIn sets this cookie for LinkedIn Ads ID syncing.

Cookie	Duration	Description
__hstc	5 months 27 days	This is the main cookie set by Hubspot, for tracking visitors. It contains the domain, initial timestamp (first visit), last timestamp (last visit), current timestamp (this visit), and session number (increments for each subsequent session).
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_ZET6HEX39B	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_75663021_2	1 minute	Set by Google to distinguish users.
_gat_UA-75663021-2	1 minute	A variation of the _gat cookie set by Google Analytics and Google Tag Manager to allow website owners to track visitor behaviour and measure site performance. The pattern element in the name contains the unique identity number of the account or website it relates to.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
hubspotutk	5 months 27 days	HubSpot sets this cookie to keep track of the visitors to the website. This cookie is passed to HubSpot on form submission and used when deduplicating contacts.
undefined	never	Wistia sets this cookie to collect data on visitor interaction with the website's video-content, to make the website's video-content more relevant for the visitor.

Cookie	Duration	Description
_fbp	3 months	This cookie is set by Facebook to display advertisements when either on Facebook or on a digital platform powered by Facebook advertising, after visiting the website.
fr	3 months	Facebook sets this cookie to show relevant advertisements to users by tracking user behaviour across the web, on sites that have Facebook pixel or Facebook social plugin.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
AnalyticsSyncHistory	1 month	No description
li_gc	2 years	No description
loglevel	never	No description available.

Automating Snowflake deployment using SnowSQL

Introducing SnowSQL

Let’s start from a code versioning system

Setting up a temporary environment

Install SnowSQL

Executing the code

Keeping secrets at bay

Conclusion

Next steps

Celine Meulemans

Related articles

How We Accelerate Hadoop-to-Snowflake Migrations

Revisiting my 6 weeks onboarding training

Exclusive! We Are Excited To Be A Dbt Premier Partner in 2023

Industries

Community

Company

Careers

Contact