Snowflake Ecosystem Podcast

S01E04 Governing Data Transformation in Snowflake (with dbt)

In episode 4, Hope Watson (dbt Labs) and Joris Van den Borre (Tropos.io) discuss governing data transformations in a modular, cloud-first data platform.

About this episode

If you’ve been planning to roll out Snowflake, you may have unnoticed that organizing data transformations look totally different compared to previous technology iterations. dbt seems to be the default choice when it comes to governing business rules in modern data platforms. Hope Watson (dbt Labs) and Joris Van den Borre (Tropos.io) spent half an hour together discussing the rise and ubiquity of the combo. 

Key takeaways from the session

The Do’s

✅ Be smart in how you deliver. Get rid of overhead in your data integration practice by taking proven practices from software engineering and applying them to a data context. One of those is “continuous integration”, a practice we use often to watch over the quality of your deliverables so the pace of delivery can stay high. Writing code instead of using low code principles is – contrary enough – often a more reliable and productive way to speed up the time-to-market for new data products;

✅ Keep your ecosystem efficient. Data transformations are a “spikey” workload, so keep a fit-for-purpose focus for every component in your tech ecosystem. The ecosystem is rich and a smart mix-and-match between components keeps the total cost of ownership at bay whilst making the most use of your Snowflake budgets;

✅ From a process perspective, it makes sense to consolidate responsibilities to transform, test and document data. But no one likes to do that, right? And if it happens, it often happens at the very last moment or as part of a technical debt reduction effort. We’re now at a point where ideal responsibilities for an engineering team can be matched with a way of working such as dbt proposes.

The Don’t’s

❌ Don’t try and reinvent the wheel. Open source is great to experiment, innovate and validate use cases. However, when projects really become successful, it’s often the innovators who become the helpdesk. Make sure there’s a stable support model – hence company – behind the open source that made your project successful.

❌ Don’t underestimate SQL, the programming language for databases. Really, don’t. It might be hard to scale across teams, regions or projects, but templating engines such as dbt do a great job of managing bits and pieces of complex business logic. By sticking to SQL, teams can remove degrees of freedom that other programming languages offer but often aren’t strictly necessary to deliver business value. This reduces the complexity of managing your platform, and makes Snowflake a perfect outsourcing partner;

Don’t build processes from scratch. None of them. Yes, it might be enticing to go full-blown from the start on managing your code, checking your quality, going to production, and scaling your platform, … But it has been done before and learnings are out there. Make sure you can just copy and paste the bare minimum, and preferably get some guardrails in place from the start;

Scroll to Top