A public institution has a yearly fixed budget to organize training for Belgian citizens. Training can cover a wide range of topics, ranging from a broad audience to specific niches. To make the best use of limited budgets, the institution needs to identify the best locations to organize these trainings, based on factors such as local interest, trends in education, the proximity of public transportation, and total addressable audience.
Therefore, they need to conduct an analysis of the residence of each citizen enrolled in training and compare this to the training they enrolled in. This information is analyzed within the specific context of each citizen. To avoid any possible challenge in privacy, the client doesn’t want to analyze this use case on a per-citizen basis but to aggregate individual context to a level of anonymity, that is still specific enough to derive meaningful conclusions.
The core challenge in this exercise is to translate the residences of previously enrolled attendees and project them on the dimensions of a geographical map. The exercise needed to be repeatable and our client wanted a hands-on approach for their own process analysts thus avoiding a complex technical setup.
Organizing the majority of data work in Snowflake
The core challenge was solved using Snowflake technology. The technology consists of powerful geospatial analytics features to determine reasonable proximity between attendees and training locations. To cover the use case, our team needed to solve 2 challenges:
Find coordinates for addresses in AWS
Translate the residences to map coordinates. We did so by feeding the addresses in text format into Amazon Location Service, a service provided by Amazon Web Services to cover that exact use case. The service – an API – is meant for application developers, hence not specifically designed to support analytical processing in the way our use case requires.
To bridge the gap between address data stored in Snowflake and the Amazon Location Service, we made use of Snowflake’s external functions. A capability to reach virtually any kind of service provided by external parties accessible through an API (application programming interface). By integrating this capability, we extended Snowflake’s SQL language with capabilities for address processing, leveraging Amazon Web Services service offerings.
Governing data privacy through geographical mapping
To ensure maximum protection of individual attendees’ data, Snowflake offers out-of-the-box geospatial capabilities that map individual address coordinates to statistical sectors, a mapping system used in Belgium to analyze demographics in a statistically sound way.
As a result, our team was able to present an analytical dashboard capable of on-the-fly, segmented analytics based on various determining factors in a GDPR-compliant way. After its release to internal analysts, the governmental agency was capable of better-allocating funds for niche training, as well as promoting better-located alternatives for well-attended training based on determining factors such as public infrastructure.
The process in total took about 6 weeks to from idea to deliverable, thanks to our library of prebuilt components to cover use cases alike.