Search
  • Tim Burns

AWS and Snowflake - DevSecOps Style



I find many articles whose purpose is to check a box on a feature checklist. Yes, (check), Snowflake does integrate with Glue. It integrates, and if you follow the DevSecOps style to create accounts, utilize secrets, and automate the ETL process in Glue, then you can get the most out of your Snowflake data warehouse with minimal costs on the ETL provider front.


However, before starting down that pass, start with Security First. Many articles jump past this important step and the developer risks exposing a password. No Beuno! Start with a DevSecOps approach and start with security first.


First order of business - establish MFA on Snowflake and AWS in order to ensure privileged operations are properly identified.

Create the Snowflake Application User and Store Securely

Avoid issues with leaked passwords by making the Snowflake application user throw-away and decouple the user with the application by regenerating users often using SnowSQL and using AWS Secrets Manager to encapsulate the credentials.


!set variable_substitution=true;
create or replace user &{user}
    identified by '&{randomPassword}' default_role = &{defaultRole} DEFAULT_NAMESPACE=&{database};
alter user &{user} set default_warehouse = &{warehouse};
alter user &{user} set DEFAULT_NAMESPACE  = &{database};
alter user &{user} set DEFAULT_ROLE = &{defaultRole};
grant role &{defaultRole} to user &{user};
grant all privileges on database &{database} to role  &{defaultRole}; 

The run the Makefile entry with an MFA administrator account.

create-app-user:
   $(eval randomPassword := $(shell openssl rand -base64 20))
   echo $(randomPassword)
   snowsql -f snowflake/access/createAppUser.sql -D database=${SNOWFLAKE_APP_DATABASE} -D user=${SNOWFLAKE_APP_USER}\
      -D randomPassword=$(randomPassword) -D defaultRole=${SNOWFLAKE_APP_ROLE} \
      -D warehouse=${SNOWFLAKE_APP_WAREHOUSE}

Again, with MFA, setup the account in Secrets Manager on AWS.

After creating the secret, store the secrets ARN in an environment variable to use when giving access to the AWS Glue Job for loading the data.


When you load the password, use secret caching and the ARN to get the credentials securely.

The details are a bit subtle follow here:


https://aws.amazon.com/blogs/big-data/use-aws-glue-to-run-etl-jobs-against-non-native-jdbc-data-sources/



19 views0 comments