How To Load Data From Azure Blob Storage Into Snowflake

This simple snowflake project details how to load data from Azure Blob Storage to the Snowflake data warehouse. You do not need any prior experience or download any tool to complete this tutorial.

The Snowflake data warehouse is a leading cloud data warehouse and it continues to gain popularity in the ever-growing data industry. I recommend every data engineer and analyst – beginner or expert who is yet to learn about this platform to do so fast.

If you’re already familiar with other popular relational databases like SQL Server, MySQL, or Oracle, you will find it easier to understand the Snowflake data warehouse (especially the ANSI SQL queries).

In this tutorial, I will show you how to:

  1. Create your Azure Blob Storage account
  2. Create your Snowflake account
  3. Create Snowflake database objects such as tables, file formats, stages
  4. Load data from the Azure Blob Storage account
  5. Connect to Snowflake from Power BI
  6. Create a dashboard with Power BI

Let’s get started.

1. Setup your Azure Blob Storage account

Here’s a video tutorial to help you complete this step.

Here’s an overview:

  1. Visit the Azure website to create a free account (if you don’t already have one).
  2. Create a resource group
  3. Create a storage account
  4. Create a container
  5. Upload your file

If done correctly, you should have something like this.

azure sample superstore directory - grionic

Here are the Azure services/objects I created and their names:

Azure Service or ObjectName
Resource groupTraining
Storage accountabcdef
Containersample-data
Filesample-superstore-2022.csv

Make the resource types accessible via SAS

  1. Go to your Storage account page on Azure
  2. Under the Security + Networking section on the left side of the page, click Share access signature.
  3. Select the checkboxes Service, Container, and Object.
azure storage account directory - grionic

At the bottom of the page, click the button Generate SAS and connection string.

azure storage account directory sas token - grionic

Great job! Your Azure account is ready.

2. Create your Snowflake account

snowflake sign up account - grionic

Visit the snowflake website. Click the START FOR FREE button.

Sign in or Sign up for a new account if you don’t have one already.

For the purpose of this tutorial, use the ACCOUNTADMIN role.

To do this, on the top-left corner of your page, click your name, then click current role, and then select the ACCOUNTADMIN role.

snowflake snowsight homepage - grionic

You can learn more about the Snowflake access control framework and roles here.

3. Create Snowflake database objects

On the top-right corner of the page, click the Worksheet button to create a new worksheet.

This worksheet will be used to create our snowflake objects.

1. Rename your worksheet

To do this by clicking the timestamp on the top-left of your page.

introduction to snowflake - rename worksheet - grionic

Select the Databases tab on the left pane. This will allow us to see the database we are about to create.

2. Create your database

CREATE DATABASE SAMPLE_SUPERSTORE_2022

Type the query above into your worksheet and click the play button on the top-right corner of the page.

If your query ran successfully, you should get status feedback like the one shown below.

introduction to snowflake - create database - grionic

3. Create your schema

In the “spirit of SQL Server”, I will name my schema DBO.

CREATE SCHEMA DBO

4. Create your warehouse

Now, create your warehouse. I’d call mine MYWH.

CREATE WAREHOUSE MYWH
introduction to snowflake - create objects - grionic

At this point, the editor pane has changed its scope to that of your newly created database, schema, and warehouse.

5. Create your table

Now, create your table. We would be importing our CSV data into this table shortly.

CREATE OR REPLACE TABLE ORDERS (
	ROW_ID INTEGER,
	ORDER_ID STRING,
	ORDER_DATE DATE,
	SHIP_DATE DATE,
	SHIP_MODE STRING,
	CUSTOMER_ID STRING,
	CUSTOMER_NAME STRING,
        SEGMENT STRING,
        COUNTRY_REGION STRING,
        CITY STRING,
        STATE_PROVINCE STRING,
        POSTAL_CODE STRING,
        REGION STRING,
        PRODUCT_ID STRING,
        CATEGORY STRING,
        SUB_CATEGORY STRING,
        PRODUCT_NAME STRING,
        SALES FLOAT,
        QUANTITY INTEGER,
        DISCOUNT FLOAT,
        PROFIT FLOAT
);

Since this is a tutorial, I’m using the max length for the data types. For instance, STRING is the same as VARCHAR(16777216) in snowflake and varchar(max) in SQL Server.

To be more precise, feel free to use VARCHAR(<length>) or NUMBER(<precision>, <scale>). To learn more about Snowflake data types, read the documentation here.

6. Create a CSV File Format object

This file format object contains important information about the data to be loaded from Azure.

CREATE FILE FORMAT AZURE_CSV
        TYPE=CSV
	SKIP_HEADER = 1
	EMPTY_FIELD_AS_NULL = true
        ENCODING = 'ISO-8859-1'
        FIELD_OPTIONALLY_ENCLOSED_BY='"'
  • TYPE=CSV: This means the file to be loaded is in CSV format
  • SKIP_HEADER = 1: Skip the first header
  • EMPTY_FIELD_AS_NULL = true: Replace null with NULL
  • ENCODING = ‘ISO-8859-1’: Encodes string like “Barry Französisch” contained in the data
  • FIELD_OPTIONALLY_ENCLOSED_BY='”‘: Avoid issues with strings that have commas

Read more on creating file formats and other format type options.

7. Create a stage

A stage is a temporary location where your data is stored before being loaded into your table.

CREATE OR REPLACE STAGE STAGE_ORDERS
    url='azure://grionic.blob.core.windows.net/sample-data'
    credentials=(azure_sas_token='?sv=2021-06-08&ss=bfqt&srt=sco&sp=rwdlacupiytfx&se=2022-11-21T13:38:34Z&st=2022-11-19T05:38:34Z&spr=https&sig=rHstIKlzWVH1WSjkm9N06GP6Ph%2FdX21IAQTe%2FsKjAuw%3D')
    file_format = AZURE_CSV;

Test your stage to make sure it’s set up correctly.

list '@stage_orders/sample-superstore-2022.csv'
introduction to snowflake - test_stage - grionic

List the data.

SELECT 
    s.$1 AS ROW_ID, 
    s.$2 AS ORDER_ID, 
    s.$3 AS ORDER_DATE,
    s.$4 AS SHIP_DATE, 
    s.$5 AS SHIP_MODE,
    s.$6 AS CUSTOMER_ID, 
    s.$7 AS CUSTOMER_NAME,
    s.$8 AS SEGMENT,
    s.$9 AS COUNTRY_REGION,
    s.$10 AS CITY,
    s.$11 AS STATE_PROVINCE,
    s.$12 AS POSTAL_CODE,
    s.$13 AS REGION,
    s.$14 AS PRODUCT_ID,
    s.$15 AS CATEGORY,
    s.$16 AS SUB_CATEGORY,
    s.$17 AS PRODUCT_NAME,
    s.$18 AS SALES,
    s.$19 AS QUANTITY,
    s.$20 AS DISCOUNT,
    s.$21 AS PROFIT
FROM '@stage_orders/sample-superstore-2022.csv' AS s
LIMIT 100;
introduction to snowflake - list data - grionic

4. Load data from the Azure Blob Storage account into Snowflake

Now is the time for the real magic – loading your data from azure into Snowflake.

COPY INTO ORDERS FROM '@STAGE_ORDERS/sample-superstore-2022.csv'

Your result should look like this.

introduction to snowflake - copy data - grionic

You can now use select statements to query your table.

SELECT * FROM ORDERS LIMIT 100

Great job!

5. Connect to Snowflake from Power BI

Open your Power BI Desktop app.

Click Get Data and search for “Snowflake“. Click the Connect button

introduction to snowflake - power bi - grionic

You will be prompted to input your server name and warehouse.

  • Your Server name can be found in your activation email. This is the first email you received when you create your account. Do not add “https://” to the server name.
  • Your Warehouse is the one you created earlier.

Add the required information and click OK.

introduction to snowflake - power bi prompt - grionic

You may be required to add your Snowflake account username and password for additional verification.

introduction to snowflake - power bi load data - grionic

If you were able to get to this point, good job! Congratulations on completing this Snowflake project.

6. Create a dashboard with Power BI

Finally, you can create a simple dashboard with Power BI Desktop to complete this project.

I will be going above and beyond to create a dashboard to brush up on my Power BI formatting skills. this is the #WOW2022 Week 43 Power BI challenge.

Here’s a snapshot of my dashboard.

grionic-superstore-dashboard-advanced-formatting-

Bonus

Let’s write and run some simple SQL queries to draw some insights.

Query 1. Annual Sales, Quantity & Profits

What are the total sales, quantity and profits by year? Create a basic bar chart to highlight this KPI.

SELECT 
      YEAR(ORDER_DATE) AS ORDER_YEAR
      ,CAST(YEAR(ORDER_DATE) AS STRING) AS ORDER_YEAR_STR
      ,SUM(SALES)
      ,SUM(QUANTITY)
      ,SUM(PROFIT)
FROM ORDERS
GROUP BY ORDER_YEAR
ORDER BY ORDER_YEAR
annual-sales-quantity-profit - grionic

You can also play around with the charts available on Snowflake.

annual-sales-quantity-profit-chart - grionic

Query 2. Date with most sale

What date did the Superstore company make the most sales?

SELECT 
      ORDER_DATE 
FROM ORDERS 
WHERE SALES = (SELECT MAX(SALES) FROM ORDERS)

Query 3: Most expensive and least expensive technology products

What is the most expensive, the least expensive product, and the difference between the most expensive and least expensive technology products?

-- 
SELECT 
      SUB_CATEGORY
      ,MAX(SALES) AS MAX_SALES
      ,MIN(SALES) AS MIN_SALES
      ,MAX(SALES) - MIN(SALES) AS SALES_DIFF
FROM ORDERS
WHERE CATEGORY = 'Technology'
GROUP BY SUB_CATEGORY
ORDER BY SALES_DIFF DESC

Your result should look like this.

max and min sale difference - grionic

Conclusion

I hope you enjoyed this tutorial on how to load Data From Azure Blob Storage Into Snowflake.

In this tutorial, you learned:

  • how to create an Azure account and upload the Sample Superstore CSV dataset
  • set up a snowflake data warehouse and load data from Azure
  • and connect to the snowflake data warehouse to create your dashboard.

I hope this tutorial was helpful.

Happy Analyzing!

Leave a Comment