Add Azure Blob Partitions to Azure SQL Table

Question

Add Azure Blob Partitions to Azure SQL Table

586 Views Asked by Mariah Akinbi At 09 November 2025 at 20:11

I have partitioned parquet files in Azure Blob that I am copying to Azure SQL. How do I get the partition name into the SQL table?

I've figured out how to get the full file path into the SQL table by adding an Additional Column in the source data section of the Copy Activity (image 1 & 2), but I'm trying to figure out how to regex the full file path down to just the partition name (202105).

In the data preview for the source data in the Copy Activity, it shows the time_period column with just the partition name (image 3). But when it shows up in SQL it is NULL for all rows (or it's the full file path, depending on if I added Additional Columns in the source data section of the Copy Activity).

Image 1:

Image 2:

Image 3:

I've tried changing the data type for time_period to an INT in Azure SQL. I've tried parsing the $$FILEPATH, but nothing I've tried has worked.

I'm basically starting from scratch as I'm sure there's a better. Extra background here and possibly here.

Similar to this

Original Q&A

There are 1 best solutions below

**KarthikBhyresh-MT** · Answer 1

As explained here in MS doc, you can utilize enablePartitionDiscovery feature.

Source: partitioned files:

Source Dataset:

Just mentioned the container name and leave the directory and file fields empty. We shall filter them using WildCard paths in Copy Activity.

Configure source in Copy Activity with respect to your files path:

Note: you can skip the step 4 i.e. additional column with $$FILEPATH, just shown for reference. You can drop this bit as you already get the ready column using enablePartitionDiscovery.

For a single folder to be picked, you will set as below.

Wildcard paths: sink / columnparts / time_period=202105 / *.parquet

For multiple folders time_period=202105 , time_period=202106 ..... as seen in previous sinp, set as below.

** will take the place of any folder in the parent folder columnparts

Wildcard paths: sink / columnparts / ** / *.parquet

Partition root Path: This should point to the parent folder where all the partitioned folders rest.

In my example: sink/columnparts

partition root path must be provided when you enable partition discovery.

Sink: Optional update existing table or just create a new one.

View from SQL DB: time_period column holds the value 202105

time_period=202105/part-00004-fcbe0bf5-2c93-45f5-9bb2-2f9089a3e83a-c000.snappy.parquet

If you see this error:

You have a mapping that is not updated! In the mapping section, you can clear or reset schema and Import schema again just to be sure.

In my case it was additional column file_path

--OR--

$$FILEPATH is a reserved variable, you cannot use it in expression builder or in functions to manipulate.

Instead if you can incorporate a step after you copy to SQL DB i.e. use a stored procedure as below.

Where column path holds the full file path received from $$FILEPATH as you have managed already. StoreParquetTest is the table created in SQL sink

CREATE PROCEDURE trimpath
AS
    UPDATE StoreParquetTest
    SET path = SUBSTRING(path,(CHARINDEX('=',path) + 1), ((CHARINDEX('/',path) - CHARINDEX('=',path) -1)))
GO

Now you can use the stored procedure activity in Pipeline after Copy Activity.

Add Azure Blob Partitions to Azure SQL Table

There are 1 best solutions below

Related Questions in SQL-SERVER

Related Questions in AZURE

Related Questions in AZURE-DATA-FACTORY

Related Questions in AZURE-SQL

Trending Questions

Popular # Hahtags

Popular Questions