OPENROWSET Bulk Load multiple JSON files from Azure Blob Storage

269 Views Asked by At

I am trying to read from an Azure Blob Storage containing multiple nested-JSON files, eg:

0000001.json
0000002.json
0000003.json
...

I created an External Data Source.

So far, I managed to successfully read each file one by one, using OPENROWSET and BULK, but explicitly stating the name of the file, like in following SQL code:

SELECT Obj1.*
FROM OPENROWSET(
    BULK '00000002.json', 
    DATA_SOURCE = 'MyDataSource', 
    SINGLE_CLOB
) as DataFile   
CROSS APPLY OPENJSON(BulkColumn) 
...

To get the list of all files I tried:

SELECT Obj1.*
FROM OPENROWSET(
    BULK '*.json', 
    DATA_SOURCE = 'MyDataSource',
    SINGLE_CLOB
) as DIR    

but I get an error

Cannot bulk load. The file "*.json" does not exist or you don't have file access rights.

I am using a SAS key with full listing and reading access, hence I don't think it's permissions issue.

How can I read the list of files?

1

There are 1 best solutions below

0
On

Cannot bulk load. The file "*.json" does not exist, or you don't have file access rights.

Since the code works when you give the file name directly, the issue does not seem to be of access rights. Wildcard characters are not supported for pattern matching when used in the file path option. It will consider *.json as a file name itself, and it gives you the above error. To solve this, you can try using an external table instead of OPENROWSET in SQL Server. This external table approach is applicable only to SQL Server 2016 or above.

Steps to create an external table:

  1. Create Database Scoped Credential
CREATE DATABASE SCOPED CREDENTIAL <Credential Name>
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET = 'xxxxx';
  1. Create external Data Source
CREATE EXTERNAL DATA SOURCE <DataSourceName>
WITH (LOCATION = 'https://<container name>.blob.core.windows.net/<blob name>',
   CREDENTIAL= <credential name>);
  1. Create external file format
CREATE EXTERNAL FILE FORMAT <file_format_name>  
WITH 
(  
FORMAT_TYPE = JSON
);
  1. Create external table
CREATE EXTERNAL TABLE <table_name>  
( <column name> <data_type> [,...n] )  
WITH (  
LOCATION = 'folder_or_filepath',  
DATA_SOURCE = <DataSourceName>,  
FILE_FORMAT = <file_format_name>)

You will get the data stored under the folder that you have given in the location option of the external table.