upload a sample pyspark dataframe to Azure blob, after converting it to excel format

634 Views Asked by kanishk kashyap At 28 July 2025 at 01:25

I'm trying to upload a sample pyspark dataframe to Azure blob, after converting it to excel format. Getting the below error. Also, below is the snippet of my sample code.

If there is a other way to do the same, pls let me know.

from pyspark.sql.types import StructType,StructField, StringType, IntegerType

import pandas as ps
#%pip install xlwt
#%pip install openpyxl
#%pip install fsspec

my_data = [
            ("A","1","M",3000),
            ("B","2","F",4000),
            ("C","3","M",4000)
          ]

schema = StructType([ \
    StructField("firstname",StringType(),True), \
    StructField("id", StringType(), True), \
    StructField("gender", StringType(), True), \
    StructField("salary", IntegerType(), True) \
  ])

df = spark.createDataFrame(data=my_data,schema=schema)

pandasDF = df.toPandas()

pandasDF.to_excel("wasbs://[email protected]/output_file.xlsx")

ValueError: Protocol not known: wasbs

Original Q&A

There are 1 best solutions below

Utkarsh Pal On 21 April 2022 at 09:26

You are directly using python library pandas to write the data. This isn't work this way. You need to first mount the Azure Blob storage container and then write the data.

To mount, use following command:

dbutils.fs.mount(
  source = "wasbs://<container-name>@<storage-account-name>.blob.core.windows.net",
  mount_point = "/mnt/<mount-name>",
  extra_configs = {"<conf-key>":dbutils.secrets.get(scope = "<scope-name>", key = "<key-name>")})

To write, use below commands:

df.write
 .mode("overwrite")
 .option("header", "true")
 .csv("dbfs:/mnt/azurestorage/filename.csv"))

upload a sample pyspark dataframe to Azure blob, after converting it to excel format

There are 1 best solutions below

Related Questions in PANDAS

Related Questions in PYSPARK

Related Questions in AZURE-DATABRICKS

Related Questions in PYSPARK-PANDAS

Trending Questions

Popular # Hahtags

Popular Questions