Transfer large Datasets to Azure blob storage from Amazon s3

1.5k Views Asked by At

I want to transfer large datasets from Amazon s3 to Azure Blob storage. Can anyone help me with how can I modify my code to handle large datasets? Below is my code in java:

        try {
            storageAccount = new CloudStorageAccount(new StorageCredentialsAccountAndKey(azureCredentialsDto.getStorageAccountName(), azureCredentialsDto.getStorageAccountKey()), true);
            blobClient = storageAccount.createCloudBlobClient();
            container = blobClient.getContainerReference(azureCredentialsDto.getBlobContainerName());

            log.info("Creating Container: "+container.getName());
            container.createIfNotExists(BlobContainerPublicAccessType.CONTAINER, new BlobRequestOptions(), new OperationContext());
            CloudBlockBlob blob = container.getBlockBlobReference(destinationFileName);
            URI blockbloburl = blob.getUri();
            log.info("Blob URI: " + blockbloburl);
           // sourcefileurl is url of amazons3 file I want to copy
            blob.startCopy(new URI(sourceFileUrl));
            log.info("Copy Started...");
1

There are 1 best solutions below

0
On

If you just need to transfer the files with large size the best option is to use Copy activity in Azure Data Factory (ADF).

AzCopy is a command-line utility that you can use to copy blobs or files to or from a storage account. AzCopy v10 (Preview) now supports Amazon Web Services (AWS) S3 as a data source. You can now copy an entire AWS S3 bucket, or even multiple buckets, to Azure Blob Storage using AzCopy.

AzCopy and ADF are the two best approach when we need to move large size files.

To use AzCopy refer Move your data from AWS S3 to Azure Storage using AzCopy

To accomplish it using ADF refer the below links:

https://www.youtube.com/watch?v=9uXDt0DP9cs&ab_channel=TechBrothersIT

Azure Data Factory V2 Pipelines for Copying Large AWS S3 Buckets to Azure Storage