What's the difference between Sources “S3 (through Hadoop)” and “S3 (Direct)” in Foundry Data Connection?

124 Views Asked by Adil B At 31 July 2025 at 21:48

What's the difference between the two S3 source options that are available in Foundry Data Connection?

S3 (through Hadoop)
S3 (Direct)

Is one preferred for ingesting parquet files?

There are 1 best solutions below

Andrew St P On 22 September 2020 at 19:20

S3 through Hadoop is currently the best tested and most flexible S3 option but the performance for large numbers of files is very poor.

S3 Direct is read from S3 using the Amazon S3 SDK directly and performs significantly better than Hadoop as it requires O(1) rather than O(number of files) network calls.

We recommend using S3-direct source instead where possible.

What's the difference between Sources “S3 (through Hadoop)” and “S3 (Direct)” in Foundry Data Connection?

There are 1 best solutions below

Related Questions in PALANTIR-FOUNDRY

Related Questions in FOUNDRY-DATA-CONNECTION

Trending Questions

Popular # Hahtags

Popular Questions