Understanding of result memory for impala jdbc driver

32 Views Asked by At

I have written below code that get result set using prepared statement. We are using impala DB with jdbc driver.

  private void writeToCSV(File resultFile, ResultSet resultSet) throws IOException, SQLException {

    LOG.info("Result set fetch size : {}", resultSet.getFetchSize());

    try (BufferedWriter csvWriter = new BufferedWriter(new FileWriter(resultFile))) {

      // Write headers
      int columnCount = resultSet.getMetaData().getColumnCount();
      for (int i = 1; i <= columnCount; i++) {
        csvWriter.write(resultSet.getMetaData().getColumnName(i));
        if (i < columnCount) {
          csvWriter.write(",");
        }
      }
      csvWriter.write("\n");

      // Write data
      int batchSize = 500; // You can adjust this based on your needs
      int currentCount = 0;

      while (resultSet.next()) {
        for (int i = 1; i <= columnCount; i++) {
          csvWriter.write(resultSet.getString(i));
          if (i < columnCount) {
            csvWriter.write(",");
          }
        }
        csvWriter.write("\n");

        currentCount++;

        // Write to the file in chunks
        if (currentCount % batchSize == 0) {
          csvWriter.flush();
        }
      }

      // Flush any remaining data
      csvWriter.flush();
    }
  }
  1. I want to understand, I have million records in memory, will record set load all of them at a time from impala DB or in batch. Also if it loads next batch, will previous record remain in memory or they will be removed.

  2. Do I need to set the fetchsize. If result.getFetchSize always returns zero.

I tried executing this with 1 million records and saw huge spike in memory. If result works in batching, we should not have seen hugh spike in memory.

0

There are 0 best solutions below