I'm trying to read an Excel with over 500.000 lines in my Spring Boot Application. The problem is that I get a java.net.SocketTimeoutException: Read timed out AFTER already having read all lines in. The problem occurs in the following method (without showing any Error or Exception in the console):
private List<XYZEntry> importFile(XYZ file) throws ServiceException {
var xyzToImport = new HashSet<XYZ>();
List<XYZEntry> invalidXYZ = new ArrayList<>();
try(var wb = WorkbookFactory.create(new ByteArrayInputStream(file.getContent()))) {
for(var row: wb.getSheetAt(0)) {
if(row.getRowNum() == 0) {
// Skip header
continue;
}
DataFormatter formatter = new DataFormatter();
var potentialXYZ = formatter.formatCellValue( row.getCell( 0 ) ).trim();
if(isXYZValidToImport(potentialXYZ)) {
xyzsToImport.add(new XYZ(potentialXYZ));
} else {
invalidXYZ.add(new XYZEntry(potentialXYZ, row.getRowNum() + 1));
}
}
if(CollectionUtils.isNotEmpty(invalidXYZ)) {
file.setInvalidEntries(invalidXYZ.size());
log.info(
"While importing the file {} | {} invalid XYZs was skipped: {}",
file.getFilename(),
invalidXYZ.size(),
String.join(",", invalidXYZ.toString())
);
}
if(CollectionUtils.isEmpty(xyzToImport)) {
log.warn(
"No valid XYZwas found in File {}, nothing imported.",
file.getFilename()
);
file.setStatus( XYZStateEnum.NO_IMPORTS );
} else {
transactionHelper.executeInNewTransaction(() -> {
xyzsRepository.deleteAll();
xyzsRepository.saveAll(xyzsToImport);
// this return ist not reached anymore
return null;
});
file.setValidEntries(xyzsToImport.size());
file.setStatus(XYZStateEnum.IMPORTED);
}
} catch(IOException e) {
file.setStatus(XYZStateEnum.ERROR);
filesRepository.save(file);
throw new ServiceException("XYZ Import failed.", e);
}
filesRepository.save(file);
return invalidXYZ;
}
I found out that the code is running until the saveAll() method stops there.
The
java.net.SocketTimeoutException: Read timed outerror usually occurs when a network connection is established, but the remote endpoint (in this case, the database) takes too long to respond, causing the timeout.Option 1: Increase timeout
It's possible that the execution time exceeds the default timeout duration, causing the read timeout exception to be thrown. To address this issue, you can try increasing the timeout duration for the database operations. To increase the timeout duration for database operations in Spring Framework, you can configure it in the database connection settings. The approach may vary depending on the specific database you are using.
Example for MySQL (increase timeout up to 5 seconds):
However, keep in mind that this might not be the optimal solution, especially if the data set continues to grow larger in the future.
Option 2: Save in batches.
Another approach is to batch the
saveAll()operation by splitting thexyzToImportcollection into smaller chunks and saving them incrementally. This way, you reduce the load on the database and minimize the chances of a timeout occurring.