How to read numeric value from excel file using spring batch excel

1.3k Views Asked by At

I am reading values from .xlsx using spring batch excel and POI. I see numeric values are printing with different format than the original value in .xlsx

Please suggest me , How to print the values as its in .xlsx file. Below are the details.

In my Excel values are as follows

enter image description here

The values are printing as below

enter image description here

My code is as below

  public ItemReader<DataObject> fileItemReader(InputStream inputStream){
    PoiItemReader<DataObject> reader = new PoiItemReader<DataObject>();
    reader.setLinesToSkip(1);
    reader.setResource(new InputStreamResource(DataObject));
    reader.setRowMapper(excelRowMapper());
    reader.open(new ExecutionContext());
    return reader;
  }



private RowMapper<DataObject> excelRowMapper() {
      return new MyRowMapper();
  }


public class MyRowMapper implements RowMapper<DataObject> {

        @Override
        public DataRecord mapRow(RowSet rowSet) throws Exception {
                
                DataObject dataObj = new DataObject();

                dataObj.setFieldOne(rowSet.getColumnValue(0));
                dataObj.setFieldTwo(rowSet.getColumnValue(1));
                dataObj.setFieldThree(rowSet.getColumnValue(2));
                dataObj.setFieldFour(rowSet.getColumnValue(3));
                
            
                return dataObj;

        }
    }
2

There are 2 best solutions below

0
On

I had this same problem, and its root is the class org.springframework.batch.item.excel.poi.PoiSheet inside PoiItemReader. The problem happens in the method public String[] getRow(final int rowNumber) where it gets a org.apache.poi.ss.usermodel.Row object and convert it to an array of Strings after detecting the type of each column in the row. In this method, we have the code:

switch (cellType) {
    case NUMERIC:
        if (DateUtil.isCellDateFormatted(cell)) {
            Date date = cell.getDateCellValue();
            cells.add(String.valueOf(date.getTime()));
        } else {
            cells.add(String.valueOf(cell.getNumericCellValue()));
        }
        break;
    case BOOLEAN:
        cells.add(String.valueOf(cell.getBooleanCellValue()));
        break;
    case STRING:
    case BLANK:
        cells.add(cell.getStringCellValue());
        break;
    case ERROR:
        cells.add(FormulaError.forInt(cell.getErrorCellValue()).getString());
        break;
    default:
        throw new IllegalArgumentException("Cannot handle cells of type '" + cell.getCellTypeEnum() + "'");
}

In which the treatment for a cell identified as NUMERIC is cells.add(String.valueOf(cell.getNumericCellValue())). In this line, the cell value is converted to double (cell.getNumericCellValue()) and this double is converted to String (String.valueOf()). The problem happens in the String.valueOf() method, that will generate scientific notation if the number is too big (>=10000000) or too small(<0.001) and will put the ".0" on integer values.

As an alternative to the line cells.add(String.valueOf(cell.getNumericCellValue())), you could use

DataFormatter formatter = new DataFormatter();
cells.add(formatter.formatCellValue(cell));

that will return to you the exact values of the cells as a String. However, this also mean that your decimal numbers will be locale dependent (you'll receive the string "2.5" from a document saved on an Excel configured for UK or India and the string "2,5" from France or Brazil).

To avoid this dependency, we can use the solution presented on https://stackoverflow.com/a/25307973/9184574:

DecimalFormat df = new DecimalFormat("0", DecimalFormatSymbols.getInstance(Locale.ENGLISH));
df.setMaximumFractionDigits(340);
cells.add(df.format(cell.getNumericCellValue()));

That will convert the cell to double and than format it to the English pattern without scientific notation or adding ".0" to integers.

My implementation of the CustomPoiSheet (small adaptation on original PoiSheet) was:

class CustomPoiSheet implements Sheet {

    protected final org.apache.poi.ss.usermodel.Sheet delegate;
    private final int numberOfRows;
    private final String name;

    private FormulaEvaluator evaluator;

    /**
     * Constructor which takes the delegate sheet.
     *
     * @param delegate the apache POI sheet
     */
    CustomPoiSheet(final org.apache.poi.ss.usermodel.Sheet delegate) {
        super();
        this.delegate = delegate;
        this.numberOfRows = this.delegate.getLastRowNum() + 1;
        this.name=this.delegate.getSheetName();
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public int getNumberOfRows() {
        return this.numberOfRows;
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public String getName() {
        return this.name;
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public String[] getRow(final int rowNumber) {
        final Row row = this.delegate.getRow(rowNumber);
        if (row == null) {
            return null;
        }
        final List<String> cells = new LinkedList<>();
        final int numberOfColumns = row.getLastCellNum();

        for (int i = 0; i < numberOfColumns; i++) {
            Cell cell = row.getCell(i);
            CellType cellType = cell.getCellType();
            if (cellType == CellType.FORMULA) {
                FormulaEvaluator evaluator = getFormulaEvaluator();
                if (evaluator == null) {
                    cells.add(cell.getCellFormula());
                } else {
                    cellType = evaluator.evaluateFormulaCell(cell);
                }
            }

            switch (cellType) {
                case NUMERIC:
                    if (DateUtil.isCellDateFormatted(cell)) {
                        Date date = cell.getDateCellValue();
                        cells.add(String.valueOf(date.getTime()));
                    } else {
                        // Returns numeric value the closer possible to it's value and shown string, only formatting to english format
                        // It will result in an integer string (without decimal places) if the value is a integer, and will result 
                        // on the double string without trailing zeros. It also suppress scientific notation
                        // Regards to https://stackoverflow.com/a/25307973/9184574
                        DecimalFormat df = new DecimalFormat("0", DecimalFormatSymbols.getInstance(Locale.ENGLISH));
                        df.setMaximumFractionDigits(340);
                        cells.add(df.format(cell.getNumericCellValue()));
                        //DataFormatter formatter = new DataFormatter();
                        //cells.add(formatter.formatCellValue(cell));
                        //cells.add(String.valueOf(cell.getNumericCellValue()));
                    }
                    break;
                case BOOLEAN:
                    cells.add(String.valueOf(cell.getBooleanCellValue()));
                    break;
                case STRING:
                case BLANK:
                    cells.add(cell.getStringCellValue());
                    break;
                case ERROR:
                    cells.add(FormulaError.forInt(cell.getErrorCellValue()).getString());
                    break;
                default:
                    throw new IllegalArgumentException("Cannot handle cells of type '" + cell.getCellTypeEnum() + "'");
            }
        }
        return cells.toArray(new String[0]);
    }

    private FormulaEvaluator getFormulaEvaluator() {
        if (this.evaluator == null) {
            this.evaluator = delegate.getWorkbook().getCreationHelper().createFormulaEvaluator();
        }
        return this.evaluator;
    }
}

And my implementation of CustomPoiItemReader (small adaptation on original PoiItemReader) calling CustomPoiSheet:

public class CustomPoiItemReader<T> extends AbstractExcelItemReader<T> {

    private Workbook workbook;

    @Override
    protected Sheet getSheet(final int sheet) {
        return new CustomPoiSheet(this.workbook.getSheetAt(sheet));
    }
    
    public CustomPoiItemReader(){
        super();
    }
    
    @Override
    protected int getNumberOfSheets() {
        return this.workbook.getNumberOfSheets();
    }

    @Override
    protected void doClose() throws Exception {
        super.doClose();
        if (this.workbook != null) {
            this.workbook.close();
        }

        this.workbook=null;
    }

    /**
     * Open the underlying file using the {@code WorkbookFactory}. We keep track of the used {@code InputStream} so that
     * it can be closed cleanly on the end of reading the file. This to be able to release the resources used by
     * Apache POI.
     *
     * @param inputStream the {@code InputStream} pointing to the Excel file.
     * @throws Exception is thrown for any errors.
     */
    @Override
    protected void openExcelFile(final InputStream inputStream) throws Exception {

        this.workbook = WorkbookFactory.create(inputStream);
        this.workbook.setMissingCellPolicy(Row.MissingCellPolicy.CREATE_NULL_AS_BLANK);
    }

}
0
On

just change your code like this while reading data from excel.

dataObj.setField(Float.valueOf(rowSet.getColumnValue(idx)).intValue();

this is only working for Column A,B,C