scndry

Posted on Apr 16 • Edited on May 18

Reading Excel to Java POJOs — A Modern Alternative to Apache POI (2026)

#java #excel #jackson #apachepoi

Reading Excel into Java POJOs is normally a multi-step exercise with Apache POI: open the workbook, iterate rows, fetch cells by column index, cast each to its type, build the object yourself. jackson-dataformat-spreadsheet is a modern alternative — Jackson solves this same problem for JSON, and the library applies the same pattern to XLSX/XLS.

Workbook wb = new XSSFWorkbook(file);
Sheet sheet = wb.getSheetAt(0);
for (Row row : sheet) {
    String name = row.getCell(0).getStringCellValue();
    int qty = (int) row.getCell(1).getNumericCellValue();
    // ... 20 more fields
}

Cell by cell. Column index by column index. Cast by cast. For every single spreadsheet your application touches.

What if you could do this instead?

SpreadsheetMapper mapper = new SpreadsheetMapper();
List<Employee> employees = mapper.readValues(file, Employee.class);

That's it. Same API as Jackson's ObjectMapper. Because a spreadsheet row IS a JSON object.

Introducing jackson-dataformat-spreadsheet

A Jackson extension module that treats XLSX/XLS as just another data format — like JSON, CSV, or XML.

GitHub: jackson-dataformat-spreadsheet
Listed as a community data format module in the official FasterXML jackson repository.

Reading

@DataGrid
public class Employee {
    private String name;
    private String department;
    private int salary;
    // getters, setters
}

SpreadsheetMapper mapper = new SpreadsheetMapper();

// Single row
Employee first = mapper.readValue(file, Employee.class);

// All rows
List<Employee> all = mapper.readValues(file, Employee.class);

// Specific sheet
SheetInput<File> input = SheetInput.source(file, "Payroll");
List<Employee> payroll = mapper.readValues(input, Employee.class);

Writing

List<Employee> employees = ...;
mapper.writeValue(file, employees, Employee.class);

That produces a proper XLSX file with headers and typed cells.

Nested Objects — The Killer Feature

Spreadsheets are flat. POJOs are not. Most Excel libraries force you to flatten everything manually. This library does it automatically:

┌─────┬──────┬─────────┬────────────────┬─────────────┬────────┐
│ ID  │ NAME │ ZIPCODE │ ADDRESS LINE 1 │ DESIGNATION │ SALARY │
├─────┼──────┼─────────┼────────────────┼─────────────┼────────┤
│ 1   │ John │ 12345   │ 123 Main St.   │ CEO         │ 300000 │
└─────┴──────┴─────────┴────────────────┴─────────────┴────────┘

@DataGrid
class Employee {
    int id;
    String name;
    Address address;
    Employment employment;
}

class Address {
    String zipcode;
    String addressLine1;
}

class Employment {
    String designation;
    long salary;
}

No configuration needed. The nested POJO structure defines the column layout. Read and write — both directions work.

How It's Built

This isn't a POI wrapper. It extends Jackson's streaming layer directly:

SheetParser extends ParserMinimalBase — pulls tokens from StAX
SheetGenerator extends GeneratorBase — streaming cell writer
SpreadsheetFactory extends JsonFactory — creates parsers/generators

The default XLSX path bypasses POI's cell model entirely. The read path parses OOXML XML directly via StAX — no XMLBeans, no SAX callbacks, no intermediate DOM. The write path builds a POI skeleton for package metadata, then streams worksheet and shared strings via StringBuilder directly to ZipOutputStream.

Jackson (pull)       SheetParser (pull)      StAX (pull)
    │                      │                      │
    ├─ nextToken() ───────►├─ next() ────────────►├─ next()
    │◄─ VALUE_STRING ──────┤◄─ CELL_VALUE ────────┤◄─ START_ELEMENT

Performance

Benchmarked against popular alternatives on realistic data (100K rows, mixed types, JMH):

Read:

┌────────────────────────────┬───────────┬─────────┐
│          Library           │ Read Time │ Memory  │
├────────────────────────────┼───────────┼─────────┤
│ jackson-spreadsheet        │ 190 ms    │ 360 MB  │
├────────────────────────────┼───────────┼─────────┤
│ FastExcel                  │ 208 ms    │ 407 MB  │
├────────────────────────────┼───────────┼─────────┤
│ Fesod (formerly EasyExcel) │ 266 ms    │ 381 MB  │
├────────────────────────────┼───────────┼─────────┤
│ Poiji                      │ 809 ms    │ 2739 MB │
├────────────────────────────┼───────────┼─────────┤
│ Apache POI                 │ 1173 ms   │ 2227 MB │
└────────────────────────────┴───────────┴─────────┘

Write:

┌───────────────────────────┬────────────┬─────────┐
│          Library          │ Write Time │ Memory  │
├───────────────────────────┼────────────┼─────────┤
│ jackson-spreadsheet       │ 138 ms     │ 125 MB  │
├───────────────────────────┼────────────┼─────────┤
│ FastExcel                 │ 152 ms     │ 149 MB  │
├───────────────────────────┼────────────┼─────────┤
│ Apache POI                │ 269 ms     │ 204 MB  │
├───────────────────────────┼────────────┼─────────┤
│ Fesod                     │ 323 ms     │ 458 MB  │
└───────────────────────────┴────────────┴─────────┘

Fastest read AND write throughput. ~6.2x faster read than Apache POI, ~9% faster write than FastExcel. You don't trade performance for convenience — you get both.

Annotations

Control the schema with @DataGrid and @DataColumn:

@DataGrid
class Product {
    @DataColumn("Product Name")
    String name;

    @DataColumn(value = "Price", style = "currency")
    double price;

    @DataColumn(merge = OptBoolean.TRUE)
    String category;
}

Getting Started

Available on Maven Central:

<dependency>
    <groupId>io.github.scndry</groupId>
    <artifactId>jackson-dataformat-spreadsheet</artifactId>
    <version>1.6.4</version>
</dependency>

Runnable end-to-end examples — single-row read, streaming 100K rows, date handling, nested objects, custom serializers, multi-sheet — are in jackson-spreadsheet-examples. Each example is one JUnit test file under src/main/java.

Requirements

Java 8+
Jackson 2.14.0+
Apache POI 4.1.1+ (Strict OOXML requires 5.1.0+)

Links

Feedback, issues, and stars welcome. If you've ever cursed at row.getCell(17).getStringCellValue(), this is for you.

DEV Community