<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: scndry</title>
    <description>The latest articles on DEV Community by scndry (@scndry).</description>
    <link>https://dev.to/scndry</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F991167%2F70ad5086-50d4-420c-a1f7-07ada3343d4b.jpeg</url>
      <title>DEV Community: scndry</title>
      <link>https://dev.to/scndry</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/scndry"/>
    <language>en</language>
    <item>
      <title>Writing 100K XLSX Rows in 150ms — Streaming Beyond SXSSF (2026)</title>
      <dc:creator>scndry</dc:creator>
      <pubDate>Thu, 30 Apr 2026 22:46:05 +0000</pubDate>
      <link>https://dev.to/scndry/splitting-correctness-from-throughput-a-hybrid-approach-to-xlsx-streaming-writes-21n0</link>
      <guid>https://dev.to/scndry/splitting-correctness-from-throughput-a-hybrid-approach-to-xlsx-streaming-writes-21n0</guid>
      <description>&lt;p&gt;Java's standard library for writing XLSX (Apache POI) hits a tradeoff at scale: SXSSF streams cells but still loads the OOXML scaffolding in memory; raw XSSF buffers everything.&lt;br&gt;
jackson-dataformat-spreadsheet separates these two concerns — POI generates the scaffold once, then a StringBuilder streams cells directly into the worksheet zip entry. Result on JMH: 100K rows in ~144ms with ~180MB peak, ~7% faster than the closest alternative (FastExcel).&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;OOXML correctness&lt;/strong&gt; — relationships, content types, namespace declarations, theme references, drawing rels. Get these wrong and Excel may refuse to open the file or report a recovery dialog. The complexity is bounded but the surface is large.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Per-cell throughput&lt;/strong&gt; — &lt;code&gt;&amp;lt;c r="A1" t="s"&amp;gt;&amp;lt;v&amp;gt;0&amp;lt;/v&amp;gt;&amp;lt;/c&amp;gt;&lt;/code&gt; repeated millions of times. This dominates write time at scale.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Yet most libraries solve both via a heavy object model (Apache POI's User Model — correct but slow) or hand-roll everything (fast but you own every OOXML edge case).&lt;/p&gt;

&lt;p&gt;There's a middle path: let POI handle correctness, let &lt;code&gt;StringBuilder&lt;/code&gt; handle the hot path, split them at the format boundary.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Existing Spectrum
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;All POI (User Model — &lt;code&gt;XSSFWorkbook&lt;/code&gt;):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ OOXML correctness handled&lt;/li&gt;
&lt;li&gt;❌ Object allocation per cell (&lt;code&gt;Cell&lt;/code&gt;, &lt;code&gt;RichTextString&lt;/code&gt;, &lt;code&gt;CellStyle&lt;/code&gt; reference chains)&lt;/li&gt;
&lt;li&gt;❌ Whole workbook lives in memory until &lt;code&gt;write()&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;POI's &lt;code&gt;SXSSFWorkbook&lt;/code&gt; improves memory by flushing rows to disk, but per-cell API overhead remains&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Hand-rolled XML:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Maximum throughput&lt;/li&gt;
&lt;li&gt;❌ You own every OOXML detail listed above&lt;/li&gt;
&lt;li&gt;❌ Easy to ship a file that opens in some readers but breaks Excel&lt;/li&gt;
&lt;li&gt;❌ POI version updates can shift the OOXML output you've replicated, forcing re-validation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;FastExcel (dhatim) takes this hand-rolled path with discipline — maintaining its own OOXML correctness logic independent of POI. It's a different tradeoff: faster code path, but a separate compatibility surface to maintain.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Split
&lt;/h2&gt;

&lt;p&gt;The insight: &lt;strong&gt;OOXML correctness is a one-time cost per file.&lt;/strong&gt; Relationships, metadata, theme — all bounded in size, written once.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Per-cell throughput scales with row count.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The split:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Let POI generate a complete XLSX scaffold with styles and an empty sheet. POI owns correctness.&lt;/li&gt;
&lt;li&gt;Split &lt;code&gt;sheet1.xml&lt;/code&gt; at the &lt;code&gt;&amp;lt;sheetData&amp;gt;&lt;/code&gt; boundary into head, data, tail.&lt;/li&gt;
&lt;li&gt;Stream &lt;code&gt;&amp;lt;row&amp;gt;&lt;/code&gt; entries via &lt;code&gt;StringBuilder&lt;/code&gt; at write time — the hottest path.&lt;/li&gt;
&lt;li&gt;Stream &lt;code&gt;sharedStrings.xml&lt;/code&gt; independently (one entry per unique string).&lt;/li&gt;
&lt;li&gt;Copy every other zip entry (rels, theme, drawing, content types) verbatim from the scaffold.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;POI owns OOXML correctness. &lt;code&gt;StringBuilder&lt;/code&gt; owns per-cell throughput. The scaffold is the handoff between them.&lt;/p&gt;
&lt;h2&gt;
  
  
  Implementation Sketch
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;setSchema()
  ├─ XSSFWorkbook (styles + empty sheet) → temp file (the scaffold)
  └─ split sheet1.xml at &amp;lt;sheetData&amp;gt;:
        head, tail   (POI-generated; copied verbatim)
        data         (&amp;lt;row&amp;gt; entries — streamed at write time)

close()
  ├─ for each zip entry:
  │     sheet1.xml        → head + streamed rows + tail
  │     sharedStrings.xml → store-driven streaming
  │     others            → copied as-is
  └─ delete temp file
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The cell write path itself, from &lt;code&gt;SSMLSheetWriter&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Override&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;writeString&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;_cacheString&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;_appendCellStart&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"s"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;_appendCellEnd&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;IOException&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;IllegalStateException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;_appendCellStart("s")&lt;/code&gt; writes &lt;code&gt;&amp;lt;c r="A1" t="s" s="0"&amp;gt;&amp;lt;v&amp;gt;&lt;/code&gt;, &lt;code&gt;_appendCellEnd()&lt;/code&gt; closes with &lt;code&gt;&amp;lt;/v&amp;gt;&amp;lt;/c&amp;gt;&lt;/code&gt;. No &lt;code&gt;Cell&lt;/code&gt; object. No &lt;code&gt;RichTextString&lt;/code&gt; allocation. Just text appended to a &lt;code&gt;StringBuilder&lt;/code&gt; that flushes into the zip stream periodically.&lt;/p&gt;

&lt;h2&gt;
  
  
  Benchmarks (100K rows, mixed types, shared string table, JMH)
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Time&lt;/th&gt;
&lt;th&gt;Memory&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;XSSFWorkbook&lt;/code&gt; via Jackson layer (POI User Model)&lt;/td&gt;
&lt;td&gt;318 ms&lt;/td&gt;
&lt;td&gt;246 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;SXSSFWorkbook&lt;/code&gt; direct (POI's streaming write)&lt;/td&gt;
&lt;td&gt;268 ms&lt;/td&gt;
&lt;td&gt;207 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Scaffold-based hybrid&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;144 ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;182 MB&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;~46% reduction in write time vs &lt;code&gt;SXSSFWorkbook&lt;/code&gt;, with correctness still inherited from POI. Memory drops because POI's per-cell wrapper objects (&lt;code&gt;Cell&lt;/code&gt;, &lt;code&gt;RichTextString&lt;/code&gt;, &lt;code&gt;CellStyle&lt;/code&gt; references) are no longer allocated — only the underlying values remain.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Stays Correct
&lt;/h2&gt;

&lt;p&gt;ECMA-376 fixes the structure of &lt;code&gt;&amp;lt;worksheet&amp;gt;&lt;/code&gt;. &lt;code&gt;&amp;lt;sheetData&amp;gt;&lt;/code&gt; is required, exactly once, at a specific position in the child sequence — POI has no choice but to emit it that way. The split point is backed by the spec, not by POI convention.&lt;/p&gt;

&lt;p&gt;This shapes the safety story:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The variable part (&lt;code&gt;&amp;lt;sheetData&amp;gt;&lt;/code&gt; contents) is what we replace.&lt;/li&gt;
&lt;li&gt;The structural part (everything else) is copied verbatim from POI.&lt;/li&gt;
&lt;li&gt;POI version bumps that change formatting (whitespace, namespace prefixes, optional metadata) get absorbed automatically — we copy whatever POI emits.&lt;/li&gt;
&lt;li&gt;ECMA-376 evolution is tracked by POI before us; the scaffold reflects the new shape.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A DOM equivalence test in CI parses both outputs (scaffold-based and POI-direct), verifies their &lt;code&gt;sheet1.xml&lt;/code&gt; and &lt;code&gt;styles.xml&lt;/code&gt; structures match, and runs on every POI version bump — catches implementation bugs in the hand-rolled &lt;code&gt;&amp;lt;sheetData&amp;gt;&lt;/code&gt; and any residual divergence before they ship.&lt;/p&gt;

&lt;p&gt;The hybrid inherits correctness from POI (which inherits it from ECMA-376) and owns only the per-cell hot path.&lt;/p&gt;

&lt;h2&gt;
  
  
  Limitations
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Applicability.&lt;/strong&gt; The scaffold overhead — generating an empty XLSX via POI — is a fixed cost per file. The split earns its keep at scale; for small files this overhead can outweigh per-cell savings.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Style table is fixed at scaffold time&lt;/strong&gt; (implementation-specific). Styles must be declared up-front. Adding a new style after the first cell write would mean re-emitting the styles part — currently unsupported here, though the pattern itself doesn't preclude it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Pattern Generalizes
&lt;/h2&gt;

&lt;p&gt;The split works for any format where:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Correctness has bounded complexity&lt;/strong&gt; — a fixed set of metadata pieces to get right&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The hot path is repetitive&lt;/strong&gt; — one record type repeated N times, dominating size and time&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The hot path can be isolated structurally&lt;/strong&gt; — a clear boundary in the format where the repetitive section lives&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;OOXML fits this pattern cleanly. Other formats with a "metadata + repeated records" shape can invite similar splits. SAX/StAX-only solutions either skip the correctness side or reimplement it. The hybrid acknowledges those are different problems and assigns each to the right tool.&lt;/p&gt;

&lt;p&gt;It's the same intuition as letting an ORM build the schema while you write hot-path queries by hand: use the heavy abstraction where correctness matters, drop to the metal where throughput matters, and respect the boundary between them.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Read Path
&lt;/h2&gt;

&lt;p&gt;Reading uses the same insight without needing a split. StAX directly on OOXML XML, bypassing POI's User Model entirely for XLSX. The XML pull-parser tooling already aligns with Jackson's pull-token model — direct stream-to-token translation, no object materialization. Same principle: don't pay for an object model when you only need a stream of tokens.&lt;/p&gt;

&lt;p&gt;The library that motivated this work is &lt;a href="https://github.com/scndry/jackson-dataformat-spreadsheet" rel="noopener noreferrer"&gt;jackson-dataformat-spreadsheet&lt;/a&gt; — a Jackson dataformat module for Excel I/O, listed in &lt;a href="https://github.com/FasterXML/jackson" rel="noopener noreferrer"&gt;FasterXML/jackson&lt;/a&gt; community modules. Apache 2.0.&lt;/p&gt;

&lt;p&gt;Critique welcome — especially OOXML edge cases the split might mishandle. I'm specifically curious about scenarios where a hand-rolled &lt;code&gt;&amp;lt;sheetData&amp;gt;&lt;/code&gt; could surprise Excel even when the surrounding scaffold is correct.&lt;/p&gt;

</description>
      <category>java</category>
      <category>excel</category>
      <category>xlsx</category>
      <category>performance</category>
    </item>
    <item>
      <title>Reading Excel to Java POJOs — A Modern Alternative to Apache POI (2026)</title>
      <dc:creator>scndry</dc:creator>
      <pubDate>Thu, 16 Apr 2026 10:42:41 +0000</pubDate>
      <link>https://dev.to/scndry/stop-treating-spreadsheets-like-spreadsheets-treat-them-like-json-fmf</link>
      <guid>https://dev.to/scndry/stop-treating-spreadsheets-like-spreadsheets-treat-them-like-json-fmf</guid>
      <description>&lt;p&gt;Reading Excel into Java POJOs is normally a multi-step exercise with Apache POI: open the workbook, iterate rows, fetch cells by column index, cast each to its type, build the object yourself. &lt;code&gt;jackson-dataformat-spreadsheet&lt;/code&gt; is a modern alternative — Jackson solves this same problem for JSON, and the library applies the same pattern to XLSX/XLS.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;Workbook&lt;/span&gt; &lt;span class="n"&gt;wb&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;XSSFWorkbook&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="nc"&gt;Sheet&lt;/span&gt; &lt;span class="n"&gt;sheet&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;wb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getSheetAt&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Row&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;sheet&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getCell&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;getStringCellValue&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;qty&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getCell&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;getNumericCellValue&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="c1"&gt;// ... 20 more fields&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cell by cell. Column index by column index. Cast by cast. For every single spreadsheet your application touches.&lt;/p&gt;

&lt;p&gt;What if you could do this instead?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;SpreadsheetMapper&lt;/span&gt; &lt;span class="n"&gt;mapper&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;SpreadsheetMapper&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Employee&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mapper&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;readValues&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Employee&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Same API as Jackson's &lt;code&gt;ObjectMapper&lt;/code&gt;. Because a spreadsheet row IS a JSON object.&lt;/p&gt;

&lt;h2&gt;
  
  
  Introducing jackson-dataformat-spreadsheet
&lt;/h2&gt;

&lt;p&gt;A Jackson extension module that treats XLSX/XLS as just another data format — like JSON, CSV, or XML.&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/scndry/jackson-dataformat-spreadsheet" rel="noopener noreferrer"&gt;jackson-dataformat-spreadsheet&lt;/a&gt;&lt;br&gt;
Listed as a &lt;a href="https://github.com/FasterXML/jackson#data-format-modules" rel="noopener noreferrer"&gt;community data format module&lt;/a&gt; in the official FasterXML jackson repository.&lt;/p&gt;
&lt;h3&gt;
  
  
  Reading
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@DataGrid&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Employee&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="c1"&gt;// getters, setters&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="nc"&gt;SpreadsheetMapper&lt;/span&gt; &lt;span class="n"&gt;mapper&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;SpreadsheetMapper&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Single row&lt;/span&gt;
&lt;span class="nc"&gt;Employee&lt;/span&gt; &lt;span class="n"&gt;first&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mapper&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;readValue&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Employee&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// All rows&lt;/span&gt;
&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Employee&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;all&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mapper&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;readValues&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Employee&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Specific sheet&lt;/span&gt;
&lt;span class="nc"&gt;SheetInput&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;File&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SheetInput&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;source&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Payroll"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Employee&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;payroll&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mapper&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;readValues&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Employee&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  Writing
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Employee&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;...;&lt;/span&gt;
&lt;span class="n"&gt;mapper&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;writeValue&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Employee&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;That produces a proper XLSX file with headers and typed cells.&lt;/p&gt;
&lt;h2&gt;
  
  
  Nested Objects — The Killer Feature
&lt;/h2&gt;

&lt;p&gt;Spreadsheets are flat. POJOs are not. Most Excel libraries force you to flatten everything manually. This library does it automatically:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────┬──────┬─────────┬────────────────┬─────────────┬────────┐
│ ID  │ NAME │ ZIPCODE │ ADDRESS LINE 1 │ DESIGNATION │ SALARY │
├─────┼──────┼─────────┼────────────────┼─────────────┼────────┤
│ 1   │ John │ 12345   │ 123 Main St.   │ CEO         │ 300000 │
└─────┴──────┴─────────┴────────────────┴─────────────┴────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@DataGrid&lt;/span&gt;
&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Employee&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="nc"&gt;Address&lt;/span&gt; &lt;span class="n"&gt;address&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="nc"&gt;Employment&lt;/span&gt; &lt;span class="n"&gt;employment&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Address&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;zipcode&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;addressLine1&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Employment&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;designation&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No configuration needed. The nested POJO structure defines the column layout. Read and write — both directions work.&lt;/p&gt;

&lt;h2&gt;
  
  
  How It's Built
&lt;/h2&gt;

&lt;p&gt;This isn't a POI wrapper. It extends Jackson's streaming layer directly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;SheetParser extends ParserMinimalBase&lt;/code&gt; — pulls tokens from StAX&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;SheetGenerator extends GeneratorBase&lt;/code&gt; — streaming cell writer&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;SpreadsheetFactory extends JsonFactory&lt;/code&gt; — creates parsers/generators&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The default XLSX path bypasses POI's cell model entirely. The read path parses OOXML XML directly via StAX — no XMLBeans, no SAX callbacks, no intermediate DOM. The write path builds a POI skeleton for package metadata, then streams worksheet and shared strings via StringBuilder directly to ZipOutputStream.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Jackson (pull)       SheetParser (pull)      StAX (pull)
    │                      │                      │
    ├─ nextToken() ───────►├─ next() ────────────►├─ next()
    │◄─ VALUE_STRING ──────┤◄─ CELL_VALUE ────────┤◄─ START_ELEMENT
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Performance
&lt;/h2&gt;

&lt;p&gt;Benchmarked against popular alternatives on realistic data (100K rows, mixed types, JMH):&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Read:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌────────────────────────────┬───────────┬─────────┐
│          Library           │ Read Time │ Memory  │
├────────────────────────────┼───────────┼─────────┤
│ jackson-spreadsheet        │ 196 ms    │ 360 MB  │
├────────────────────────────┼───────────┼─────────┤
│ FastExcel                  │ 252 ms    │ 407 MB  │
├────────────────────────────┼───────────┼─────────┤
│ Fesod (formerly EasyExcel) │ 267 ms    │ 384 MB  │
├────────────────────────────┼───────────┼─────────┤
│ Poiji                      │ 848 ms    │ 2743 MB │
├────────────────────────────┼───────────┼─────────┤
│ Apache POI                 │ 1095 ms   │ 2225 MB │
└────────────────────────────┴───────────┴─────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Write:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌───────────────────────────┬────────────┬─────────┐
│          Library          │ Write Time │ Memory  │
├───────────────────────────┼────────────┼─────────┤
│ jackson-spreadsheet       │ 144 ms     │ 182 MB  │
├───────────────────────────┼────────────┼─────────┤
│ FastExcel                 │ 154 ms     │ 149 MB  │
├───────────────────────────┼────────────┼─────────┤
│ Apache POI                │ 268 ms     │ 207 MB  │
├───────────────────────────┼────────────┼─────────┤
│ Fesod                     │ 314 ms     │ 458 MB  │
└───────────────────────────┴────────────┴─────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Fastest read AND write throughput. ~5.6x faster read than Apache POI, ~7% faster write than FastExcel. You don't trade performance for convenience — you get both.&lt;/p&gt;

&lt;h2&gt;
  
  
  Annotations
&lt;/h2&gt;

&lt;p&gt;Control the schema with &lt;code&gt;@DataGrid&lt;/code&gt; and &lt;code&gt;@DataColumn&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@DataGrid&lt;/span&gt;
&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Product&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nd"&gt;@DataColumn&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Product Name"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="nd"&gt;@DataColumn&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Price"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;style&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"currency"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="nd"&gt;@DataColumn&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;merge&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OptBoolean&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;TRUE&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;p&gt;Available on Maven Central:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;io.github.scndry&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;jackson-dataformat-spreadsheet&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.6.0&lt;span class="nt"&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Requirements
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Java 8+&lt;/li&gt;
&lt;li&gt;Jackson 2.14.0+&lt;/li&gt;
&lt;li&gt;Apache POI 4.1.1+ (Strict OOXML requires 5.1.0+)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Links
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/scndry/jackson-dataformat-spreadsheet" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/scndry/jackson-dataformat-spreadsheet/blob/main/GUIDE.md" rel="noopener noreferrer"&gt;Usage Guide&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/scndry/jackson-dataformat-spreadsheet/blob/main/ARCHITECTURE.md" rel="noopener noreferrer"&gt;Architecture&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/scndry/jackson-dataformat-spreadsheet/blob/main/BENCHMARK.md" rel="noopener noreferrer"&gt;Benchmarks&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Feedback, issues, and stars welcome. If you've ever cursed at &lt;code&gt;row.getCell(17).getStringCellValue()&lt;/code&gt;, this is for you.&lt;/p&gt;

</description>
      <category>java</category>
      <category>excel</category>
      <category>jackson</category>
      <category>apachepoi</category>
    </item>
  </channel>
</rss>
