DEV Community

loading...

Handling fixed-length files using a Kotlin DSL

Leonardo Colman Lopes
Kotlin enthusiast, enjoys open source as a hobby and is really into software testing!
Updated on ・3 min read

TL;DR - Fixed Length files can be troublesome to handle, and even more troublesome when there are multiple kinds of records on the lines. To solve this, we'll use Guiabolso's fixed length file handler, a library designed just for this purpose.

Hello Kotliners!

In this article we will see a little bit on how to handle fixed-length files by using this library, by Guiabolso (with some contributions made by myself). There are many solutions for this kind of problem on the JVM, but all of them are focused on Java.

These libraries are old and not optimized for Kotlin, and become very cumbersome/verbose to use. We created a Kotlin-DSL to handle these cases in a beautiful and concise way.

Remembering: What is a Fixed-Length file?

If you ever worked with a fixed length file, you know that most of the times it's a big pain in the ass.

Photo by Tim Gouw from Pexels

A fixed length or fixed width file is a file containing data separated by fields. These fields have a specific length and is sometimes prefixed/suffixed with a character to denote emptiness, such as 0003 for an int 3 of fixed length 4.

An example of a fixed length file is the following:

Kotlin    1201.63
Java      0219.52
Python    0129.62
Javascript0308.43

It represents data from PYPL, and the data is organized as:

Field From index To Index Padding
Language Name 0 10 RightPadding(' ')
Ranking 10 12 LeftPadding('0')
Share % 12 17 LeftPadding('0')

This kind of file is broadly used by legacy systems (I'm looking at you, banking system), and integrating with them is troublesome and not fun.

Parsing fixed-length files

These files are hard to understand and boring to deal with. The solution to parsing them usually involves a lot of string manipulation and manual buffer control (or bringing the entire file to memory if it's viable).

When dealing with Kotlin code, this verbosity is annoying. We want simplicity and conciseness.

For this, the company I work for, Guiabolso, developed a small library for fixed-length file handling in Kotlin.

GitHub logo GuiaBolso / fixed-length-file-handler

Handlers for Fixed Length files in a beautiful Kotlin DSL

This library provides a beautiful (I wrote it, so I can say it's beautiful, right?) Kotlin DSL to parse this kind of file. Using our previous example as example:

data class PYPLRecord(langName: String, ranking: Int, share: Double)

val pyplSequence = fixedLengthFileParser<PYPLRecord>(fileStream) {
    PYPLRecord(
        field(0, 10, Padding.PaddingRight(' '),
        field(10, 12, Padding.PaddingLeft('0'),
        field(12, 17, Padding.PaddingLeft('0')
    )
}

This will allow us to map our file to a lazy Sequence, which will process the file as a stream instead of bringing it to memory. The library already supports many of the usual Java/Kotlin types, without having to cast and translate them.

"Advanced" fixed-length files

For some reason yet to be defined reason, some of these legacy systems use the same file for more than one record type

Photo by Juan Pablo Serrano Arenas from Pexels

In these cases, our example above will be used for more things, such as Developer Name and Preferred Language

1Kotlin    1201.63
1Java      0219.52
1Python    0129.62
1Javascript0308.43
2Leonardo Colman LopesKotlin
2Jane Doe             Javascript

The type of the record is marked at some position in the line, and your system must find a way to parse it any way.

This leads to a bigger String manipulation spaghetti and a more unsustainable code.

The library also provides a way to parse this kind of file:

data class PYPLRecord(langName: String, ranking: Int, share: Double)
data class DevRecord(devName: String, preferredLang: String)

fixedLengthFileParser<Any>(fileInputStream) {
    withRecord({ line -> line[0] == '1' }) {
        PYPLRecord(
            field(1, 11, Padding.PaddingRight(' '),
            field(11, 13, Padding.PaddingLeft('0'),
            field(13, 18, Padding.PaddingLeft('0')
        )
    }

    withRecord( { line -> line[0] == '2' }) {
        DevRecord(
            field(1, 22, Padding.PaddingRight(' '),
            field(22, 32, Padding.PaddingRight(' ')
        )
    }
}

We believe that parsing fixed-length files will be easier with this library, and we hope to help anyone that needs this kind of feature. Take a look!

GitHub logo GuiaBolso / fixed-length-file-handler

Handlers for Fixed Length files in a beautiful Kotlin DSL

Fixed Length File Handler

Build Status GitHub Bintray Download

Introduction

When processing data from some systems (mainly legacy ones), it's usual to have Fixed Length Files, which are files that contain lines which content is split using a specific length for each field of a record.

This kind of files are sometimes tricky to handle as many times there is a spaghetti of string manipulations and padding, and character counting and... Well, many things to take care of.

This library comes to the rescue of programmers dealing with fixed length files. It enables you to simply define how your records are structured and it will handle these records for you in a nice Kotlin DSL for further processing.

Using with Gradle

This library is published to Bintray jcenter, so you'll need to configure that in your repositories:

repositories {
    mavenCentral()
    jcenter()
}

And then you can import it into your dependencies:

dependencies {
    implementation(

Discussion (0)