DEV Community

Cover image for Feedback on Small Java Package
Andrew (he/him)
Andrew (he/him)

Posted on

Feedback on Small Java Package

I just finished a small package built to intelligently infer schemata ("schemas") of CSV files:

GitHub logo awwsmm / scheme

A minimal package for intelligently inferring schemata of CSV files

scheme

A minimal package for intelligently inferring schemata of CSV files.

JaCoCo Java Code Coverage Score Build Status Link to Javadoc

  • Self-contained -- no external dependencies
  • Compatible -- runs on any Java version >= 8
  • Easy -- works immediately with no configuration required

Built to more intelligently infer schemata for creating Parquet files from CSV.


Download

Download the repository (and unzip if you downloaded the ZIP file):

GitHub 'download' button and menu

Navigate to the target directory:

in Windows Explorer

Locating the JAR file in Windows Explorer

in Windows cmd prompt

C:\>cd C:\Users\myusername\Downloads\scheme-master\target
C:\Users\myusername\Downloads\scheme-master\target>dir
 Volume in drive C is Windows
 Volume Serial Number is 14EE-41C8
 Directory of C:\Users\myusername\Downloads\scheme-master\target
18 Sep 2019  17:30    <DIR>          .
18 Sep 2019  17:30    <DIR>          ..
18 Sep 2019  17:30               931 coverage.svg
18 Sep 2019  17:30            17,449 scheme-1.0.jar
               2 File(s)         18,380 bytes
               2 Dir(s)  2,749,439,602,688 bytes free
Enter fullscreen mode Exit fullscreen mode

in a bash (or similar) shell on a UNIX-like OS

$ git clone https://github.com/awwsmm/scheme.git
Cloning into 'scheme'
remote: Enumerating objects:
Enter fullscreen mode Exit fullscreen mode

I'm looking for feedback on it! Let me know if the layout is unclear or if the documentation could use some work, etc. I tested it on a few different versions of Java and I haven't had any problems.

Javadoc available here.

Anything you like about what I did? Anything you hate? Anything you'd change?

Let me know in the comments! And thanks for your help!

Discussion (7)

Collapse
chains5000 profile image
Pablo Fradua

I've only read a bit of CSV.java and didn't like the huge schema method.
Also didn't like where it's called (
return schema(file, -1, -1, 35, false, false, false, true);)
That line isn't exactly easy to read.
Why is every method static?

I appreciate the effort in documenting everything.
The readme page is great too.

(I don't know if you are looking for feedback on using the library or the library code)

Collapse
awwsmm profile image
Andrew (he/him) Author

Yeah I'm sort of fighting against OOP here. I don't want to have to create a CSV object and then run the algorithm, etc. etc. I just wanted the user to be able to say "okay, give me the schema for this file", with no other input required.

Maybe that's not the best way to go about it, though.

Collapse
simbo1905 profile image
Simon Massey

Not what you were asking but codereview.stackexchange.com can be awesome for details...

Collapse
awwsmm profile image
Andrew (he/him) Author

Great tip!

Collapse
fultonbrowne profile image
Fulton Browne

I don't have any csv files now, but I will check out!

Collapse
awwsmm profile image
Andrew (he/him) Author

There are example ones in the package!

at src/main/resources/

Collapse
fultonbrowne profile image
Fulton Browne

thanks will check it out!