DEV Community

Cover image for Choosing Java as your language for a Machine Learning project - Are we crazy???
Jordi Cabot
Jordi Cabot

Posted on • Originally published at

Choosing Java as your language for a Machine Learning project - Are we crazy???

Most people are stunned when they realize that the Xatkit bot engine is written in Java.

True, the vast majority of AI / Machine Learning projects are written in Python. But this doesn't mean that you should go with Python when starting your own project. And don't worry, this is not a post about language wars. I don't pretend to say that Java is better than Python (nor the other way round, for that matter). I'm just explaining our language choice. And suggesting that you should take into account many aspects when choosing the base language for your next project.

Let's see why Java is a good choice for Machine Learning projects, or at least as good as a choice as many others:

  • Machine Learning is only a small part of your project. Most of your code will NOT be about ML tasks but about data input/output, user interface, interaction with external services,... so the language needs to be good at all these things as well.  This is especially true in the case of chatbots that, to begin with, need to interact with different user input platforms.
  • There are ML libraries available for every language. So there is always a way to execute/train your neural networks outside the python world. For instance, in Xatkit, we reuse Stanfords' Core NLP models in some of our language processors. And, if needed, there is always the option to wrap the ML models code in a Python server (I like the simplicity of Flask for this) and consume them via API calls to this server.
  • Java is heavily used in the enterprise world. So while core ML fans may frown at our language choice, enterprise users may see Java as a benefit as they already know how to manage and deploy Java-based applications but they could not have the same experience with Python or other languages.
  • We are Java "experts". We are much more productive coding in Java than with any other language. Of course, we could become proficient in Python if we put the time but time is precious and it made sense to stick to the language we were already using in other projects
  • Xatkit is a model-based tool. By model, I refer here to software design models, not ML ones. An in the modeling ecosystem, Java is still the boss. In particular, Xatkit reuses some EMF libraries, mostly to do some reflection on the bot definition at runtime. For sure, there are other ways to accomplish the same goal, but you can see this as a legacy decision before Xatkit embraced Fluent APIs for the bot definition.

As you can see, maybe Java should not be your first option when getting started in AI technologies if there is really no constraint at all on your language choice. Otherwise, the choice of a language is more of a social/team/organization decision that should take into account many other aspects (team knowledge, organization architecture, integration needs,...). We see developers arguing non-stop about why language A is better than language B but for most projects, even those including some kind of intelligent component, any major language will work and that choice will NOT be the core element in the project success at all.

So, forgive me if we continue developing bots in Java :-)

Top comments (3)

cicirello profile image
Vincent A. Cicirello

If you really needed Python in your Java app, another option aside from Flask is to use the GraalVM, which is designed with polyglot programming in mind.

Anyway, nice post. I do research in AI, mostly evolutionary computation, metaheuristics, and related concepts. And nearly all of my research code is in Java. I use Python mainly for data analysis tasks such as to analyze data from experiments, etc, but the experiments themselves are in Java.

My main reasons for Java over Python for this come down to speed and better support for multithreading, as well as decades of code I've already written in Java. If I was starting from scratch, I don't think I'd pick Python but if I did go with Python I'd probably end up implementing the majority in C rather than directly in Python.

However, if I was working on an ML project, I imagine I'd likely end up using Python in one way or another due to the really good existing ML libraries. Although like your post explains, if ML was a small part, I might end up with a polyglot approach.

jcabot profile image
Jordi Cabot

Thanks a lot for your detailed comments!. And good to see that you've not had any major issue with Java in AI research

jcabot profile image
Jordi Cabot

Additional interesting comments and pointers on the integration of Machine Learning libraries in #Java