DEV Community: Mani

AI Coding Tools (MCP-series)

Mani — Tue, 07 Oct 2025 00:09:00 +0000

AI-assisted development, code assistants, and intelligent developer tools

About This Category

Learn how to leverage AI-powered coding tools to boost your productivity as a developer. This category covers MCP (Model Context Protocol) setup, Claude command usage, platform comparisons, and other AI coding assistant topics.

Posts

MCP (Model Context Protocol) Series

This series covers everything you need to know about setting up and using MCP servers with Claude and other AI coding assistants.

The Complete MCP Server Setup Guide: Claude Desktop, Claude Code, and Cursor

Published: October 2025 | Difficulty: Intermediate

A comprehensive guide to setting up Model Context Protocol (MCP) servers across different Claude environments. Learn step-by-step configuration for Claude Desktop, Claude Code, and Cursor, with troubleshooting tips and best practices.

Topics: MCP Setup, Claude Desktop, Claude Code, Cursor, Configuration

Reading Time: ~20 minutes

The Claude Command Reference Card

Published: October 2025 | Difficulty: Beginner

Quick reference guide for Claude commands, shortcuts, and best practices. Essential for anyone using Claude Desktop, Claude Code, or Cursor to maximize productivity.

Topics: Commands, Shortcuts, Prompt Patterns, Workflows

Reading Time: ~10 minutes

MCP Setup Comparison Tables

Published: October 2025 | Difficulty: Intermediate

Side-by-side comparison of MCP server setup across Claude Desktop, Claude Code, and Cursor. Features detailed comparison tables, decision matrices, and migration guides.

Topics: Platform Comparison, Feature Matrix, Migration, Use Cases

Reading Time: ~15 minutes

Resource References

Topics Covered in This Category

MCP (Model Context Protocol) - Setup, configuration, and best practices
Claude Environments - Desktop, Code extension, and Cursor integration
AI Pair Programming - Using AI as a coding partner
Command Reference - Quick lookup for commands and shortcuts
Platform Comparison - Choosing the right tool for your workflow

External Resources

Disclaimer

Important Notice:

Content Accuracy: The information, resources, and links provided in this repository are curated from various sources and are subject to change. While we strive to maintain accuracy and keep content up-to-date, we cannot guarantee that all information is current, correct, or complete at all times.
Third-Party Content: This repository contains references, links, and citations to external sources, articles, tools, libraries, and resources created by third parties. We do not own or claim ownership of this external content. All credit belongs to the original authors and creators.
No Warranty: The content is provided "as is" without warranty of any kind, express or implied. We make no representations or warranties regarding the accuracy, reliability, or completeness of any information provided.
Verification Recommended: Users are strongly encouraged to verify information, test tools and code, and refer to official documentation before using any resources in production environments or critical applications.
Rapidly Evolving Field: AI, ML, and DL are rapidly evolving fields. Tools, best practices, and technologies mentioned here may become outdated. Always check for the latest versions and updates from official sources.
No Professional Advice: Nothing in this repository constitutes professional, legal, or technical advice. Users should consult with qualified professionals for specific guidance related to their use cases.
Community Contributions: This is a community-driven project. Content may be contributed by various individuals. If you find errors, outdated information, or have suggestions for improvements, please see our Contributing Guidelines.

Use at Your Own Risk: By using this repository, you acknowledge and accept these disclaimers and agree to use the information and resources at your own discretion and risk.

Another Two Years In The Life Of AI, ML, DL And Java

Mani — Fri, 25 Dec 2020 21:16:55 +0000

This is a reblog of the original post at Java Advent Calendar 2020, see original post.

A bit of a background

Many of you know of my first post on this topic Two years in the life of AI, ML, DL and Javathat was back in 2018/2019 when I had just started with my journey which still continues but I have more to share in this update (and I hope you will like it). At that time we had a good number of resources and the divide between Java and non-Java related resources were not so to have discussions about — since then things have changed, let’s see what the landscape looks like.

I’m hoping that I’m able to share richer aspects of my journey, views, opinions, observations and learnings this time around as compared to my previous summary post.

I do admit that this is only a summary post and I intend to write more specific and focussed posts like the ones I did in 2019 (see the following section) — I have earmarked a couple of topics and I will flesh them out as soon as I’m happy with my selections.

2019: year of blogs

Looking at my activities in 2019 towards 2020, I can say I was more focussed on writing blogs and creating a lot of hands-on solution which I shared via my projects on GitHub and blogs on Medium. Here is a list of all of them:

August 2019: How to build Graal-enabled JDK8 on CircleCI?
August 2019: How to do Deep Learning for Java?
September 2019: Running your JuPyTeR notebooks on the cloud
October 2019: Running Apache Zeppelin on the cloud
November 2019: Applying NLP in Java, all from the command-line
November 2019: Exploring NLP concepts using Apache OpenNLP
December 2019: Exploring NLP concepts using Apache OpenNLP inside a Java-enabled Jupyter notebook

2020: year of presentations

Similarly in 2020, from a blogging point-of-view I didn’t write much, this officially my first proper post of the year (with the exception of one in August 2020, which you will find below) — I was quite busy with preparing for talks and presentations and online meetings and discussions. I kicked off the year with my talk at the Grakn Cosmos 2020 conference in London, UK. So let me list them here so you can take a look at them in your own time:

February 2020: Naturally, getting productive, my journey with Grakn and Graql at Grakn Cosmos 2020 conference
July 2020: “nn” things every Java Developer should know about AI/ML/DL at JOnConf 2020 conference
August 2020: From backend development to machine learning on Abhishek Talks(YouTube channel) meetup
August 2020: An Interview by Neural Magic: Machine Learning Engineer Spotlight(blog post)
October 2020: Profiling Text Data at the NLP Zurich meetup
October 2020: Tribuo: an introduction to a Java ML Library at MakeITWeek 2020 conference
October 2020: “nn” things every Java Developer should know about AI/ML/DL at Oracle Groundbreakers Tour – APAC 2020 conference

You can find slides (and videos of some) of the above presentations under the links provided.

Other achievements and activities this year

As to my journey, I have also been involved a good part of this year learning by online AI/ML competitions, see this and this post shared to learn more. You can also find all of my datasets and notebooks for further study. Also the launch of an Python-based NLP Library called NLP Profiler was another important feat for me this year.

What’s happening in this space?

There is a lot of hype happening in the space and as Suyash Joshi rightly said during a conversation sometime back: “It’s easy to get sucked into hype and shiny new stuff I fell a lot for that and then get overwhelmed“. And I think you will agree with me that we can relate to this situation. Also many of the things we are seeing look nice, and shiny but may NOT be fully ready for the real-world yet, I did cover similar arguments in a post in August.

There are few entities who are openly claiming various things about the not-so-bright-sides of AI, they are things like:

the statistical methods used are from the last century and may not be suited to modern day data(led by Nassim Taleb’s school of thought)
many modern-day open-source tools and industry methods to derive at solutions are broken or inadequate
we are not able to explain our predictions and findings(blackbox issue: explainability and interpretability)
our AI solutions lack causality, intent and reasoning (similar to the previous point)

These come from individuals, groups and companies who are building their own solutions (mostly closed-source) as they have noticed the above drawbacks and doing things to overcome them but also making the rest of the world aware of it.

Where is Java on the map?

Java has been contributing to the AI force since sometime now, all the way when there was a lot of talk during the Big Data buzz, which then transformed to the new buzz about AI, Machine Learning and Deep Learning. If we closely look at the categories and sub-categories of resources here:

Apache Spark [1] |[2]
Apache Hadoop
Deeplearning4J
and many other such tools, libraries, frameworks, and products

and now more recently

grCuda and graalPython (from the folks at Oracle Labs behind all the GraalVM magic)
DeepNetts
Tribuo

and more under these resources

Java: AI/ML/DL
NLP Java
Tools and libraries (Java/JVM)
some intriguing GitHub resources: AI or Artificial Intelligence | ML or Machine Learning | DL or Deep Learning | RL or Reinforcement Learning | NLP or Natural Language Processing

[1] Nvidia and Apache Spark

[2] Oracle Labs and Apache Spark

You can already see various implementations using Java and other JVM languages — they have just not been noticed by us, unless we made an attempt to look for them.

If you have been following the progress of GraalVM and Truffle (see GitHub) in this space boosting server-side, desktop and cloud-base systems in terms of performance, ergonomic footprint and language interop — then you will, just like me also see that Java’s presence and position is growing from strength-to-strength in this space.

(grCuda and graalPython are two such examples: see AI/ML/DL related resources on Awesome Graal and also at the appendix sections of the talk slides for more resources — see section above for talks)

Some highlights in pictures

Recommended Resources

If I had to go through many of the resources I have shared in this post so far, of all of them I would look at these topics (in no particular order and also not an exhaustive list, it’s still a lot of topics to learn and know about):

My apologies as YMMV — I’m working on these topics and slowly gaining knowledge and experience and hence my urge to immediately share them with the rest of the community. I’ll admit few of these may be still Work in Progress — but I hope we can all learn these together. I’ll share more as I make more progress with them during the course of the days, weeks and months into 2021.

Conclusions

We are in a heterogenous space and AI/ML/DL ideas and innovations from creation to deployment go through various forms and shapes and depending on the comfort-zones and end-use cases of the ideas and innovations get implemented using multitudes of technologies. It won’t be correct or fair to say only one technology is being used.

We are seeing R&D, rapid prototyping, MVP, PoC and the likes are being done using languages like Python, R, Matlab, Julia and the likes while when it comes to robust, reliable, scalable, production-ready solutions more serious contenders are chosen for encapsulating the original work. And Java and other JVM languages, and the likes that can provide such stability are often being considered both for desktop and cloud implementations.

Wearing my Oracle Groundbreaker Ambassador hat: I recommend taking advantage of the Free cloud credits from Oracle (also see this post — the VMs are super fast and easy to set-up (see my posts above).

What next?

The journey continues, the road may be long, but it’s full of learnings — that certainly helps us grow. Many times we learn at the turning points while the milestones keep changing. Let’s continue learning and improving ourselves.

Please do feedback if you are curious to explore deeper aspects of what you read above, as you may understand the topic is wide and deep and one post cannot do full justice to such rich topics (plus many of them are moving targets).

This is a reblog of the original post at Java Advent Calendar 2020, see original post.

An Interview by Neural Magic: Machine Learning Engineer Spotlight

Mani — Sun, 01 Nov 2020 09:05:10 +0000

This is a reblog of the original post at https://neuralmagic.com/blog/machine-learning-engineer-spotlight-mani-sarkar/ by Neural Magic

In our new blog series, we’re interviewing data scientists and machine learning engineers about their career paths, areas of interest and thoughts on the future of AI. We kick off this week with a 20-year veteran and jack-of-all-trades when it comes to machine learning and data science: Mani Sarkar. Mani is a strategic machine learning engineer based in London, UK, who believes in getting beyond the theoretical and applying AI to real-world problems.

Below is our interview, lightly edited for clarity.

Tell us more about how you got into machine learning.

I started my career as a software developer, writing desktop, web-based applications and command-line tools. The best thing about being a developer is learning new things. I’ve always been interested in data, numbers and math. In the last few years, I got more serious about it.

After 20 years working as a permanent employee, I decided to take charge of my career as a freelancer. I help companies develop proof-of-concept or minimum viable products to secure funding or go to market quickly. I focus on improving performance or the speed of the software development process. My motto is, “Strengthening teams and helping them accelerate!”

There are really no boundaries to what I’m asked to do, so learning data science became a major priority for me. One client inspired me to delve into this subject more. The company wrote bots that could read and write computer code. The bot provides recommendations on how to improve your code as a developer. As someone who’s interested in software quality, it pushed me to pursue practical machine learning and data science projects.

What are you most excited about in the work you are doing these days?

I take a top-down approach to machine learning where I focus on the business problem, as opposed to many others who take a bottom-up approach. I can count on my fingers the entities taking a top-down approach. My philosophy of learning and implementing matches greatly with theirs. Creating programs, content, guides and everything else using this principle is very exciting. Machine learning is an ever-changing and hard-to-grasp field, so it is important to help business users contextualize technical projects.

Autonomy and creativity is a great part of what I do. When I’m free to be creative, I’m able to get the best results for the end-user, the customer or even the community (when I’m working on open source projects).

What is the coolest machine learning problem you have worked toward solving?

Natural language processing (NLP) is a widespread field with many new innovations and advancements. Despite that, at a very basic level, there are no comprehensive tools to analyze tabular text data. There are a lot of fragmented tools and utilities available, but many of them are not open-sourced or widely shared. So, we all end up building our own little solutions to analyze text datasets. Each one of us might do it differently and get a different response.

While preparing for a talk last month, I wrote a simple utility called NLP Profiler in under three hours, which is now going to be part of the Better NLP library. When given a dataset and a column name with text data, NLP Profiler will return either high-level insights about the text or low-level/granular statistical information about the same text. Think of it as using the pandas.describe() function or running Pandas Profiling on your data frame, but for datasets containing text columns rather than columnar datasets (tabular or spreadsheet-like data where each column may be a different data type like string, numeric, date, etc. This includes most kinds of data commonly stored in a relational database or tab, or .csv files)

I have used it on a few datasets and it has shown some interesting information. High-level information would include things like sentiment analysis, subjectivity/objectivity analysis, grammar or spelling quality check, etc. Low-level details could include the number of words in the sentence, the number of emojis in the text, etc. NLP Profiler can do this analysis using a single line of code. Above all, it can be extended and shared openly with others. This opens a new world of machine learning on text data and can help any NLP engineer or practitioner.

How do you predict machine learning will evolve over the next decade?

Automation will play a big role. Even though there will be big challenges – like privacy, ethics, bias, and more – there will be ways around it using a combination of automation and human intervention. Machine learning will evolve as a human-in-the-loop or AI will assist humans do everything from collecting data, to training models, to analyzing models.

I predict that there are three different pathways that may occur in parallel:

Fully/partially AI-driven systems:Many of these error-prone systems have been attempted, and I imagine there will be even more failures to learn from in the future.
AI assisting humans:Rather than AI taking people’s jobs, AI will augment mundane tasks. This will create a cascade of new industries, just like the advent of PCs in the 80s and 90s created countless fulfilling careers.
Humans doing tasks AI cannot do fully or partially:AI still isn’t good at some tasks humans excel at, which will remain the case for some time.

Since NLP is an area in which I’m particularly interested, I imagine in the future we may see a smart dialogue system with memory and ability to detect context. We’ve already come a long way from the 1960s MIT NLP program, ELIZA. I created a conversational chatbot demo and video based on the logic used to build ELIZA, which people can play with on my Github.

What is the most interesting application of machine learning you have seen out there?

I was recently at an online meetup where a small startup called PolyAI gave a demo of an interactive chatbot app that would take a phone call and make a reservation from any caller. It was used to demonstrate how a person could book a restaurant table via the app. It was amazing to see how the chatbot picked up many of the nuances in the caller’s language or style of speaking (even the accent) that only experienced professionals could have responded to. The accuracy at which the information was delivered to the user was impressive. Even though the caller was a bit unfamiliar with the fact that they were talking to a chatbot and not a human, they could get their points across and were pleasantly surprised.

Another demo I recently experienced was a technical open source application called PyCaret, where the whole process of creating a machine learning model was automated or made available with only a few lines of code in Python. The tool made all kinds of analysis and generated metrics and logs that could be assessed to make further decisions. It even had a way to perform post-model creation analysis using SHAPLY.

What do you see as the biggest challenges in machine learning and AI right now?

Ethics, and privacy are big challenges, which we are all aware of. However, we have not developed General AI yet (or if it’s available, we don’t see it in the mainstream) where these concerns would become more severe.

Specialization in different domains is still a challenge. There’s still a lot of work to be done to master practical applications of AI. Smaller subsets of different domain expertise need to come together to become generally useful to the population.

On the NLP front, machine understanding of various languages and dialects will continue to remain an issue, even though not for long. The accuracy of general NLP systems’ results still needs work before we can accept them in the mainstream for domestic or industrial use. That said, many industries have achieved success creating solutions for specific needs (under controlled environments).

The other challenge will be the ability to weave society and these AI creations. With certain inventions and innovations, challenges lie in the lack of knowledge around these systems. Specific areas of concern could include algorithms for crime and law enforcement, or health and safety. Authenticity and the rise of “deep fakes” will continue to be a challenge for some time.

Finally, energy consumption within data centers that specialize in AI/machine learning tasks are having negative consequences on the environment. That’s a global problem that needs to be solved.

If you could change one thing about the public perception of machine learning and AI, what would it be?

The biggest perception to change is that AI is not here to take away our jobs or make lives difficult. When augmented with our skills and knowledge, it can assist us in our regular life and work processes. Contrary to what the entertainment industry might have you believe, AI can’t “think” like us independently. We haven’t reached that stage where we can say we have built an electronic or digital consciousness around us. We may think we are going to be able to build one in the distant future, but that’s a discussion for another time.

About Mani

Mani is a passionate developer mainly in the Java/JVM space, currently strengthening teams and helping them accelerate when working with small teams and startups, as a freelance software, data, ML engineer.

A Java Champion, JCP Member, OpenJDK contributor, thought leader in the LJC and other developer communities and involved with @adoptopenjdk, @graalvm and other F/OSS projects. Writes code, not just on the Java/JVM platform but in other programming languages, hence likes to call himself a polyglot developer. He sees himself working in the areas of core Java, JVM, JDK, Hotspot, Graal, GraalVM, Truffle, VMs, and Performance Tuning.

An advocate of a number of agile and software craftsmanship practices and a regular at many talks, conferences (Devoxx, VoxxedDays) and hands-on-workshops – speaks, participates, organises and helps out at many of them. Expresses his thoughts often via blog posts (on his own blog site, DZone, Medium and other third-party sites), and microblogs (tweets).

Learn more and connect with Mani here.

This is a reblog of the original post at https://neuralmagic.com/blog/machine-learning-engineer-spotlight-mani-sarkar/ by Neural Magic

Exploring NLP concepts using Apache OpenNLP inside a Java-enabled Jupyter notebook

Mani — Tue, 03 Dec 2019 13:03:28 +0000

Introduction

I have been exploring and playing around with the Apache OpenNLP library after a bit of convincing. For those who are not aware of it, it’s an Apache project, supporters of F/OSS Java projects for the last two decades or so (see Wikipedia). I found their command-line interface pretty simple to use and it is a great learning tool for learning and trying to understand Natural Language Processing (NLP). Independent of this post, you can find another perspective on exploring NLP concepts using Apache OpenNLP, all of this directly from the realms of your command-prompt.

I can say almost everyone in this space is also aware and familiar with Jupyter Notebooks (in case you are not, have a look at this video or [1] or [2]). Here onwards we will be doing the same things you have been doing with your own experiments from within the realms of the notebook.

Exploring NLP using Apache OpenNLP

Command-line Interface

I’ll refer you to the post where we cover the command-line experience with Apache OpenNLP, and it’s a great way to familiarise yourself with this NLP library.

Jupyter Notebook: Getting started

Do the following before proceeding any further:

        $ git clone git@github.com:neomatrix369/nlp-java-jvm-example.git
        or 
        $ git clone https://github.com/neomatrix369/nlp-java-jvm-example.git
        $ cd nlp-java-jvm-example

And then see Getting started section in the Exploring NLP concepts from inside a Java-based Jupyter notebook part of the README before proceeding further.

Also, we have chosen the JDK to be GraalVM by default, you can see this from these lines in the console messages:

    <---snipped-->
    JDK_TO_USE=GRAALVM
    openjdk version "11.0.5" 2019-10-15
    OpenJDK Runtime Environment (build 11.0.5+10-jvmci-19.3-b05-LTS)
    OpenJDK 64-Bit GraalVM CE 19.3.0 (build 11.0.5+10-jvmci-19.3-b05-LTS, mixed mode, sharing)
    <---snipped-->

Note: a docker image has been provided to be able to run a docker container that would contain all the tools you need. You can see the *shared* folder has been created which is linked to the volume mounted into your container, mapping your the folder from the local machine. So anything created or downloaded into the shared folder will be available even after you exit your container!*

Have a quick read of the main README file to get an idea of how to go about using the docker-runner.sh shell script, and take a quick glance at the Usage section of the scripts as well*.*

Running the Jupyter notebook container

See Running the Jupyter notebook container section in the Exploring NLP concepts from inside a Java-based Jupyter notebook part of the README before proceeding further.

All you need to do is run this command after cloning the repo mentioned in the links above:

    $ ./docker-runner.sh --notebookMode --runContainer

Once you have the above running, the action will automatically open load the Jupyter notebook interface for you into a browser window. You will have a couple of Java notebooks to choose from (placed in the shared/notebooks folder on your local machine):

Installing Apache OpenNLP in the container

When inside the container in the notebook mode, you have two approaches to install Apache OpenNLP:

From the command-line interface (optional)
See From the command-line interface sub-section under the Installing Apache OpenNLP in the container section in the Exploring NLP concepts from inside a Java-based Jupyter notebook part of the README before proceeding further.
From inside the Jupyter notebook (recommended)
See From inside the Jupyter notebook sub-section under the Installing Apache OpenNLP in the container section in the Exploring NLP concepts from inside a Java-based Jupyter notebook part of the README before proceeding further.

Viewing and accessing the shared folder

See Viewing and accessing the shared folder section in the Exploring NLP concepts from inside a Java-based Jupyter notebook part of the README before proceeding further.

This will also be covered in a small way via the Jupyter notebooks in the following section. You can see directory contents via the %system Java cell magic and then from the command prompt a similar files/folders layout.

Performing NLP actions in a Jupyter notebook

While you have the notebook server running you will see this launcher window with a list of notebooks and other supporting files show up as soon as the notebook server launches:

Each of the notebooks above have a purpose, MyFirstJupyterNLPJavaNotebook.ipynb shows how to write Java in a IPython notebook and perform NLP actions using Java code snippets that invoke the Apache OpenNLP library functionalities (see docs for more details on the classes and methods and also the Java Docs for more details on the Java API usages).

The other notebook MyNextJupyterNLPJavaNotebook.ipynb runs the same Java code snippets on a remote cloud instance (with the help of the Valohai CLI client) and returns the results in the cells, with just single commands. It’s fast and free to create an account and use within the free-tier plan.

We are able to examine the below Java API bindings to the Apache OpenNLP library from inside both the Java-based notebooks:

Language Detector API
Sentence Detection API
Tokenizer API
Name Finder API (including other examples)
Parts of speech (POS) Tagger API
Chunking API
Parsing API

Exploring the above Apache OpenNLP Java APIs via the notebook directly

We are able to do this from inside a notebook, running the IJava Jupyter interpreter which allows writing Java in a typical notebook. We will be exploring the above named Java APIs using small snippets of Java code and see the results appear in the notebook:

So go back to your browser and look for the MyFirstJupyterNLPJavaNotebook.ipynb notebook and have a play with it, reading and executing each cell and observing the responses.

Exploring the above Apache OpenNLP Java APIs via the notebook with the help of remote cloud services

We are able to do this from inside a notebook, running the IJava Jupyter interpreter which allows writing Java in a typical notebook. But in this notebook, we have taken it further and used the %system Java cell magic and the Valohai CLI magic instead of running the Java code snippets in the various cells like the previous notebook.

So that way the downloading of the models and processing of the text using the model does not happen on your local machine but one a more sophisticated remote server in the cloud. And you are able to control this process from inside the notebook cells. This is more relevant when the models and the datasets to process are large and your local instance(s) do not have the necessary resources to support long-standing NLP processes. I have seen NLP training and evaluations to take long to finish and hence high-spec resources are a must.

And again go back to your browser and look for the MyNextJupyterNLPJavaNotebook.ipynb notebook and have a play with it, reading and executing each cell. All the necessary details are in there including links to the docs and supporting pages.

To get a deeper understanding of how these two notebooks were put together and how they work operationally, please have a look at all the source files.

Closing the Jupyter notebook

Make sure you have saved your notebook before you do this. Switch to the console window from where you ran the docker-runner shell script. Pressing Ctrl-C in the console running the Docker container gives you this:

    <---snipped--->
    [I 21:13:16.253 NotebookApp] Saving file at /MyFirstJupyterJavaNotebook.ipynb
    ^C[I 21:13:22.953 NotebookApp] interrupted
    Serving notebooks from local directory: /home/jovyan/work
    1 active kernel
    The Jupyter Notebook is running at:
    http://1e2b8182be38:8888/
    Shutdown this notebook server (y/[n])? y
    [C 21:14:05.035 NotebookApp] Shutdown confirmed
    [I 21:14:05.036 NotebookApp] Shutting down 1 kernel
    Nov 15, 2019 9:14:05 PM io.github.spencerpark.jupyter.channels.Loop shutdown
    INFO: Loop shutdown.
    <--snipped-->
    [I 21:14:05.441 NotebookApp] Kernel shutdown: 448d46f0-1bde-461b-be60-e248c6342f69

This shuts down the container and you are back to your local machine command-prompt. Your notebook stays preserved in the shares/notebooks folder on your local machine, provided you have been saving them as you kept changing them.

Other concepts, libraries and tools

There are other Java/JVM based NLP libraries mentioned in the Resources section below, for brevity we won’t cover them. The links provided will lead to further information for your own pursuit.

Within the Apache OpenNLP tool itself we have only covered the command-line access part of it and not the Java Bindings. In addition, we haven’t gone through all the NLP concepts or features of the tool again for brevity have only covered a handful of them. But the documentation and resources on the GitHub repo should help in further exploration.

You can also find out how to build the docker image for yourself, by examining the docker-runner script.

Limitations

Although the Java cell magic does make a difference and helps run commands. Even though it’s a non-Python based notebook we could still run shell commands and execute Java code in our cells and do some decent NLP work in the Jupyter notebooks.

If you had a python-based notebook, then Valohai’s extension called Jupyhai specially made for such purposes would suffice. Have a look at the Jupyhai sub-section in the Resources section of this post (at the end of the post). In fact, we have been running all our actions in the Jupyhai notebook, although I have been calling it Jupyter Notebook, have a look at the icon on the toolbar in the middle part of the panel in the browser):

Conclusion

This has been a very different experience than most of the other ways of exploring and learning, and you can see why the whole industry specifically speaking areas that cover Academia, Research, Data Science and Machine Learning have taken this approach like a storm. We still have limitations but with time even they will be overcome making our experience a smooth one.

Seeing your results in the same page where your code sits is a lot assuring and gives us a short and quick feedback loop. Especially being able to see the visualisations and change them dynamically and get instant results can cut through the cruft for busy and eager students, scientists, mathematicians, engineers and analysts in every field not just Computing or Data Science or Machine Learning.

Resources

IJava (Jupyter interpreter)

Github
Docs
[%system](https://github.com/SpencerPark/IJava/pull/78) Java cell magic implementation
Docker image with IJava + Jupyhai + other dependencies

Jupyhai

Apache OpenNLP

nlp-java-jvm-example GitHub project
Apache OpenNLP | GitHub | Mailing list | @apacheopennlp
Docs
Download
- Apache OpenNLP Jar/binary
- Model Zoo
Legends to support the examples in the docs
- List of languages
- Penn Treebank tag set
Find more in the Resources section in the README

About me

Mani Sarkar is a passionate developer mainly in the Java/JVM space, currently strengthening teams and helping them accelerate when working with small teams and startups, as a freelance software engineer/data/ml engineer, more….

Twitter: @theNeomatrix369 | GitHub: @neomatrix369

Exploring NLP concepts using Apache OpenNLP

Mani — Thu, 21 Nov 2019 00:58:46 +0000

Introduction

After looking at a lot of Java/JVM based NLP libraries listed on Awesome AI/ML/DL I decided to pick the Apache OpenNLP library. One of the reasons comes from the fact another developer (who had a look at it previously) recommended it. Besides, it’s an Apache project, they have been great supporters of F/OSS Java projects for the last two decades or so (see Wikipedia). It also goes without saying that Apache OpenNLP is backed by the Apache 2.0 license.

In addition, this tweet from an NLP researcher added some more confidence to the matter:

// Detect dark theme var iframe = document.getElementById('tweet-1101890668283215872-165'); if (document.body.className.includes('dark-theme')) { iframe.src = "https://platform.twitter.com/embed/Tweet.html?id=1101890668283215872&theme=dark" }

I’ll like to say my personal experience has been similar with Apache OpenNLP so far and I echo the simplicity and user-friendly API and design. You will see as we explore it further, that being the case.

Exploring NLP using Apache OpenNLP

Java bindings

We won’t be covering the Java API to Apache OpenNLP tool in this post but you can find a number of examples in their docs. A bit later you will also need some of the resources enlisted in the Resources section at the bottom of this post in order to progress further.

Command-line Interface

I was drawn to the simplicity of the CLI available and it just worked out-of-the-box, for instances where a model was needed, and when it was provided. It would just work without additional configuration.

To make it easier to use and also not have to remember all the CLI parameters it supports I have put together some shell scripts. Have a look at the README to get more insight into what they are and how to use them.

Getting started

You will need the following from this point forward:

Git client 2.x or higher (an account on GitHub to fork the repo)
Java 8 or higher (suggest install GraalVM CE 19.x or higher)
Docker CE 19.x or higher and check it is running before going further
Ability to run shell scripts from the CLI
Understand reading/writing shell scripts (optional)

Note: At the time of the writing version 1.9.1 of Apache OpenNLP was available.

We have put together scripts to make these steps easy for everyone:

    $ git clone git@github.com:valohai/nlp-java-jvm-example.git
    or 
    $ git clone https://github.com/valohai/nlp-java-jvm-example.git
    $ cd nlp-java-jvm-example

This will lead us to the folder with the following files in it:

    LICENSE.txt      
    README.md        
    docker-runner.sh     <=== only this one concerns us at startup
    images
    shared               <=== created just when you run the container

Note: a docker image has been provided to be able to run a docker container that would contain all the tools you need to go further. You can see the *shared* folder has been created, which is a volume mounted into your container but it’s actually a directory created on your local machine and mapped to this volume. So anything created or downloaded there will be available even after you exit out of your container!

Have a quick read of the main README file to get an idea of how to go about using the docker-runner.sh shell script, and take a quick glance at the Usage section ***as well.* Thereafter also take a look into the Apache OpenNLP README file to see the usages of the scripts provided there in.

Run the NLP Java/JVM docker container

At your local machine command prompt while at the root of the project, do this:

    $ ./docker-runner.sh --runContainer

There is a chance you get this first, before you get the prompt:

    Unable to find image 'neomatrix369/nlp-java:0.1' locally
    0.1: Pulling from neomatrix369/nlp-java
    f476d66f5408: ...
    .
    .
    .
    Digest: sha256:53b89b166d42ddfba808575731f0a7a02f06d7c47ee2bd3622e980540233dcff
    Status: Downloaded newer image for neomatrix369/nlp-java:0.1

And then you will be presented with prompt inside the container:

    Running container neomatrix369/nlp-java:0.1

    ++ pwd
    + time docker run --rm --interactive --tty --workdir /home/nlp-java --env JDK_TO_USE= --env JAVA_OPTS=<--snipped>
    nlp-java@cf9d493f0722:~$

The container is packed with all the Apache OpenNLP scripts/tools you need to get started with exploring various NLP solutions.

Installing Apache OpenNLP inside the container

Here is how we go further from here when you are inside the container, at the container command-prompt:

    nlp-java@cf9d493f0722:~$ cd opennlp


    nlp-java@cf9d493f0722:~$ ./opennlp.sh

You will see the apache-opennlp-1.9.1-bin.tar.gz artifact being downloaded and expanded into the shared folder:

    % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                     Dload  Upload   Total   Spent    Left  Speed
    100 10.6M  100 10.6M    0     0  4225k      0  0:00:02  0:00:02 --:--:-- 4225k
    apache-opennlp-1.9.1/
    apache-opennlp-1.9.1/NOTICE
    apache-opennlp-1.9.1/LICENSE
    apache-opennlp-1.9.1/README.html
    .
    .
    .
    apache-opennlp-1.9.1/lib/jackson-jaxrs-json-provider-2.8.4.jar
    apache-opennlp-1.9.1/lib/jackson-module-jaxb-annotations-2.8.4.jar

Viewing and accessing the shared folder

Just as you run the container, a shared folder is created, it may be empty in the beginning but as we go along we will find it fill up with different files and folders.

It’s also where you will find the downloaded models and the Apache OpenNLP binary exploded into its own directory (by the name apache-opennlp-1.9.1).

You can access and see the contents of it from the command-prompt (outside the container) as well:

    ### Open a new command prompt
    $ cd nlp-java-jvm-example
    $ cd images/java/opennlp
    $ ls ..
    Dockerfile       corenlp.sh       opennlp          reverb.sh        word2vec.sh
    cogcomp-nlp.sh   mallet.sh        openregex.sh     shared
    common.sh        nlp4j.sh         rdrposttagger.sh version.txt

    $ ls ../shared
    apache-opennlp-1.9.1   en-ner-date.bin        en-sent.bin
    en-chunker.bin         en-parser-chunking.bin langdetect-183.bin

    ### In your case the contents of the shared folder may vary but the way to get to the folder is above.

From inside the container this is what you see:

    nlp-java@cf9d493f0722:~$ ls 
    cogcomp-nlp.sh   corenlp.sh  nlp4j.sh  openregex.sh        reverb.sh  word2vec.sh
    common.sh        mallet.sh   opennlp   rdrposttagger.sh        shared

    nlp-java@cf9d493f0722:~$ ls shared
    MyFirstJavaNotebook.ipynb      en-ner-date.bin           en-pos-maxent.bin          
    langdetect-183.bin
    apache-opennlp-1.9.1           en-ner-time.bin           en-pos-perceptron.bin  
    notebooks
    en-chunker.bin                 en-parser-chunking.bin    en-token.bin

    ### In your case the contents of the shared folder may vary but the way to get to the folder is above.

Performing NLP actions inside the container

The good thing is without ever leaving your current folder you can perform these NLP actions (check out the Exploring NLP Concepts section in the README):

Usage help of any of the scripts: at any point in time you can always query the scripts by calling them this way:

    nlp-java@cf9d493f0722:~$ ./[script-name.sh] --help

For e.g.

    nlp-java@cf9d493f0722:~$ ./detectLanguage.sh --help

gives us this usage text as output:

           Detecting language in a single-line text or article

           Usage: ./detectLanguage.sh --text [text]
                     --file [path/to/filename]
                     --help

           --text      plain text surrounded by quotes
           --file      name of the file containing text to pass as command arg
           --help      shows the script usage help text

Detecting language in a single-line text or article (see legend of language abbreviations used)

    nlp-java@cf9d493f0722:~$ ./detectLanguage.sh --text "This is an english sentence"

    eng This is an english sentence

See Detecting languages section in the README for more examples and detailed output.

Detecting sentences in a single line text or article.

    nlp-java@cf9d493f0722:~$ ./detectSentence.sh --text "This is an english sentence. And this is another sentence."


    This is an english sentence.
    And this is another sentence.

See Detecting sentences section in the README for more examples and detailed output.

Finding person name, organisation name, date, time, money, location, percentage information in a single line text or article.

    nlp-java@cf9d493f0722:~$ ./nameFinder.sh --method person  --text "My name is John"


    My name is <START:person> John <END>

See Finding names section in the README for more examples and detailed output. There are a number of types of name finder examples in this section.

Tokenize a line of text or an article into its smaller components (i.e. words, punctuation, numbers).

    nlp-java@cf9d493f0722:~$ ./tokenizer.sh --method simple --text "this-is-worth,tokenising.and,this,is,another,one"


    this - is - worth , tokenising . and , this , is , another , one

See Tokenise section in the README for more examples and detailed output.

Parse a line of text or an article and identify groups of words or phrases that go together (see Penn Treebank tag set for the legend of token types), also see https://nlp.stanford.edu/software/lex-parser.shtml.

    nlp-java@cf9d493f0722:~$ ./parser.sh --text "The quick brown fox jumps over the lazy dog ."


    (TOP (NP (NP (DT The) (JJ quick) (JJ brown) (NN fox) (NNS jumps)) (PP (IN over) (NP (DT the) (JJ lazy) (NN dog))) (. .)))

See Parser section in the README for more examples and detailed output.

Tag parts of speech of each token in a line of text or an article (see Penn Treebank tag set for the legend of token types), also see https://nlp.stanford.edu/software/tagger.shtml.

    nlp-java@cf9d493f0722:~$ ./posTagger.sh --method maxent --text "This is a simple text to tag"


    This_DT is_VBZ a_DT simple_JJ text_NN to_TO tag_NN

See Tag Parts of Speech section in the README for more examples and detailed output.

Text chunking by dividing a text or an article into syntactically correlated parts of words, like noun groups, verb groups. You apply this feature on the tagged parts of speech text or article. Apply chunking on a text already tagged by PoS tagger (see Penn Treebank tag set for the legend of token types, also see https://nlpforhackers.io/text-chunking/).

    nlp-java@cf9d493f0722:~$ ./chunker.sh --text "This_DT is_VBZ a_DT simple_JJ text_NN to_TO tag_NN"


    \[NP This_DT \] [VP is_VBZ ] \[NP a_DT simple_JJ text_NN \] [PP to_TO ] [NP tag_NN]

See Chunking section in the README for more examples and detailed output.

Exiting from the NLP Java/JVM docker container

It is as simple as this:

    nlp-java@f8562baf983d:~/opennlp$ exit
    exit
           67.41 real         0.06 user         0.05 sys

And you are back to your local machine prompt.

Benchmarking

One of the salient features of this tool is, it’s recording and reporting metrics of its actions at different execution points - time taken at micro and macro levels, here’s a sample output to illustrate this feature:

    Loading Token Name Finder model ... done (1.200s)
    My name is <START:person> John <END>


    Average: 24.4 sent/s
    Total: 1 sent
    Runtime: 0.041s
    Execution time: 1.845 seconds

From the above I have come across 5 metrics that are useful for me as a scientist or an analyst or even as an engineer:

    Took 1.200s to load the model into memory

    (Average) Processed at an average rate of 24.4 sentences per second
    (Total) Processed 1 sentence
    (Runtime) It took 0.040983606557377 (0.041 seconds) to process this 1 sentence
    (Execution time) The whole process ran for 1.845 seconds (startup, processing sentence(s) and shutdown)

Information like this is invaluable when it comes to making performance comparisons like:

between two or more models (load-time and run-time performance)
between two or more environments or configurations
between applications doing the same NLP, action put together using different tech stacks
- also includes different languages
finding co-relations between different corpora of text data processed (quantitative and qualitative comparisons)

Empirical example

BetterNLP library written in python is doing something similar, see Kaggle kernels: Better NLP Notebook and Better NLP Summarisers Notebook (search for time_in_secs inside both the notebooks to see the metrics reported).

Personally, it’s quite inspiring and also validates that this is a useful feature (or action) to offer to the end-user.

Other concepts, libraries and tools

There are other Java/JVM based NLP libraries mentioned in the Resources section below, for brevity we won’t cover them. The links provided will lead to further information for your own pursuit.

Within the Apache OpenNLP tool itself, we have only covered the command-line access part of it and not the Java Bindings. In addition, we haven’t gone through all the NLP concepts or features of the tool again for brevity have only covered a handful of them. But the documentation and resources on the GitHub repo should help in further exploration.

You can also find out how to build the docker image for yourself, by examining the docker-runner script.

Conclusion

After going through the above, we can conclude the following about the Apache OpenNLP tool by exploring its pros and cons:

Pros

It’s an easy to use API and understand
Shallow learning curve and detailed documentation with lots of examples
Covers a lot of NLP functionality, there’s more in the docs to explore than we did above
Easy shell scripts and Apache OpenNLP scripts have been provided to play with the tool
Lots of resources available below to learn more about NLP (See the Resources section below)
Resources provided to quickly get started and explore the Apache OpenNLP tool

Cons

Looking at the GitHub repo, it seems the development is slow or has been stagnated (last two commits have a wide gap i.e. May 2019 and Oct 15, 2019)
A few models are missing when going through the examples in the documentation (manual)
The current models provided may need further training as per your use case(s), see this tweet:

// Detect dark theme var iframe = document.getElementById('tweet-1193626559439134721-28'); if (document.body.className.includes('dark-theme')) { iframe.src = "https://platform.twitter.com/embed/Tweet.html?id=1193626559439134721&theme=dark" }

Resources

Apache OpenNLP

nlp-java-jvm-example GitHub project
Apache OpenNLP | GitHub | Mailing list | @apacheopennlp
Docs
Download
- Apache OpenNLP Jar/binary
- Model Zoo
Legends to support the examples in the docs
- List of languages
- Penn Treebank tag set
Find more in the Resources section in the README
Other related posts
- How to do Deep Learning for Java on the Valohai Platform?
- NLP with DL4J in Java, all from the command-line

About me

Twitter: @theNeomatrix369 | GitHub: @neomatrix369

Originally published at https://blog.valohai.com.

NLP with DL4J in Java, all from the command-line

Mani — Fri, 01 Nov 2019 15:07:35 +0000

Introduction

We are all aware of Machine Learning tools and cloud services that work via the browser and give us an interface we can use to perform our day-to-day data analysis, model training, and evaluation, and other tasks to various degrees of efficiencies.

But what would you do if you want to do these tasks on or from your local machine or infrastructure available in your organisation? And, if these resources available do not meet the pre-requisites to do decent end-to-end Data Science or Machine Learning tasks. That’s when access to a cloud-provider agnostic deep learning management environments like Valohai can help. And to add to this, we will be using the free-tier that is accessible to one and all.

We will be performing the task of building a Java app, and then training and evaluating an NLP model using it, and we will do all of it from the command-line interface with less interaction with the available web interface — basically it will be an end-to-end process all the way to training, saving and evaluation of the NLP model. And we won’t need to worry much about setting up any environments, configuring or managing it.

Purpose or Goals

We will learn to do a bunch of things in this post covering various levels of abstractions (in no particular order):

how to build and run an NLP model on the local machine?
- much fewer examples on NLP, mostly on classification or regression or computer vision (image, video, etc…)
how to build and run an NLP model on the cloud?
how to build NLP Java apps that run on the CPU or GPU?
- most examples out there are non-Java based, much less Java-based ones
- most examples are CPU based, much less on GPUs
how to perform the above depending on the absence/presence of resources i.e. GPU?
how to build a CUDA docker container for Java?
how to do all the above all from the command-line?
- via individual commands
- via shell scripts

What do we need and how?

Here’s what we need to be able to get started:

a Java app that builds and runs on any operating system
- takes advantage of the GPU if available, otherwise uses the available CPU
CLI tools that allow connecting to remote cloud services
shell scripts and code configuration to manage all of the above

The how part of this task is not hard once we have our goals and requirements clear, we will expand on this in the following sections.

NLP for Java, DL4J and Valohai

NLP for Java: DL4J

We have all of the code and instructions needed to get started with this post, captured for you on github. Below are the steps you go through to get acquainted with the project:

Quick startup

To quickly get started we need to do just these:

open an account on https://valohai.com, see https://app.valohai.com/accounts/signup/
install Valohai CLI on your local machine
clone the repo https://github.com/valohai/dl4j-nlp-cuda-example/

    $ git clone https://github.com/valohai/dl4j-nlp-cuda-example/
    $ cd dl4j-nlp-cuda-example

create a Valohai project using the Valohai CLI tool, and give it a name

    $ vh project create

link your Valohai project with the github repo https://github.com/valohai/dl4j-nlp-cuda-example/ on the Repository tab of the Settings page (https://app.valohai.com/p/[your-user-id]/dl4j-nlp-cuda-example/settings/repository/)

    $ vh project open
    ### Go to the Settings page > Repository tab and update the git repo address with https://github.com/valohai/dl4j-nlp-cuda-example/

update Valohai project with the latest commits from the git repo

    $ vh project fetch

Now you’re ready to start using the power of performing Machine Learning tasks from the command-line.

See Advanced installation and setup section in the README to find out what we need to install and configure on your system to run the app and experiments on your local machine or inside a Docker container — this is not necessary for the purpose of this post at the moment but you can try it out at a later time.

About valohai.yaml

You will have noticed we have a valohai.yaml ****in the git repo and our valohai.yaml the file contains a number of steps that you can use, we have enlisted them by their names, which we will use when running our steps:

build-cpu-gpu-uberjar: build our uberjar (both CPU and GPU versions) on Valohai
train-cpu-linux: run the NLP training using the CPU-version of uberjar on Valohai
train-gpu-linux: run the NLP training using the GPU-version of uberjar on Valohai
evaluate-model-linux: evaluate the trained NLP model from one of the above train-* execution steps
know-your-gpus: run on any instance to gather GPU/Nvidia related details on that instance, we run the same script with the other steps above (both the build and run steps)

Building the Java app from command line

Assuming you are all set up we will start by building the Java app on the Valohai platform from the command prompt, which is as simple as running one of the two commands:

$ vh exec run build-cpu-gpu-uberjar [--adhoc]

### Run vh exec run --help to find out more about this command

And you will be prompted with the execution counter, which is nothing by a number:

<--snipped-->
😼 Success! Execution #1 created. See https://app.valohai.com/p/valohai/dl4j-nlp-cuda-example/execution/016dfef8-3a72-22d4-3d9b-7f992e6ac94d/

Note: use --adhoc only if you have not setup your Valohai project with a git repo or have unsaved commits and want to experiment before being sure of the configuration.

You can watch your execution by:

$ vh watch 1

### the parameter 1 is the counter returned by the vh exec run build-cpu-gpu-uberjar operation above, it is the index to refer to that execution run

and you can see either we are waiting for an instance to be allocated or console messages move past the screen when the execution has kicked off. You can see the same via the web interface as well. Note: instances are available based on how popular they are and also how much quota you have left on them, if they have been used recently they are more likely to be available next.

Once the step is completed, you can see it results in a few artifacts, called outputs in the Valohai terminology, we can see them by:

$ vh outputs 1

### Run vh outputs --help to find out more about this command

We will need the URLs that look like datum://[….some sha like notation…] for our next steps. You can see we have a log file that has captured the GPU related information about the running instance, you can download this file by:

$ vh outputs --download . --filter *.logs 1

### Run vh outputs --help to find out more about this command

Running the NLP training process for CPU/GPU from the command-line

We will use the built artifacts namely the uberjars for the CPU and GPU backends to run our training process:

### Running the CPU uberjar
$ vh exec run train-cpu-linux --cpu-linux-uberjar=datum://016dff00-43b7-b599-0e85-23a16749146e [--adhoc]

### Running the GPU uberjar
$ vh exec run train-gpu-linux --gpu-linux-uberjar=datum://016dff00-2095-4df7-5d9e-02cb7cd009bb [--adhoc]

### Note these datum:// link will vary in your case
### Run vh exec run train-cpu-linux --help to get more details on its usage

Note: take a look at the Inputs with Valohai CLI docs to see how to write commands like the above.

We can watch the process if we like but it can be lengthy so we can switch to another task.

The above execution runs finish with saving the model into the ${VH_OUTPUTS} folder to enable it to be archived by Valohai. The model names get suffix to their names, to keep a track of how they were produced.

At any point during our building, training or evaluation steps, we can stop an ongoing execution (queued or running) by just doing this:

➜ dl4j-nlp-cuda-example git:(master) ✗ vh stop 3
(Resolved stop to execution stop.)
⌛ Stopping #3...
=> {"message":"Stop signal sent"}
😁 Success! Done.

Downloading the saved model post successful training
We can query the outputs of execution by its counter number and download it using:

$ vh outputs 2
$ vh outputs --download . --filter Cnn*.pb 2

See how you to can evaluate the downloaded model on your local machine, both the models created by the CPU and GPU based processes (respective uberjars). Just pass in the name of the downloaded model as a parameter to the runner shell script provided.

Evaluating the saved NLP model from a previous training execution

### Running the CPU uberjar and evaluating the CPU-verion of the model
$ vh exec run evaluate-model-linux --uber-jar=datum://016dff00-43b7-b599-0e85-23a16749146e --model=datum://016dff2a-a0d4-3e63-d8da-6a61a96a7ba6 [--adhoc]

### Running the GPU uberjar and evaluating the GPU-verion of the model
$ vh exec run evaluate-model-linux --uber-jar=datum://016dff00-2095-4df7-5d9e-02cb7cd009bb --model=datum://016dff2a-a0d4-3e63-d8da-6a61a96a7ba6 [--adhoc]

### Note these datum:// link will vary in your case
### Run vh exec run train-cpu-linux --help to get more details on its usage

And at the end of the model evaluation we get the below, model evaluation metrics and confusion matrix after running a test set on the model:

Note: the source code contains ML and NLP-related explanations at various stages in the form of inline comments.

Capturing the environment information about Nvidia’s GPU and CUDA drivers

This step is unrelated to the whole process of building and running a Java app on the cloud and controlling and viewing it remotely using the client tool, although it is useful to be able to know on what kind of system we ran our training on, especially for the GPU aspect of the training:

$ vh exec run know-your-gpus [--adhoc]

### Run vh exec run --help to get more details on its usage

Keeping track of your experiments

While writing this post, I ran a number of experiments and to keep track of the successful versus failed experiments in an efficient manner, I was able to use Valohai’s version control facilities baked into its design by
filtering for executions
searching for specific execution by “token”
re-running the successful and failed executions
confirming that the executions were successful and a failure for the right reasons
also, checkout data-catalogs and data provenance on the Valohai platform, below is an example of my project (look for the Trace button):

Comparing the CPU and GPU based processes

We could have discussed comparisons between the CPU and GPU based processes in terms of these:

app-building performance
model training speed
model evaluation accuracy

But we won’t cover these topics in this post, although you have access to the metrics you need for it, in case you wish to investigate further.

Necessary configuration file(s) and shells scripts

All the necessary scripts can be found on the github repo, they can be found in:

the root folder of the project
docker folder
resources-archive folder

Please also have a look at the README.md file for further details on their usages and other additional information that we haven’t mentioned in this post here.

Valohai - Orchestration

If we have noticed all the above tasks actually was orchestrating the tasks via a few tools at different levels of abstractions:

docker to manage infrastructure and platform-level configuration and version control management
java to be able to run our apps on any platform of choice
shell scripts to be able to again run both building and execution commands in a platform-agnostic manner and also be able to make exceptions for the absence of resources i.e. GPU on MacOSX
a client tool to connect with the remote cloud service i.e. Valohai CLI, and view, control executions and download the end-results

You are basically orchestrating your tasks from a single point making use of the tools and technologies available to do various Data and Machine Learning tasks.

Conclusion

We have seen that NLP is a resource-consuming task and having the right methods and tools in hands certainly helps. Once again the DeepLearning4J library from Skymind and the Valohai platform have come to our service. Thanks to the creators of both platform. In addition, we can see the below benefits (and more) this post highlights.

Benefits

We gain a bunch of things by doing the way we did the things above:

not have to worry about hardware and/or software configuration and version control management — docker containers FTW
able to run manual one-off building, training and evaluation tasks — Valohai CLI tool FTW
automate regularly use tasks for your team to be able to run tasks on remote cloud infrastructure — infrastructure-as-code FTW
overcome the limitations of an old or slow machine or a Mac with no access to the onboard GPU — CUDA-enabled docker image scripts FTW
overcome situations when not enough resources are available on the local or server infrastructure, and still be able to run experiments requiring high-throughput and performant environments — a cloud-provider agnostic platform i.e Valohai environments FTW
run tasks and not have to wait for them to finish and be able to run multiple tasks — concurrently and in-parallel on remote resources in a cost-effective manner — a cloud-provider agnostic platform i.e Valohai CLI tool FTW
remotely view, control both configuration and executions and even download the end-results after a successful execution — a cloud-provider agnostic platform i.e Valohai CLI tool FTW
and many others you will spot yourself

Suggestions

using provided CUDA-enabled docker container: highly recommend not to start installing Nvidia drivers or CUDA or cuDNN on your local machine (Linux or Windows-based) — shelve this for later experimentation
use provided shell scripts and configuration files: try not to perform manual CLI command instead use shells scripts to automate repeated tasks, provided examples are a good starting point and take it further from there
try to learn as much: about GPUs, CUDA, cuDNN from resources provided and look for more (see Resources section at the bottom of the post)
use version control and infrastructure-as-code systems: git and the valohai.yaml are great examples of this

As a software engineer, ML engineer, data scientist, I felt very productive and my time and resources were effectively used while doing all of the above, and above all I can share it with others and everyone can just reuse all of this work directly - just clone the repo and off you go.

What we didn’t cover and is potentially a great topic to talk about, is the Valohai Pipelines in some future post!

Resources

dl4j-nlp-cuda-example project on GitHub
CUDA enable docker container on Docker Hub (use the latest tag: v0.5)
GPU, Nvidia, CUDA and cuDNN
- See the Resources section on the github repo, under GPU, Nvidia, CUDA and cuDNN
Awesome AI/ML/DL resources
- Java AI/ML/DL resources
Valohai resources
- valohai | docs | blogs | GitHub | Videos | Showcase | About valohai | Slack | @valohaiai
- Search for any topic in the Documentation
- Blog posts on how to use the Valohai CLI tool: [1] | [2]
- Custom Docker Images
Awesome Graal | graalvm.org
Other related posts
- How to do Deep Learning for Java on the Valohai Platform?
- Blog posts on how to use the Valohai CLI tool: [1] | [2]

About me

Twitter: @theNeomatrix369 | GitHub: @neomatrix369

Originally published at https://blog.valohai.com.

Running Apache Zeppelin on the cloud

Mani — Thu, 17 Oct 2019 19:45:43 +0000

Introduction

This post is going to be a composition of the practical parts of two posts, one written late last year and the other a couple of months ago respectively. The posts being Apache Zeppelin: stairway to notes* haven! (late Dec 2018) and Running your JuPyTer notebooks on Oracle Cloud Infrastructure (early September 2019). Although this time we are going to make Apache Zeppelin run on the Oracle Cloud Infrastructure.

We will follow a similar structure like in the previous posts for ease of reading and understanding.

Also for brevity, we will use the term OCI when referring to Oracle Cloud Infrastructure throughout the rest of the post. In some cases, I have hyperlinked and redirected the reader (with a bit of narration) with repeated steps in the post, and in some cases, I have expressed those steps literally in the post adapted to the current theme i.e. Apache Zeppelin on OCI.

Please do not literally use information like DNS or IP addresses or any other details directly from the screenshots or text areas in this post and from the linked ones. These details may differ in your case so please try to follow the ideas and principles behind the process. You should use the details that show up on your console or browser interface when you are setting up at your end, as instructed in the post.

OCI: get started quickly

So to get started we need to have an account on OCI, which is super simple to set up. I suggest reading the below sections from the post Running your JuPyTer notebooks on Oracle Cloud Infrastructure (provided screenshots to help navigate through the steps):

Introduction
Signing up
Setup

****

Actions to get on the cloud

And we stop the moment we reach the end of the Actions to get on the cloud section. But please ensure you install everything along the way to have to hand the tools you need for the rest of the post. Skip anything that appears Jupyter notebooks related as we will be setting up Apache Zeppelin next.

When we have finished the above we are at a good point, as we will have a VM instance accessible from both the browser as well as from the CLI. And we can then do further steps to install Apache Zeppelin and kick-off. Make a note of the Public IP Address of the VM instance created above before proceeding, in my case it is 132.145.60.249.

Zeppelin: get started quickly

If you already know Zeppelin and feel at home with it, and are confident after skimming through the post: Apache Zeppelin: stairway to notes* haven! you can directly go to the next section in the post i.e Running Apache Zeppelin.

But to gain familiarity with Apache Zeppelin if you haven’t used it in the past, I suggest to slowly go through the post: Apache Zeppelin: stairway to notes* haven! and get it to work on your local machine. We will be doing further steps to make it work on the cloud i.e. OCI. Just for your information, when the post was written we used Apache Zeppelin 0.8.0, Spark 2.4.3 and ran it on top of GraalVM 1.0.0-rc10, as was bundled in the docker image neomatrix369/zeppelin:0.1 since then things have moved on. For this post, we have decided to use more recent versions i.e. Apache Zeppelin 0.8.1, Spark 2.4.4 and GraalVM 19.2.0.1 and you can access this via the docker image neomatrix369/zeppelin:0.2.

Note: I have steered away from Apache Zeppelin 0.8.0 and 0.8.2 for this post as it has introduced new things that cause regression in our workflow, for all intent and purpose of this post we can use Apache Zeppelin 0.8.1. Version 0.8.0 produces this error (resolved in version 0.8.1) when we try to run a paragraph with Scala code.

Also, if you have already noticed, the Zeppelin world call notebooks as notes, cells as paragraphs and so on.

Running Apache Zeppelin

We will be running Apache Zeppelin on the cloud directly now since we have had the experience of running it on the local machine already. For some of you, this might now be a no-brainer, as the steps are not many and its pretty simple to go about as we have already laid the ground for it. Just to be clear, the same instructions are applicable on a bare-metal or VM instance.

Logging into the VM instance

You can then ssh into the box (see docs on connecting via ssh) and proceed with rest of the actions below:

### Oracle Linux and CentOS images, user name: opc
### the Ubuntu image, user name: ubuntu
$ ssh -i ~/.ssh/id_rsa ubuntu@132.145.60.249
or
$ ssh ubuntu@132.145.60.249

and we get the next prompt, to which we answer ‘yes’:

The authenticity of host '132.145.60.249 (132.145.60.249)' can't be established.
ECDSA key fingerprint is SHA256:USafjsySmPItXTdBOsQyiYbEdiFSa7Cs1so+9EnKC4M.
Are you sure you want to continue connecting (yes/no)? yes

which is followed by this console — a sign that you are now logged into the VM:

Cloning the git repo

Now that we are logged in and that we have all the scripts we need at https://github.com/neomatrix369/awesome-ai-ml-dl/tree/master/examples/apache-zeppelin, we can clone it and run them.

If you haven’t this yet, then please run the below commands:

$ git clone https://github.com/neomatrix369/awesome-ai-ml-dl/
$ cd examples/apache-zeppelin

Installing Docker

The Docker docs for installing Docker on Ubuntu can be found on the Docker site. A bash-script has also been provided to quicken the process, although the target OS here is Ubuntu 16.04 or higher:

$ ./installDocker.sh

Note: in case you choose another OS image during VM creation, you will have to install Docker manually with the docs from Docker or modify the above script to make it work for the target OS.

Building Apache Zeppelin Docker image (optional)

Achtung! really sorry, this process can take longer, so please go away and make yourself and others coffee, read xkcd, watch a comedy and then come back in 15–20 mins or so (depending on your network bandwidth)!

Hence we can choose to continue or skip to the next step and use an older version of the docker image.

We can start by running the build script to build our latest Zeppelin Docker container:

$ DOCKER_USER_NAME= IMAGE_VERSION=0.2 ./buildZeppelinDockerImage.sh

and we see these messages flying by:

Sending build context to Docker daemon 34.82kB
Step 1/21 : ARG ZEPPELIN_VERSION
Step 2/21 : FROM apache/zeppelin:${ZEPPELIN_VERSION}
---> 353d7641c769
Step 3/21 : ARG SPARK_VERSION
---> Using cache
---> 2ca1b6703dd7
Step 4/21 : ENV SPARK_VERSION=${SPARK_VERSION:-2.4.3}
---> Using cache
---> f507d31d0aca
Step 5/21 : RUN echo "$LOG_TAG Download Spark binary" && wget -O /tmp/spark-${SPARK_VERSION}-bin-hadoop2.7.tgz http://archive.apache.org/dist/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop2.7.tgz
---> Running in c94542e7eb00
[ZEPPELIN_0.8.1]: Download Spark binary
--2019-10-13 19:55:16-- http://archive.apache.org/dist/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz
Saving to: ‘/tmp/spark-2.4.4-bin-hadoop2.7.tgz’
[--snipped--]
213350K .......... .......... .......... .......... .......... 94% 51.4K 3m0s
213400K .......... .......... .......... .......... .......... 94% 88.1K 2m59s
213450K .......... .......... .......... .......... .......... 95% 58.7K 2m59s
213500K .......... .......... .......... .......... .......... 95% 45.5K 2m58s
213550K .......... .......... .......... .......... .......... 95% 4.40M 2m57s
213600K .......... .......... .......... .......... .......... 95% 83.8K 2m56s
213650K .......... .......... .......... .......... .......... 95% 91.9K 2m55s
213700K .......... .......... .......... .......... .......... 95% 67.2K 2m55s
213750K .......... .......... .......... .......... .......... 95% 166K 2m54s
213800K .......... .......... .......... .......... .......... 95% 79.8K 2m53s
[--snipped--]
Step 21/21 : CMD ["bin/zeppelin.sh"]
---> Running in 843684f60302
Removing intermediate container 843684f60302
---> 5833f13ff7c7
Successfully built 5833f13ff7c7
Successfully tagged neomatrix369/zeppelin:0.2

You have noticed, we have a few changes:

amendments made to Zeppelin-Dockerfile)
the build and run scripts also looks different (buildZeppelinDockerImage.sh and runZeppelinDockerImage.sh)
and we are also using to 0.2 see CLI usages in the post

Hope all of this starts to make sense (I gave hints when we said things have moved on…).

Pushing the Docker Image to Docker Hub (optional)

Once we have a successfully built the docker image containing Apache Zeppelin from the above step we can easily upload the image from our local repository to the remote one via:

$ DOCKER_USER_NAME= IMAGE_VERSION=0.2 ./push-apache-zeppelin-docker-image-to-hub.sh

Although take note that it expects a couple of things:

an account on Docker Hub (i.e. neomatrix369) — of course, your own account
you are logged into your Docker Hub account locally
you have set up the DOCKER_USER_NAME with your Docker hub account

Otherwise you will get error messages, hopefully, they will guide you through till you upload it.

Note: in our case, we have access to the docker image on Docker hub, see neomatrix369/zeppelin on Docker Hub.

Running Apache Zeppelin from the Docker Image

We will download the already created images hosted on Docker Hub:

Version 0.1 (Apache Zeppelin 0.8.0, Spark 2.4.3, GraalVM 1.0.0-rc10) — older image

$ docker pull neomatrix369/zeppelin:0.1
$ ./runZeppelinDockerContainer.sh

Version 0.2 (Apache Zeppelin 0.8.1, Spark 2.4.4, GraalVM 19.2.0.1) — new image

$ docker pull neomatrix369/zeppelin:0.2
$ IMAGE_VERSION=0.2 ./runZeppelinDockerContainer.sh

the above commands should result in the output:

ubuntu@instance-20191014-0101:~/awesome-ai-ml-dl/examples/apache-zeppelin$ IMAGE_VERSION=0.2 ./runZeppelinDockerContainer.sh
Please wait till the log messages stop moving, it will be a sign that the service is ready! (about a minute or so)
Once the service is ready, go to http://localhost:8080 to open the Apache Zeppelin homepage
Pid dir doesn't exist, create /zeppelin/run
OpenJDK GraalVM CE 19.0.0 warning: ignoring option MaxPermSize=512m; support was removed in 8.0
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/zeppelin/lib/interpreter/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/zeppelin/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
[---snipped---]
WARNING: A HTTP GET method, public javax.ws.rs.core.Response org.apache.zeppelin.rest.CredentialRestApi.getCredentials(java.lang.String) throws java.io.IOException,java.lang.IllegalArgumentException, should not consume any entity.
WARNING: The (sub)resource method createNote in org.apache.zeppelin.rest.NotebookRestApi contains empty path annotation.
WARNING: The (sub)resource method getNoteList in org.apache.zeppelin.rest.NotebookRestApi contains empty path annotation.

Opening the Apache Zeppelin notes in your browser

Go to the browser and try to open this:

http://132.145.60.249:8080

But this won’t work because we haven’t opened up the port 8080 from within our cloud network (via Ingress Rules, read more about it here) to the outside world (public):

We would need to add the above entry to the Ingress Rules section, you can get to Ingress Rules page via the navigation menu: Networking > Virtual Cloud Networks > Virtual Cloud Network Details (by clicking on a VCN entry) > Security Lists, which brings you to the page with the Default Security Lists. **On clicking the Security List that corresponds to your **Virtual Cloud Network (VCN) you will land on the above Ingress Rules page.

In case, you are still not able to find it, search for the term security using the search facility on any page in the Cloud Console (see the magnifying glass 🔍at the top of the page). This will show you all the **Default Security Lists and clicking on it will bring you to the Ingress Rules page above (you might have just one Security List entry). Note: Ingress means traffic coming into the network/VM instance.

Why port 8080, that’s because we set it up like that in the docker scripts, have a look at the sources to find out why and how.

Having done all of the above: voila! We see the Apache Zeppelin startup page in the browser:

Using Apache Zeppelin notes

Take a look at the section Importing a note in Apache Zeppelin: stairway to notes* haven! this section onwards you can see how to import existing notes and execute them. Once we import and open a note, and run it, it would look like this:

Importing this note also creates a “notes.json” file in the ~/awesome-ai-ml-dl/examples/apache-zeppelin/notebook folder of the VM instance.

Further examples can be found on https://github.com/dylanmei/docker-zeppelin, although these would need additional installation and configuration to the Apache Zeppelin build.

Create a custom image for reuse

As we have been able to successfully run Apache Zeppelin from inside a VM instance, we can save this image for future re-use or share with others. Before doing that, I would delete the logs and notebook folders from the ~/awesome-ai-ml-dl/examples/apache-zeppelin of the VM instance.

Creating an image of the VM instance can be done via Compute> Instances > Instance Details the navigation menu, and Create Custom Image from the Actions drop-down menu:

When successfully created, it becomes available among the list of Custom Images to choose from, the next time we go to create a new VM instance:

Power-user

If all of this was piece of cake for you or you have survived without much hassle, then try out all the deep-dive stuff mentioned in the README page here.

To be able to code in other JVM languages in the Apache Zeppelin environment all you need is additional extensions — it’s only a matter of installing and configuring. You can learn all about them here, you can see you can also code in Python on Apache Zeppelin. Find out how you can write your own interpreters for Apache Zeppelin. Both the notebook and interpreters can be accessed via Notebook API and Interpreter API respectively.

Signing off

[--snipped--]
Oct 14, 2019 1:02:40 AM org.glassfish.jersey.internal.Errors logErrors
WARNING: The following warnings have been detected: WARNING: A HTTP GET method, public javax.ws.rs.core.Response org.apache.zeppelin.rest.InterpreterRestApi.listInterpreter(java.lang.String), should not consume any entity.
WARNING: A HTTP GET method, public javax.ws.rs.core.Response org.apache.zeppelin.rest.CredentialRestApi.getCredentials(java.lang.String) throws java.io.IOException,java.lang.IllegalArgumentException, should not consume any entity.
WARNING: The (sub)resource method createNote in org.apache.zeppelin.rest.NotebookRestApi contains empty path annotation.
WARNING: The (sub)resource method getNoteList in org.apache.zeppelin.rest.NotebookRestApi contains empty path annotation.
^C

In case, you have created a note, it gets saved in the sub-directory called (apache zeppelin directory), you can retrieve this using scp from your local machine (see here on how to do that).

Make sure you have signed out of both the oracle.com and cloud.oracle.com login sessions, it’s easy to forget one or the other. But before doing that please also have a look at the Cleaning up of resources page in the docs — you don’t want your instance running forever while you are not looking at it!

Conclusion

After doing this, it looks like a no brainer to run a notebook service i.e. Apache Zeppelin on a cloud provider like OCI (Oracle Cloud Infrastructure).

In effect, if we summarise the conclusions of the two posts, Apache Zeppelin: stairway to notes* haven! and Running your JuPyTer notebooks on Oracle Cloud Infrastructure we will more or less say:

Apache Zeppelin gives us:

similar flexibility as Jupyter notebooks, and allows extending functionality via configurations and extensions
execution progress per paragraph (per cell) is always displayed (in real-time) unlike Jupyter notebooks
lazy execution to help efficiency
round-trip navigability between table data and visualisation in the cell (paragraph)
execution may appear a bit slower than Jupyter notebooks at times
but there are solutions to speed this up (for future posts to cover)
all-in-all a great place for Java/JVM developers to feel at home and do ML experiments on the JVM

OCI gives us:

an easy-to-use cloud environment
quickly set up our environment to get to market with our apps and solutions we want to bring to market quick
enables us to run Apache Zeppelin (natively or via Docker image)
instances that can be shared publicly or privately depending on your network security settings
provides ways to secure your infrastructure on the cloud (we didn’t cover it with much depth here), but please check out the docs on Security on the OCI docs page to learn more.

Please keep an eye on this space, and share your comments, feedback or any contributions which will help us all learn and grow to @theNeomatrix369, you can find more about me via the *[About me page*](http://neomatrix369.worpress.com/aboutme).**

Resources

Apache Zeppelin

Docker

OCI/Cloud

Getting Started
Tutorial to setup a VM instance
Install CLI
CLI docs
Cleaning up of resources
Docs on connecting via ssh to OCI
New Sign-in: https://cloud.oracle.com/en_US/sign-in
Traditional Sign-in: https://myaccount.cloud.oracle.com/
Contact Support
Developer Tools
Documentation
Oracle Cloud Community Forum
Oracle Cloud Compliance
Oracle Cloud Infrastructure Blog

Security

About me

Twitter: @theNeomatrix369 | GitHub: @neomatrix369

Running your JuPyTeR notebooks on the cloud

Mani — Thu, 12 Sep 2019 20:55:57 +0000

Introduction

On the back of my previous share on how to build and run a docker container with Jupyter, I’ll be taking this further on how we can make this run on a cloud platform.

We’ll try to do this on Oracle Cloud Infrastructure (OCI). In theory, we should be able to do everything in the blog on any VM or Baremetal instance. If you are new to Oracle Cloud, I would suggest getting familiar with the docs and Getting Started sections of the docs. You will also find several informative links at the bottom of this post, in the Resources section.

I found the tutorial to setup a VM instance simple and useful — I recommend having a glance and following the steps. Take note of the pre-requisites before actually getting downing to creating a VM instance and ssh-ing into it — it will involve Creating Compartments, Subnets, Security Lists, among other things before you can create a VM.

Signing up

You will have noticed you have to have an account to be able to get access to the Cloud Dashboard and proceed.

You can sign up by going to oracle.com and also to cloud.oracle.com — recommend signing up via these portals. You might even be eligible for FREE credits once you do that (enough to spend your weekend running your favourite instances).

Setup

Dashboard — sign-in
Once you are signed up, you sign-in via cloud.oracle.com/sign-in which will take you to a page like this

Follow the instructions as described in the tutorial to setup a VM instanceand give your VM and other resources names (use initials as a prefix) you can identify easily. This will kick off the request to create the VM (if all your entries are valid) — and in less than 15 seconds you should have a VM ready to be used.

Once the VM instance is created, make a note of the Public IP Address of the instance. All running VMs can be found on by going to the Compute > Instance on the navigation menu on the left:

Select the running VM by clicking it:

which will take you to the VM details page, where you can spot the Public IP Address:

Note: the Public IP Address will be different for every VM created, the above is temporary.

CLI

OCI can be accessed with a command-line tool called oci-cli which can be installed by following instructions mention in the CLI docs. Once installed the command to invoke it is called oci and you can invoke it by doing the below:

Actions to get on the cloud

    $ oci --help
    Usage: oci [OPTIONS] COMMAND [ARGS]...
    Oracle Cloud Infrastructure command line interface, with support for
    Audit, Block Volume, Compute, Database, IAM, Load Balancing, 
Networking, DNS, File Storage, Email Delivery and Object Storage 
Services.
    Most commands must specify a service, followed by a resource type 
and then an action. For example, to list users (where $T contains the 
OCID of the current tenant):

    oci iam user list --compartment-id $T

    Output is in JSON format.

    For information on configuration, see
    https://docs.cloud.oracle.com/Content/API/Concepts/sdkconfig.htm.

    Options:
    <-- snipped -->

    Commands:
    <-- snipped -->

As such we won’t need the dashboard for the most part here onwards. We will also NOT be covering the use of the CLI tool in this post.

Logging into the VM instance
You can then ssh into the box (see docs on connecting via ssh) and proceed with rest of the actions below:

    ### Oracle Linux and CentOS images, user name: opc
    ### the Ubuntu image, user name: ubuntu
    $ ssh -i ~/.ssh/id_rsa ubuntu@132.145.78.136
    or
    $ ssh ubuntu@132.145.78.136

Installing git
For this blog post, we selected the Canonical Ubuntu Linux (Canonical-Ubuntu-16.04–2019.08.14–0) as our OS image, which comes with apt-get and git installed so we don’t need to do anything there.

Running Jupyter Notebooks

Cloning our repo
We can clone our repo and perform the rest of the steps:

    $ git clone https://github.com/neomatrix369/awesome-ai-ml-dl
    $ cd awesome-ai-ml-dl/examples/JuPyteR

Installing Docker
The Docker docs for installing Docker on Ubuntu can be found on the Docker site. A bash-script has also been provided to quicken the process, although the target OS here is Ubuntu 16.04 or higher:

    $ cd build-docker-image
    $ ./installDocker.sh

Note: in case you choose another OS image during VM creation, you will have to install Docker manually with the docs from Docker or modify the above script to make it work for the target OS.

Building the Jupyter Docker Image

    $ cd build-docker-image
    $ sudo ./buildDockerImage.sh

In this specific environment, we need to pass the sudo keyword before every docker command. You may not have to do that in your local environment or elsewhere.

Running the Jupyter notebook as a Docker container

    $ cd [back into the project root folder]
    $ sudo ./runDockerContainer.sh

This will show you a console like this:

    <--- snipped --->
    OpenJDK Runtime Environment (build 9.0.4+11)
    OpenJDK 64-Bit Server VM (build 9.0.4+11, mixed mode)
    PATH=/home/jupyter/.local/bin:/opt/java/openjdk/bin:/usr/local/sbin:/usr
/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
    ~~~ JDK9, Linux only: We are enabling JVMCI flags (enabling Graal as 
Tier-2 compiler) ~~~
    ~~~ Graal setting: please check docs for higher versions of Java and 
for other platforms ~~~
    JAVA_OPTS=-XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI 
-XX:+UseJVMCICompiler
    JAVA_TOOL_OPTIONS=-XX:+UnlockExperimentalVMOptions 
-XX:+UseCGroupMemoryLimitForHeap -XX:+UnlockExperimentalVMOptions 
-XX:+EnableJVMCI -XX:+UseJVMCICompiler
    Available kernels:
      python2    /home/jupyter/.local/share/jupyter/kernels/python2
      java       /usr/share/jupyter/kernels/java
    [I 13:39:35.993 NotebookApp] Writing notebook server cookie secret 
to /home/jupyter/.local/share/jupyter/runtime/notebook_cookie_secret
    [I 13:39:36.293 NotebookApp] Serving notebooks from local directory: 
/home/jupyter
    [I 13:39:36.294 NotebookApp] The Jupyter Notebook is running at:
    [I 13:39:36.295 NotebookApp] http://(81dde8675279 or 
127.0.0.1):8888/?token=bb0c81ef7e9f3932355b953163702aa2d9f75e18005e6e48
    [I 13:39:36.297 NotebookApp] Use Control-C to stop this server and 
shut down all kernels (twice to skip confirmation).
    [C 13:39:36.310 NotebookApp]
    To access the notebook, open this file in a browser:
            file:///home/jupyter/.local/share/jupyter/runtime/nbserver-
28-open.html
        Or copy and paste one of these URLs:
            http://(81dde8675279 or 127.0.0.1):8888/?
token=bb0c81ef7e9f3932355b953163702aa2d9f75e18005e6e48

Make a note of the URL, and replace the 127.0.0.1 with your Public IP Address i.e. 132.145.78.136.
You can also see from the above logs we are using Java 9 (built on the AdoptOpenJDK farm) and enabling the GraalVM compiler as HotSpot’s C2 compiler (see Switches to enable the GraalVM compiler in Java 9 to enabled the GraalVM compiler). It’s also because the Java extension for Jupyterrequires Java 9 or higher to work.

Opening the Jupyter notebook in your browser
Go to the browser and try to open this:

    http://132.145.78.136:8888/?token=bb0c81ef7e9f3932355b953163702aa2d9f75e18005e6e48

Aargh! It does NOT work!
That is because we haven’t opened up the port 8888 from within our cloud network (via Ingress Rules, read more about it here) to the outside world (public):

We would need to add the above entry to the Ingress Rules section, you can get to Ingress Rules page via the navigation menu: Networking > Virtual Cloud Networks > Virtual Cloud Network Details > Security Lists, which brings you to the page with the Default Security Lists. **On clicking the Security List that corresponds to your **Virtual Cloud Network (VCN) you will land on the above Ingress Rules page.

Why port 8888, that’s because we set it up like that in the docker scripts, have a look at the sources to find out why and how.

Having done all of the above: voila! We see the Jupyter startup page in the browser:

And you can see a Java-based notebook is available to play with! Try out the below by creating a new Java notebook in the browser:

https://github.com/ligee/kotlin-jupyter/raw/master/samples/ScreenShotInJupyter.png
You are free to create Python notebooks as well, not just Java ones — this is the beauty of Jupyter notebooks.

Installing Jupyter on a bare-metal or VM environment

For brevity, we didn’t cover this aspect, but if you look at the scripts associated with building and running the Jupyter instance, you will see that the docker build scripts build and run the instance with the help of individual scripts that can be executed on its own in this order:

$ cd build-docker-image
$ [install Java 9 SDK and set the PATH and JAVA_HOME]
$ ./install-jupyter-notebooks.sh
$ ./install-java-kernel.sh
$ ./runLocal.sh

If you want to see how this would work, run the above scripts in the local environment of the instance, the rest of the instructions should work as expected.

Create custom image for reuse

As we have been able to successfully run Jupyter Notebook from inside a VM instance, we can save this image for future re-use or share with others. Creating an image of the VM instance can be done via Compute> Instances > Instance Details the navigation menu, and Create Custom Image from the Actions drop-down menu:

Note: whilst in the process of creating a custom image, your original VM instance is shut-down. This can take under a couple of minutes to complete depending on the size of the original VM instance.
When successfully created, it becomes available among the list of Custom Images to choose from, the next time we go to create a new VM instance:

Power-user

If all of this was piece of cake for you or you have survived without much hassle, then try out all the deep-dive stuff mentioned in the README page here.
To be able to code in other languages in the Jupyter environment all you need is a Jupyter extension — it’s only a matter of installing and configuring. You can learn all about this here.

Signing off

In case you have created a notebook, it gets saved in the sub-directory called jupyter/notebooks, you can retrieve this using scp from your local machine (see here on how to do that).
Make sure you have signed out of both the oracle.com and cloud.oracle.com login sessions, it’s easy to forget one or the other. But before doing that please also have a look at the Cleaning up of resourcespage in the docs — you don’t want your instance running forever while you are not looking at it!

Conclusion

A good set of scripts (including Docker) and an easy-to-use cloud environment can help in many ways. In this case, enabling us to run a Jupyter notebook instance that can be shared publicly or privately depending on your network security settings.

The Jupyter environment is flexible and allows extending functionality via configurations and extensions.

We didn’t cover things like cloud security and partitioning of user instances — which is sort of out of scope for this post. Please look into this further, if they are important to you. Please do ensure it meets the necessary levels of security for your application or use-case. Check out the docs on Security on the OCI docs page to learn more.

Resources

General

Docker

OCI/Cloud

Getting Started
Tutorial to setup a VM instance
Install CLI
CLI docs
Cleaning up of resources
Docs on connecting via ssh to OCI
New Sign-in: https://cloud.oracle.com/en_US/sign-in
Traditional Sign-in: https://myaccount.cloud.oracle.com/
Contact Support
Developer Tools
Documentation
Oracle Cloud Community Forum
Oracle Cloud Compliance
Oracle Cloud Infrastructure Blog

Security

About me

Twitter: @theNeomatrix369 | GitHub: @neomatrix369

How to do Deep Learning for Java?

Mani — Sun, 08 Sep 2019 17:37:37 +0000

Introduction

Some time ago I came across this life-cycle management tool (or cloud service) called Valohai and I was quite impressed by its user-interface and simplicity of design and layout. I had a good chat about the service at that time with one of the members of Valohai and was given a demo. Previous to that I had written a simple pipeline using GNU Parallel, JavaScript, Python and Bash - and another one purely using GNU Parallel, and Bash. I also thought about replacing the moving parts with ready-to-use task/workflow management tools like Jenkins X, Jenkins Pipeline, Concourse or Airflow but due to various reasons, I did not proceed with the idea.

Coming back to our original conversation, I noticed a lot of the examples and docs on Valohai were based on Python and R and the respective frameworks and libraries. There was a lack of Java/JVM based examples or docs. So I took this opportunity to do something about that.

I was encouraged by Valohai to implement something using the famous Java library called DL4J - Deep Learning for Java.

My initial experience with Valohai already gave me a good impression after getting an understanding of its design, layout and workflow. That it was developer-friendly and the makers already took into consideration various facets of both developer and infrastructure workflows. In our worlds, the latter is mostly run by DevOps or SysOps teams and we know the nuances and pain-points attached to it. You can find out more about its features from the Features section of the site.

Achtung! Just to let you know that from here onwards the post will be a bit more technical and may contain code snippets and mention of deep learning/machine learning and infrastructure-related terminologies.

What do we need and how?

For any machine learning or deep learning project or initiative, these days the two important components (from a high-level perspective) are code that will create and serve the model and infrastructure where this whole life-cycle will be executed.

Of course, there are going to be steps and components needed before, during and after the above mentioned but to keep things simple let’s say we need code and infrastructure.

Code
For code I have chosen a modified example using DL4J, it’s an MNist project with a training set of 60,000 images and test set of 10,000 images of hand-written digits. This dataset is available via the DL4J library (just like Keras provides a stock of them). Look for the MnistDataSetIterator under DatasetIterators in the DL4J Cheatsheet for further details on this particular dataset.

Have a look at the source code we will be using before getting started, the main Java class is called org.deeplearning4j.feedforward.mnist.MLPMnistSingleLayerRunner.

Infrastructure
As it is obvious by now, we have decided to try out the Java example using Valohai as our infrastructure to run our experiments (training and evaluation of the model). Valohai recognizes git repositories and directly hooks into them and allows Execution of our code, irrespective of platform or language - we will see how this works. This also means if you are a strong supporter of GitOps or Infrastructure-As-Code you will appreciate the workflow.

For this we just need an account on Valohai, we can avail a Free-tier account and have access to several instances of various configurations when we sign up. See Free-tier under Plans and Pricing and the comparison chart for more details. For what we would like to do, the Free-tier is more than enough for now.

Deep Learning for Java and Valohai

As we agreed, we're going to use these two technologies to achieve our goal of training a single layer model and evaluating it, as well as seeing what the end-to-end experience is like on Valohai.

We will bundle the necessary build and run-time dependencies into the Docker image and use it to build our Java app, train a model and evaluate it on the Valohai platform via a simple valohai.yaml file which is placed in the root folder of the project repository.

Deep Learning for Java: DL4J

The easy part is, we won’t need to do much here, just build the jar and download the dataset into the Docker container. We have a pre-built Docker image that contains all the dependencies needed to build a Java app. We have pushed this image into Docker Hub, you can find it by searching for dl4j-mnist-single-layer (we will be using a specific tag as defined in the YAML file). We have chosen to use GraalVM 19.1.1 as our Java build and runtime for this project, and so it is embedded into the Docker image (see Dockerfile for the definition of the Docker image). To learn more about GraalVM check out the resources at the official site of graalvm.org and Awesome Graal.

Orchestration
When the uber jar is invoked from the command-line, we land into the MLPMnistSingleLayerRunner class which directs us to the intended action depending on the parameters passed in:

    public static void main(String[] args) throws Exception {
        MLPMnistSingleLayerRunner mlpMnistRunner = new MLPMnistSingleLayerRunner();

        JCommander.newBuilder()
                .addObject(mlpMnistRunner)
                .build()
                .parse(args);

        mlpMnistRunner.execute();
    }

The parameters passed into the uber jar are received by this class and handled by the execute() method.

We can create a model via the --action train parameter and evaluate the created model via the --action evaluate parameter respectively passed to the Java app (uber jar).

The main parts of the Java app that does this work can be found in the two Java classes mentioned in the sections below.

Train a model

Can be invoked from the command-line via:

    ./runMLPMnist.sh --action train --output-dir ${VH_OUTPUTS_DIR}

    or

    java -Djava.library.path=""             \
         -jar target/MLPMnist-1.0.0-bin.jar \
         --action train --output-dir ${VH_OUTPUTS_DIR}

This creates the model (when successful, at the end of the Execution) by the name mlpmnist-single-layer.pb in the folder specified by the --output-dir passed in at the beginning of the Execution. From the perspective of Valohai, it should be placed into the ${VH_OUTPUTS_DIR} which is what we do (see valohai.yaml file).

For source code, see class MLPMNistSingleLayerTrain.java,

Evaluate a model

Can be invoked from the command-line via:

    ./runMLPMnist.sh --action evaluate --input-dir ${VH_INPUTS_DIR}/model

    or

    java -Djava.library.path=""             \
         -jar target/MLPMnist-1.0.0-bin.jar \
         --action evaluate --input-dir ${VH_INPUTS_DIR}/model

This expects a model (created by the training step) by the name mlpmnist-single-layer.pb to be present in the folder specified by the --input-dir passed in when the app has been called.

For source code, see class MLPMNistSingleLayerEvaluate.java.

I hope this short illustration makes it clear how the Java app that trains and evaluates the model works in general.

That’s all is needed of us, but feel free to play with the rest of the source (along with the README.md and bash scripts) and satisfy your curiosity and understanding on how this is done! Further resources on how to go about with DL4J has been provided in the Resources section at the end of the post.

Valohai

Valohai as a platform allows us to loosely couple our runtime environment, our code, and our dataset, as you can see from the structure of the YAML file below. That way the different components can evolve independently without impeding or being dependent on one another. Hence our Docker container only has the build and runtime time components packed into it. At Execution time we build the uber jar in the Docker container, upload it to some internal or external storage, and then via another Execution step download the uber jar and dataset from storage (or another location) to run the training. This way the two execution steps are decoupled; we can e.g. build the jar once and run hundreds of training steps on the same jar. As the build and runtime environments should not change that often we can cache them and the code, dataset and model sources can be made dynamically available during Execution time.

valohai.yaml
The heart of integrating our Java project with the Valohai infrastructure is defining the steps of Execution of the steps in the valohai.yaml file placed in the root of your project folder. Our valohai.yaml looks like this:

    ---

    - step:
        name: Build-dl4j-mnist-single-layer-java-app
        image: neomatrix369/dl4j-mnist-single-layer:v0.5
        command:
          - cd ${VH_REPOSITORY_DIR}
          - ./buildUberJar.sh
          - echo "~~~ Copying the build jar file into ${VH_OUTPUTS_DIR}"
          - cp target/MLPMnist-1.0.0-bin.jar ${VH_OUTPUTS_DIR}/MLPMnist-1.0.0.jar
          - ls -lash ${VH_OUTPUTS_DIR}
        environment: aws-eu-west-1-g2-2xlarge
    - step:
        name: Run-dl4j-mnist-single-layer-train-model
        image: neomatrix369/dl4j-mnist-single-layer:v0.5
        command:
          - echo "~~~ Unpack the MNist dataset into ${HOME} folder"
          - tar xvzf ${VH_INPUTS_DIR}/dataset/mlp-mnist-dataset.tgz -C ${HOME}
          - cd ${VH_REPOSITORY_DIR}
          - echo "~~~ Copying the build jar file from ${VH_INPUTS_DIR} to current location"
          - cp ${VH_INPUTS_DIR}/dl4j-java-app/MLPMnist-1.0.0.jar .
          - echo "~~~ Run the DL4J app to train model based on the the MNist dataset"
          - ./runMLPMnist.sh {parameters}
        inputs:
          - name: dl4j-java-app
            description: DL4J Java app file (jar) generated in the previous step 'Build-dl4j-mnist-single-layer-java-app'
          - name: dataset
            default: https://github.com/neomatrix369/awesome-ai-ml-dl/releases/download/mnist-dataset-v0.1/mlp-mnist-dataset.tgz
            description: MNist dataset needed to train the model
        parameters:
          - name: --action
            pass-as: '--action {v}'
            type: string
            default: train
            description: Action to perform i.e. train or evaluate
          - name: --output-dir
            pass-as: '--output-dir {v}'
            type: string
            default: /valohai/outputs/
            description: Output directory where the model will be created, best to pick the Valohai output directory
        environment: aws-eu-west-1-g2-2xlarge

    - step:
        name: Run-dl4j-mnist-single-layer-evaluate-model
        image: neomatrix369/dl4j-mnist-single-layer:v0.5
        command:
          - cd ${VH_REPOSITORY_DIR}
          - echo "~~~ Copying the build jar file from ${VH_INPUTS_DIR} to current location"
          - cp ${VH_INPUTS_DIR}/dl4j-java-app/MLPMnist-1.0.0.jar .
          - echo "~~~ Run the DL4J app to evaluate the trained MNist model"
          - ./runMLPMnist.sh {parameters}
        inputs:
          - name: dl4j-java-app
            description: DL4J Java app file (jar) generated in the previous step 'Build-dl4j-mnist-single-layer-java-app'    
          - name: model
            description: Model file generated in the previous step 'Run-dl4j-mnist-single-layer-train-model'
        parameters:
          - name: --action
            pass-as: '--action {v}'
            type: string
            default: evaluate
            description: Action to perform i.e. train or evaluate
          - name: --input-dir
            pass-as: '--input-dir {v}'
            type: string
            default: /valohai/inputs/model
            description: Input directory where the model created by the previous step can be found created
        environment: aws-eu-west-1-g2-2xlarge

Explanation of the step Build-dl4j-mnist-single-layer-java-app
From the YAML file, we can see that we define this step by first using the Docker image and then run the build script to build the uber jar. Our docker image has the build environment dependencies setup (i.e. GraalVM JDK, Maven, etc…) to build a Java app. We do not specify any inputs or parameters as this is the build step. Once the build will be successful we want to copy the uber jar called MLPMnist-1.0.0-bin.jar (original name) to the /valohai/outputs folder (represented by ${VH_OUTPUTS_DIR}). Everything within this folder automatically gets persisted within your project’s storage, e.g. an AWS S3 bucket. Finally, we define our job to run in the AWS environment.

Note: The Valohai free tier *does* not have network access from inside the Docker container (this is disabled by default), please contact support to enable this option (I had to do the same), or else we cannot download our Maven and other dependencies during build time.

Explanation of the step Run-dl4j-mnist-single-layer-train-model
The semantics of the definition is similar to the previous step except we specify two inputs one for the uber jar (MLPMnist-1.0.0.jar) and the other for the dataset (to be unpacked into ${HOME}/.deeplearning4j folder). We will be passing the two parameters --action train and --output-dir /valohai/outputs. The model created from this step is collected into the /valohai/outputs/model folder (represented by ${VH_OUTPUTS_DIR}/model).

Note: In the Input fields in the Execution tab of the Valohai Web UI, we can select the outputs from previous Executions by using the Execution number i.e. *#1* or *#2* , in addition to using datum:// or http:// URLs. Typing in the few letters of the name of the file also helps search through the whole list.

Explanation of the step Run-dl4j-mnist-single-layer-evaluate-model
Again, this step is similar to the previous step, except that we will be passing in the two parameters --action evaluate and --input-dir /valohai/inputs/model. Also, we have again specified two inputs: sections defined in the YAML file called dl4j-java-app and model with no default set for both of them. This will allow us to select the uber jar and the model we wish to evaluate - that was created by the step Run-dl4j-mnist-single-layer-train-model, using the web interface.

Hope this explains the steps in the above definition file but if you require further help, please do not hesitate to look at the docs and tutorials.

valohai web interface

Once we have an account, we can sign-in and continue with creating a project by the name mlpmnist-single-layer and link the git repo https://github.com/valohai/mlpmnist-dl4j-example/ to the project and save the project, have a quick look at the tutorials to see how to create a project using the web interface.

Now you can execute a step and see how it pans out!

Building the DL4J Java app step

Go to the Executions tab in the web interface, either copy an existing or create a new execution using the [Create execution] button, all the necessary default options will be populated, select Step Build-dl4j-mnist-single-layer-java-app.

For Environment I would select AWS eu-west-1 g2.2xlarge and click on the [Create execution] button at the bottom of the page, to see the Execution kick-off.

Training the model step

Go to the Execution tab in the web interface, and do the same as the previous step and select Step Run-dl4j-mnist-single-layer-train-model. You will have to select the Java app (just type jar in the field) built in the previous step, the dataset has already been pre-populated via the valohai.yaml file:

Click on [Create execution] to kick off this step.

You will see the model summary fly by in the log console:

    [<--- snipped --->]
    11:17:05 =========================================================================
    11:17:05 LayerName (LayerType) nIn,nOut TotalParams ParamsShape
    11:17:05 =========================================================================
    11:17:05 layer0 (DenseLayer) 784,1000 785000 W:{784,1000}, b:{1,1000}
    11:17:05 layer1 (OutputLayer) 1000,10 10010 W:{1000,10}, b:{1,10}
    11:17:05 -------------------------------------------------------------------------
    11:17:05  Total Parameters: 795010
    11:17:05  Trainable Parameters: 795010
    11:17:05  Frozen Parameters: 0
    11:17:05 =========================================================================
    [<--- snipped --->]

The models created can be found under the Outputs sub-tab in the Executions main tab, during and at the end of the Execution:

You might have noticed several artifacts in the Outputs sub-tab. That’s because we save a checkpoint at the end of each epoch! Look out for these in the Execution logs:

    [<--- snipped --->]
    11:17:14 o.d.o.l.CheckpointListener - Model checkpoint saved: epoch 0, iteration 469, path: /valohai/outputs/checkpoint_0_MultiLayerNetwork.zip
    [<--- snipped --->]

The checkpoint zip contains the state of the model training at that point, saved in three of these files:

    configuration.json
    coefficients.bin
    updaterState.bin

Training the model > Metadata

You might have noticed these notations fly by in the Execution logs:

    [<--- snipped --->]
    11:17:05 {"epoch": 0, "iteration": 0, "score (loss function)": 2.410047}
    11:17:07 {"epoch": 0, "iteration": 100, "score (loss function)": 0.613774}
    11:17:09 {"epoch": 0, "iteration": 200, "score (loss function)": 0.528494}
    11:17:11 {"epoch": 0, "iteration": 300, "score (loss function)": 0.400291}
    11:17:13 {"epoch": 0, "iteration": 400, "score (loss function)": 0.357800}
    11:17:14 o.d.o.l.CheckpointListener - Model checkpoint saved: epoch 0, iteration 469, path: /valohai/outputs/checkpoint_0_MultiLayerNetwork.zip
    [<--- snipped --->]

These notations trigger Valohai to pick up these values (in JSON format) to be used to plot Execution metrics, which can be seen during and after the Execution under the Metadata sub-tab in the Executions main tab:

We were able to do this by hooking a listener class (called ValohaiMetadataCreator) into the model, such that during training attention is passed on to this listener class at the end of each iteration. In the case of this class, we print the epoch count, iteration count, and the score (the loss function value), here is a code snippet from the class:

        public void iterationDone(Model model, int iteration, int epoch) {
            if (printIterations <= 0)
                printIterations = 1;
            if (iteration % printIterations == 0) {
                double score = model.score();
                System.out.println(String.format(
                        "{\"epoch\": %d, \"iteration\": %d, \"score (loss function)\": %f}",
                        epoch,
                        iteration,
                        score)
                );
            }
        }

Evaluating the model step

Once the model has been successfully created via the previous step, we are ready to evaluate it. We create a new Execution just like we did previously but this time select the Run-dl4j-mnist-single-layer-evaluate-model step. We will need to select the Java app (MLPMnist-1.0.0.jar) again and the created model (mlpmnist-single-layer.pb) before kicking off the Execution (as shown below):

After selecting the desired model as input, click on the [Create execution] button. It is a quicker Execution step than the previous one and we will see the following output:

The Evaluation Metrics and Confusion Matrix post model analysis will be displayed in the console logs.

We can see our training activity has resulted in the model that is near 97% accurate based on the test dataset. The confusion matrix helps point out the instances a digit has been incorrectly predicted as another digit. Maybe this is good feedback to the creator of the model and maintainer of the dataset to do some further investigations.

The question remains (and is outside the scope of this post) — how good is the model when faced with real-world data?

valohai CLI
It’s easy to install and get started with the CLI tool, see Command-line Usage.

If you haven’t yet cloned the git repository then here’s what to do:

    $ git clone https://github.com/valohai/mlpmnist-dl4j-example/

We then need to link our Valohai project created via the web interface in the above section to the project stored on our local machine (the one we just cloned). Run the below commands to do that:

    $ cd mlpmnist-dl4j-example
    $ vh project --help   ### to see all the project-specific options we have for Valohai
    $ vh project link

You will be shown something like this:

    [  1] mlpmnist-single-layer
    ...
    Which project would you like to link with /path/to/mlpmnist-dl4j-example?
    Enter [n] to create a new project.:

Select 1 (or the selection appropriate for you) and you should see this message:

    😁  Success! Linked /path/to/mlpmnist-dl4j-example to mlpmnist-single-layer.

The quickest way to know of all the CLI options with the CLI tool is:

    $ vh --help

One more thing, before going any further ensure that your Valohai project is in sync with the latest git project, by doing this:

    $ vh project fetch

(on the top right side in your web interface, shown with the two-arrows-pointing-to-each-other icon).

Now we can execute the steps from the CLI with:

    $ vh exec run Build-dl4j-mnist-single-layer-java-app

Once the Execution is on, we can inspect and monitor it via:

    $ vh exec info
    $ vh exec logs
    $ vh exec watch

We can also see the above updates via the web interface at the same time.

Further resources on how to use Valohai has been provided in the Resources section at the end of the post, there are a couple of blog posts on how to use the CLI tool, see [1] | [2].

Conclusion

As you have seen both DL4J and Valohai individually or combined are fairly easy to get started with. Further, we can develop on the different components that make up our experiments i.e. build/runtime environment, code, and dataset and integrate them into an Execution in a loosely coupled manner.

The template examples used in this post are a good way to get started to build more complex projects. That you can use either the web interface or the CLI to get your job done with Valohai! With the CLI you can also integrate it with your setup and scripts (or even with CRON or CI/CD jobs).

Also, it’s clear that if I’m working on an AI/ML/DL related project I don’t need to concern myself with creating and maintaining an end-to-end pipeline (which many others and I have had to do in the past) — thanks to the good work by the folks at Valohai.

Thanks to both Skymind (the startup behind DL4J, for creating, maintaining and keeping free) and Valohai for making this tool and cloud-service available for both free and commercial use.

Please do let me know if this is helpful by dropping a line in the comments below or by tweeting at @theNeomatrix369, and I would also welcome feedback, see how you can reach me, above all please check out the links mentioned above.

Resources

mlpmnist-dl4j-examples project on GitHub
Awesome AI/ML/DL resources
- Java AI/ML/DL resources
  - Deep Learning and DL4J Resources
Additional DL4J resources
- Loss functions
  - Loss function Interface by DL4J
  - 5 Regression Loss Functions All Machine Learners Should Know
- Evaluation
  - https://deeplearning4j.org/docs/latest/deeplearning4j-nn-evaluation
Valohai resources
- valohai | docs | blogs | GitHub | Videos | Showcase | About valohai | Slack
- Blog posts on how to use the CLI tool: [1] | [2]
Awesome Graal | graalvm.org

About me

Twitter: @theNeomatrix369 | GitHub: @neomatrix369

How to build Graal-enabled JDK8 on CircleCI?

Mani — Tue, 13 Aug 2019 18:27:25 +0000

The GraalVM Compiler is a replacement to HotSpot’s server-side JIT compiler widely known as the C2 compiler. It is written in Java with the goal of better performance (among other goals) as compared to the C2 compiler. New changes starting with Java 9 mean that we can now plug in our own hand-written C2 compiler into the JVM, thanks to JVMCI. The researchers and engineers at Oracle Labs have created a variant of JDK8 with JVMCI enabled which can be used to build the [GraalVM Compiler](http://wikipedia.com/graal-compiler). The GraalVM Compiler is open source and is available on GitHub (along with the HotSpot JVMCI sources needed to build the GraalVM Compiler). This gives us the ability to fork/clone it and build our own version of the GraalVM Compiler.

In this post, we are going to build the GraalVM Compiler with JDK8 on CircleCI. The resulting artifacts are going to be:

JDK8 embedded with the GraalVM Compiler, and
a zip archive containing Graal & Truffle modules/components.

Note: we are not covering how to build the whole of the GraalVM suite in this post, that can be done via another post. Although these scripts can be used to that, and there exists a branch which contains the rest of the steps.

Why use a CI tool to build the GraalVM Compiler?

Continuous integration (CI) and continuous deployment (CD) tools have many benefits. One of the greatest is the ability to check the health of the code-base. Seeing why your builds are failing provides you with an opportunity to make a fix faster. For this project, it is important that we are able to verify and validate the scripts required to build the GraalVM Compiler for Linux and macOS, both locally and in a Docker container. A CI/CD tool lets us add automated tests to ensure that we get the desired outcome from our scripts when every PR is merged. In addition to ensuring that our new code does not introduce a breaking change, another great feature of CI/CD tools is that we can automate the creation of binaries and the automatic deployment of those binaries, making them available for open source distribution.

Let’s get started

During the process of researching CircleCI as a CI/CD solution to build the GraalVM Compiler, I learned that we could run builds via two different approaches, namely:

A CircleCI build with a standard Docker container (longer build time, longer config script)
A CircleCI build with a pre-built, optimised Docker container (shorter build time, shorter config script)

We will now go through the two approaches mentioned above and see the pros and cons of both of them.

Approach 1: using a standard Docker container

For this approach, CircleCI requires a docker image that is available in Docker Hub or another public/private registry it has access to. We will have to install the necessary dependencies in this available environment in order for a successful build. We expect the build to run longer the first time and, depending on the levels of caching, it will speed up.

To understand how this is done, we will be going through the CircleCI configuration file section-by-section (stored in .circleci/circle.yml), see config.yml in .circleci for the full listing, see commit df28ee7 for the source changes.

Explaining sections of the config file

The below lines in the configuration file will ensure that our installed applications are cached (referring to the two specific directories) so that we don’t have to reinstall the dependencies each time a build occurs:

    dependencies:
      cache_directories:
        - "vendor/apt"
        - "vendor/apt/archives"

We will be referring to the docker image by it’s full name (as available in http://hub.docker.com under the account name used - adoptopenjdk). In this case, it is a standard docker image containing JDK8 made available by the good folks behind the Adopt OpenJDK build farm. In theory, we can use any image as long as it supports the build process. It will act as the base layer on which we will install the necessary dependencies:

        docker:
          - image: adoptopenjdk/openjdk8:jdk8u152-b16

Next, in the pre-Install Os dependencies step, we will restore the cache, if it already exists, this may look a bit odd, but for unique key labels, the below implementation is recommended by the docs:

          - restore_cache:
              keys:
                - os-deps-{{ arch }}-{{ .Branch }}-{{ .Environment.CIRCLE_SHA1 }}
                - os-deps-{{ arch }}-{{ .Branch }}

Then, in the Install Os dependencies step we run the respective shell script to install the dependencies needed. We have set this step to timeout if the operation takes longer than 2 minutes to complete (see docs for timeout):

          - run:
              name: Install Os dependencies
              command: ./build/x86_64/linux_macos/osDependencies.sh
              timeout: 2m

Then, in then post-Install Os dependencies step, we save the results of the previous step - the layer from the above run step (the key name is formatted to ensure uniqueness, and the specific paths to save are included):

          - save_cache:
              key: os-deps-{{ arch }}-{{ .Branch }}-{{ .Environment.CIRCLE_SHA1 }}
              paths:
                - vendor/apt
                - vendor/apt/archives

Then, in the pre-Build and install make via script step, we restore the cache if one already exists:

          - restore_cache:
              keys:
                - make-382-{{ arch }}-{{ .Branch }}-{{ .Environment.CIRCLE_SHA1 }}
                - make-382-{{ arch }}-{{ .Branch }}

Then, in the Build and install make via script step we run the shell script to install a specific version of make and it is set to timeout if step takes longer than 1 minute to finish:

          - run:
              name: Build and install make via script
              command: ./build/x86_64/linux_macos/installMake.sh
              timeout: 1m

Then, in the post Build and install make via script step, we save the results of the above action to the cache:

          - save_cache:
              key: make-382-{{ arch }}-{{ .Branch }}-{{ .Environment.CIRCLE_SHA1 }}
              paths:
                - /make-3.82/
                - /usr/bin/make
                - /usr/local/bin/make
                - /usr/share/man/man1/make.1.gz
                - /lib/

Then, we define environment variables to update JAVA_HOME and PATH at runtime. Here the environment variables are sourced so that we remember them for the next subsequent steps till the end of the build process (please keep this in mind):

          - run:
              name: Define Environment Variables and update JAVA_HOME and PATH at Runtime
              command: |
                echo '....'     <== a number of echo-es displaying env variable values
                source ${BASH_ENV}

Then, in the step to Display Hardware, Software, Runtime environment and dependency versions, as best practice we display environment-specific information and record it into the logs for posterity (also useful during debugging when things go wrong):

          - run:
              name: Display HW, SW, Runtime env. info and versions of dependencies
              command: ./build/x86_64/linux_macos/lib/displayDependencyVersion.sh

Then, we run the step to setup MX - this is an important from the point of view of the GraalVM Compiler (mx is a specialised build system created to facilitate compiling and building Graal/GraalVM and it’s components):

          - run:
              name: Setup MX
              command: ./build/x86_64/linux_macos/lib/setupMX.sh ${BASEDIR}

Then, we run the important step to Build JDK JVMCI (we build the JDK with JVMCI enabled here) and timeout, if the process takes longer than 15 minutes without any output or if the process takes longer than 20 minutes in total to finish:

          - run:
              name: Build JDK JVMCI
              command: ./build/x86_64/linux_macos/lib/build_JDK_JVMCI.sh ${BASEDIR} ${MX}
              timeout: 20m
              no_output_timeout: 15m

Then, we run the step Run JDK JVMCI Tests, which runs tests as part of the sanity check after building the JDK JVMCI:

          - run:
              name: Run JDK JVMCI Tests
              command: ./build/x86_64/linux_macos/lib/run_JDK_JVMCI_Tests.sh ${BASEDIR} ${MX}

Then, we run the step Setting up environment and Build GraalVM Compiler, to set up the build environment with the necessary environment variables which will be used by the steps to follow:

          - run:
              name: Setting up environment and Build GraalVM Compiler
              command: |
                echo ">>>> Currently JAVA_HOME=${JAVA_HOME}"
                JDK8_JVMCI_HOME="$(cd ${BASEDIR}/graal-jvmci-8/ && ${MX} --java-home ${JAVA_HOME} jdkhome)"
                echo "export JVMCI_VERSION_CHECK='ignore'" >> ${BASH_ENV}
                echo "export JAVA_HOME=${JDK8_JVMCI_HOME}" >> ${BASH_ENV}
                source ${BASH_ENV}

Then, we run the step Build the [GraalVM Compiler](https://github.com/oracle/graal/tree/master/compiler) and embed it into the JDK (JDK8 with JVMCI enabled) which timeouts if the process takes longer than 7 minutes without any output or longer than 10 minutes in total to finish:

          - run:
              name: Build the [GraalVM Compiler](https://github.com/oracle/graal/tree/master/compiler) and embed it into the JDK (JDK8 with JVMCI enabled)
              command: |
                echo ">>>> Using JDK8_JVMCI_HOME as JAVA_HOME (${JAVA_HOME})"
                ./build/x86_64/linux_macos/lib/buildGraalCompiler.sh ${BASEDIR} ${MX} ${BUILD_ARTIFACTS_DIR}
              timeout: 10m
              no_output_timeout: 7m

Then, we run the simple sanity checks to verify the validity of the artifacts created once a build has been completed, just before archiving the artifacts:

          - run:
              name: Sanity check artifacts
              command: |
                ./build/x86_64/linux_macos/lib/sanityCheckArtifacts.sh ${BASEDIR} ${JDK_GRAAL_FOLDER_NAME}
              timeout: 3m
              no_output_timeout: 2m

Then, we run the step Archiving artifacts (means compressing and copying final artifacts into a separate folder) which timeouts if the process takes longer than 2 minutes without any output or longer than 3 minutes in total to finish:

          - run:
              name: Archiving artifacts
              command: |
                ./build/x86_64/linux_macos/lib/archivingArtifacts.sh ${BASEDIR} ${MX} ${JDK_GRAAL_FOLDER_NAME} ${BUILD_ARTIFACTS_DIR}
              timeout: 3m
              no_output_timeout: 2m

For posterity and debugging purposes, we capture the generated logs from the various folders and archive them:

          - run:
              name: Collecting and archiving logs (debug and error logs)
              command: |
                ./build/x86_64/linux_macos/lib/archivingLogs.sh ${BASEDIR}
              timeout: 3m
              no_output_timeout: 2m
              when: always
          - store_artifacts:
              name: Uploading logs
              path: logs/

Finally, we store the generated artifacts at a specified location - the below lines will make the location available on the CircleCI interface (we can download the artifacts from here):

          - store_artifacts:
              name: Uploading artifacts in jdk8-with-graal-local
              path: jdk8-with-graal-local/

Approach 2: using a pre-built optimised Docker container

For approach 2, we will be using a pre-built docker container, that has been created and built locally with all necessary dependencies, the docker image saved and then pushed to a remote registry for e.g. Docker Hub. And then we will be referencing this docker image in the CircleCI environment, via the configuration file. This saves us time and effort for running all the commands to install the necessary dependencies to create the necessary environment for this approach (see the details steps in Approach 1).

We expect the build to run for a shorter time as compared to the previous build and this speedup is a result of the pre-built docker image (we will see in Steps to build the pre-built docker image), to see how this is done). The additional speed benefit comes from the fact that CircleCI caches the docker image layers which in turn results in a quicker startup of the build environment.

We will be going through the CircleCI configuration file section-by-section (stored in .circleci/circle.yml) for this approach, see config.yml in .circleci for the full listing, see commit e5916f1 for the source changes.

Explaining sections of the config file

Here again, we will be referring to the docker image by it’s full name. It is a pre-built docker image neomatrix369/graalvm-suite-jdk8 made available by neomatrix369. It was built and uploaded to Docker Hub in advance before the CircleCI build was started. It contains the necessary dependencies for the GraalVM Compiler to be built:

        docker:
          - image: neomatrix369/graal-jdk8:${IMAGE_VERSION:-python-2.7}
        steps:
          - checkout

All the sections below do the exact same tasks (and for the same purpose) as in Approach 1, see Explaining sections of the config file.

Except, we have removed the below sections as they are no longer required for Approach 2:

    - restore_cache:
              keys:
                - os-deps-{{ arch }}-{{ .Branch }}-{{ .Environment.CIRCLE_SHA1 }}
                - os-deps-{{ arch }}-{{ .Branch }}
          - run:
              name: Install Os dependencies
              command: ./build/x86_64/linux_macos/osDependencies.sh
              timeout: 2m
          - save_cache:
              key: os-deps-{{ arch }}-{{ .Branch }}-{{ .Environment.CIRCLE_SHA1 }}
              paths:
                - vendor/apt
                - vendor/apt/archives
          - restore_cache:
              keys:
                - make-382-{{ arch }}-{{ .Branch }}-{{ .Environment.CIRCLE_SHA1 }}
                - make-382-{{ arch }}-{{ .Branch }}
          - run:
              name: Build and install make via script
              command: ./build/x86_64/linux_macos/installMake.sh
              timeout: 1m
          - save_cache:
              key: make-382-{{ arch }}-{{ .Branch }}-{{ .Environment.CIRCLE_SHA1 }}
              paths:
                - /make-3.82/
                - /usr/bin/make
                - /usr/local/bin/make
                - /usr/share/man/man1/make.1.gz

In the following section, I will go through the steps show how to build the pre-built docker image. It will involve running the bash scripts - ./build/x86_64/linux_macos/osDependencies.sh and ./build/x86_64/linux_macos/installMake.sh to install the necessary dependencies as part of building a docker image. And, finally pushing the image to Docker Hub (can be pushed to any other remote registry of your choice).

Steps to build the pre-built docker image

Run build-docker-image.sh (see bash script source) which depends on the presence of Dockerfile (see docker script source). The Dockerfile does all the necessary tasks of running the dependencies inside the container i.e. runs the bash scripts ./build/x86_64/linux_macos/osDependencies.sh and ./build/x86_64/linux_macos/installMake.sh:

    $ ./build-docker-image.sh

Once the image has been built successfully, run push-graal-docker-image-to-hub.sh after setting the USER_NAME and IMAGE_NAME (see source code), otherwise it will use the default values as set in the bash script:

    $ USER_NAME="[your docker hub username]" IMAGE_NAME="[any image name]" \
        ./push-graal-docker-image-to-hub.sh

CircleCI config file statistics: Approach 1 versus Approach 2

Areas of interest	Approach 1	Approach 2
Config file (full source list)	build-on-circleci	build-using-prebuilt-docker-image
Commit point (sha)	df28ee7	e5916f1
Lines of code (loc)	110 lines	85 lines
Source lines (sloc)	110 sloc	85 sloc
Steps (`steps:` section)	19	15
Performance (see Performance section)	Some speedup due to caching, but slower than Approach 2	Speed-up due to pre-built docker image, and also due to caching at different steps. Faster than Approach 1 Ensure DLC layering is enabled (its a paid feature)

What not to do?

Approach 1 issues
- I came across things that wouldn’t work initially, but were later fixed with changes to the configuration file or the scripts:
  - please make sure the .circleci/config.yml is always in the root directory of the folder
  - when using the store_artifacts directive in the .circleci/config.yml file setting set the value to a fixed folder name i.e. jdk8-with-graal-local/ - in our case, setting the path to ${BASEDIR}/project/jdk8-with-graal didn’t create the resulting artifact once the build was finished hence the fixed path name suggestion.
  - environment variables: when working with environment variables, keep in mind that each command runs in its own shell hence the values set to environment variables inside the shell execution environment isn’t visible outside, follow the method used in the context of this post. Set the environment variables such that all the commands can see its required value to avoid misbehaviours or unexpected results at the end of each step.
  - caching: use the caching functionality after reading about it, for more details on CircleCI caching refer to the caching docs. See how it has been implemented in the context of this post. This will help avoid confusions and also help make better use of the functionality provided by CircleCI.
Approach 2 issues
- Caching: check the docs when trying to use the Docker Layer Caching (DLC) option as it is a paid feature, once this is known the doubts about “why CircleCI keeps downloading all the layers during each build” will be clarified, for Docker Layer Caching details refer to docs. It can also clarify why in non-paid mode my build is still not as fast as I would like it to be.

General note:

Light-weight instances: to avoid the pitfall of thinking we can run heavy-duty builds, check the documentation on the technical specifications of the instances. If we run the standard Linux commands to probe the technical specifications of the instance we may be misled by thinking that they are high specification machines. See the step that enlists the Hardware and Software details of the instance (see Display HW, SW, Runtime env. info and versions of dependencies). The instances are actually Virtual Machines or Container like environments with resources like 2CPU/4096MB. This means we can’t run long-running or heavy-duty builds like building the GraalVM suite. Maybe there is another way to handle these kinds of builds, or maybe such builds need to be decomposed into smaller parts.
Global environment variables: as each run line in the config.yml, runs in its own shell context, from within that context environment variables set by other executing contexts do not have access to these values. Hence in order to overcome this, we have adopted two methods:
- pass as variables as parameters to calling bash/shell scripts to ensure scripts are able to access the values in the environment variables
- use the source command as a run step to make environment variables accessible globally # End result and summary

We see the below screen (the last step i.e. Updating artifacts enlists where the artifacts have been copied) after a build has been successfully finished:

The artifacts are now placed in the right folder for download. We are mainly concerned about the jdk8-with-graal.tar.gz artifact.

Performance

Before writing this post, I ran multiple passes of both the approaches and jotted down the time taken to finish the builds, which can be seen below:

Approach 1: standard CircleCI build (caching enabled)
- 13 mins 28 secs
- 13 mins 59 secs
- 14 mins 52 secs
- 10 mins 38 secs
- 10 mins 26 secs
- 10 mins 23 secs
Approach 2: using pre-built docker image (caching enabled, DLC feature unavailable)
- 13 mins 15 secs
- 15 mins 16 secs
- 15 mins 29 secs
- 15 mins 58 secs
- 10 mins 20 secs
- 9 mins 49 secs

Note: **Approach 2 should show better performance when using a paid tier, as Docker Layer Caching is available as part of this plan.

Sanity check

In order to be sure that by using both the above approaches we have actually built a valid JDK embedded with the GraalVM Compiler, we perform the following steps with the created artifact:

Firstly, download the jdk8-with-graal.tar.gz artifact from under the Artifacts tab on the CircleCI dashboard (needs sign-in):
Then, unzip the .tar.gz file and do the following:

    tar xvf jdk8-with-graal.tar.gz

Thereafter, run the below command to check the JDK binary is valid:

    cd jdk8-with-graal
    ./bin/java -version

And finally, check if we get the below output:

    openjdk version "1.8.0-internal"
    OpenJDK Runtime Environment (build 1.8.0-internal-jenkins_2017_07_27_20_16-b00)
    OpenJDK 64-Bit Graal:compiler_ab426fd70e30026d6988d512d5afcd3cc29cd565:compiler_ab426fd70e30026d6988d512d5afcd3cc29cd565 (build 25.71-b01-internal-jvmci-0.46, mixed mode)

Similarly, to confirm if the JRE is valid and has the GraalVM Compiler built-in, we do this:

    ./bin/jre/java -version

And check if we get a similar output as above:

    openjdk version "1.8.0-internal"
    OpenJDK Runtime Environment (build 1.8.0-internal-jenkins_2017_07_27_20_16-b00)
    OpenJDK 64-Bit Graal:compiler_ab426fd70e30026d6988d512d5afcd3cc29cd565:compiler_ab426fd70e30026d6988d512d5afcd3cc29cd565 (build 25.71-b01-internal-jvmci-0.46, mixed mode)

With this, we have successfully built JDK8 with the GraalVM Compiler embedded in it and also bundled the Graal and Truffle components in an archive file, both of which are available for download via the CircleCI interface.

Note: you will notice that we do perform sanity checks of the binaries built just before we pack them into compressed archives, as part of the build steps (see bottom section of CircleCI the configuration files).

Nice badges!

We all like to show-off and also like to know the current status of our build jobs. A green-colour, build status icon is a nice indication of success, which looks like the below on a markdown README page:

We can very easily embed both of these status badges displaying the build status of our project (branch-specific i.e. master or another branch you have created) built on CircleCI (see docs on how to do that).

Conclusions

We explored two approaches to build the GraalVM Compiler using the CircleCI environment. They were good experiments to compare performance between the two approaches and also how we can do them with ease. We also saw a number of things to avoid or not to do and also saw how useful some of the CircleCI features are. The documentation and forums do good justice when trying to make a build work or if you get stuck with something.

Once we know the CircleCI environment, it’s pretty easy to use and always gives us the exact same response (consistent behaviour) every time we run it. Its ephemeral nature means we are guaranteed a clean environment before each run and clean up after it finishes. We can also set up checks on build time for every step of the build, and abort a build if the time taken to finish a step surpasses the threshold time-period.

The ability to use pre-built docker images coupled with Docker Layer Caching on CircleCI can be a major performance boost (saves us build time needed to reinstall any necessary dependencies at every build). Additional performance speedups are available on CircleCI, with caching of the build steps - this again saves build time by not having to re-run the same steps if they haven’t changed.

There are a lot of useful features available on CircleCI with plenty of documentation and everyone on the community forum are helpful and questions are answered pretty much instantly.

Next, let’s build the same and more on another build environment/build farm - hint, hint, are you think the same as me? Adopt OpenJDK build farm? We can give it a try!

Thanks and credits to Ron Powell from CircleCI and Oleg Šelajev from Oracle Labs for proof-reading and giving constructive feedback.

Please do let me know if this is helpful by dropping a line in the comments below or by tweeting at @theNeomatrix369, and I would also welcome feedback, see how you can reach me, above all please check out the links mentioned above.

Useful resources

Links to useful CircleCI docs
CircleCI configuration and supporting files
- Approach 1: https://github.com/neomatrix369/awesome-graal/tree/build-on-circleci (config file and other supporting files i.e. scripts, directory layout, etc…)
- Approach 2: https://github.com/neomatrix369/awesome-graal/tree/build-on-circleci-using-pre-built-docker-container (config file and other supporting files i.e. scripts, directory layout, etc…)
Scripts to build Graal on Linux, macOS and inside the Docker container
Truffle served in a Holy Graal: Graal and Truffle for polyglot language interpretation on the JVM
Learning to use Wholly GraalVM!
Building Wholly Graal with Truffle!

About me

Twitter: @theNeomatrix369 | GitHub: @neomatrix369

Apache Zeppelin: stairway to notes* haven!

Mani — Thu, 03 Jan 2019 01:41:36 +0000

*notes is for notebooks in Zeppelin lingo

This post is a re-blog of my @JavaAdventCalendar post from https://www.javaadvent.com/2018/12/apache-zeppelin-stairway-to-notes-haven.html.

Introduction

Continuing from the previous post, Two years in the life of AI, ML, DL and Java, where I had expressed my motivation. I mentioned our discussions, one of the discussions was, that you can write in languages like Python, R, Julia in JuPyteR notebooks. Most were not aware you can also write Java and Scala in addition to Python, SQL etc… with the help of Apache Zeppelin notebooks. And so I did commit to sharing something in those lines to broaden everyone’s awareness. Although it’s been some time since then I have managed to put together my thoughts into this post, showing how we can do similar operations using Apache Zeppelin which supports both Java, and Scala. The project itself is written in Java and it’s open architecture means Zeppelin can support anything as long as an interpreter for that thing has been provided.

First things, first…

In case, I have lost some of you, here’s what I meant by JuPyteR notebooks and writing notebooks in different languages, see https://www.youtube.com/watch?v=Rc4JQWowG5I and also have a look at the list of kernels supported by JuPyteR notebook. But in this post, we are covering Apache Zeppelin, how to get it to work and how to use a couple of notes in the Zeppelin environment.

The fun part…

So let’s have a look at how we do it, by first downloading and installing Apache Zeppelin.

Download & Installation

Download
Go to the Download page, a number of options are available, two of the recommended options:

Download entire binary containing the interpreters
Download a net installer which then downloads the interpreters (you can choose the ones you need or use --all flag for all the interpreters)

In our case, I downloaded the net-install interpreter package from the download binary package section.

Installation
I unpacked the .tgz archive and placed it in the /opt/ folder and ran:

$ cd /opt/zeppelin-0.8.0-bin-netinst
$ ./bin/install-interpreter.sh --all

For another type of archive or installation option, see the instructions on the Quick Start page.

Running
Depending on the type of binary downloaded, follow the instructions on the Quick Start page.
Although in our case, I had to just run:

$ cd /opt/zeppelin-0.8.0-bin-netinst
$ ./bin/zeppelin.sh

Optional setting
As I was curious what it was running Zeppelin under another JDK than the usual Oracle or OpenJDK JDK or JRE, I decided to try GraalVM JRE and so I switched JAVA_HOME to point to /path/to/GraalVM/jre on my machine. The GraalVM JDK comes bundled with the JRE which can be independently used just like any Java vendor’s JRE.

When Zeppelin is run, these messages are shown (you can see the JAVA_HOMEsettings have been picked up):

Pid dir doesn't exist, create /opt/zeppelin-0.8.0-bin-netinst/run
GraalVM 1.0.0-rc7 warning: ignoring option MaxPermSize=512m; support was removed in 8.0
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/zeppelin-0.8.0-bin-netinst/lib/interpreter/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/zeppelin-0.8.0-bin-netinst/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Dec 25, 2018 1:34:23 AM org.glassfish.jersey.internal.inject.Providers checkProviderRuntime
WARNING: A provider org.apache.zeppelin.rest.NotebookRepoRestApi registered in SERVER runtime does not implement any provider interfaces applicable in the SERVER runtime. Due to constraint configuration problems the provider org.apache.zeppelin.rest.NotebookRepoRestApi will be ignored.
Dec 25, 2018 1:34:23 AM org.glassfish.jersey.internal.inject.Providers checkProviderRuntime
Dec 25, 2018 1:34:23 AM org.glassfish.jersey.internal.inject.Providers checkProviderRuntime
[---- snipped ----]
WARNING: The (sub)resource method getNoteList in org.apache.zeppelin.rest.NotebookRestApi contains empty path annotation.

Running (continued)
Once all the above steps are completed and Zeppelin has successfully started, do the below:

Open http://localhost:8080/#/
And then look at the docs under Exploring Zeppelin UI
And then the tutorial notebook at http://localhost:8080/#/notebook/2A94M5J1Z

Small experiment

Just to look at some numbers, I decided to use the Zeppelin Tutorial/Basic Features (Spark) notebook to check the difference in performance when run using GraalVM JDK/JRE and another JDK/JRE and here are the results:

GraalVM JDK

./bin/zeppelin.sh 48.26s user 25.63s system 28% cpu 4:20.15 total (started and stopped the script manually)
First paragraph
Took 47 sec. Last updated by anonymous at December 25 2018, 2:18:36 AM.
Each paragraph thereafter (columns from left to right):
Took 44 sec. Last updated by anonymous at December 25 2018, 2:18:44 AM. (outdated)
Took 10 sec. Last updated by anonymous at December 25 2018, 2:18:47 AM. (outdated)
Took 6 sec. Last updated by anonymous at December 25 2018, 2:18:50 AM. (outdated)

Oracle JDK8

./bin/zeppelin.sh 37.64s user 25.73s system 29% cpu 3:38.49 total (started and stopped the script manually)
First paragraph
Took 54 sec. Last updated by anonymous at December 25 2018, 2:12:16 AM.
Each paragraph thereafter (columns from left to right):
Took 43 sec. Last updated by anonymous at December 25 2018, 2:12:24 AM. (outdated)
Took 13 sec. Last updated by anonymous at December 25 2018, 2:12:29 AM. (outdated)
Took 6 sec. Last updated by anonymous at December 25 2018, 2:12:31 AM. (outdated)

My observations are that the performance differences were marginal, although for different kinds of operation the results would vary between the two, hence more observations are needed. Best to stay put on GraalVM JRE unless otherwise indicated to see more such variations as we go along.

Note: paragraphs are code blocks in Zeppelin lingo, note is what a notebook is referred to as in the Zeppelin world. Hence the idea, a note has one or more paragraphs.

There are many other tutorials (sample) notes to play with, see on the home page under Zeppelin Tutorial (see screenshot):

Importing a note

From the home page (http://localhost:8080/#/, see below), we can select the hyperlinked text Import Note, which allows us to import a note (Notebook in Zeppelin lingo) from disk or from a URL.

In our case, I added the note from https://github.com/mmatloka/machine-learning-by-example-workshop (ensure the link to the raw contents of the json file is used i.e. https://raw.githubusercontent.com/mmatloka/machine-learning-by-example-workshop/master/Workshop.json) into Zeppelin, and tried running but got various errors when trying to run the first couple of paragraphs.

Looking for answers as to why I was getting those errors, I came across a forum and then took upon the suggestion from someone on the forum where similar errors messages were reported. It was a workaround to fix issue https://issues.apache.org/jira/browse/ZEPPELIN-3586.

We failed the previous time, so let’s try again…

One of the solutions was to make SPARK_HOME point to a separate instance of Spark and not rely on the embedded spark interpreter inside the Apache Zeppelin installation. As a workaround, a link to a Dockerfile gist was provided at https://gist.github.com/conker84/4ffc9a2f0125c808b4dfcf3b7d70b043#file-zeppelin-dockerfile. I extended the script to incorporate GraalVM JRE and added the necessary configuration for it to be visible to Zeppelin and Spark:

Zeppelin-Dockerfile

FROM apache/zeppelin:0.8.0
# Workaround to "fix" https://issues.apache.org/jira/browse/ZEPPELIN-3586
RUN echo "$LOG_TAG Download Spark binary" && \
wget -O /tmp/spark-2.3.1-bin-hadoop2.7.tgz http://apache.panu.it/spark/spark-2.3.1/spark-2.3.1-bin-hadoop2.7.tgz && \
tar -zxvf /tmp/spark-2.3.1-bin-hadoop2.7.tgz && \
rm -rf /tmp/spark-2.3.1-bin-hadoop2.7.tgz && \
mv spark-2.3.1-bin-hadoop2.7 /spark-2.3.1-bin-hadoop2.7
ENV SPARK_HOME=/spark-2.3.1-bin-hadoop2.7
### My modified steps here on:
RUN rm -fr /usr/lib/jvm/java-1.8.0-openjdk-amd64 /usr/lib/jvm/java-8-openjdk-amd64
RUN wget https://github.com/oracle/graal/releases/download/vm-1.0.0-rc10/graalvm-ce-1.0.0-rc10-linux-amd64.tar.gz
RUN tar xvzf graalvm-ce-1.0.0-rc10-linux-amd64.tar.gz
RUN mv graalvm-ce-1.0.0-rc10/jre /usr/lib/jvm/graalvm-ce-1.0.0-rc10
ENV JAVA_HOME=/usr/lib/jvm/graalvm-ce-1.0.0-rc10
ENV PATH=$JAVA_HOME/bin:$PATH
RUN java -version
RUN rm graalvm-ce-1.0.0-rc10-linux-amd64.tar.gz
RUN rm -fr graalvm-ce-1.0.0-rc10
CMD ["bin/zeppelin.sh"]

And created two small bash scripts to help build the docker image and run the container from the image.

Build docker image

docker build -t zeppelin -f Zeppelin-Dockerfile .

Run docker container

docker run --rm \
-it \
-p 8080:8080 zeppelin

Note: the docker image is called zeppelin:latest, and is about 4.45GB in size.

The above scripts can be found at https://github.com/neomatrix369/awesome-ai-ml-dl/tree/master/examples/apache-zeppelin, please feel free to improve them and create pull requests back into the repo.

In case, you don’t wish to do the above, you could try using https://github.com/dylanmei/docker-zeppelin. I’m Apache Zeppelin works out of the box using this container as well.

I wasn’t too keen with the above as the whole process took more than 45 mins, 35 mins of which went into downloading several MBs of Spark. Downloading the GraalVM JDK was a breeze, less than 5 minutes on my high-speed DSL connection.

Applied the same steps above to load Michal Matloka’s Workshop notebook (workshop.json) and ran the paragraphs in the notebook and it worked like a charm, without any errors, of course. Thanks, Michal Matloka, for providing with such an example to play with and learn multiple things in one go.

From loading the dataset from a .csv file:

to produce the final outcome, via the parameter avgMetrics – average cross-validation metrics for each paramMap in CrossValidator.estimatorParamMaps, in the respective order.

A score of 53.18% — might still need a bit of tweaking and fine-tuning to achieve a higher score but that is a different discussion and tangents from our current topic on Zeppelin notes.

Caveat

Somehow Zeppelin does not like code layouts with such indentations:

val indexToString = new IndexToString()
.setInputCol("prediction").setOutputCol("predictionLabel")
.setLabels(stringIndexer.labels)

so when I removed the indentation to join the chain of function calls together:

val indexToString = new IndexToString().setInputCol("prediction").setOutputCol("predictionLabel").setLabels(stringIndexer.labels)

I was able to run the paragraphs fine. I had to do this to all the paragraphs to prevent any errors from Zeppelin. Or else you get messages of such nature across all the paragraphs:

:1: error: illegal start of definition
.setInputCol("prediction").setOutputCol("predictionLabel")
^

Summary

Things I like about Zeppelin are:

you have a clean and intuitive interface (must be Angular at work)
you can write custom interpreters and expand the accepted list of languages
write your own visualisers
execution progress of every paragraph is displayed in real-time
the execution time of every paragraph is computed and displayed in real-time
wherever applicable a table of data can be visualised into a number of visuals and back to table of data — and all of this is done lazily (only executed when selected and keeps the results static)

Although, execution can appear to be slower than JuPyteR notebooks. A number of bells-and-whistles available in IPython notebooks are absent which also means being an open-source project it leaves a lot of room for improvements via contributions — pick your favourite feature of choice for a pull request.

All-in-all a great place for Java/JVM developers to feel at home and use Zeppelin to do their prototype, ML training and experimentation work for developers familiar with not just Python and R but also Java and Scala.

Please keep an eye on this space, and share your comments, feedback or any contributions which will help us all learn and grow to @theNeomatrix369, you can find more about me via the About me page.

About me

Twitter: @theNeomatrix369 | GitHub: @neomatrix369

Two years in the life of AI, ML, DL and Java

Mani — Thu, 03 Jan 2019 01:34:47 +0000

Citation

All the images in the post are owned by the respective owners/creators/authors. This post is a re-blog of my @JavaAdventCalendar post from https://www.javaadvent.com/2018/12/two-years-in-the-life-of-ai-ml-dl-and-java.html.

Introduction

AI, ML and DL are acronyms for Artificial Intelligence, Machine Learning and Deep Learning.

Now back to what I was going to write about. If you ask me, I’ll already admit that I have NOT even scraped the surface of these topics. What I share here is a glimpse of what’s out there and each one of you might have discovered many more aspects of these topics as part of your daily professional and personal pursuits.

One of my motivations of putting this post and the links below together comes from the discussion we had during the LJC Unconference in November 2018, where Jeremie, Michael Bateman and I along with a number of LJC JUG members gathered at a session discussing a similar topic. And the questions raised by some were in the lines of where does Java stand in the world of AI-ML-DL. How do I do any of these things in Java? Which libraries and frameworks to use?

AI-ML-DL and Java and their outreach

Another confession, I didn’t spend too much time trying to gather and categorise these topics, thanks to Twitter and the Internet for helping me find them and use them. I hope whatever content has been put together here quantifies to more than the answer to the above questions. And in case you feel further improvements can be made to the content, categorisation, layout, please feel free to contribute, you can start by visiting the git repo and creating a pull request. Please watch, fork, start the repo to get updates of the changes to come. Here’s a number of resources shared in the last two years (circa), categorised as best I could:

Business / General / Semi-technical
Classifier / decision trees
- Email Spam Detector java Application with ApacheSpark (Tweet)
- Guide to Artificial Intelligence: Automating Decision-Making (Tweet)
Correlated Cross Occurrence
- Multi-domain predictive AI or how to make one thing predict another (Tweet)
Deep learning
- Deep learning with java (Tweet)
- Free AI Training - Java-based deep-learning tools to analyze and train data, then send the resulting changes back to the server (Tweet)
Genetic Algorithms
- Jenetics is an advanced Genetic Algorithm, respectively an Evolutionary Algorithm, library written in java (Tweet)
Java projects / technologies
Natural Language Processing (aka NLP)
Neural Networks
Recommendation systems / Collaborative Filtering (CF)
- Tutorial on Collaborative Filtering (CF) in Java – a machine learning technique used by recommendation systems(Tweet)
Tools & Libraries, Cheatsheets, Resources
How-to / Deploy / DevOps / Serverless
- Learn how to deploy and manage machine learning models (Tweet)
- How to prepare unstructured data for BI and data analytics AI and MachineLearning (Tweet)
- Machine Learning Model Deployment Made Simple: [1] [2] (Tweet)
- (more links)
Misc
- Introduction to interactive Data Lake Queries (Tweet)
- A Simple Introduction To Data Structures (Tweet)

Due to a large number of the links gathered, not all of them could be shown here and so I have created a git repo and to host them on GitHub, where you will find the rest of the links. Once again pull requests are very welcome.

From my several weeks to few months of intense experience I suggest if you want to get your hands dirty with Artificial Intelligence and it’s off-springs [2][3], don’t shy away from it, just because it is not Java / JVM based. It’s best to start high-level with whatever you have and when you have understood the subject enough to try to apply them in the languages you are at home with, be that Java or any other JVM language you may know. I’m not claiming I know them, but merely sharing my mileage.

One of the things we came up during our discussions was that AI, ML and DL have strong contributions from academia and they use tools and languages best known to them and sometimes most appropriate for the task in hand.

Follow the community and the tools that drive the innovation and inspiration, to become better at the subject of choice. In this case, it applies to Artificial Intelligence and its variants [2][3].

Quick shoutouts

Firstly, to @java for sharing many AI, ML, DL related resources with the wider community. And also to organisations like @skymindio (https://skymind.ai/) who are doing an awesome job in bridging the gap between the Java/JVM and AI/ML/DL worlds.

Also, would like to thank the good folks (Helen and team) behind the ML Study group in London — supported by @RWmeetamentor, who have been working hard to bring everyone together to learn ML and related topics. They may have even very indirectly influenced me to write this post. wink, wink

Summary

So to sum up, our discussion at the LJC Unconference 2018, we mentioned other languages like Python, R, Julia, Matlab and the likes, contribute more to AI, ML and DL than another programming language.

I know it is not going to make me popular by saying this but my humble request to all developers would be that not to think or expect everything possible from a single programming language. Any language and in the context of this post, Java and other JVM languages are meant and written for a purpose and no doubt we can replicate efforts made in other languages in Java/JVM languages.

But, at the end of the day, they should all be treated as tools and be used where appropriate.
I hope the little shared in this post still will help inspire the Java / JVM community especially those who have strong interests in topics like Artificial Intelligence, Machine Learning and Deep Learning.

Please keep an eye on this space, more good stuff coming and share your comments, feedback or any contributions which will help us all learn and grow to@theNeomatrix369, you can find more about me via the About me page.

About me

Twitter: @theNeomatrix369 | GitHub: @neomatrix369

DEV Community: Mani

AI Coding Tools (MCP-series)

About This Category

Posts

MCP (Model Context Protocol) Series

The Complete MCP Server Setup Guide: Claude Desktop, Claude Code, and Cursor

The Claude Command Reference Card

MCP Setup Comparison Tables

Topics Covered in This Category

External Resources

Disclaimer

Another Two Years In The Life Of AI, ML, DL And Java

A bit of a background

2019: year of blogs

2020: year of presentations

Other achievements and activities this year

What’s happening in this space?

Where is Java on the map?

Some highlights in pictures

Recommended Resources

Conclusions

What next?

An Interview by Neural Magic: Machine Learning Engineer Spotlight

This is a reblog of the original post at https://neuralmagic.com/blog/machine-learning-engineer-spotlight-mani-sarkar/ by Neural Magic

Tell us more about how you got into machine learning.

What are you most excited about in the work you are doing these days?

What is the coolest machine learning problem you have worked toward solving?

How do you predict machine learning will evolve over the next decade?

What is the most interesting application of machine learning you have seen out there?

What do you see as the biggest challenges in machine learning and AI right now?

If you could change one thing about the public perception of machine learning and AI, what would it be?

About Mani

This is a reblog of the original post at https://neuralmagic.com/blog/machine-learning-engineer-spotlight-mani-sarkar/ by Neural Magic

Exploring NLP concepts using Apache OpenNLP inside a Java-enabled Jupyter notebook

Introduction

Exploring NLP using Apache OpenNLP

Command-line Interface

Jupyter Notebook: Getting started

Running the Jupyter notebook container

Installing Apache OpenNLP in the container

Viewing and accessing the shared folder

Performing NLP actions in a Jupyter notebook

Closing the Jupyter notebook

Other concepts, libraries and tools

Limitations

Conclusion

Resources

IJava (Jupyter interpreter)

Jupyhai

Apache OpenNLP

Other related posts

About me

Exploring NLP concepts using Apache OpenNLP

Introduction

Exploring NLP using Apache OpenNLP

Java bindings

Command-line Interface

Getting started

Run the NLP Java/JVM docker container

Installing Apache OpenNLP inside the container

Viewing and accessing the shared folder

Performing NLP actions inside the container

Exiting from the NLP Java/JVM docker container

Benchmarking

Empirical example

Other concepts, libraries and tools

Conclusion

Resources

Apache OpenNLP

Other related posts

About me

NLP with DL4J in Java, all from the command-line

Introduction

Purpose or Goals

What do we need and how?

NLP for Java, DL4J and Valohai

NLP for Java: DL4J

Conclusion

Benefits

Suggestions