DEV Community

Cover image for Introducing bricks, an open-source content-library for NLP
Leonard Püttmann for Kern AI

Posted on

Introducing bricks, an open-source content-library for NLP

This week we launched bricks, an open-source library which provides enrichments for your natural language processing projects. Our main goal with bricks is to shorten the amount of time that you need from idea to implementation. Bricks also seamlessly integrates into our main tool, the Kern AI refinery.

Let's take a closer look at the structure of bricks and how to use it. You'll find bricks here

https://bricks.kern.ai/home

Structure of a brick module

In each module of bricks, you will find the source code for the function. You can directly use a bricks module in refinery, either by directly copying the source code or via the bricks integration that will be available in the next release of refinery 1.7. Of course, this code could also be used outside of refinery.

Image description

On the right hand side, you can directly try out the module over an live endpoint that we've deployed. You can try out the module with the example input that is already provided, or you can type something yourself and try it out!

Image description

Types of bricks modules

Currently, there are three main types of modules in refinery:

Classifiers:

As the name suggests, these modules can be used to classify something. Need to find out the language of your text or get the complexity of it? You'll find what you need in the classifiers!

Extractors:

The extractors are really useful if you would like to pull certain information or entities from your text. The most bricks modules can currently be found here, where you'll find modules to extract metrics, time, names, adresses and many more useful thing! We've built all of these modules in a way that they can instantly be used for labeling functions in refinery.

Generators:

This type of module generates some new form of output, such as a translation or a cleaned or corrected version of a text. In the generators, you will also find two premium functions, for which you'll need an API key of an external provider to use them, in this case for language translation. However, it's also very important to us to always provide similar modules that don't need an API key.

Using a bricks module in refinery

Let's say that we have a dataset with news articles, and we want to categorize them by their complexity. We then go to the sentence complexity module in bricks and copy all the source code.

Image description

We then go back to our project in refinery and create a new attribute calculation, which we can do on the settings page.

Image description

We then paste in the code and put in the name of our attribute, in our case the headlines!

Image description

As a result, we'll then get the sentence complexity of each of our headlines that we have in our dataset.

Image description

All of this takes less than a minute to implement.

Contributing to bricks

As all projects at Kern, bricks is open-source, meaning that you get access to the source-code. You can also contribute to bricks if you built something that you would like to share and that you think would be useful to others. Should you have a great idea or implementation, feel free to just open an issue on our GitHub page.You can check bricks GitHub page here. On our GitHub page, you'll also find a detailed explaination of how to contribute to bricks.

We've also made a tutorial on YouTube in which our DevRel guy Div shows you all the neccessary steps to contribute.

You may also join our Discord community, where you can ask questions and discuss things with the wonderful Kern community. Join us here: https://discord.gg/WAnAgQEv

Top comments (0)