DEV Community: Petros Demetrakopoulos

EthairBalloons is now available for Python!

Petros Demetrakopoulos — Wed, 06 Jul 2022 13:12:55 +0000

Yes!
It is finally there.
The much-loved ORM library for the Ethereum Blockchain is now available for Python.

What is EthAir Balloons ?

EthAir Balloons is a strictly typed ORM library for Ethereum blockchain. It allows you to use Ethereum blockchain as a persistent storage in an organized and model-oriented way without writing custom complex Smart contracts.

Until now, EthairBalloons was only available for JavaScript and you could install it via npm (npm install --save ethairballoons)

However, from now on there is an official Python version too.
you can check it out on this GitHub repository and install it via pip install ethairballoons

petrosDemetrakopoulos / ethairballoons.py

A strictly typed ORM library for Ethereum blockchain.

EthAir Balloons

A strictly typed ORM library for Ethereum blockchain It allows you to use Ethereum blockchain as a persistent storage in an organized and model-oriented way without writing custom complex Smart contracts.

Note As transaction fees may be huge, it is strongly advised to only deploy EthAir Balloons models in private Ethereum blockchains or locally using ganache

Installation

pip install ethairballoons

Setup

from ethairballoons import ethairBalloons
# frist parameter is the IP of the Ethereum network we want to store data
# seconda parameter is the path to save to smart contract
provider = ethairBalloons('127.0.0.1', '../')

mySchema = provider.createSchema(modelDefinition={
    'name': "Car",
    'contractName': "carsContract",
    'properties': [{
            'name': "model",
            'type': "bytes32",
            'primaryKey': True
    },
        {
            'name': "engine",
            'type': "bytes32",
    },
        {
            'name': "cylinders",

…

View on GitHub

Quick Tip: Reversing a string in Python

Petros Demetrakopoulos — Wed, 06 Jan 2021 23:04:26 +0000

You can easily reverse a string in python like this:

print("Hello"[::-1])

This syntax reverses the string by converting it to an array (in python strings are actually arrays of characters) and then slicing it The syntax for the array slicing is as follows: [start:end:step] where empty start means 0 and empty stop means the length of the array. So [::-1] means to slice the array from the 1st element to the last element in reverse order.

Face mask detection with Tensorflow CNNs

Petros Demetrakopoulos — Mon, 13 Jul 2020 14:49:18 +0000

COVID-19 has been an inspiration for many software and data engineers during the last months.
This project demonstrates how a Convolutional Neural Network (CNN) can detect if a person in a picture is wearing a face mask or not.
As you can easily understand the applications of this method may be very helpful for the prevention and the control of COVID-19 as it could be used in public places like airports, shopping malls etc.

Defining the problem

Detecting if an image contains a person wearing a mask or not is a simple classification problem.
We have to classify the images between 2 discrete classes: The ones that contain a face mask and the ones that do not.

The dataset

Hopefully, I found a dataset containing faces with and without masks online. It is available on this github link.
It contains 1,376 images. 690 images show people with face masks and 686 images show people without face masks.

Image classification and CNNs

A bit of theoretical background first.
Convolutional Neural Networks (CNN) are neural networks most commonly used to analyze images.
A CNN receives an image as an input in the form of a 3D matrix. The first two dimensions corresponds to the width and height of the image in pixels while the third one corresponds to the RGB values of each pixel.

CNNs consist of the following sequential modules (each one may contain more than one layer)

Convolution
ReLu activation function
Pooling
Fully connected layers
Output layer

Convolution

Convolution operation is an element-wise matrix multiplication operation.
Convolutional layers take the three-dimensional input matrix we mentioned before and they pass a filter (also known as convolutional kernel) over the image, applying it to a small window of pixels at a time (i.e 3x3 pixels) and moving this window until the entire image has been scanned. The convolutional operation calculates the dot product of the pixel values in the current filter window along with the weights defined in the filter. The output of this operation is the final convoluted image.

The following animation (found in Google developers portal) shows how the sliding o the window is performed over an image

The core of image classification CNNs is that as the model trains what it really does is that it learns the values for the filter matrices that enable it to extract important features (shapes, textures, colored areas, etc) in the image. Each convolutional layer applies one new filter to the convoluted image of the previous layer that can extract one more feature. So, as we stack more filters, the more features the CNN can extract from an image.

ReLu activation function

After each convolution operation, CNN applies to the output a Rectified Linear Unit (ReLu) function to the convolved image.
As you may remember from the Machine Learning 101 course in university, ReLu is very commonly used in machine learning applications because it introduces nonlinearity into the model. This helps our model to generalize better and avoid overfitting.

Pooling

Pooling is the process where the CNN downsamples the convolved image by reducing the number of dimensions of the feature map.
It does so to reduce processing time and the computing power needed.
During this process, it preserves the most important feature information. There are several methods that can be used for pooling. The most common ones are Max pooling and Average pooling.
In our application, we will use max pooling as it is the most effective most of the times.
Max pooling is very similar to the convolution process. A windows slides over the feature map and extracts tiles of a specified size. For each tile, max pooling picks the maximum value and adds it to a new feature map.

The following animation (found in Google developers portal) shows how max pooling operation is performed.

Fully connected layers

After pooling, there is always one or more fully connected layers. These layers perform the classification based on the features extracted from the image by the previously mentioned convolution processes. The last fully connected layer is the output layer which applies a softmax function to the output of the previous fully connected layer and returns a probability for each class.

The general form of an image classification CNN is the one shown below:

Face mask detection

I used Tensorflow and Keras to create the CNN that classifies the images as with or without mask.

First, we need to randomly split the dataset in separate train / test sets.
We do so with the following function:

def train_test_split(source, trainPath, testPath, split_size):
    dataset = []
    for crnImage in os.listdir(source):
        data = source + '/' + crnImage
        if(os.path.getsize(data) > 0):
            dataset.append(crnImage)
    train_len = int(len(dataset) * split_size)
    test_len = int(len(dataset) - train_len)
    shuffled = random.sample(dataset, len(dataset))
    train = dataset[0:train_len]
    test = dataset[train_len:len(dataset)]
    print("train images with mask:",len(train))
    print("test images without mask:",len(test))

  #copying train and test images in seaparate directories
    for trainDataPoint in train: 
        crnTrainDataPath = source + '/' + trainDataPoint
        newTrainDataPath =  trainPath + '/' + trainDataPoint
        copyfile(crnTrainDataPath, newTrainDataPath)

    for testDataPoint in test:
        crnTestDataPath = source + '/' + testDataPoint
        newTestDataPath =  testPath + '/' + testDataPoint
        copyfile(crnTestDataPath, newTestDataPath)

We then call it twice (one for the images that contain a mask and one fot the images that do not) with a train / test split of 80% (80% used for training and 20% for test).

train_test_split('data/with_mask', 'data/train/training_with_mask', 'data/test/test_with_mask',0.8)
train_test_split('data/without_mask','data/train/training_without_mask', 'data/test/test_without_mask',0.8)

The model

The definition of the model is presented below:

model = tf.keras.models.Sequential([
    Conv2D(32, 3, activation='relu', input_shape=(IMG_WIDTH, IMG_HEIGHT, 3)),
    MaxPooling2D(2,2),
    Conv2D(64, 3, activation='relu'),
    MaxPooling2D(2,2),
    Conv2D(128, 3, padding='same', activation='relu'),
    MaxPooling2D(2,2),
    Flatten(),
    Dropout(0.3), 
    Dense(256, activation='relu'),
    Dense(2, activation='softmax') # dense layer has a shape of 2 as we have only 2 classes 
])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

The model consists of 10 layers in total.
The first 6 layers form 3 sequential Convolution - ReLu - Pooling groups.
Then, a flatten layer is applied to reshape the output of the CNN to a single dimension.
After the flatten layer, a dropout layer is applied. This layer randomly drops 30% (rate = 0.3) of the tensors in order to avoid overfitting.
In the end, a fully connected (dense) layer is applied that classifies the images based on the features extracted in the previous layers of the CNN and the final layer outputs the probability of each class label.

We also use binary_crossentropy as the loss function because our data contain only 2 classes.

Training the model

We train the model with the following function.
First, we open 2 training streams ("flows") from the 2 directories of train and test (validation) images.
We also save checkpoints during training in separate directories for each checkpoint.
Finally, we call the fit_generator function of the model and training begins.
During the process, we keep track of training and validation accuracy and loss (we will use the values later to plot learning curves).

def trainModel():
  training_dir = "data/train"
  train_datagen = ImageDataGenerator(rescale=1.0/255,
                                     rotation_range=40,
                                     width_shift_range=0.2,
                                     height_shift_range=0.2,
                                     shear_range=0.2,
                                     zoom_range=0.2,
                                     horizontal_flip=True,
                                     fill_mode='nearest')

  train_generator = train_datagen.flow_from_directory(training_dir, 
                                                      batch_size=10, 
                                                      target_size=(IMG_WIDTH, IMG_HEIGHT))
  validation_dir = "data/test"
  validation_datagen = ImageDataGenerator(rescale=1.0/255)
  validation_generator = validation_datagen.flow_from_directory(validation_dir, 
                                                           batch_size=10, 
                                                           target_size=(IMG_WIDTH, IMG_HEIGHT))
  checkpoint = ModelCheckpoint('model-{epoch:03d}.model',monitor='val_loss',verbose=0,save_best_only=True,mode='auto')

  history = model.fit_generator(train_generator,
                                epochs=epochs,
                                validation_data=validation_generator,
                                callbacks=[checkpoint])
  global acc
  acc = history.history['accuracy']
  global val_acc
  val_acc = history.history['val_accuracy']
  global loss
  loss = history.history['loss']
  global val_loss
  val_loss = history.history['val_loss']

Then, we label the outputs of the CNN ans apply colors to the results (red for without mask, green with mask) as follows:

labels_dict = {0:'without mask',1:'with mask'}
color_dict = {0:(0,0,255),1:(0,255,0)}

Implementing face detection

Inspired by this face and mask detection article, I used the OpenCV framework to implement live face detection using the default webcam of the computer. I used the very common Haar Feature-based Cascade Classifiers for detecting the features of the face. This cascade classifier is designed by OpenCV to detect the frontal face by training thousands of images.

Face and mask detection is performed in the following code (courtesy of Gurucharan M K).

# Initializing webcam to live preview face mask detection ROIs on faces
# Seen this in this repo https://github.com/mk-gurucharan/Face-Mask-Detection
face_clsfr = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')

labels_dict = {0:'without mask',1:'with mask'}
color_dict = {0:(0,0,255),1:(0,255,0)}

size = 4
webcam = cv2.VideoCapture(0)
classifier = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')

while True:
    (rval, im) = webcam.read()
    im = cv2.flip(im,1,1) #Flip to act as a mirror

    # Resize the image to speed up detection
    mini = cv2.resize(im, (im.shape[1] // size, im.shape[0] // size))
    faces = classifier.detectMultiScale(mini)

    # Draw rectangles around each face
    for (x, y, w, h) in faces:
        face_img = im[y:(y+h)*size, x:(x+w) * size]
        resized=cv2.resize(face_img,(IMG_WIDTH, IMG_HEIGHT))
        normalized = resized / 255.0
        reshaped = np.reshape(normalized,(1,IMG_WIDTH,IMG_HEIGHT,3))
        reshaped = np.vstack([reshaped])
        result = model.predict(reshaped)

        label = np.argmax(result,axis=1)[0]

        cv2.rectangle(im,(x* size,y* size),((x+w)* size,(y+h)* size),color_dict[label],2)
        cv2.rectangle(im,(x * size,(y* size)-40),((x+w)* size,y* size),color_dict[label],-1)
        cv2.putText(im, labels_dict[label], (x* size + 10, (y* size)-10),cv2.FONT_HERSHEY_DUPLEX,0.8,(255,255,255),2)

    cv2.imshow('Mask Detection', im)
    key = cv2.waitKey(10)
    if key == 27:
        break
webcam.release()
cv2.destroyAllWindows()

Benchmarks

The learning curves (training and validation accuracy and loss) of the model are the following for 30 epochs of training:

As you can see we achieved an accuracy of more than 98% in the validation dataset which is nice!

In my 2.5 GHz i7, 16 GB Ram MacBook Pro, the model took almost 17 minutes to train.

Full project

Full project is available in this Github repository

petrosDemetrakopoulos / face-mask-detector

Face mask detection with Tensorflow and Keras CNNs

Face mask detection with Tensorflow CNNs

COVID-19 has been an inspiration for many software and data engineers during the last months This project demonstrates how a Convolutional Neural Network (CNN) can detect if a person in a picture is wearing a face mask or not As you can easily understand the applications of this method may be very helpful for the prevention and the control of COVID-19 as it could be used in public places like airports, shopping malls etc.

Defining the problem

Detecting if an image contains a person wearing a mask or not is a simple classification problem We have to classify the images between 2 discrete classes: The ones that contain a face mask and the ones that do not.

The dataset

Hopefully, I found a dataset containing faces with and without masks online. It is available on this github link It contains 1,376…

View on GitHub

References

11 weird questions I was asked in interviews

Petros Demetrakopoulos — Wed, 24 Jun 2020 16:05:59 +0000

How much is 17x17?
The answer is 289 and of course, I answered it wrong. However, the recruiter which asked this question mentioned that he did it to “check how I think” so I believe he would like to see me thinking that 17x17 is 17^2 which is (10 + 7)^2 which expands to 10^2 + 2x10x7 + 7^2 which equals 100 + 49 + 70x2 = 289.
Pretty weird though.
Do you know how Shazam works?
In layman’s terms, Shazam captures a sound sample of a song and then analyses it into a table of frequencies that the song contains and how frequently these frequencies are presented in a specific song. This distribution of frequencies is unique for each song and thus a song can be uniquely identified by it. This is achieved with algorithms and methods like Discrete Fourier Transform (DFT) derived from the fields of signal processing, mathematics, and physics. You can find more about how Shazam works on this awesome blog post.
What was the most self-explanatory and streamlined solution you provided in a complex programming problem you faced?
As you can guess, there is no right or wrong answer to this question and it is personal and different for everyone. I answered with a solution I provided to a queue waiting problem. My team and I solved the problem in the database level of the app with SQL stored procedures and triggers.
What is the most important thing that your parents offered you?
This was a tricky one. I answered that one of the most important things that my parents offered me was
- Tuition in 2 very good private schools. I explained that the overall experience that these schools offered me was much more than just better teachers and better education. These 2 schools offered me many opportunities in culture, sports, science clubs, etc.
- My parents also offered me many stimuli and incentives to fill my spare time. Walks in art exhibitions, museums, performance of science experiments at home during the weekends, and of course coding were some of them.
What is the difference between a process and a thread?
I had to remember some basic principles from the Operating Systems course in university to answer this correctly. The difference is crucial and simple: Both threads and processes are “independent sequences of execution”. However, threads that run on the same process, run in a shared memory space while processes run in separate memory spaces. This StackOverflow answer makes it pretty clear.
Would it be able to create a server running in a PC monitoring all the transactions of Ethereum blockchain in real-time?
The answer is no. According to this StackExchange answer, “Syncing the Ethereum blockchain with Geth in --fast mode has two phases running in parallel: block sync and state trie download. Both phases need to be done in order to have a full node and switch to full mode where every transaction is executed and verified. The block sync downloads all the block information (header, transactions). This phase uses a lot of CPU and space to store all the data. However, in fast mode no transactions are executed, so we do not have any account state available (ie. balances, nonces, smart contract code and data). Geth needs to download and cross-check with the latest block the state trie. This phase is called state trie download and usually takes longer than the block sync. When you are between 64 and 128 blocks behind, it usually means you finished the block sync phase and during the state trie download phase, the block number count will always oscillate between 64 and 128 blocks behind the latest block mined on Ethereum. This is normal until the state trie download phase ends and your node is fully synced. However, using an HDD, you might not be able to keep up and have a high enough disk write rate to catch the head (latest state entry).
My personal experience is that the SSD in my MacBook Pro failed to sync too due to disk rate and disk space, so it is not possible to sync the entire Ethereum blockchain with a typical personal computer.
Why do you prefer React over Angular?
I know that React is a library and Angular is a framework and thus comparing them may seem like comparing Apples to Oranges. However I answered that I prefer React because it is more lightweight in terms of payload (in front-end development, the payload of a library or framework is something that a developer should really care about because it may be used by mobile or really slow internet connections) while it offers exactly the same functionality. I also explained that the so-called 2-way binding functionality of Angular is not really missing from React, as React accomplishes the same result using state.
Did your team write tests in your previous job?
The answer there was pretty simple and straightforward. Yes, in my previous job the team wrote tests, we had an organized CI/CD track that ran all the tests before the deployment. I also mentioned we used multiple git branches for the different development and deployment stages (development, staging, production, etc.)
What is the difference between git rebase and git merge?
Git rebase and merge are both used to integrate changes from one branch into another. Their difference lies in how this result is achieved. The main difference is that git rebase moves a branch into master as it is while merge adds a new commit (a merge commit) preserving the history and chronological order. We could say that merge preserves the commit history as rebase re-writes it. You can learn more about git rebase vs git merge in this awesome Atlassian tutorial.
What is the difference between UNION and UNION ALL in SQL?
UNION SQL command is used to combine the results of 2 or more SELECT statements. The difference between UNION and UNION ALL is that UNION eliminates duplicate records while UNION ALL preserves them. As you can imagine, using UNION instead of UNION ALL has a drawback in performance since the DB server must do additional work to remove duplicate rows.
From which club did you get your sailing license?
A funny one, probably the recruiter used it as an ice-breaker. He was a licensed open sea sailing captain too, so he asked in which club I did the courses and if I had ever taken part in sailing races.

Conclusion

Apart from the classic interview questions, these were some that I found interesting and/or weird. If you have any experience with interesting or weird interview questions, feel free to discuss them in the comment section.

Unit testing Node.JS APIs

Petros Demetrakopoulos — Sat, 18 Apr 2020 11:56:36 +0000

As a professional software developer dedicated to Node.js RESTful APIs, I have come to the following conclusion:

Developers are not paid to write code.
They are paid to deliver tech
solutions.

And these solutions should be...

Concrete and robust
Have high availability no matter the load
Reliable
Secure
Cost-effective
Maintainable

Developers should also be able to provide evidence that their solutions match the criteria mentioned above. They should also be able to detect and fix easily and fast any bug or issue that may occur.

And that's where Unit testing comes in

Definition

Unit Testing is a level of software testing where individual units/ components of a software are tested. The purpose is to validate that each unit of the software performs as designed.
Source: softwaretestingfundamentals.com

But which are the units in an API?

The units in an API consist of:

API requests
- HTTP method (i.e GET, POST, PUT etc.) API endpoint (i.e /v1/posts)
- Request parameters
- Request headers
Request Body • Models
- Properties / fields
- Model methods

Learning by example: An example API

For the purposes of this article, we will use an example API for a classic book Library (yes, the original one where you can borrow books, study, etc.)

The API will be composed of the following elements:

Entities / Models
- Books
- Users
Endpoints
- GET /users
- POST /user
- GET /books
- POST /book

The endpoints have the form shown in the following screenshots:
We use faker.js to generate the dummy data that the API will use.

GET /users endpoint

POST /user endpoint

GET /books endpoint

POST /book endpoint

So far so good. I think it is crystal what each endpoint does and the form of data that it responds with.

An example response for the GET /users endpoint looks like this:

But what do we really want to test?

By writing unit tests for an API, we try to answer questions like these:

Does GET /users always responds with an array of user objects?
Does POST /book always responds with the book object submitted?
Does POST /user responds with the right error code when one or more required fields are missing?
Does POST /user responds with the right error code when email does not have the correct format?

Of course, there are many more questions that we may want to answer to be sure that our API works as expected but for our example, those are some important ones.

Let’s grab a cup of coffee (or tea ?)

The 2 main libraries we use to write unit tests for Node.JS applications are Mocha which is the main unit testing framework and Chai which is the assertion library. Chai provides the functions that make the checks we want to perform a lot easier.

i.e

response.should.be.a('string'); 
response.should.not.have.property(‘phone’);

Chai library has 4 main interfaces that do the same thing with
different syntax:

should
assert
expect

i.e the following 3 lines perform exactly the same test.

email.should.be.a(‘string’) 
expect(email).to.be.a(‘string’) 
assert.typeOf(email,’string’)

A look in the `package.json` file of our project

In order to run tests with the default npm test command we should add the following value in the scripts key of our package.json file.

"scripts": {
 "test": "nyc mocha --timeout 10000"
}

We set the timeout for each test case (a test case performs an API call) to 10K ms (or 10s).

The anatomy of a test

As you can see a test is composed of

The dependencies (common for many test cases)
A name and a description
The API call
The actual tests (assertions)
The callback that notifies mocha library that the test has completed.

Coverage reports and nyc

nyc is a tool that reports how much of the total code is covered by the tests we have written. It also reports all the uncovered lines so to know where to look and what tests to write.

A coverage report after the completion of the tests looks like this:

Some good practices regarding unit tests

It is a good practice to save the different payloads we use to test POST endpoints in separate .txt or .json files.
We should also create different test declarations for different things /
functions we want to check.
We should also try to write tests in order to form different “scenarios”.
i.e The DB is initially empty, so we POST a new user, then the user created POSTs a new book, then we DELETE the book and then the user etc.
We should also write tests to check error codes and errors. Bugs and issues may be hidden in the validation logic.
We should also write tests checking access level if our API has different user types with different access levels
Finally, we should try to reach the higher coverage we can. However, we should always keep in mind that it is impossible to reach 100%.

That's all folks!

I hope you enjoyed it and that it will help you to write unit tests for your Node.JS API in the future.

Security in Node.JS and Express: The bare minimum - Part 3.

Petros Demetrakopoulos — Wed, 15 Apr 2020 14:27:56 +0000

In the previous part, we covered

XSS Attacks
SQL injections
RegEx Denial of Service

Security in Node.JS and Express: The bare minimum - Part 2.

Petros Demetrakopoulos ・ Apr 10 '20 ・ 3 min read

#javascript #node #security #webdev

In this part, we will cover

Cross-Site Request Forgery Attacks (CSRF)
Rate Limiting
Data Sanitization

Cross-Site Request Forgery

Cross-Site Request Forgery according to OWASP

Cross-Site Request Forgery (CSRF) is an attack that forces an end user to execute unwanted actions on a web application in which they’re currently authenticated. CSRF attacks specifically target state-changing requests, not theft of data, since the attacker has no way to see the response to the forged request.

In order to prevent this kind of attacks, we should implement a synchronized CSRF tokens policy.

CSRF token is a simple string set when the user requests a page that contains a form and expects the same CSRF token when a POST request is made. If the CSRF tokens do not match or if the CSRF token is not in the form data, the POST request is not allowed. CSRF token is unique for each user session and most of the times it expires in a given time span.

In Express applications we can implement a CSRF policy with the help of csurf npm package.
The package can be used in one line and it handles everything related to the CSRF tokens for all the users.

So in the back-end the correct setup looks like this

var csrf = require('csurf');
var app = express();
app.use(csrf());

app.use(function(req, res, next) {
  res.locals._csrf = req.csrfToken();
  next();
});

And in the front-end looks like this for each form.

<html>
  <form method="post" action=“changeEmail">
    <input type="hidden" name="_csrf" value="_csrf">
    <input type="email" name=“newEmail">
  </form>
</html>

Rate Limiting

One other crucial aspect of the security of your Express application is rate limiting. As you may already know, rate limiting is the policy that control the rate of requests that your server can receive from a specific user and / or IP address. In that way we prevent DoS attacks.

express-rate-limit npm package enables us to apply policies like the ones mentioned above in a really easy way.

i.e

var rateLimit = require("express-rate-limit");

 app.set('trust proxy', 1); // add this line only if your server is behind a proxy

var limiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100, // limit each IP to 100 requests per windowMs
delayMs: 0 // disable delaying - user has full speed until the max limit is reached
});

app.use(limiter); // applies rate limiting policy to every endpoint of the server
// we could also apply policies for specific routes or even different policies for each route

express-rate-limit allows us to apply rate limiting policies to all the endpoints of our Express server or even different policies for each route.

i.e This example applies a rate limiting policy only to the enpoints starting with /api.

var rateLimit = require("express-rate-limit");
var apiLimiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100
});

// only apply to requests that begin with /api/
app.use("/api/", apiLimiter);

Important note: Static resources such as images, CSS stylesheets, front-end Javascript scripts count for requests as well if we serve them through our Express server (which is a bad practice anyway, we should prefer CDN networks for static resources).

Data sanitisation and validation

It is an important process that must take place in every endpoint where the user interacts with the server by submitting data. It protects the server from most of the flaws mentioned in this serie of articles. When we are validating data, we are interested in checks like "Is this a correct e-mail address?", "Is it an Integer?", "Is it a valid telephone number?" etc.

A very usefull npm package that helps us perform this kind of checks in user input is express-validator.

express-validator allows us to define "check schemas" for each endpoint in pure JSON. It also allows us to set the error messages sent back to the user if a validation for a field fails.

An example is given below:

app.put('/user/:id/password', checkSchema({
   id: {
   // the location of the field can be one or more of 'body', 'cookies', 
   'headers', 'params' or 'query'.
   // If omitted, all request locations will be checked
     in: ['params','query'],
     isInt: true,
     errorMessage: 'ID is wrong'
   },
   password: {
      isLength: {
         errorMessage: 'Password should be at least 7 characters long',
         options: {min: 7}
      }
   }
})

express-validator offers many useful keys and functions such as isIn(), exists(), isUUID(), isPostalCode(), trimimng functions etc. It also allows us to implement custom validation and sanitisation logics.

That's all folks (for now...)

I hope you find it interesting and it will help you build more secure and robust Node.JS and Express apps.

Security in Node.JS and Express: The bare minimum - Part 2.

Petros Demetrakopoulos — Fri, 10 Apr 2020 09:28:43 +0000

In the previous part, we covered

Server side JS injection
“Use strict”
Helmet
Changing default error pages
Proper session management

Security in Node.JS and Express: The bare minimum - Part 1.

Petros Demetrakopoulos ・ Apr 6 '20

#javascript #node #security #webdev

In this part, we will cover

XSS Attacks
SQL injections
RegEx Denial of Service

XSS Attacks in general

XSS attacks (or Cross - Site Scripting) allows intruders to execute scripts in the victims’ browser. In that way, they can access cookies, session tokens and other sensitive info or redirect users to malicious sites. It is one of the most common ways an intruder can take over a webpage.

Example:
Let's say we have the following sign-up form that sends data to our Express server:

If we do nothing about it, 'alert(document.cookie)' will be saved in the username field in our database backend and when we fetch and render the username of the specific user in the future, the user will see the following alert.

As you can imagine, this vulnerability may have catastrophic consequences as it may expose critical information and data. Actually some of the most famous attacks in the web have been performed by exploiting this vulnerability. A classic example is this 2014 attack in Twitter.

XSS Attacks - How to prevent them

There is a bunch of things we can do to secure our Express server for XSS-attacks. First of all, we should always perform data validation and sanitization. This means that for every incoming request we should check that the input parameters given by the user are in the correct format, the one that the server and the database expect to be. Another useful tip is to set the cookie httpOnly value to true because it prevents cookies from being accessed by browser JS scripts.

app.use(express.session({
    secret: "s3Cur3",
    cookie: {
        httpOnly: true,
        secure: true
    }
})

Also, we should always HTML Escape data before inserting it into HTML Elements (ex: convert & to & and < to < etc). This will probably neutralize some XSS threats. We should also do this for JSON values presented in an HTML context and read the data with JSON.parse().
Finally, we should use “XSS” npm package that will perform many of the counter-measures mentioned above for use.

SQL injections in general

Let's that in a login endpoint, we receive the username and the password of the user in the following way (to simplify the case, let's assume that no password hashing policy is performed).

app.post('/login', function (req, res) {
var username = req.body.username;
var password = req.body.password;

var sql = 'SELECT * FROM Users WHERE Name ="' + username+ '" AND Pass ="' + password + '"'
// and then executing the query in our SQL databse
});

What if the malicious user type " or “"=" in username and password fields ?
thw SQL query that we are ready to execute will look like this:

SELECT * FROM Users WHERE Name ="" or ""="" AND Pass ="" or ""=""

OR ""="" condition is always true!
So the query returns all the rows of “Users” table.

SQL injections - How to prevent them

Once again, data validation and sanitization is the best way to eliminate these threats. NPM Packages like sqlstring , escape user input values and thus it makes the vulnerability very difficult for a malicious user to exploit it. Also, packages like sql-query-builder that offer you the ability to create SQL queries in a more structured way like this

query().select([users.id.as('User'), users.id.count(1)])
    .from(users).join(posts)
    .on(posts.user_id).equals(users.id)
    .groupBy(users.id);

are far better in security terms than string concatenated SQL queries.

RegEx Denial of Service

Some Regular Expressions may be “unsafe” for some inputs, i.e (a+)+ regex is unsafe for input aaaaaaaaaaaaaaaaaaaaa! as it will lead the regex evaluation to exponential time complexity causing the server to Denial of Service.

Fortunately there is an NPM package that helps us detect vulnerable RegExes and it is called “safe-regex”
It is used like this:

var safe = require(‘safe-regex’);
var regex = new RegExp(‘(a+)+’);
console.log(safe(regex));

It will return a boolean value indicating if the regex is safe or not.

That's all folks (for now...)

I hope you find it interesting and it will help you build more secure and robust Node.JS and Express apps.
In the next part we will cover Cross-Site Request Forgery, Rate Limiting and Data Sanitization.

Security in Node.JS and Express: The bare minimum - Part 1.

Petros Demetrakopoulos — Mon, 06 Apr 2020 08:43:48 +0000

Node.JS is one of the most famous frameworks for developing the back-end part of an application. However, this does not mean in any way that it does not contain many vulnerabilities that the developer should be aware and take actions in order to neutralize them.

What we will cover in this article

Server side JS injection
“Use strict”
Helmet
Changing default error pages
Proper session management

Server side JS injection

Also known as SSJS.
It is the case where user input is directly passed in native JS functions like eval(),setTimeout(), setInterval() or Function(). These functions enable the client to execute malicious Javascript code on the server-side. It could be a process.exit() command that would kill the server or a call in file system. So we should simply avoid using these functions at any cost. These functions consist a bad practice even if we validate and sanitize user input data. In order to prevent it just use JSON.parse(), it is much safer.

Use strict

"Use strict" literal must be declared in the top of every JS script of our Node.JS application. It enables "strict mode" which does not allow some actions such as using a variable without declaring it (i.e x = 5.2), deleting objects, variables, functions etc. It also limits eval() use cases and possible exploits.

Helmet

Helmet is an npm package (you can install it by typing npm i helmet) that sets various HTTP headers that may consist a threat if left with the default values. It sets Content-Security-Policy header and it also allows the developer to set the X-Powered-By header to other than the default value, so the intruder is not aware of real stack behind the application running on the server. Finally, it protects you from a bunch of other things like clickjacking and disables client-side caching.

Changing default error pages

There is no reason to hide X-Powered-By header if we keep the default error pages of Express because the intruder can still understand that our server runs on Express. We can do so as shown in the snippet below:

// Handle 404
  app.use(function(req, res) {
      res.status(400);
     res.render('404', {title: '404: File Not Found'});
  });

  // Handle 500
  app.use(function(error, req, res, next) {
      res.status(500);
     res.render('500', {title:'500: Internal Server Error', error: error});
  });

Proper session management

Session management may consist a possible threat too.
The Express cookies we user should always have these 2 properties always set to true:
1) httpOnly
2) secure
The first one prevents cookies from being accessed by browser JS scripts and the second one forces that cookies can only be configured over secure HTTPS connections
The correct cookies setup is shown in the snippet below:

app.use(express.cookieParser());
app.use(express.session({
    secret: "s3Cur3",
    cookie: {
        httpOnly: true,
        secure: true
    }
}));

ephemeral cookie property is also very useul for security as it deletes the cookie when the browser is closed (if set to true). So, it is very useful for apps that are being accessed by public computers.
Finally, we should always destroy session and cookies on logout.
Example:

req.session.destroy(function() {
    res.redirect("/");
});

That's all folks (for now...)

I hope you find it interesting and it will help you build more secure and robust Node.JS and Express apps.
In the next part we will cover XSS Attacks, SQL and No-SQL injections and RegEx Denial of Service.

Part 2 is also available in the link below:

Security in Node.JS and Express: The bare minimum - Part 2.

Petros Demetrakopoulos ・ Apr 10 '20 ・ 3 min read

#javascript #node #security #webdev

Generating Beatles-like lyrics with RNNs

Petros Demetrakopoulos — Fri, 03 Apr 2020 10:13:22 +0000

Project link in Github

The project is implemented in Python using Tensorflow.
It is a word-based Recurrent neural network (RNN) inspired by this Tensorflow tutorial.
The model is trained in a corpus containing the song titles and the lyrics of many famous songs of Beatles.
So, given a sequence of words from Beatles' lyrics, it can predict the next words.
We could say that it is a model generating Beatle-like lyrics.

The corpus (aka training dataset)

As we said the corpus (beatles_lyrics.txt) contains the song titles and the lyrics of many Beatles songs in the following format:

Song title 1
-----------------
Lyric line 1
Lyric line 2
Lyric line 3
Lyric line 4
etc...

Song title 2
-----------------
Lyric line 1
Lyric line 2
Lyric line 3
Lyric line 4
etc...

etc...

I have manually created the corpus with the lyrics I found in this amazing site

Data preprocessing

As almost every machine learning model, training data need a bit of preprocessing before they are ready to be used as a training input for our RNN.
We preprocess the initial corpus by doing the following tasks:

Convert all letters to lowercase
Remove blank lines
Remove special characters (such as ',' , '(' , ')' , '[' , ']' etc)

The following function performs the preprocessing:

stopChars = [',','(',')','.','-','[',']','"']
# preprocessing the corpus by converting all letters to lowercase, 
# replacing blank lines with blank string and removing special characters
def preprocessText(text):
  text = text.replace('\n', ' ').replace('\t','')
  processedText = text.lower()
  for char in stopChars:
    processedText = processedText.replace(char,'')
  return processedText

After the text preprocessing step, we need to convert it to a list of words.
This procedure is also known as "Tokenization".

The following function performs the tokenization:

 def corpusToList(corpus):
  corpusList = [w for w in corpus.split(' ')] 
  corpusList = [i for i in corpusList if i] #removing empty strings from list
  return corpusList

Then, we trim each word for leading or trailing spaces / tabs.

map(str.strip, corpus_words) # trim words

Now, it is time to find the unique words (aka vocabulary) from which our dataset is composed of.

vocab = sorted(set(corpus_words))

In order to train our model, we need to represent words with numbers. So we map a specific number to each unique word of our corpus and vice versa by creating the following lookup tables. Then we represent the whole corpus as a list of numbers (word_as_int).

print('Corpus length (in words):', len(corpus_words))
print('Unique words in corpus: {}'.format(len(vocab)))
word2idx = {u: i for i, u in enumerate(vocab)}
idx2words = np.array(vocab)
word_as_int = np.array([word2idx[c] for c in corpus_words])

The prediction process

Our goal is to predict the next words that will follow in a sequence, given some starting words (a start sequence).
In layman's terms, RNNs are able to maintain an internal state that depends on the elements (in our case elements = sequences of words) that the RNN has previously "seen".
So, we train the RNN to take as an input a sequence of words and predict the output, which is the following word at each time step. As you can easily understand, if we run the model for many time steps we generate sequences of words!

In order to train it, we have to split our train dataset (aka corpus) in "batches" of sequences of words (as this is what we also want to predict). Then, we need to shuffle them, because we want to make the order with which the songs have been placed in the dataset indifferent for the RNN (and thus for the prediction it will do). If we do not shuffle them, RNN may learn the order of the songs in the corpus to and that may lead it to overfitting

Creating training batches

Now it is time to slice the corpus into training batches. Each batch should contain seqLength words from the corpus.
For each split sequence of words, there is also a target sequence which has the same length as the training one and it is the same but one word shifted to the right. So, we slice the text into seqLength+1 words slices and we use the first seqLength words as training sequence and we extract the target sequence as mentioned.

Example:
Let's say our corpus contains the following verse:

I read the news today oh boy
About a lucky man who made the grade

Now, if the seqLength is 14, the training sequence will be :

I read the news today oh boy
About a lucky man who made the

and the target sequence will be:

read the news today oh boy
About a lucky man who made the grade.

We do so with the following lines:

# The maximum length sentence we want for a single input in words
seqLength = 10
examples_per_epoch = len(corpus_words)//(seqLength + 1) # number of seqLength+1 sequences in the corpus

# Create training / targets batches
wordDataset = tf.data.Dataset.from_tensor_slices(word_as_int)
sequencesOfWords = wordDataset.batch(seqLength + 1, drop_remainder=True) # generating batches of 10 words each, typically converting list of words (sequence) to string

def split_input_target(chunk): # This is where right shift happens
  input_text = chunk[:-1]
  target_text = chunk[1:]
  return input_text, target_text # returns training and target sequence for each batch

dataset = sequencesOfWords.map(split_input_target) # dataset now contains a training and a target sequence for each 10 word slice of the corpus

Shuffling the batches

As we mentioned earlier, before we feed our training batches in our RNN, we have to shuffle them to prevent the RNN from learning the order of the songs in the corpus which may lead it to overfitting.

BATCH_SIZE = 64 # each batch contains 64 sequences. Each sequence contains 10 words (seqLength)
BUFFER_SIZE = 100 # Number of batches that will be processed concurrently

dataset = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)

dataset now contains batches of 64 word sequence each, each sequence is filled in the previous step with 10 words.

The model

Our RNN is composed of 3 layers:

Input layer. It maps the number representing each word to a vector with known dimensions (that are explicitly set)
GRU (middle) layer: GRU stands for Gated Recurrent Units. The number of units that this layer contains is also explicitly set. This layer could also be replaced by a Long Short-Term Memory (LSTM) layer. More on LSTMs and GRUs in this useful link
Output layer: It has as many units as the size of the vocabulary

The model definition code:

# Length of the vocabulary in words
vocab_size = len(vocab)
# The embedding dimension
embedding_dim = 256
# Number of GRU units
rnn_units = 1024

def createModel(vocab_size, embedding_dim, rnn_units, batch_size):
  model = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim,
                              batch_input_shape=[batch_size, None]),
    tf.keras.layers.GRU(rnn_units,
                        return_sequences=True,
                        stateful=True,
                        recurrent_initializer='glorot_uniform'),
    tf.keras.layers.Dense(vocab_size)
  ])
 return model

model = createModel(vocab_size = len(vocab), embedding_dim=embedding_dim, rnn_units=rnn_units, batch_size=BATCH_SIZE)

How the RNN works

For each word in the input layer, the model passes its embedding to the GRU layer for one step of time.
The output of the GRU is then passed to the dense layer which predicts the log-likelihood of the next word.

The schematic below is a bit more descriptive.

Training the model

From now on, we can consider the problem as a simple classification problem.
If you think about it, our model predicts the "class" of the next word based on its current state and the input words during a time step.

So, as in every classification model, in order to train it we need to calculate the loss in each time step.

We do so by defining the following function:

def loss(labels, logits):
  return tf.keras.losses.sparse_categorical_crossentropy(labels, logits, from_logits=True)

Then we compile the model using 'adam' oprimizer.

model.compile(optimizer='adam', loss=loss)

During the training process, we should save checkpoints of the training in a directory that we have manually created

checkpoint_dir = './training_checkpoints'
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}")

checkpoint_callback=tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_prefix,
    save_weights_only=True)

Now it is time to execute training.
We explicitly set the number of epochs.
At this point I would like to remind you that an epoch is one forward pass and one backward pass of all the training examples.

We choose to train for 20 epochs. Note that as you increase number of epochs, the training time will increase too.
You should experiment with this number in order to fine tune the model.
But be careful, training for too many epochs may lead to overfitting.

EPOCHS = 20
history = model.fit(dataset, epochs=EPOCHS, callbacks=[checkpoint_callback])

After the training process is over, we restore the trained model form the latest checkpoint.
It is time to generate the lyrics!

tf.train.latest_checkpoint(checkpoint_dir)
model = createModel(len(vocab), embedding_dim, rnn_units, batch_size=1)

model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))

model.build(tf.TensorShape([1, None]))
model.summary()

Generating the lyrics

RNNs (as most of Neural network types in general) need an initial state to start predicting.
In our case, this initialization is represented by a starting string with which we want the generated lyrics to start.
The model generates the probability distribution of the next word using the start string and the RNN state.

Then, with the help of categorical distribution, the index of the predicted word is calculated and the predicted word is used as the input for the next time step of the model

The state that the RNN returns is then fed back to the input of the RNN, in order to help it by providing more context (not just one word). This process continues as it generates predictions and this is why it learns better while it gets more context from the predicted words.

The following function performs the task mentioned above:

def generateLyrics(model, startString, temp):
  print("---- Generating lyrics starting with '" + startString + "' ----")
  # Number of words to generate
  num_generate = 30

  # Converting our start string to numbers (vectorizing)
  start_string_list =  [w for w in startString.split(' ')]
  input_eval = [word2idx[s] for s in start_string_list]
  input_eval = tf.expand_dims(input_eval, 0)

  text_generated = []

  model.reset_states()
  for i in range(num_generate):
      predictions = model(input_eval)
      # remove the batch dimension
      predictions = tf.squeeze(predictions, 0)

      # temp represent how 'conservative' the predictions are. 
      # Lower temp leads to more predictable (or correct) lyrics
      predictions = predictions / temp 
      predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()

      # We pass the predicted word as the next input to the model
      # along with the previous hidden state
      input_eval = tf.expand_dims([predicted_id], 0)
      text_generated.append(' ' + idx2words[predicted_id])
  return (startString + ''.join(text_generated))

As you can see, many factors may influence the accuracy of the predictions.
temp parameter for example represents how 'conservative' the predictions are.
This means that lower temp values lead to more predictable (or correct) lyrics.

Running the model

model.py also contains a "demo part".
After the training process is finished, it saves the model in a binary file (you can then restore it in one line of code and use it instantly to predict values) for future use so we do not have to train it every time we want to generate lyrics.

#save trained model for future use (so we do not have to train it every time we want to generate text)
model.save('saved_model.h5')

Then it calls the generateLyrics function with the start string "love" (We all know how much Beatles used) this word in their songs.
Then it prompts the user to enter a start string and a temp value to generate lyrics.

Some examples that the model gave me:

Start string: "love"
Generated lyrics: "love youyouyouyou as i write this letter send my love to give you all my loving i will send to you all my loving i will send to you don't bother"

Start string: "boys and girls"
temp: 0.4
Generated lyrics: "boys and girls make me sing and shout that georgia's always be blind love is here to stay and that's enough to make you my girl be the only one love me hold"

Start string: "day"
temp: 0.8
Generated lyrics: "day tripper night at my own it will take a walk on home loretta get back get back get back get back get back get back get back get back get"

`
Hm... Not that bad.

Optimizing the model

You can easily fine-tune the model by changing the variables that influence the accuracy of the predictions. Using a stopwords set with the very common words in the corpus and removing them from the training data may help the predictions as it will eliminate the noise and the features that do not offer a significant info gain.

BATCH_SIZE, temp, embedding dimensions, sequence length can all influence the prediction and thus the generated lyrics.
You can also try to use LSTM units in the middle layer.
So you can experiment yourself and comment with the results.

Project is available on this Github Repo

petrosDemetrakopoulos / RNN-Beatles-lyrics-generator

A word-based Recurrent neural network (RNN) trained to generate Beatle-like lyrics

SIR.js . An epidemic simulation library in JS.

Petros Demetrakopoulos — Sun, 29 Mar 2020 16:53:33 +0000

During the last days, I see more and more projects related to epidemic simulations, SIR Modeling etc. It is obvious that the COVID-19 pandemic has motivated developers around the world to offer projects and solutions that try to predict the evolution of the pandemic in the following weeks or months.

Probably the simplest model someone can use to model an epidemic is the SIR model.

But how is the SIR model defined?

(From WolframMathworld)
An SIR model is an epidemiological model that computes the theoretical number of people infected with a contagious illness in a closed population over time. The name of this class of models derives from the fact that they involve coupled equations relating the number of susceptible people S(t), number of people infected I(t), and number of people who have recovered R(t).

The model needs the initial values of the Susceptible persons in a population (S0), Infected (I0) and Recovered (R0). It also needs beta factor which is a constant determining how often a susceptible-infected contact results in a new infection and gamma factor which is the rate an infected person recovers and moves into the resistant phase.

The simulation runs for N iterations (which represent the time span of the simulation).

Installation and example

SIR.js can be installed via npm by typing npm install sir.js .

Code is available in this GitHub repo and you are welcome to contribute!

It is ultrasimple to set it up and use it:

let SIRjs = require('sir.js');

let solution = SIRjs.solve({S0: 0.9, I0: 0.1, R0: 0.0, t:1, N: 500, beta: 0.35, gamma: 0.1});
SIRjs.printChart(solution);

It has only 2 functions: solve() and printChart()
solve() function takes an object argument with the following keys:

S0: Initial S (Susceptible) value
I0: Initial I (Infectious) value
R0: Initial R (Recovered) value
t: The time step
N: The time span (in units of time) aka the length of the simulation
beta: The parameter controlling how often a susceptible-infected contact results in a new infection
gamma: The rate an infected recovers and moves into the resistant phase

It returns an array of objects that contain S,I and R values for each moment.

printChart() function prints an ASCII chart for each one of the S,I,R variables

You can then plot the results in any way you want.
Here, for example, I have plotted the results using Google Sheets

That's all folks!

I hope you find it interesting and that it will be useful!

COVID-19: An opportunity for the developers’ community and Open Source Software.

Petros Demetrakopoulos — Sun, 15 Mar 2020 19:25:07 +0000

Facts: COVID-19 is already a global pandemic. Many software developers and ICT professionals in general have already turned to work-from-home mode as the official guideline is to STAY HOME.

Most developers are really passionate about their job and as a result they have a really strong community that is the moving force of the software evolution. I am also pretty sure that for most developers it will be really difficult to watch tons of series and films on Netflix during the hard days of quarantine. Sooner or later they will have the need to code during their spare time in their homes.

So here are a few things (many are obvious) we can do to make the days we will pass much more time in our homes count and offer more to our community.

Learn a new programming language/technology or framework.
By watching youtube videos, taking courses in Udemy / Coursera / edx etc. You know the ways are infinite.
Answer questions in StackOverflow and other community forums.
Start this side-project that you always wanted to and you could not find time to do it.
Write tests for your existing projects (if you have not done it already).
The most important one: Contribute to Open Source projects.

There are millions of open source projects out there. Find the one you like or you use in your projects, visit its repository on GitHub, check the open issues, try to resolve one and then make a pull request. I am sure that you will learn many things during the process.

Conclusion

The time you will have to spend staying at home is a lot. Try not to waste it and spend it as creatively as you can while offering to the community too.

So what do you think? What other opportunities emerge in the community due to COVID-19? Feel free to discuss in the comment section.

Stay Safe,
#StayHome ,
And keep coding : )

Using data models over Ethereum blockchain with EthAir Balloons

Petros Demetrakopoulos — Mon, 03 Feb 2020 10:57:59 +0000

Back to basics:
First of all, let's remember what blockchains are: A blockchain in Layman's terms is a kind of distributed database that offers more transparency and more security than other databases. This means that their main role is to store data.

Ethereum blockchain is amazing because it was the first to offer the capability of running code over it with Smart contracts and Solidity, the language behind them. But Solidity smart contracts may be a real hell even for experienced developers as their development is very time consuming, they cannot be easily tested and they have many limitations such as not many available data types, the limited number of parameters you can pass into a function, lack of object-oriented concepts, etc. It feels more like a very primitive programming language than a modern one that allows more complex data structures and functions.

And here comes EthAir Balloons, a spin-off project of my thesis as an undergraduate student in CS School of Athens University of Economics and Business.

EthAir Balloons is a strictly typed ORM library for Ethereum blockchain. It allows you to use Ethereum blockchain as persistent storage in an organized and model-oriented way without writing custom complex Smart contracts. We could tell it is for Ethereum based blockchains what Mongoose is for MongoDB.

We will do a walkthrough to the library by showing how you can very easily create and deploy a new model and then perform all the CRUD operations.

Assuming we have already created a new Node.js project and an index.js file, we can proceed to the installation of the library by typing npm i --save ethairballoons in the root directory of the project.

Now, in the index.js file we add the following lines:

var ethAirBalloons = require('ethairballoons');
var path = require('path');
var savePath = path.resolve(__dirname + '/contracts');

var ethAirBalloonsProvider = ethAirBalloons('http://localhost:8545', savePath); 
//ethereum blockchain provider URL, path to save auto generated smart contracts

var Car = ethAirBalloonsProvider.createSchema({
        name: "Car",
        contractName: "carsContract",
        properties: [
            {
                name: "model",
                type: "bytes32",
                primaryKey: true
            },
            { 
                name: "engine",
                type: "bytes32",
            },
            {   name: "cylinders",
                type: "uint"
            }
        ]
    });

As you can see, we can initiate an instance of ethairballoons (or what I like to call an "ethAirBalloons provider") using only 2 arguments:

1) the URL of the Ethereum blockchain provider that we want to use (in the example it is set to a local ganache-cli provider),

2) the path where we want to save the automatically generated smart contracts of your models.

After the provider initialisation, we can create new data schemas using the createSchema() function and pass the schema details in JS object format. Of course you can (an it is advised) keep the schema definitions in separate .JSON files and then import them using the require() statement in the top of your file.

Now that our data schema is set, it is time to deploy it in the blockchain, in this point I would like to rememeber that we do so in a local ganache-cli instance (which is an Ethereum blockchain simulator) and not in the actual ethereum network. As transaction fees may be huge, it is strongly advised to only deploy EthAir Balloons models in private Ethereum blockchains or locally using ganache-cli.

We deploy our model by calling the deploy() function as shown below:

Car.deploy(function (success, err) {
    if (!err) {
        console.log('Deployed successfully');
    }
});

This function generates the solidity Smart contract of our model and deploys it in the Ethereum based blockchain that we set in the previous step. It returns a boolean indicating if the deploy is successfull and an error object that will be undefined if the deploy is successfull. After deploy completes wecan call the other functions of the model.

Model functions

EthAir Balloons implement all the functions needed to perform CRUD operations.

save()

This function saves a new record in the blockchain. Make sure to set the primary key field in the object you want to save, otherwise an error will be returned. It returns the saved object and an error object that will be undefined if the object is saved successfully.

An example is shown below:

var newCarObject = {model:'Audi A4', engine: 'V8', wheels: 4};
Car.save(newCarObject, function (objectSaved, err) {
   if (!err) {
       console.log('object saved');
   }
});

find()

This function returns all the records of our Model.

Car.find(function (allRecords, err) {
   if (!err) {
       console.log(allRecords);
   }
});

findById()

This function returns the record with a specific primary key value if exists. Otherwise it will return an error object mentioning that 'record with this id does not exist'.

Car.findById('Audi A4', function (record, err) {
   if (!err) {
       console.log(record);
   } 
});

deleteById()

Deletes the record with a specific primary key value if exists. Otherwise it will return an error object mentioning that 'record with this id does not exist'.

Car.deleteById('Audi A4', function (success, err) {
   if (!err) {
       console.log('object deleted successfully');
   } 
});

updateById()

Updates the record with a specific primary key value if exists. Otherwise it will return an error object mentioning that 'record with this id does not exist'. It returns the updated record.

The first parameter is the primary key value of the record we want to update. The second parameter is the updated object.

var updatedCarObject = { engine: 'V9', wheels: 4 };
Car.updateById('Audi A4', updatedCarObject, function (updatedObject, err) {
   if (!err) {
       console.log('object updated successfully');
   } 
});

That's all folks!

I hope you find it interesting and that it will be useful in future projects!

DEV Community: Petros Demetrakopoulos

EthairBalloons is now available for Python!

What is EthAir Balloons ?

petrosDemetrakopoulos / ethairballoons.py

A strictly typed ORM library for Ethereum blockchain.

EthAir Balloons

Installation

Setup

Quick Tip: Reversing a string in Python

Face mask detection with Tensorflow CNNs

Defining the problem

The dataset

Image classification and CNNs

Convolution

ReLu activation function

Pooling

Fully connected layers

Face mask detection

The model

Training the model

Implementing face detection

Benchmarks

Full project

petrosDemetrakopoulos / face-mask-detector

Face mask detection with Tensorflow and Keras CNNs

Face mask detection with Tensorflow CNNs

Defining the problem

The dataset

References

11 weird questions I was asked in interviews

Conclusion

Unit testing Node.JS APIs

Definition

But which are the units in an API?

Learning by example: An example API

But what do we really want to test?

Let’s grab a cup of coffee (or tea ?)

A look in the package.json file of our project

The anatomy of a test

Coverage reports and nyc

Some good practices regarding unit tests

That's all folks!

Security in Node.JS and Express: The bare minimum - Part 3.

Security in Node.JS and Express: The bare minimum - Part 2.

Petros Demetrakopoulos ・ Apr 10 '20 ・ 3 min read

Cross-Site Request Forgery

Rate Limiting

Data sanitisation and validation

That's all folks (for now...)

Security in Node.JS and Express: The bare minimum - Part 2.

Security in Node.JS and Express: The bare minimum - Part 1.

Petros Demetrakopoulos ・ Apr 6 '20

XSS Attacks in general

XSS Attacks - How to prevent them

SQL injections in general

SQL injections - How to prevent them

RegEx Denial of Service

That's all folks (for now...)

Security in Node.JS and Express: The bare minimum - Part 1.

What we will cover in this article

Server side JS injection

Use strict

Helmet

Changing default error pages

Proper session management

That's all folks (for now...)

Security in Node.JS and Express: The bare minimum - Part 2.

Petros Demetrakopoulos ・ Apr 10 '20 ・ 3 min read

Generating Beatles-like lyrics with RNNs

The corpus (aka training dataset)

Data preprocessing

The prediction process

Creating training batches

Shuffling the batches

The model

How the RNN works

Training the model

Generating the lyrics

Running the model

Optimizing the model

A look in the `package.json` file of our project