DEV Community

Hoang Le for INNOMIZE

Posted on • Edited on

Quickly Installing And Running Neo4j Using Ansible On AWS Cloud

Recently, we have worked on a project that uses Neo4j to store and process large graph data for our client. Our client has been asked for a solution to launch, install and configure a Neo4j single node (for the development environment and High Availability Neo4j cluster (for production environment).

Before getting started, I just wanted to give you a note that there are a couple of options to deploy Neo4j on AWS, so you might take a look before looking for details or select the best option that works for you:

This article provides a step by step guild on how to launch, install and configure high availability Neo4j cluster (aka HA cluster) using Ansible on AWS. We use AWS for demonstration, but you are able to custom the playbooks and configurations for other cloud vendors such as Google Cloud, Azure.

Pre-requisites

In order to use this Ansible playbook on AWS, the following is needed:

  • An AWS account with a user's access key and secret key.
  • An IAM policy attached to the above user that allows launching new EC2 instances, authorize ports in security groups.
  • An EC2 Key-Pair to allow SSH to EC2 instances.
  • git installed on your machine

The steps

To deploy Neo4j, what are we going to build is the following deployment flow:

  1. Setup security groups and authorize ports communication
  2. Launch EC2 instance(s) - optional
  3. Update OS
  4. Install Neo4j Enterprise on EC2 instance(s)
  5. Install HAProxy and configure HA cluster on EC2 instance(s) - only required for HA cluster

Getting started

Before deploying, a security group needs to be created that the Neo4j cluster/instance will use. In fact, you can create multiple security groups for different purposes such as allow SSH to instance, allow Neo4j communication between each other. But to simplify the process, we will use one security groups that allow the following ports:

  • 22 (SSH)
  • Neo4j Ports listed on this page
  • 8000 - HA admin port for HA cluster deployment

Login to AWS Console Management portal and create a security group and open the above ports like below screenshot:

Security group inbound ports

Project structure

A well-defined project structure will help us easy to understand each part of the solution, allow reuse, and customizable. If you have experience in working with Ansible, you should know how to organize the Ansible project. I followed the alternative approach mentioned in this article, feel free to select your own approach.

Ansible project structure
Ansible Project Structure

The above project structure contains the following:

  • extension/setup: contain scripts to install Ansible and required python packages
  • inventories/[env]: define all variables for playbooks that allow us custom for each environment
  • roles: predefined and reusable roles for our playbooks. In this solution, we use the following roles:
    • common: the common role to install common package or update the latest OS version.
    • haproxy: the role to install and configure HAProxy.
    • launch-ec2: the role to launch EC2 instances in multiple AZ.
    • neo4j: install and configure Neo4j on a single instance.
  • templates: template files for configuring Neo4j instances as well as HAProxy config files
  • There are two main playbooks:
    • neo4j.single.yml: the playbook to launch and install a single Neo4j node.
    • neo4j.cluster.yml: the playbook to launch and install an HA Neo4j cluster.

Preparation

Step 1 - Clone/download source code from Github using GIT

clone https://github.com/innomizetech/neo4j-ansible.git

change directory into the newly created directory

cd neo4j-ansible

Step 2 - Install Ansible and required python packages

chmod +x extension/setup/setup.sh
./extension/setup/setup.sh

Step 3 - Decrypt Ansible Vault file

A vault file contains sensitive information so that we shouldn't commit to source control in plaintext. So we need to encrypt it before committing to source control. Using Ansible vault so this problem. In this repo, we committed the password file for demo purpose, please note that you should not commit the password file into Source control.

Run below command to decrypt the vault.yml file in the inventory directory:

ansible-vault decrypt inventories/dev/group_vars/vault.yml --vault-password-file ansible-vault.pass

Step 4 - Update vault.yml file

---
# Sensitive variables here are applicable to deploy application

aws_access_key: <<your access key>>
aws_secret_key: <<your secret access key>>

# The security group id to be attached to new instance
security_group: <<your security group id>>
# An Amazone Linux image
image: <<AMI id i.e. ami-048a01c78f7bae4aa>>
# The first subnet to launch instance, it should be public subnet if you allowed public access
vpc_az1_subnet_id: <<your subnet 1 id>>
# The second subnet to launch instance, it should be public subnet if you allowed public access
vpc_az2_subnet_id: <<your subnet 2 id>>

# Set initial password for Neo instances
initial_password: <<your password>>

# HAProxy configuration (requires for cluster mode with HAProxy)
stats_user: <<your HAProxy username>>
stats_pass: <<your HAProxy user password>>

Deploy a single Neo4j Node

Step 1 - Update groups variables

Review and update variables in the inventory\dev\group_vars\neo4j-single.yml file, below are some important variables:

  • region: an AWS region to launch and deploy Neo4j
  • keypair: an existing key-pair on the above region
  • other variables: feel free to update according to your requirement

Step 2- Run Ansible playbook

Run below command to deploy a single Neo4j instance for dev environment. Replace dev to any existing inventory in the inventories directory (i.e. staging, prod)

ansible-playbook neo4j.single.yml -e env=dev --vault-password-file ansible-vault.pass

#  or we can use -b -K to enter SUDO password (sudo su)
ansible-playbook neo4j.single.yml -e env=dev --vault-password-file ansible-vault.pass -b -K

Wait until the command finished and access to Neo4j browser at http://public-ip:7474

Deploy an HA Neo4j cluster

Execute the same steps above with

  • neo4j-cluster.yml group variables file
  • neo4j.single.yml the Ansible playbook

Check out the result of each case by watching those videos on our Youtube channel:

If you have any issues when practicing this instruction, feel free to let us know by giving us our comments.

Visit our blog for more interesting articles. If you have any questions or need help you can contact me via Twitter.

Top comments (0)