DEV Community 👩‍💻👨‍💻

Marcelo Costa
Marcelo Costa

Posted on

Quickly set up a Hive environment on GCP

This quick-start guide is part of a series that shows how to set up databases on Google Cloud Platform, for developing and testing purposes.

This guide will show you how to create a Hive environment running inside your Google Cloud Project.

Create a Compute Engine VM

Using Cloud Shell:

# Create the Hive GCE instance
gcloud compute instances create hive \
  --zone=us-central1-c \
  --machine-type=n1-standard-1 \
  --image-project=debian-cloud --boot-disk-size=30GB \
  --image=debian-9-stretch-v20190916 \
  --boot-disk-type=pd-standard \
  --boot-disk-device-name=hive \
  --scopes=cloud-platform 

Configure your VM with Hive

Using Cloud Shell:

# Connect to the Hive VM
gcloud compute ssh --zone=us-central1-c hive

# Login as super user
sudo -s

# Install Docker
curl -sSL https://get.docker.com/ | sh

# Install Docker-compose
curl -L https://github.com/docker/compose/releases/download/1.18.0/docker-compose-`uname -s`-`uname -m` -o /usr/local/bin/docker-compose

sudo chmod +x /usr/local/bin/docker-compose

# Test installation
docker-compose --version

Create docker environment

Inside hive vm with ssh:

# Install git if you don’t have
apt-get install git

# Clone the github
git clone https://github.com/mesmacosta/docker-hive

# Go inside the created directory
cd docker-hive

# Start docker compose
docker-compose up -d

Creating tables (internal, managed by hive)

Inside hive vm with ssh:

# Connect to hive-server with the command
docker-compose exec hive-server bash

# Connect to beeline cmd:
/opt/hive/bin/beeline -u jdbc:hive2://localhost:10000

# Create an internal table named funds
CREATE TABLE funds (code INT, opt STRING);

# Load table with data (Optional)
LOAD DATA LOCAL INPATH '/opt/hive/examples/files/kv1.txt' OVERWRITE INTO TABLE funds;

# Test 
select * from funds;

Creating tables (external)

Inside hive vm with ssh:

# Connect to the hadoop namenode
docker-compose exec namenode bash

# Create a new file with some data at any directory 
echo '1,2,3,4' > csvFile

# Create a directory inside hdfs
hdfs dfs -mkdir -p /test/another_test/one_more_test

# Add the file to the directory
hdfs dfs -put csvFile /test/another_test/one_more_test/csvFile

# Connect to hive-server with the command
docker-compose exec hive-server bash

# Create external table
CREATE EXTERNAL TABLE IF NOT EXISTS store
(ID int,
DEPT int,
CODE int,
DIGIT int
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION '/test/another_test/one_more_test/';

# You can then query it inside hive to see that it worked
select * from store;

And that's it!

If you have difficulties, don’t hesitate reaching out. I would love to help you!

Top comments (0)

Create an Account! The only reason people scroll to the bottom...  
is because they want to read more.

Create an account to bookmark, comment, and react to articles that interest you.