DEV Community

Marcelo Costa
Marcelo Costa

Posted on

5

Quickly set up a Hive environment on GCP

This quick-start guide is part of a series that shows how to set up databases on Google Cloud Platform, for developing and testing purposes.

This guide will show you how to create a Hive environment running inside your Google Cloud Project.

Create a Compute Engine VM

Using Cloud Shell:

# Create the Hive GCE instance
gcloud compute instances create hive \
  --zone=us-central1-c \
  --machine-type=n1-standard-1 \
  --image-project=debian-cloud --boot-disk-size=30GB \
  --image=debian-9-stretch-v20190916 \
  --boot-disk-type=pd-standard \
  --boot-disk-device-name=hive \
  --scopes=cloud-platform 
Enter fullscreen mode Exit fullscreen mode

Configure your VM with Hive

Using Cloud Shell:

# Connect to the Hive VM
gcloud compute ssh --zone=us-central1-c hive

# Login as super user
sudo -s

# Install Docker
curl -sSL https://get.docker.com/ | sh

# Install Docker-compose
curl -L https://github.com/docker/compose/releases/download/1.18.0/docker-compose-`uname -s`-`uname -m` -o /usr/local/bin/docker-compose

sudo chmod +x /usr/local/bin/docker-compose

# Test installation
docker-compose --version
Enter fullscreen mode Exit fullscreen mode

Create docker environment

Inside hive vm with ssh:

# Install git if you don’t have
apt-get install git

# Clone the github
git clone https://github.com/mesmacosta/docker-hive

# Go inside the created directory
cd docker-hive

# Start docker compose
docker-compose up -d
Enter fullscreen mode Exit fullscreen mode

Creating tables (internal, managed by hive)

Inside hive vm with ssh:

# Connect to hive-server with the command
docker-compose exec hive-server bash

# Connect to beeline cmd:
/opt/hive/bin/beeline -u jdbc:hive2://localhost:10000

# Create an internal table named funds
CREATE TABLE funds (code INT, opt STRING);

# Load table with data (Optional)
LOAD DATA LOCAL INPATH '/opt/hive/examples/files/kv1.txt' OVERWRITE INTO TABLE funds;

# Test 
select * from funds;
Enter fullscreen mode Exit fullscreen mode

Creating tables (external)

Inside hive vm with ssh:

# Connect to the hadoop namenode
docker-compose exec namenode bash

# Create a new file with some data at any directory 
echo '1,2,3,4' > csvFile

# Create a directory inside hdfs
hdfs dfs -mkdir -p /test/another_test/one_more_test

# Add the file to the directory
hdfs dfs -put csvFile /test/another_test/one_more_test/csvFile

# Connect to hive-server with the command
docker-compose exec hive-server bash

# Create external table
CREATE EXTERNAL TABLE IF NOT EXISTS store
(ID int,
DEPT int,
CODE int,
DIGIT int
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION '/test/another_test/one_more_test/';

# You can then query it inside hive to see that it worked
select * from store;
Enter fullscreen mode Exit fullscreen mode

And that's it!

If you have difficulties, don’t hesitate reaching out. I would love to help you!

Sentry image

See why 4M developers consider Sentry, “not bad.”

Fixing code doesn’t have to be the worst part of your day. Learn how Sentry can help.

Learn more

Top comments (0)

Billboard image

Try REST API Generation for MS SQL Server.

DevOps for Private APIs. With DreamFactory API Generation, you get:

  • Auto-generated live APIs mapped from database schema
  • Interactive Swagger API documentation
  • Scripting engine to customize your API
  • Built-in role-based access control

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay