DEV Community

Cover image for TITANIC DATA SET
Abidakun Kemisola
Abidakun Kemisola

Posted on

TITANIC DATA SET

Introduction
The titanic dataset is mostly used for data analysis. It contains information about the passengers in the titanic ship which include features of gender, age, passenger’s class and survival status. The perform data analysis on the titanic dataset using spreadsheet in the project.
Dataset
The Titanic dataset occurred on the sea as the titanic dataset and it consist of the following using table:
Variables
Definition
Key

Survival
survival
0 = No, 1 = yes

Pclass
Ticket class
1 = 1st, 2 = 2nd, 3 = 3rd

Sex
Passenger’s gender

Sibsp
number of siblings / spouses aboard the Titanic

Age
Passenger’s age

Parch
Number of parents / children aboard the Titanic

Ticket
Ticket number

Fare
Passenger fare

Cabin
Cabin number

Embark
Port of Embarkation
C = Cherbourg, Q = Queenstown S = Southamption

Variable Note
Pclass : A proxy for socio-economic status (SES)
1st = upper
2nd = middle
3rd = lower
Age: Age is a fractional if less than 1. If the age is estimated, is it the form of xx.5
Sibsp: The dataset defines family relationship in this way….
Siblings = brother, sister, stepbrother, stepsister

Spouse = husband, wife (mistresses, and fiancée were ignored)

Parch: The dataset defines family relations in this way..
Parent = mother, father
Child= daughter, son, stepdaughter, stepson
Some children traveled only with a nanny, therefore parch=0 for them.
Data visualization

Observation
Having gone through the glance insight many keys and variable where observe for the findings of the titanic dataset that leads to the summary of the data.
Conclusion
Base on this analysis we can summerise the key findings, insight for further investigation

Image description
You can register to enjoy this great privilege

HNG intership
website

Top comments (0)