Introduction
The titanic dataset is mostly used for data analysis. It contains information about the passengers in the titanic ship which include features of gender, age, passenger’s class and survival status. The perform data analysis on the titanic dataset using spreadsheet in the project.
Dataset
The Titanic dataset occurred on the sea as the titanic dataset and it consist of the following using table:
Variables
Definition
Key
Survival
survival
0 = No, 1 = yes
Pclass
Ticket class
1 = 1st, 2 = 2nd, 3 = 3rd
Sex
Passenger’s gender
Sibsp
number of siblings / spouses aboard the Titanic
Age
Passenger’s age
Parch
Number of parents / children aboard the Titanic
Ticket
Ticket number
Fare
Passenger fare
Cabin
Cabin number
Embark
Port of Embarkation
C = Cherbourg, Q = Queenstown S = Southamption
Variable Note
Pclass : A proxy for socio-economic status (SES)
1st = upper
2nd = middle
3rd = lower
Age: Age is a fractional if less than 1. If the age is estimated, is it the form of xx.5
Sibsp: The dataset defines family relationship in this way….
Siblings = brother, sister, stepbrother, stepsister
Spouse = husband, wife (mistresses, and fiancée were ignored)
Parch: The dataset defines family relations in this way..
Parent = mother, father
Child= daughter, son, stepdaughter, stepson
Some children traveled only with a nanny, therefore parch=0 for them.
Data visualization
Observation
Having gone through the glance insight many keys and variable where observe for the findings of the titanic dataset that leads to the summary of the data.
Conclusion
Base on this analysis we can summerise the key findings, insight for further investigation
Top comments (0)