DEV Community

Sarvesh Kesharwani
Sarvesh Kesharwani

Posted on

Gist for quick revision of statistics for data science.

Population: A population is a complete set of individuals or objects that share a common characteristic.

Sample: A sample is a subset of a population that is selected for analysis.

Variable: A variable is a characteristic that can be measured or observed and can take different values.

Data: Data are the values of the variables collected from a sample or a population.

Descriptive statistics: Descriptive statistics summarize and describe the main features of the data.

Inferential statistics: Inferential statistics use the data from a sample to make inferences about a population.

Hypothesis testing: Hypothesis testing is a statistical method used to test a hypothesis about a population.

Confidence interval: A confidence interval is a range of values that is likely to contain the true value of a population parameter.

Regression analysis: Regression analysis is a statistical method used to examine the relationship between two or more variables.

Probability: Probability is a measure of the likelihood that an event will occur.

Central tendency: Central tendency is a statistical measure that indicates the typical or central value of a distribution of data.

Standard deviation: Standard deviation is a measure of the spread or variability of a distribution of data.

Correlation: Correlation is a statistical measure that indicates the strength and direction of a relationship between two variables.

Outliers: Outliers are data points that are significantly different from other data points in the sample or population.

Data visualization: Data visualization is the graphical representation of data to help in understanding and analyzing the data.

Top comments (0)