<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: SunnyTamang</title>
    <description>The latest articles on DEV Community by SunnyTamang (@sunnytamang).</description>
    <link>https://dev.to/sunnytamang</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F739054%2F125f141c-0849-41aa-a33c-f8fd560237f8.jpeg</url>
      <title>DEV Community: SunnyTamang</title>
      <link>https://dev.to/sunnytamang</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sunnytamang"/>
    <language>en</language>
    <item>
      <title>Statistics interview questions and answers (Part 1)</title>
      <dc:creator>SunnyTamang</dc:creator>
      <pubDate>Thu, 28 Oct 2021 13:50:53 +0000</pubDate>
      <link>https://dev.to/sunnytamang/statistics-interview-questions-and-answers-part-1-3n8p</link>
      <guid>https://dev.to/sunnytamang/statistics-interview-questions-and-answers-part-1-3n8p</guid>
      <description>&lt;h2&gt;
  
  
  These are some questions that are related to statistics
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;1. Where you have used Hypothesis Testing in your machine learning solution?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Answer: Hypothesis testing is one of the statistical analysis where we test the assumption made    for any particular situation. While testing some assumption which was claimed to be true I performed the hypothesis testing where the null hypothesis was that whatever claimed results to be true and the alternate hypothesis was that whatever claim was made was false.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;2. What do you understand by P-value? And what is the use of it in machine learning?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Answer: P-value which is also know as probability value, it the probability of null hupothesis to be true. It sets the rule to reject null hypothesis.&lt;/p&gt;

&lt;p&gt;If the p-value is less than the significance value then we reject the null hypothesis or else accept it.&lt;/p&gt;

&lt;p&gt;If the p-value falls in the 95% of the confidence interval then we will accept the null hypothesis.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;3. Which type of error is sever error, Type1 or Type2? And why with example.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Answer: To answer of this question is it depends. It depends on the problem statement we are looking into.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;p&gt;The confusion matrix with regards to disease vs treatment is fatal in case of &lt;strong&gt;false negative&lt;/strong&gt; (when patient have the disease and the model predicted that patient dont have the disease) then in that case patient wont get the treatment and might loose his/her life.&lt;/p&gt;

&lt;p&gt;Similarly in criminal is guilty or innocent case &lt;strong&gt;false positive&lt;/strong&gt; is is much more worse (when the person is inncent and the model predicts person is guilty) as we will end up punishing an innocent.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;4. Can we use Chi-Squared with numerical dataset? If yes, give example. If no, give reason.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Answer: Chi-Squared generally deals with categorical data rather than only numerical data.&lt;/p&gt;

&lt;p&gt;Chi Sqauared finds the differences or it compares two or more groups with a value, or to compare two or more groups.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;5. What do you understand by ANOVA testing?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Answer: ANOVA stands for Analysis of Variance. It is an extenstion of T-Test.&lt;/p&gt;

&lt;p&gt;In T-Test we test if there is any difference in mean and it can only test of two groups at a time, so if there are more than 2 groups instead of performing T-Test multiple times we go for ANOVA testing.&lt;/p&gt;

&lt;p&gt;ANOVA testing looks for two different types of variations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;variation within groups&lt;/li&gt;
&lt;li&gt;variation between groups
To test ANOVA our hypothesis will be:&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;null hypothesis: There is no difference in means&lt;br&gt;
alternate hypothesis: Atleast or mean differs from the other means&lt;br&gt;
There are two type of ANOVA:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One way ANOVA&lt;/li&gt;
&lt;li&gt;Two way ANOVA&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One way ANOVA: When we want to test two groups and see if thee is any difference.&lt;/p&gt;

&lt;p&gt;Two way ANOVA: When we test twice for the same group.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;6. Give me a scenario where we can use the Z-test and T-test.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Answer: We use either of the test depending on:&lt;/p&gt;

&lt;p&gt;&lt;u&gt;Sample size&lt;/u&gt;: &lt;/p&gt;

&lt;p&gt;When sample size is large or greater than 30 we use Z-test else T-test&lt;/p&gt;

&lt;p&gt;&lt;u&gt;Population variance&lt;/u&gt;:&lt;/p&gt;

&lt;p&gt;When population variance is known, we use Z-test else T-test&lt;/p&gt;

&lt;p&gt;&lt;u&gt;Distribution&lt;/u&gt;:&lt;/p&gt;

&lt;p&gt;If normally distributed we perform Z-test else T-test&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;7. What do you underdstand by inferential statistics?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Answer: When we try to form a conclusion about the population by conducting the experiments on sample taken from the population.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;8. When you are trying to calculate standard deviation or variance, why you used n-1 in denominator?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Answer: Having denominator as n-1 corrects the biasness in the estimation of the population variance.&lt;/p&gt;

&lt;p&gt;So, for example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;if we have data points where the population mean is not inside the sample points&lt;/li&gt;
&lt;li&gt;now, if we take the sample mean and the distance between the sample points to the sample mean, that will be much lower estimate as compared to population variance&lt;/li&gt;
&lt;li&gt;this could lead to underestimating the population variance&lt;/li&gt;
&lt;li&gt;hence by dividing the denominator by n-1 which makes the denominator much smaller and in turn gives high value for the sample variance which will be unbiased estimate'&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;9. What to do you understand by right skewness? Give example.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Answer: When the data is not normally distributed and we have tail type elongated line on the right side, that is called right skewness.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;p&gt;Income distrubution.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--XtwxK6BB--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://www.mathsisfun.com/data/images/skewed-distribution.svg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--XtwxK6BB--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://www.mathsisfun.com/data/images/skewed-distribution.svg" alt="Income distrubution" width="279" height="211"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;10. What is the the difference between normal distribution, standard normal distribution and uniform distribution?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Answer:&lt;/p&gt;

&lt;p&gt;&lt;u&gt;Normal distribution&lt;/u&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;it is a density curve which is a bell shaped curve&lt;/li&gt;
&lt;li&gt;has the tendency of the data to cluster around the central value which is also known as population mean&lt;/li&gt;
&lt;li&gt;has a total area of 100%&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;u&gt;Standard normal distribution&lt;/u&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;it is a special type of normal distribution which has the mean as 0 and the standard deviation as 1&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;u&gt;Uniform distribution&lt;/u&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;this distribution has the values which lies between certain range/boundary&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>datascience</category>
      <category>statistics</category>
      <category>interview</category>
    </item>
  </channel>
</rss>
