<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Ruth</title>
    <description>The latest articles on DEV Community by Ruth (@rvtheverett).</description>
    <link>https://dev.to/rvtheverett</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F656536%2Fba8d2d01-58ea-4603-8b35-3fde547d26a0.jpg</url>
      <title>DEV Community: Ruth</title>
      <link>https://dev.to/rvtheverett</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/rvtheverett"/>
    <language>en</language>
    <item>
      <title>Pandas: Importing Data, Indexing, Comparisons and Selectors (featuring adoptable dog data)</title>
      <dc:creator>Ruth</dc:creator>
      <pubDate>Sun, 14 Nov 2021 16:44:48 +0000</pubDate>
      <link>https://dev.to/rvtheverett/pandas-importing-data-indexing-comparisons-and-selectors-featuring-adoptable-dog-data-52l5</link>
      <guid>https://dev.to/rvtheverett/pandas-importing-data-indexing-comparisons-and-selectors-featuring-adoptable-dog-data-52l5</guid>
      <description>&lt;p&gt;For the second Python Pandas guide we will be reviewing how to import data, as well as a deeper dive into indexing, comparison statements and selecting subsets of data. For this we will be using a dataset with information about adoptable dogs from &lt;a href="https://www.kaggle.com/jmolitoris/adoptable-dogs"&gt;Kaggle&lt;/a&gt;. They have loads of datasets that you can easily download and use for projects. &lt;/p&gt;

&lt;h2&gt;
  
  
  Importing data from a csv
&lt;/h2&gt;

&lt;p&gt;To start we need to import our downloaded data into a DataFrame. It's pretty simple to upload the data from our downloads folder when working locally using the pd.read_csv function.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;dogs = pd.read_csv(r'Downloads/ShelterDogs.csv')
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once we have imported the data we want to inspect the DataFrame and make sure it contains all of the information that we need and in the correct format. We can use .head() to view the first 5 rows.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;dogs.head()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--xhHPZjJQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636308553267/eXd2t9j9c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--xhHPZjJQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636308553267/eXd2t9j9c.png" alt="Screenshot 2021-11-07 at 18.09.01.png" width="880" height="254"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In order to find out more about the data types of each column, we can use .info() which displays the column name, the count of rows that contain null data and then the data type of each columns.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;dogs.info()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--auJpaRKw--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636308588956/VXK0rxapI.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--auJpaRKw--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636308588956/VXK0rxapI.png" alt="Screenshot 2021-11-07 at 18.09.38.png" width="824" height="906"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Data Types
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;int&lt;/strong&gt; - whole numbers &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;float&lt;/strong&gt; - decimal numbers &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;object&lt;/strong&gt; - a string&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Changing data types
&lt;/h2&gt;

&lt;p&gt;There may be an instance where we want to change a data type, in order to perform some operations, or to make them easier to view. &lt;/p&gt;

&lt;p&gt;In order to change string to number you can use the Pandas method pd.to_numeric or to a change number into to string you can use .astype(str). &lt;/p&gt;

&lt;p&gt;Within our dataframe we can also change the date column from a string to an actual date format, using the .to_datetime command.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;dogs['date_found'] = pd.to_datetime(dogs['date_found'])
dogs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--eVQa41nR--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636314062876/wHx-FAI4g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--eVQa41nR--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636314062876/wHx-FAI4g.png" alt="Screenshot 2021-11-07 at 19.40.53.png" width="880" height="168"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Finally, .shape will show us the number of columns and then the number of rows.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;dogs.shape
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Selecting multiple columns
&lt;/h2&gt;

&lt;p&gt;To select two or more columns from a DataFrame, we can use a list of the column names between a double set of square brackets. This will provide us a subset of the data, and create a new DataFrame containing only this information, leaving our original DataFrame untouched.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;type_breed = dogs[['name', 'sex', 'breed']]
type_breed.head()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--kK0TgSzJ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636308652815/crsezoFiO.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--kK0TgSzJ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636308652815/crsezoFiO.png" alt="Screenshot 2021-11-07 at 18.10.43.png" width="580" height="364"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Selecting one column using .iloc
&lt;/h2&gt;

&lt;p&gt;We can use .iloc, which is integer based, to select and display a single column by specifying the positional index of the row we want to view.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--M9LhiWhX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636314759657/bm__aTAPT.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--M9LhiWhX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636314759657/bm__aTAPT.png" alt="Screenshot 2021-10-31 at 14.24.13.png" width="808" height="386"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;dogs.iloc[1]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--TZzHG_N3--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636308693583/Rg2FUR0A-.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--TZzHG_N3--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636308693583/Rg2FUR0A-.png" alt="Screenshot 2021-11-07 at 18.11.24.png" width="646" height="708"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  iloc indexing
&lt;/h2&gt;

&lt;p&gt;In some cases, we might also want to select only certain rows, we can do this by using the index of the rows, and select those using the iloc command.  It allows us to select multiple rows, using index based selection. This is similar to how we select elements from a list, using the : operator and square brackets.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;dogs.iloc[3:7]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--ii6xEBRI--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636308725211/9YltEJXBy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ii6xEBRI--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636308725211/9YltEJXBy.png" alt="Screenshot 2021-11-07 at 18.11.47.png" width="880" height="211"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This will display rows 3-6, as the last row is not included. &lt;/p&gt;

&lt;p&gt;The : operator, which also comes from native Python, means "everything". When combined with other selectors, however, it can be used to indicate a range of values.&lt;/p&gt;

&lt;p&gt;iloc[:10] would select all rows up to, but not including, the 10th row&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;dogs.iloc[:10]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can also use minus indexing, for example iloc[-2:] will display the last 2 columns.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;dogs.iloc[-2:]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Renaming columns
&lt;/h2&gt;

&lt;p&gt;When we get data from different sources, we may need to rename the columns so they are easier to read or call from. &lt;/p&gt;

&lt;p&gt;There are a couple of methods to do this, depending on which column names we want to change. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;.columns will allow us to change all of the columns at once. However, it’s important to get the ordering right to avoid mislabeling them.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;.rename will allow us to change individual columns. You can pass a single column to change or multiple columns, using a dictionary with the original name and the new name.&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;dogs.rename(columns = {'posted' : 'date_posted', 
                         'breed' : 'dog_breed',
                         'adoptable_from' : 'date_adoptable',
                         'coat' : 'coat_type'},
                         inplace = True)

dogs.head()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--bgfxlM3r--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636308801538/JYZZxvSvH.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--bgfxlM3r--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636308801538/JYZZxvSvH.png" alt="Screenshot 2021-11-07 at 18.13.12.png" width="880" height="348"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Selecting with comparison statements
&lt;/h2&gt;

&lt;p&gt;We can also select a subset of data using comparison statements. &lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operator&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;==&lt;/td&gt;
&lt;td&gt;Equal to&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;!=&lt;/td&gt;
&lt;td&gt;Not Equal&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&amp;gt;&lt;/td&gt;
&lt;td&gt;Greater than&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&amp;lt;&lt;/td&gt;
&lt;td&gt;Less than&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&amp;gt;=&lt;/td&gt;
&lt;td&gt;Greater than or Equal to&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&amp;lt;=&lt;/td&gt;
&lt;td&gt;Less than or Equal to&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Equal to
&lt;/h3&gt;

&lt;p&gt;For example, if we wanted to only display female dogs from the dataset, we would use the double equal operator and define the column that we are selecting from.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;female = dogs[dogs.sex == 'female']
female
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--SsQNmLC6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636308861068/gRD2OFDNo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--SsQNmLC6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636308861068/gRD2OFDNo.png" alt="Screenshot 2021-11-07 at 18.14.11.png" width="880" height="208"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Not Equal to
&lt;/h3&gt;

&lt;p&gt;If we wanted to exclude all dogs which are an unknown mix breed, we could use the not equal to operator to select all rows except ones that mention Unknown Mix.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;breeds = dogs[dogs.dog_breed != 'Unknown Mix']
breeds
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--WsdlIGid--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636308894969/FyFMJg523.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--WsdlIGid--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636308894969/FyFMJg523.png" alt="Screenshot 2021-11-07 at 18.14.32.png" width="880" height="213"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Greater than
&lt;/h3&gt;

&lt;p&gt;As we have numbers in our dataset we can use the greater than operator to select any rows which contain a number higher than the one we give it. We just need to specify which column we are selecting the data from, in this case, it's the age column.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;over_two = dogs[dogs.age &amp;gt; 2]
over_two
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--YXRiNmpJ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636308919299/3Taa6eCQs3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--YXRiNmpJ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636308919299/3Taa6eCQs3.png" alt="Screenshot 2021-11-07 at 18.15.09.png" width="880" height="202"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Less than
&lt;/h3&gt;

&lt;p&gt;Similarly, we can use the less than operator to select all of the dogs which have an age less than 4.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;under_four = dogs[dogs.age &amp;lt; 4]
under_four
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--QMXalvc8--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636308959357/EVhW633mn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--QMXalvc8--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636308959357/EVhW633mn.png" alt="Screenshot 2021-11-07 at 18.15.49.png" width="880" height="254"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Multiple statements
&lt;/h3&gt;

&lt;p&gt;It is also possible to use multiple comparison operators in a single selection, and we can define if we want the final dataset to contain either both of these by using the ampersand, to signify an and rule.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;likes_people = dogs[(dogs.likes_people == 'yes') &amp;amp; (dogs.likes_children == 'yes')]
likes_people
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--1NXEyK7O--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636308996244/gIuVo7B3_.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--1NXEyK7O--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636308996244/gIuVo7B3_.png" alt="Screenshot 2021-11-07 at 18.16.27.png" width="880" height="267"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In addition, we can use the pipe, |, to define an or rule, where we want our final dataset to display either of the data points we have selected.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cats_female = dogs[(dogs.get_along_females == 'yes') | (dogs.get_along_cats == 'yes')]
cats_female
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Mixed statements
&lt;/h2&gt;

&lt;p&gt;We can also use mixed comparison operators within one statement, for example if we want any dogs which are neutered and aged either 4 or older we can do such:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;suitable = dogs[(dogs.neutered == 'yes') &amp;amp; (dogs.age &amp;gt;= 4)]
suitable
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--mp_8j-8U--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636309109192/ljJVIqmaj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--mp_8j-8U--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636309109192/ljJVIqmaj.png" alt="Screenshot 2021-11-07 at 18.18.17.png" width="880" height="169"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Built in selectors
&lt;/h2&gt;

&lt;p&gt;Pandas also comes with some conditional selectors that are built in and can be used in a similar way to logical statements. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;isin selects data where the value is what you are defining in the list &lt;/li&gt;
&lt;li&gt;isnull will select data that is null within the columns that you select (i.e displaying Null)&lt;/li&gt;
&lt;li&gt;notnull selects data that has a value, i.e all that is not null &lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  isin
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;breed = dogs[(dogs.dog_breed.isin(['Staffordshire Terrier Mix', 'Labrador Retriever Mix', 'German Shepherd Dog Mix']))]
breed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--92YEUwag--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636311856198/3g-QybpgL.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--92YEUwag--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636311856198/3g-QybpgL.png" alt="Screenshot 2021-11-07 at 19.04.07.png" width="618" height="616"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once we have selected subsets of data, the index is changed to reflect only the rows that we have selected, if we want to reorder the index appropriately we will want to reset the index using .reset_index().&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;breed = dogs[(dogs.dog_breed.isin(['Staffordshire Terrier Mix', 'Labrador Retriever Mix', 'German Shepherd Dog Mix']))].reset_index()
breed 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--nVwYtOqM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636311911167/14MEpgc2i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--nVwYtOqM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636311911167/14MEpgc2i.png" alt="Screenshot 2021-11-07 at 19.05.02.png" width="684" height="610"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  notnull
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;neutered_known = dogs.loc[dogs.neutered.notnull()]
neutered_known 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  isnull
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;neutered_unknown = dogs.loc[dogs.neutered.isnull()]
neutered_unknown 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--KMOVA7X---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636312036391/hUumXRusa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--KMOVA7X---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636312036391/hUumXRusa.png" alt="Screenshot 2021-11-07 at 19.07.06.png" width="880" height="332"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Dealing with null data
&lt;/h2&gt;

&lt;p&gt;It's likely that you will come across null data when dealing with big datasets, and this can cause issues when doing any selecting or mathematical operations. It can also sometimes make the tables look messy, hard to review and can skew final results. &lt;/p&gt;

&lt;p&gt;There is an easy way to replace the null data columns using the Pandas method .fillna(). Within the brackets you will need to add an argument that states the replacement number or text. This will then replace all of the Nan fields.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;dogs.fillna("not available")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--030WtKZh--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636314520992/JO3YdpS4Y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--030WtKZh--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1636314520992/JO3YdpS4Y.png" alt="Screenshot 2021-11-07 at 19.48.30.png" width="880" height="337"&gt;&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;I hope this has been a helpful (and cute) way to understand the more of the Pandas library in Python. I'm looking forward to making a few more posts in this series :) &lt;/p&gt;

&lt;p&gt;A notebook to download and play around with can be found  &lt;a href="https://github.com/rvth/thinking-like-a-panda/blob/main/Pandas%20-%20Adoptable%20Dogs.ipynb"&gt;here&lt;/a&gt; .&lt;/p&gt;

</description>
      <category>python</category>
      <category>pandas</category>
      <category>learning</category>
      <category>data</category>
    </item>
    <item>
      <title>Pandas: Creating, Modifying and Inspecting DataFrames (featuring data from Squid Game)</title>
      <dc:creator>Ruth</dc:creator>
      <pubDate>Sun, 31 Oct 2021 19:07:06 +0000</pubDate>
      <link>https://dev.to/rvtheverett/pandas-creating-modifying-and-inspecting-dataframes-featuring-data-from-squid-game-1d2c</link>
      <guid>https://dev.to/rvtheverett/pandas-creating-modifying-and-inspecting-dataframes-featuring-data-from-squid-game-1d2c</guid>
      <description>&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--7TbDLMG7--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1635697226433/p1kLqNC4g.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--7TbDLMG7--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1635697226433/p1kLqNC4g.gif" alt="giphy (6).gif" width="480" height="270"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Inspired by &lt;a class="mentioned-user" href="https://dev.to/codechips"&gt;@codechips&lt;/a&gt;
' SQL Squid Game guide, I thought it would be fun to use some Squid Game data to write a guide on the basics of the Pandas library in Python including creating a DataFrame, modifying rows and columns and inspecting the data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Importing Pandas
&lt;/h2&gt;

&lt;p&gt;The first step is to import that Pandas library and alias it as pd. This means we don't have to call the full word when we run a Pandas function, we just need to type pd.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import pandas as pd 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The main data structure of Pandas is a DataFrame, which is similar to an excel spreadsheet, where we can store data. Once the data is within a DataFrame, there are several ways it can be used, for example data analysis or to visualise it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Introducing DataFrames
&lt;/h2&gt;

&lt;p&gt;DataFrames store data in rows and columns. Each column has a name, which is a string, and each row has an index, which is typically an integer. Like lists in Python, DataFrames also use 0 indexing, which means the first row in index 0 instead of index 1. However, you can set the index to include extra information about what the row contains if you want. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--5JKI-Y_r--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1635684575976/YYzdNTC9B.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--5JKI-Y_r--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1635684575976/YYzdNTC9B.png" alt="Screenshot 2021-10-31 at 12.49.20.png" width="690" height="356"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;DataFrames can contain many different data types, including strings, integers and floats. &lt;/p&gt;

&lt;p&gt;You can create a DataFrame by uploading data from a csv file, but you can also create a DataFrame by typing values into a list and using a dictionary to transform it into a DataFrame. You can use multiple different lists, containing different data types, but the value of content included in each must be the same.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;data = {'Name':  ['Oh Il-Nam', 'Kang Sae-byeok', 'Jang Deok-su', 'Abdul Ali', 'Han Mi-nyeo',  'Cho Sang-woo', 'Ji-yeong'],
'Number': [1, 67, 101, 199, 212, 218, 240],
        }

df = pd.DataFrame(data, index = ["player1", "player67", "player101", "player199", "player212", "player281", 
"player240"])

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As you can see above we have one list with the players names, typed as strings, and another with their number, which is an integer. &lt;/p&gt;

&lt;p&gt;We then transform this dictionary of lists to a DataFrame using the &lt;em&gt;pd.DataFrame()&lt;/em&gt; command. We also set the index values, by passing the string names as a list argument. &lt;/p&gt;

&lt;h2&gt;
  
  
  Inspecting a DataFrame
&lt;/h2&gt;

&lt;p&gt;Once the data has been added, we want to make sure it looks correct and contains everything we want. The best method to do this is to use &lt;em&gt;df.head()&lt;/em&gt;. This will show the first 5 rows of data. However, if you would like to see more data, i.e 10 rows, you can pass this as an argument within the brackets.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;df.head(10)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--LXwlDgmQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1635684192998/BSI6OefFT.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--LXwlDgmQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1635684192998/BSI6OefFT.png" alt="Screenshot 2021-10-31 at 12.20.59.png" width="468" height="456"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If we want to only view the data from one row, we can use the panda-specific access method &lt;em&gt;.loc&lt;/em&gt;. This is a label based method, meaning we have to specify the name of the row we want to view, this method can only use string. Here we add in the index name that we specified when creating the DataFrame.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;print(df.loc["player67"])
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--5OVkaGgO--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1635684208937/PwpNHm1re.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--5OVkaGgO--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1635684208937/PwpNHm1re.png" alt="Screenshot 2021-10-31 at 12.21.49.png" width="524" height="134"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The other access method is &lt;em&gt;.iloc&lt;/em&gt;, which is integer based, by specifying the positional index of the row we want to view. We would use this one if we hadn't changed the index, or if we had changed it to another integer value instead of a string. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--D-P55C79--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1635690271373/FkLwSbFId.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--D-P55C79--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1635690271373/FkLwSbFId.png" alt="Screenshot 2021-10-31 at 14.24.13.png" width="808" height="386"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We can also get a count of how many rows within the DataFrame, by using the &lt;em&gt;count()&lt;/em&gt; function. This will print the number of rows in each column.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;df.count()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--WfHjHeGn--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1635684222534/zlvMgpnzK.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--WfHjHeGn--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1635684222534/zlvMgpnzK.png" alt="Screenshot 2021-10-31 at 12.22.54.png" width="286" height="128"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If we only want to print a single number to display the row count, we can pass in the name of the column we want to square within square brackets before the count function. Storing this within a variable will also enable us to use this number for functions in the future.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;player_count = df['Name'].count()
player_count
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Adding a new row
&lt;/h2&gt;

&lt;p&gt;There are a couple of different ways to add new rows to a DataFrame. If we have multiple rows within another DataFrame, we can use the &lt;em&gt;.append()&lt;/em&gt; function to add several rows to the end of our existing DataFrame. &lt;/p&gt;

&lt;p&gt;However, if we have just 1 row to add, i.e only 1 new player to add, we can again use the .loc method to add the row to the end of our original DataFrame. As we are defining the name of our indexes, we need to pass this in before the new values that will be within the row. The number of values contained must match the number of columns we have and be in the correct order.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;df.loc['player456'] = ['Seong Gi-hun', 456]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now if we print the head() of the new DataFrame, we will see our new row added in 🧑‍🦰&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--ay6AvSye--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1635684246183/_tnVDlj_u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ay6AvSye--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1635684246183/_tnVDlj_u.png" alt="Screenshot 2021-10-31 at 12.24.04.png" width="506" height="606"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Adding a new column
&lt;/h2&gt;

&lt;p&gt;In addition to adding a new row, we can also add a new column. There are several ways you can define the value that will be contained within this column for each row, including based on what is within other columns and using conditional statement or lambda function. However, in this case we want the value to be the same for every row, to show that every player is currently playing the game. &lt;/p&gt;

&lt;p&gt;For this we just need to add the name of our new column and assign it the value that we want to add.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;df['Status'] = "Playing"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--AqHXtCrU--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1635684288720/Hps6Ba-7n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--AqHXtCrU--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1635684288720/Hps6Ba-7n.png" alt="Screenshot 2021-10-31 at 12.24.47.png" width="580" height="524"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Changing values
&lt;/h2&gt;

&lt;p&gt;Now we have a column with the status of all players that are playing the game. But what happens when we start to have eliminations? We need to update the value in that column 😬&lt;/p&gt;

&lt;p&gt;Again, we will use the *.loc * method to define the row that we will be amending and the column that we will be changing the value of. &lt;/p&gt;

&lt;p&gt;In this case it will be based on their number, so we need to pass in this column name and use the equals to ensure it is only going to change the single row that equals that row in the Number column. Next we will pass in the name of the column that we will be amending before assigning the new value of Eliminated 😢&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;df.loc[df.Number == 1, "Status"] = "Eliminated"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Mlk65P-N--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1635684316867/WkohnycJf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Mlk65P-N--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1635684316867/WkohnycJf.png" alt="Screenshot 2021-10-31 at 12.30.54.png" width="626" height="524"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Deleting a row
&lt;/h2&gt;

&lt;p&gt;Once players start being eliminated, they will be removed from the game, therefore we will want to ensure we have updated DataFrame that reflects this and only contains those who are currently playing. &lt;/p&gt;

&lt;p&gt;Because we want to keep track of all of the players in the original DataFrame, we don't want to delete anything from this one. However, we can use the data from our original DataFrame to make another one while leaving the original untouched. We can do this by defining the name of our new DataFrame and apply some logic to pull data from our original one. &lt;/p&gt;

&lt;p&gt;In this case, &lt;em&gt;.drop()&lt;/em&gt; allows us to drop (aka delete, but I guess in the instance of game 5, literally drop) the eliminated players. As we have changed their status to Eliminated, we can use this value to delete them all.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;round5 = df.drop(df.index[df['Status'] == "Eliminated"])
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It is also possible to drop the rows based on their name or number, but this would be done on a row by row basis, rather than removing multiple at once. &lt;/p&gt;

&lt;p&gt;Now we can view the new DataFrame, and see our remaining players for the game.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;round5.head()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Y5bk4Xf7--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1635684404025/o_-4tbSEq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Y5bk4Xf7--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1635684404025/o_-4tbSEq.png" alt="Screenshot 2021-10-31 at 12.38.31.png" width="588" height="256"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Using the data
&lt;/h2&gt;

&lt;p&gt;The final thing we can do is use the count function mentioned earlier, where we extracted a count value for the columns based on conditional statements. For example, if we want to print the number of players who have a status of Playing, and then for those who have status of Eliminated.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;eliminated = df[df['Status'] == "Eliminated"]['Name'].count()
playing = df[df['Status'] == "Playing"]['Name'].count()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Assigning these values to a variable means we can also use these counts to create simple sentences using f strings. As the variable values are mutable, we can update these after every game when players are eliminated.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;print(f"there are currently {playing} players playing")
print(f"there are currently {eliminated} players eliminated")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--hhl6eNNH--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1635684418784/mo-Mb5B_3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--hhl6eNNH--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1635684418784/mo-Mb5B_3.png" alt="Screenshot 2021-10-31 at 12.40.05.png" width="708" height="94"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And not to forget a wonderful maths function, we can use the number of eliminated players and take this from the total number of players who started to get a count for the current players in the game.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;total_players = 8
current = total_players - eliminated
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I hope this has been a helpful (and fun) way to understand the basics of the Pandas library in Python. I'm  also hoping to make more posts that go further in detail covering logical statements, merging and visualisations in Pandas. &lt;/p&gt;

&lt;p&gt;A quick cheatsheet of these functions can be found  &lt;a href="https://www.notion.so/Pandas-Creating-Modifying-and-Inspecting-DataFrames-featuring-data-from-Squid-Game-01ec20de1527400a8551aeb5a194fa54"&gt;here&lt;/a&gt; and a notebook to download and play around with can be found  &lt;a href="https://github.com/rvth/squid-game-python/blob/main/Squid%20Game%20Python.ipynb"&gt;here&lt;/a&gt;. &lt;/p&gt;

</description>
      <category>python</category>
      <category>pandas</category>
      <category>datascience</category>
      <category>learning</category>
    </item>
  </channel>
</rss>
