<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Riley Molloy</title>
    <description>The latest articles on DEV Community by Riley Molloy (@rpm4real).</description>
    <link>https://dev.to/rpm4real</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F185873%2F644545af-78ca-4f2d-9d98-51212985a8f3.jpeg</url>
      <title>DEV Community: Riley Molloy</title>
      <link>https://dev.to/rpm4real</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/rpm4real"/>
    <language>en</language>
    <item>
      <title>In-line Renaming of Pandas Aggregates   </title>
      <dc:creator>Riley Molloy</dc:creator>
      <pubDate>Wed, 14 Aug 2019 00:48:37 +0000</pubDate>
      <link>https://dev.to/rpm4real/in-line-renaming-of-pandas-aggregates-3i6n</link>
      <guid>https://dev.to/rpm4real/in-line-renaming-of-pandas-aggregates-3i6n</guid>
      <description>&lt;h1&gt;
  
  
  The Problem
&lt;/h1&gt;

&lt;p&gt;When working with aggregating dataframes in pandas, I've found myself frustrated with how the results of aggregated columns are named. By default, they inherit the name of the column of which you're aggregating. For example,&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt; 
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;

&lt;span class="n"&gt;iris&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;iris&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;groupby&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'species'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agg&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="s"&gt;'sepal_length'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table class="dataframe"&gt;  &lt;thead&gt;    &lt;tr&gt;      &lt;th&gt;&lt;/th&gt;      &lt;th&gt;sepal_length&lt;/th&gt;    &lt;/tr&gt;    &lt;tr&gt;      &lt;th&gt;species&lt;/th&gt;      &lt;th&gt;&lt;/th&gt;    &lt;/tr&gt;  &lt;/thead&gt;  &lt;tbody&gt;    &lt;tr&gt;      &lt;th&gt;setosa&lt;/th&gt;      &lt;td&gt;5.01&lt;/td&gt;    &lt;/tr&gt;    &lt;tr&gt;      &lt;th&gt;versicolor&lt;/th&gt;      &lt;td&gt;5.94&lt;/td&gt;    &lt;/tr&gt;    &lt;tr&gt;      &lt;th&gt;virginica&lt;/th&gt;      &lt;td&gt;6.59&lt;/td&gt;    &lt;/tr&gt;  &lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;So obviously, we as the writers of the above code know that we took a mean of sepal length. But just looking at the output we have no idea what was done to the sepal length value. We can get around this if we enclose the aggregate function in a list:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;iris&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;groupby&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'species'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agg&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="s"&gt;'sepal_length'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table class="dataframe"&gt;  &lt;thead&gt;    &lt;tr&gt;      &lt;th&gt;&lt;/th&gt;      &lt;th&gt;sepal_length&lt;/th&gt;    &lt;/tr&gt;    &lt;tr&gt;      &lt;th&gt;&lt;/th&gt;      &lt;th&gt;mean&lt;/th&gt;    &lt;/tr&gt;    &lt;tr&gt;      &lt;th&gt;species&lt;/th&gt;      &lt;th&gt;&lt;/th&gt;    &lt;/tr&gt;  &lt;/thead&gt;  &lt;tbody&gt;    &lt;tr&gt;      &lt;th&gt;setosa&lt;/th&gt;      &lt;td&gt;5.01&lt;/td&gt;    &lt;/tr&gt;    &lt;tr&gt;      &lt;th&gt;versicolor&lt;/th&gt;      &lt;td&gt;5.94&lt;/td&gt;    &lt;/tr&gt;    &lt;tr&gt;      &lt;th&gt;virginica&lt;/th&gt;      &lt;td&gt;6.59&lt;/td&gt;    &lt;/tr&gt;  &lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Pandas adds a row (technically adds a level, creating a multiIndex) to tell us the different aggregate functions we applied to the column. In this case, we only applied one, but you could see how it would work for multiple aggregation expressions. &lt;/p&gt;

&lt;p&gt;This approach works well. If you want to collapse the multiIndex to create more accessible columns, you can leverage a concatenation approach, inspired by &lt;a href="https://stackoverflow.com/questions/14507794/pandas-how-to-flatten-a-hierarchical-index-in-columns"&gt;this stack overflow post&lt;/a&gt; (note that other implementations similarly use &lt;code&gt;.ravel()&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;iris&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;groupby&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'species'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agg&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="s"&gt;'sepal_length'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'_'&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;gp&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table class="dataframe"&gt;  &lt;thead&gt;    &lt;tr&gt;      &lt;th&gt;&lt;/th&gt;      &lt;th&gt;sepal_length_mean&lt;/th&gt;    &lt;/tr&gt;    &lt;tr&gt;      &lt;th&gt;species&lt;/th&gt;      &lt;th&gt;&lt;/th&gt;    &lt;/tr&gt;  &lt;/thead&gt;  &lt;tbody&gt;    &lt;tr&gt;      &lt;th&gt;setosa&lt;/th&gt;      &lt;td&gt;5.01&lt;/td&gt;    &lt;/tr&gt;    &lt;tr&gt;      &lt;th&gt;versicolor&lt;/th&gt;      &lt;td&gt;5.94&lt;/td&gt;    &lt;/tr&gt;    &lt;tr&gt;      &lt;th&gt;virginica&lt;/th&gt;      &lt;td&gt;6.59&lt;/td&gt;    &lt;/tr&gt;  &lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Both of these solutions have a few immediate issues:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Column names can still be far from readable English; &lt;/li&gt;
&lt;li&gt;The concatenation approach may not scale for all applications; &lt;/li&gt;
&lt;li&gt;Pandas takes the &lt;code&gt;__name__&lt;/code&gt; attribute of any custom functions and uses it for the column name here. In the case of aggregating with custom functions or lambda functions, it's not likely the column names will make sense in these formats. &lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  A Different Solution
&lt;/h1&gt;

&lt;p&gt;We can leverage the &lt;code&gt;__name__&lt;/code&gt; attribute to create a clearer column name and maybe even one others can make sense of. 👍 &lt;/p&gt;

&lt;p&gt;To be clear: we could obviously rename any of these columns after the dataframe is returned, but in this case I wanted a solution where I could set column names on the fly. &lt;/p&gt;

&lt;h2&gt;
  
  
  Taking Advantage of the &lt;code&gt;__name__&lt;/code&gt; Attribute
&lt;/h2&gt;

&lt;p&gt;If you're unfamiliar, the &lt;code&gt;__name__&lt;/code&gt; attribute is something every function you or someone else defines in python comes along with.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;this_function&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;pass&lt;/span&gt; 

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;this_function&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;this_function&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;We can change this attribute after we define it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;this_function&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;pass&lt;/span&gt; 

&lt;span class="n"&gt;this_function&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'that.'&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;this_function&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;that.&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;There are also some great options for adjusting a function &lt;code&gt;__name__&lt;/code&gt; as you define the function using decorators. More about that &lt;a href="https://stackoverflow.com/questions/10874432/possible-to-change-function-name-in-definition"&gt;here&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;Returning to our application, lets examine the following situation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;my_agg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; 
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;iris&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;groupby&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'species'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agg&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="s"&gt;'sepal_length'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;my_agg&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="s"&gt;'sepal_width'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;my_agg&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table class="dataframe"&gt;  &lt;thead&gt;    &lt;tr&gt;      &lt;th&gt;&lt;/th&gt;      &lt;th&gt;sepal_length&lt;/th&gt;      &lt;th&gt;sepal_width&lt;/th&gt;    &lt;/tr&gt;    &lt;tr&gt;      &lt;th&gt;&lt;/th&gt;      &lt;th&gt;my_agg&lt;/th&gt;      &lt;th&gt;my_agg&lt;/th&gt;    &lt;/tr&gt;    &lt;tr&gt;      &lt;th&gt;species&lt;/th&gt;      &lt;th&gt;&lt;/th&gt;      &lt;th&gt;&lt;/th&gt;    &lt;/tr&gt;  &lt;/thead&gt;  &lt;tbody&gt;    &lt;tr&gt;      &lt;th&gt;setosa&lt;/th&gt;      &lt;td&gt;12.52&lt;/td&gt;      &lt;td&gt;8.57&lt;/td&gt;    &lt;/tr&gt;    &lt;tr&gt;      &lt;th&gt;versicolor&lt;/th&gt;      &lt;td&gt;14.84&lt;/td&gt;      &lt;td&gt;6.92&lt;/td&gt;    &lt;/tr&gt;    &lt;tr&gt;      &lt;th&gt;virginica&lt;/th&gt;      &lt;td&gt;16.47&lt;/td&gt;      &lt;td&gt;7.44&lt;/td&gt;    &lt;/tr&gt;  &lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;We could add a line adjusting the &lt;code&gt;__name__&lt;/code&gt; of &lt;code&gt;my_agg()&lt;/code&gt; before we start our aggregation. But what if we could rename the function as we were aggregating? Similar to how we can rename columns in a SQL statement as we define them. &lt;/p&gt;

&lt;h2&gt;
  
  
  Higher-order Renaming Function
&lt;/h2&gt;

&lt;p&gt;To solve this problem, we can define a higher-order function which returns a copy of our original function, but with the name attribute changed. It looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;renamer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agg_func&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;desired_name&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;return_func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;agg_func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;return_func&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;desired_name&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;return_func&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;We can apply this function outside of our application of &lt;code&gt;my_agg&lt;/code&gt; to reset the &lt;code&gt;__name__&lt;/code&gt; on-the-fly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;iris&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;groupby&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'species'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agg&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="s"&gt;'sepal_length'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;renamer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;my_agg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;'Cool Name'&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
    &lt;span class="s"&gt;'sepal_width'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;renamer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;my_agg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;'Better Name'&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table class="dataframe"&gt;  &lt;thead&gt;    &lt;tr&gt;      &lt;th&gt;&lt;/th&gt;      &lt;th&gt;sepal_length&lt;/th&gt;      &lt;th&gt;sepal_width&lt;/th&gt;    &lt;/tr&gt;    &lt;tr&gt;      &lt;th&gt;&lt;/th&gt;      &lt;th&gt;Cool Name&lt;/th&gt;      &lt;th&gt;Better Name&lt;/th&gt;    &lt;/tr&gt;    &lt;tr&gt;      &lt;th&gt;species&lt;/th&gt;      &lt;th&gt;&lt;/th&gt;      &lt;th&gt;&lt;/th&gt;    &lt;/tr&gt;  &lt;/thead&gt;  &lt;tbody&gt;    &lt;tr&gt;      &lt;th&gt;setosa&lt;/th&gt;      &lt;td&gt;12.52&lt;/td&gt;      &lt;td&gt;8.57&lt;/td&gt;    &lt;/tr&gt;    &lt;tr&gt;      &lt;th&gt;versicolor&lt;/th&gt;      &lt;td&gt;14.84&lt;/td&gt;      &lt;td&gt;6.92&lt;/td&gt;    &lt;/tr&gt;    &lt;tr&gt;      &lt;th&gt;virginica&lt;/th&gt;      &lt;td&gt;16.47&lt;/td&gt;      &lt;td&gt;7.44&lt;/td&gt;    &lt;/tr&gt;  &lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Realistic Example
&lt;/h2&gt;

&lt;p&gt;Here's a perfect scenario to utilize this solution:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;numpy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;percentile&lt;/span&gt;

&lt;span class="n"&gt;iris&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;groupby&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'species'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agg&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="s"&gt;'sepal_length'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;renamer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;percentile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;&lt;span class="s"&gt;'25th Percentile'&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
    &lt;span class="s"&gt;'sepal_width'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;renamer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;percentile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;75&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;&lt;span class="s"&gt;'75th Percentile'&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table class="dataframe"&gt;  &lt;thead&gt;    &lt;tr&gt;      &lt;th&gt;&lt;/th&gt;      &lt;th&gt;sepal_length&lt;/th&gt;      &lt;th&gt;sepal_width&lt;/th&gt;    &lt;/tr&gt;    &lt;tr&gt;      &lt;th&gt;&lt;/th&gt;      &lt;th&gt;25th Percentile&lt;/th&gt;      &lt;th&gt;75th Percentile&lt;/th&gt;    &lt;/tr&gt;    &lt;tr&gt;      &lt;th&gt;species&lt;/th&gt;      &lt;th&gt;&lt;/th&gt;      &lt;th&gt;&lt;/th&gt;    &lt;/tr&gt;  &lt;/thead&gt;  &lt;tbody&gt;    &lt;tr&gt;      &lt;th&gt;setosa&lt;/th&gt;      &lt;td&gt;4.80&lt;/td&gt;      &lt;td&gt;3.68&lt;/td&gt;    &lt;/tr&gt;    &lt;tr&gt;      &lt;th&gt;versicolor&lt;/th&gt;      &lt;td&gt;5.60&lt;/td&gt;      &lt;td&gt;3.00&lt;/td&gt;    &lt;/tr&gt;    &lt;tr&gt;      &lt;th&gt;virginica&lt;/th&gt;      &lt;td&gt;6.22&lt;/td&gt;      &lt;td&gt;3.18&lt;/td&gt;    &lt;/tr&gt;  &lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;In order to get various percentiles of sepal widths and lengths, we can leverage lambda functions and not have to bother defining our own. We use the renamer to fix give these lambda functions understandable names.&lt;/p&gt;

&lt;p&gt;To take this a step further, we can include the column name in the rename string and drop the top level of the column multiIndex:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;numpy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;percentile&lt;/span&gt;

&lt;span class="n"&gt;df3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;iris&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;groupby&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'species'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agg&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="s"&gt;'sepal_length'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;renamer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;percentile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;&lt;span class="s"&gt;'Length 25th Percentile'&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
    &lt;span class="s"&gt;'sepal_width'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;renamer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;percentile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;75&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;&lt;span class="s"&gt;'Width 75th Percentile'&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 

&lt;span class="n"&gt;df3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;droplevel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;df3&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table class="dataframe"&gt;  &lt;thead&gt;    &lt;tr&gt;      &lt;th&gt;&lt;/th&gt;      &lt;th&gt;Length 25th Percentile&lt;/th&gt;      &lt;th&gt;Width 75th Percentile&lt;/th&gt;    &lt;/tr&gt;    &lt;tr&gt;      &lt;th&gt;species&lt;/th&gt;      &lt;th&gt;&lt;/th&gt;      &lt;th&gt;&lt;/th&gt;    &lt;/tr&gt;  &lt;/thead&gt;  &lt;tbody&gt;    &lt;tr&gt;      &lt;th&gt;setosa&lt;/th&gt;      &lt;td&gt;4.80&lt;/td&gt;      &lt;td&gt;3.68&lt;/td&gt;    &lt;/tr&gt;    &lt;tr&gt;      &lt;th&gt;versicolor&lt;/th&gt;      &lt;td&gt;5.60&lt;/td&gt;      &lt;td&gt;3.00&lt;/td&gt;    &lt;/tr&gt;    &lt;tr&gt;      &lt;th&gt;virginica&lt;/th&gt;      &lt;td&gt;6.22&lt;/td&gt;      &lt;td&gt;3.18&lt;/td&gt;    &lt;/tr&gt;  &lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h1&gt;
  
  
  Final Thoughts
&lt;/h1&gt;

&lt;p&gt;There are many ways to skin a cat when working with pandas dataframes, but I'm constantly looking for ways to simplify and speed-up my work-flow. This solution helps me work through aggregation steps and easily create sharable tables. It certainly won't work for all situations, but consider using it the next time you get frustrated with unhelpful column names!  &lt;/p&gt;

</description>
      <category>python</category>
      <category>pandas</category>
      <category>datascience</category>
    </item>
  </channel>
</rss>
