<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Vinicius Aguiar</title>
    <description>The latest articles on DEV Community by Vinicius Aguiar (@vncsna).</description>
    <link>https://dev.to/vncsna</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F478201%2F9331d27d-37d0-4646-9f50-40b7b4868f25.png</url>
      <title>DEV Community: Vinicius Aguiar</title>
      <link>https://dev.to/vncsna</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/vncsna"/>
    <language>en</language>
    <item>
      <title>Como funciona o sistema de inserção de dados na BD</title>
      <dc:creator>Vinicius Aguiar</dc:creator>
      <pubDate>Fri, 28 May 2021 15:32:31 +0000</pubDate>
      <link>https://dev.to/basedosdados/como-funciona-o-sistema-de-insercao-de-dados-na-bd-25dk</link>
      <guid>https://dev.to/basedosdados/como-funciona-o-sistema-de-insercao-de-dados-na-bd-25dk</guid>
      <description>&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; Vamos conhecer um pouquinho sobre como funciona a infraestrutura de inserção de dados da Base dos Dados e como você pode melhorar seu portfólio de cientista de dados e/ou desenvolvedor contribuindo com a nossa missão de universalizar o acesso a dados.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Infraestrutura
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Pwd9BZ1m--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/g5v7q6ubopmziloojgft.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Pwd9BZ1m--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/g5v7q6ubopmziloojgft.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;O time de infraestrutura da Base dos Dados é responsável pelas ferramentas de ingestão de dados, que englobam desde o upload de dados até a disponibilização de dados no ambiente de produção; pelo acesso de dados através de pacotes em Python e R; e pelo website. Neste cenário o time é atualmente dividido em várias frentes, tratando da renovação do site e da implementação de pesos e contrapesos automatizados.&lt;/p&gt;

&lt;p&gt;Procuramos simplificar e automatizar todos os processos, começando com o &lt;a href="https://basedosdados.github.io/mais/colab_data/"&gt;upload de dados&lt;/a&gt; e inserção dos mesmos no &lt;strong&gt;Ambiente de Experimentação&lt;/strong&gt;. Neste ponto o colaborador pode adicionar dados em sua nuvem do Google, limpar e tratar os dados, e então criar as tabelas locais com a interface de linha de comando desenvolvida pela infra. Por fim, o colaborador pode submeter a base de dados para revisão, criando um pull request no &lt;a href="https://github.com/basedosdados/mais/pulls"&gt;Github&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Após o pull request de revisão entra em ação o sistema de &lt;strong&gt;Pesos e Contrapesos&lt;/strong&gt;, com o time de dados checando a qualidade dos dados e metadados. Esse ponto é crucial para manter a qualidade dos dados, um diferencial da BD. O time de infra atua procurando automatizar o máximo possível o processo de revisão dos dados, realizando a validação de metadados como descrições e nomes de colunas; e tipos de dados, como dados chaves primárias.&lt;/p&gt;

&lt;p&gt;Após a checagem dos dados, o pull request de inserção de dados é aprovado e os dados entram no &lt;strong&gt;Ambiente de Produção&lt;/strong&gt;. Logo podem ser acessados por uma de nossas ferramentas, como os pacotes em Python e R, ou diretamente pelo BigQuery.&lt;/p&gt;

&lt;p&gt;Paralelamente ao processo de inserção de dados, o time de Infra também trabalha com a renovação do portal, visando oferecer uma interface moderna.&lt;/p&gt;

&lt;h2&gt;
  
  
  Contribuindo com dados
&lt;/h2&gt;

&lt;p&gt;No caminho para se tornar um analista de dados ou desenvolvedor nos encontramos com certas dificuldades de entrada no mercado de trabalho. Por vezes não existe balanço entre estudo e aplicação prática, ou somente a análise de dados de brincadeirinha. Levante a mão quem não passou uma época paralisado em bases de dados como o Titanic ou Iris. E apesar dessas bases de dados serem uma boa alternativa para aprender novos métodos ou ferramentas, o conhecimento obtido trabalhando com as mesmas não é transferível para o mundo real.&lt;/p&gt;

&lt;p&gt;Uma boa alternativa para lidar com dados reais e melhorar seu portfólio é ajudar a Base dos Dados com a sua ingestão de dados.  No mínimo você irá lidar com a captura de dados, preferencialmente de forma automatizada, a arquitetura de dados, e a limpeza dos mesmos. Também vai interagir com ferramentas do dia a dia de um cientista de dados, como interfaces de linha de comando, YAML e BigQuery. A experiência conquistada pode ser crucial na entrada no mercado de trabalho.&lt;/p&gt;

&lt;p&gt;Descrevemos em detalhes esse processo em &lt;a href="https://basedosdados.github.io/mais/colab_data/"&gt;Colaborando com dados na BD+&lt;/a&gt;. Em resumo o processo é dividido em quatro partes. Inicialmente você informa seu interesse para a BD. Então limpa e trata os dados. Em seguida realiza upload dos dados em seu BigQuery pessoal. E por fim envia os dados para revisão.&lt;/p&gt;

&lt;h2&gt;
  
  
  Contribuindo com a infra
&lt;/h2&gt;

&lt;p&gt;Aliás, outra forma de contribuir e melhorar seu portfólio, mas agora de desenvolvedor, é colaborando com a infraestrutura da BD.&lt;/p&gt;

&lt;p&gt;A colaboração começa conversando conosco, no bate papo da infra ou nas reuniões às 19h da segunda-feira, ambas nos canais da infra no &lt;a href="https://discord.gg/huKWpsVYx4"&gt;Discord&lt;/a&gt;. Após este passo podemos escolher uma feature ou problema para desenvolvimento, isto é, caso ainda não tenha escolhido algum problema contido nas &lt;a href="https://github.com/basedosdados/mais/issues"&gt;issues&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Como você pode colaborar? Aqui estão algumas ideias:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Adicionando novos conjuntos de dados &lt;/li&gt;
&lt;li&gt;Fazendo a revisão de submissões de dados&lt;/li&gt;
&lt;li&gt;Aprimorando e criando novas funcionalidades do pacote em Python&lt;/li&gt;
&lt;li&gt;Aprimorando e criando novas funcionalidades do pacote em R&lt;/li&gt;
&lt;li&gt;Criando um pacote em Stata&lt;/li&gt;
&lt;li&gt;Adicionando checagens automáticas de dados&lt;/li&gt;
&lt;li&gt;Adicionando checagens automáticas de metadados&lt;/li&gt;
&lt;li&gt;Desenvolvendo novas features para o site&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;Nosso projeto já te ajudou de alguma forma?&lt;/strong&gt; Saiba como nos ajudar:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://apoia.se/basedosdados"&gt;Apoie o projeto&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://basedosdados.github.io/mais/colab_data/"&gt;Seja um(a) colaborador(a) de dados na BD&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/basedosdados/mais"&gt;Colabore com nossos pacotes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Compartilhe nas redes sociais!&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Texto produzido por &lt;a href="https://github.com/vncsna"&gt;Vinicius&lt;/a&gt; e &lt;a href="https://github.com/fernandascovino"&gt;Fernanda&lt;/a&gt;, com apoio de &lt;a href="https://github.com/d116626"&gt;Diego&lt;/a&gt;, &lt;a href="https://github.com/JoaoCarabetta"&gt;João&lt;/a&gt; e &lt;a href="https://github.com/Hellcassius"&gt;Caio&lt;/a&gt; do time de Infra da Base dos Dados 💚&lt;/p&gt;

</description>
      <category>database</category>
      <category>systems</category>
      <category>python</category>
      <category>programming</category>
    </item>
    <item>
      <title>Base dos Dados Python 102</title>
      <dc:creator>Vinicius Aguiar</dc:creator>
      <pubDate>Sat, 08 May 2021 15:51:41 +0000</pubDate>
      <link>https://dev.to/basedosdados/base-dos-dados-python-102-50k0</link>
      <guid>https://dev.to/basedosdados/base-dos-dados-python-102-50k0</guid>
      <description>&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; Neste artigo vamos demonstrar como acessar os dados do BD+ via consulta SQL pelo pacote da Base Dos Dados em Python. Em seguida vamos continuar as análises do post &lt;a href="https://dev.to/basedosdados/base-dos-dados-python-101-3l6-temp-slug-495353"&gt;Base Dos Dados Python 101&lt;/a&gt;, sobre a popularidade de nomes nos estados brasileiros, com o conteúdo adaptado do &lt;a href="https://www.youtube.com/watch?v=wI2xEioDPgM" rel="noopener noreferrer"&gt;Workshop "Brincando com dados da BD+ em Python"&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;geopandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;gpd&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;basedosdados&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;bd&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;matplotlib.pyplot&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h2&gt;
  
  
  Consultas SQL na BD+
&lt;/h2&gt;

&lt;p&gt;Já vimos em um artigo anterior que podemos acessar a BD+ pela função &lt;code&gt;read_table&lt;/code&gt;, capaz de carregar uma tabela por vez. Hoje vamos acessar a BD+ por meio de consultas SQL, com a função &lt;code&gt;read_sql&lt;/code&gt;. Recomendamos essa opção, por contar com toda a capacidade da linguagem SQL, acesso amplo ao acervo da BD+ e retorno em formato de dataframe. Além do mais, sua sintaxe é análoga a &lt;code&gt;read_table&lt;/code&gt;. Isto é, precisamos apenas fornecer uma consulta válida e um id de projeto do Google Cloud.&lt;/p&gt;

&lt;p&gt;A seguir ilustramos uma consulta de dados da tabela de nomes brasileiros do IBGE (2010) cruzada com diretórios brasileiros, com fim de obter o nome mais popular de cada estado. Observe que precisamos fornecer o nome do datalake (basedosdados), dataset (br_bd_diretorios_brasil) e tabela (municipio).&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_sql&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;'''&lt;/span&gt;&lt;span class="s"&gt;
    SELECT estado.uf AS state, estado.municipio AS town, nomes.nome AS name, nomes.qtde_nascimentos_ate_2010 AS freq
    FROM  basedosdados.br_ibge_nomes_brasil.quantidade_municipio_nome_2010 AS nomes
    INNER JOIN basedosdados.br_bd_diretorios_brasil.municipio AS estado ON nomes.id_municipio = estado.id_municipio
    &lt;/span&gt;&lt;span class="sh"&gt;'''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;billing_project_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;basedosdados42&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Após lermos os dados realizamos um pré-processamento. Neste processo, eliminamos nomes únicos ou erros em municípios, como: Awo, Yorane e Zilneide. Por fim, também calculamos a frequência dos nomes nos estados por 100.000 pessoas.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;duplicated&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keep&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; &lt;span class="c1"&gt;# seleciona nomes que aparecem mais de uma vez
&lt;/span&gt;
&lt;span class="n"&gt;total_by_state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;groupby&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;state&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;freq&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;# agrupa o total por estado
&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;merge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_by_state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;on&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;state&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;suffixes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;_total_by_state&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c1"&gt;# une os dados com total por estado
&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;freq_per_100k&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100_000&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;freq&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;freq_total_by_state&lt;/span&gt; &lt;span class="c1"&gt;# cria o total por estado por 100k pessoas
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h2&gt;
  
  
  Em que estado Fred é mais famoso?
&lt;/h2&gt;

&lt;p&gt;Nossa primeira questão de análise é descobrir como está distribuída a frequência de um nome nos estados brasileiros. Em particular gostaríamos de saber em qual estado Fred, nome de um de nossos desenvolvedores, é o nome mais famoso. Então vamos contar a frequência do nome Fred por 100.000 pessoas em cada estado, e em seguida representar este dado com um mapa com gradiente de cores (coroplético). &lt;/p&gt;

&lt;p&gt;Para isso, previamente instalamos a biblioteca &lt;code&gt;geopandas&lt;/code&gt; para auxiliar o desenho do mapa e manualmente criamos um mapeamento entre o nome de um estado e sua abreviação nos dados geográficos &lt;code&gt;state2abbr&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;state2abbr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Acre&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.AC&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Alagoas&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.AL&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Amapá&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.AP&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Amazonas&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.AM&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Bahia&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.BA&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Ceará&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.CE&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Distrito Federal&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.DF&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Espírito Santo&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.ES&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Goiás&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.GO&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Maranhão&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.MA&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Mato Grosso&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.MT&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Mato Grosso do Sul&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.MS&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Minas Gerais&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.MG&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Pará&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.PA&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Paraíba&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.PB&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Paraná&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.PR&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Pernambuco&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.PE&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Piauí&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.PI&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Rio de Janeiro&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.RJ&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Rio Grande do Norte&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.RN&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Rio Grande do Sul&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.RS&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Rondônia&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.RO&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Roraima&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.RR&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Santa Catarina&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.SC&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;São Paulo&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.SP&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Sergipe&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.SE&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Tocantins&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BR.TO&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Em seguida criamos a função &lt;code&gt;visualize_name&lt;/code&gt; que filtra os dados do nome, agrupa a frequência por estados e desenha o mapa.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;visualize_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;name_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name == &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# filtra o nome escolhido
&lt;/span&gt;    &lt;span class="n"&gt;name_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;name_df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;groupby&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;state&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;as_index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;# agrupa pelos estados
&lt;/span&gt;    &lt;span class="n"&gt;name_df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;name_df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;state2abbr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="c1"&gt;# abrevia os estados
&lt;/span&gt;    &lt;span class="n"&gt;name_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;name_df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rename&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;state&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;HASC_1&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="c1"&gt;# renomeia a coluna
&lt;/span&gt;
    &lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ax&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;subplots&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;br_map&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gpd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;./shapes/gadm36_BRA_1.shp&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# lê o mapa
&lt;/span&gt;    &lt;span class="n"&gt;br_map&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;br_map&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;merge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name_df&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;on&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;HASC_1&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;how&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;outer&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# une o mapa com os dados
&lt;/span&gt;    &lt;span class="n"&gt;br_map&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;column&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;freq_per_100k&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ax&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;ax&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;legend&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# desenha o gráfico
&lt;/span&gt;
&lt;span class="nf"&gt;visualize_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Fred&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftsj44r5wocg1kbeqcs3o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftsj44r5wocg1kbeqcs3o.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Interessante notar como Fred parece um nome comum em estados como Amazonas e Roraima, devido a população nestes estados ser menor quando comparada com São Paulo.&lt;/p&gt;
&lt;h2&gt;
  
  
  Quais são os nomes mais regionais?
&lt;/h2&gt;

&lt;p&gt;Essa é uma boa pergunta a se fazer: qual nome é mais exclusivo de uma região? Ou seja, é mais provável de ser encontrado numa determinada região e não em outros estados.&lt;/p&gt;

&lt;p&gt;Vamos solucionar esta dúvida agrupando a frequência de nomes dos municípios em estados. Em seguida excluímos os nomes que aparecem apenas em um estado. Para então calcular a mediana e desvio padrão das frequências de cada nome distinto, e por fim normalizamos as frequências de nomes.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;state_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;groupby&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;state&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;as_index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;# soma as frequências de municípios
&lt;/span&gt;&lt;span class="n"&gt;state_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;state_df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;duplicated&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keep&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; &lt;span class="c1"&gt;# elimina nomes não duplicados
&lt;/span&gt;
&lt;span class="n"&gt;median&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state_df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;groupby&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;as_index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;freq_per_100k&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;median&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;state_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state_df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;merge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;median&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;on&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;suffixes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;_median&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;std&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state_df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;groupby&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;as_index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;freq_per_100k&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;std&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;state_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state_df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;merge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;on&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;suffixes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;_std&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;state_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;freq_per_100k_zscore&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state_df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;freq_per_100k&lt;/span&gt;
&lt;span class="n"&gt;state_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;freq_per_100k_zscore&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="n"&gt;state_df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;freq_per_100k_median&lt;/span&gt;
&lt;span class="n"&gt;state_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;freq_per_100k_zscore&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;/=&lt;/span&gt; &lt;span class="n"&gt;state_df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;freq_per_100k_std&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Podemos então visualizar quais são os nomes mais regionais, procurando pela máxima frequência normalizada de cada estado.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;idxmax&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state_df&lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;state&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;freq_per_100k_zscore&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]].&lt;/span&gt;&lt;span class="nf"&gt;groupby&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;state&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;idxmax&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;idxmax&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;idxmax&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ravel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;max_name_by_state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state_df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;idxmax&lt;/span&gt;&lt;span class="p"&gt;][[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;state&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;freq_per_100k_zscore&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;span class="n"&gt;max_name_by_state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;max_name_by_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sort_values&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;by&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;state&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;set_index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;state&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhkm1mbxkqo90jo3z3556.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhkm1mbxkqo90jo3z3556.PNG" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Para aprender como funciona, nada melhor que brincar com o código. Agora é com você!! Como ficou seu nome no ranking? Em qual estado aparece com maior frequência? Fez uma análise da sua família e descobriu alguma coisa legal? Conta pra gente no twitter!&lt;/p&gt;



&lt;p&gt;&lt;strong&gt;Confira o notebook com a análise completa no nosso repositório:&lt;/strong&gt;&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev.to%2Fassets%2Fgithub-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/basedosdados" rel="noopener noreferrer"&gt;
        basedosdados
      &lt;/a&gt; / &lt;a href="https://github.com/basedosdados/analises" rel="noopener noreferrer"&gt;
        analises
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      📊 Repositório de códigos simples e replicáveis das análises publicadas. 
    &lt;/h3&gt;
  &lt;/div&gt;
&lt;/div&gt;



&lt;p&gt;Texto produzido por &lt;a href="https://github.com/vncsna" rel="noopener noreferrer"&gt;Vinicius&lt;/a&gt;, &lt;a href="https://github.com/fernandascovino" rel="noopener noreferrer"&gt;Fernanda&lt;/a&gt; e Paolo, da Base dos Dados 💚&lt;/p&gt;

</description>
      <category>python</category>
      <category>programming</category>
      <category>beginners</category>
      <category>datascience</category>
    </item>
    <item>
      <title>Razões para se tornar um Alquimista I</title>
      <dc:creator>Vinicius Aguiar</dc:creator>
      <pubDate>Tue, 04 May 2021 18:19:08 +0000</pubDate>
      <link>https://dev.to/computandoarte/razoes-para-se-tornar-um-alquimista-i-33gh</link>
      <guid>https://dev.to/computandoarte/razoes-para-se-tornar-um-alquimista-i-33gh</guid>
      <description>&lt;h3&gt;
  
  
  Introdução a poções e receitas em Elixir
&lt;/h3&gt;

&lt;p&gt;Desenvolver em Elixir tem se mostrado uma experiência agradável e gostaria de compartilhar um pouco do meu aprendizado. Nesta série de posts vou apresentar as razões que me motivaram a continuar estudando esta linguagem e seu ecossistema. E neste post em específico vou apresentar o operador Pipe e Pattern Matching.&lt;/p&gt;




&lt;h3&gt;
  
  
  Digressões
&lt;/h3&gt;

&lt;p&gt;No final da minha última entrevista de emprego (em que não obtive sucesso) o gestor perguntou como eu enxergava meu futuro no cargo e o que gostaria de aprender. Para surpresa de ninguém respondi que gostaria de não estar trabalhando na área e sim estudando algo mais interessante (quem iria imaginar que eu não iria passar no processo seletivo hein), como processing ou programação funcional. Sem entrar no mérito da área ser interessante ou não, afinal é uma questão de preferência, foi uma péssima resposta que me fez refletir melhor se gostaria de trabalhar na área.&lt;/p&gt;

&lt;p&gt;Para entender melhor quais seriam meus próximos passos resolvi estudar uma linguagem funcional, e Elixir foi a escolhida. Esses são meus primeiros passos neste paradigma, e não é uma assunto simples. Citando &lt;strong&gt;Joe Armstrong&lt;/strong&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"When you start working with a new programming language, on the&lt;br&gt;
surface it seems that all you are doing is learning a new anguage. But at a deeper level, you are doing something much more profound - you are learning a new way of thinking."&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  Pipe
&lt;/h3&gt;

&lt;p&gt;O pipe é um operador que passa o resultado de uma expressão como primeiro parâmetro da próxima expressão, de acordo com a &lt;a href="https://elixirschool.com/en/lessons/basics/pipe-operator/"&gt;Elixir School&lt;/a&gt;. É bem conveniente quando precisamos passar um valor por uma série operações em cadeia, provendo uma forma de expressarmos melhor o processo. Através do pipe podemos evitar códigos escritos desta forma:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--tSakhzvA--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/qmiq9b2c7aif05k5oedg.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--tSakhzvA--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/qmiq9b2c7aif05k5oedg.PNG" alt="Série de operações em Elixir sem pipe"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;E abrir espaço para forma mais elegante:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--wREUeqYn--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/oq4kmeq2s2i8htnzmjj1.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--wREUeqYn--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/oq4kmeq2s2i8htnzmjj1.PNG" alt="Série de operações em Elixir com pipe"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Note que o pipe é um operador bastante conhecido na comunidade de programação, já foi previamente adotado por F#, Ocaml e Julia. E há pouco tempo atrás o operador foi adotado como síntaxe nativa na linguagem R, apesar de ser utilizado anteriormente. Como programador usual de Python, eu sinto que gostaria de uma feature semelhante.&lt;/p&gt;




&lt;h3&gt;
  
  
  Pattern Matching
&lt;/h3&gt;

&lt;p&gt;Segundo a &lt;a href="https://elixirschool.com/pt/lessons/basics/pattern-matching/"&gt;Elixir School&lt;/a&gt;, em Elixir o operador de igualdade ou atribuição (=) é na verdade o operador de match. Ele compara os lados esquerdo e direito da expressão, tentando igualar os mesmos, e caso consiga retorna o valor da expressão. Desta forma ao compararmos uma variável não declarada previamente a um valor, atribuímos o valor a variável.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--aqCiJytB--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ce9jbl4lh0f7jpf8qu0u.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--aqCiJytB--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ce9jbl4lh0f7jpf8qu0u.PNG" alt="Pattern Matching com um tipo básico"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Apesar de parecer exatamente igual ao operador de atribuição clássico neste exemplo, pattern matching permite uma gama de operações interessantes. Por exemplo, é possível extrair valores de bearer tokens com:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--InIOmOLg--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ob4rk36ryfdguhwby5vy.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--InIOmOLg--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ob4rk36ryfdguhwby5vy.PNG" alt="Pattern Matching de um bearer token"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;E aplicar pattern matching no nome de funções:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--oDCa3STr--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/103q4uxldtgjg1r2bkpd.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--oDCa3STr--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/103q4uxldtgjg1r2bkpd.PNG" alt="Pattern Matching na assinatura de uma função"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;No primeiro exemplo a expressão à esquerda realiza o pattern matching com a entrada da forma ["Bearer password"]. Desta forma extraio o token e em seguida retorno uma tupla com uma flag de sucesso e o token. Já o segundo exemplo mostra uma forma de declarar a mesma função com entradas distintas. Desta forma evitamos a necessidade de criar funções com assinaturas distintas ou utilizar estruturas de controle dentro das funções.&lt;/p&gt;

&lt;p&gt;A capacidade de pattern matching se extende muito além dos exemplos apresentados, e em particular acho muito prático para extrair campos específicos de structs. Para conhecer melhor suas capacidades eu indico o próprio &lt;a href="https://elixir-lang.org/getting-started/pattern-matching.html"&gt;Getting Started da Elixir&lt;/a&gt;.&lt;/p&gt;




&lt;h3&gt;
  
  
  Bônus: O tema drácula para Elixir do VSCode
&lt;/h3&gt;

&lt;p&gt;Uma sólida escolha para visualizar o código de noite. Esse tema é uma extensão do VSCode que permite uma boa visualização do código, em particular gosto do highlight amarelo dos módulos.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--GjaTDAjb--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/963vng1fcsqhdu3jqy8t.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--GjaTDAjb--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/963vng1fcsqhdu3jqy8t.PNG" alt="Exemplo do Dracula theme for Elixir"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  E por onde estou aprendendo?
&lt;/h3&gt;

&lt;p&gt;Comecei a jornada pelo livro Learn Functional Programming with Elixir de &lt;a href="https://github.com/ulissesalmeida"&gt;Ulisses Almeida&lt;/a&gt;, e daí segui pra implementação de programas simples com suporte do &lt;a href="https://elixir-lang.org/getting-started/introduction.html"&gt;Getting Started&lt;/a&gt;. Agora estou lendo Programming Phoenix 1.4 de Chris McCord, Bruce Tate e &lt;a href="https://github.com/josevalim"&gt;José Valim&lt;/a&gt; (O criador da linguagem é BR).&lt;/p&gt;

&lt;h3&gt;
  
  
  E quem pode ajudar?
&lt;/h3&gt;

&lt;p&gt;A comunidade é bem ativa e prestativa, apesar das perguntas meio bobas que me vem em mente. (Não se engane, existem perguntas ruins, mas nunca deixe de perguntar). Recomendo conversar em &lt;a href="https://t.me/elug_ce"&gt;Elug CE&lt;/a&gt;, &lt;a href="https://telegram.me/elixirbr"&gt;Elixir BR&lt;/a&gt; e &lt;a href="https://twitter.com/elixir_utfpr"&gt;Elixir UTFPR&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Agradecimentos
&lt;/h3&gt;

&lt;p&gt;Obrigado.&lt;/p&gt;

</description>
      <category>elixir</category>
      <category>functional</category>
    </item>
  </channel>
</rss>
