<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Matt Grofsky</title>
    <description>The latest articles on DEV Community by Matt Grofsky (@code_munkee).</description>
    <link>https://dev.to/code_munkee</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F510496%2F48806bae-ced3-4c75-b2f0-8d47b27ef8ae.jpg</url>
      <title>DEV Community: Matt Grofsky</title>
      <link>https://dev.to/code_munkee</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/code_munkee"/>
    <language>en</language>
    <item>
      <title>Google AI Vision &amp; Text to Speech on a Raspberry Pi</title>
      <dc:creator>Matt Grofsky</dc:creator>
      <pubDate>Wed, 11 Nov 2020 16:05:31 +0000</pubDate>
      <link>https://dev.to/code_munkee/google-ai-vision-text-to-speech-on-a-raspberry-pi-48en</link>
      <guid>https://dev.to/code_munkee/google-ai-vision-text-to-speech-on-a-raspberry-pi-48en</guid>
      <description>&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--okNZRtYz--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/t5yol44bdvayp8759jfe.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--okNZRtYz--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/t5yol44bdvayp8759jfe.jpg" alt="AI Vision"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As the CTO of &lt;a href="https://www.ytel.com"&gt;Ytel, Inc.&lt;/a&gt;, I work a lot with communications technology and machine learning. MMS is typically the defacto standard when it comes to sending photos back and forth on a mobile device outside of downloaded OTT applications. &lt;/p&gt;

&lt;p&gt;RCS is now starting to appear on mobile devices, and media sharing is expected to accelerate. I thought it would be interesting to see how hard it would be to build out an IoT type device outside the Google Cloud Platform proper that can interact with some of Google’s prebuilt AI models and interpret this media.&lt;/p&gt;

&lt;p&gt;I will provide some tools and code that will allow you to build a demo that will take a photo of a scene, analyze it, and then speak back the results.&lt;/p&gt;

&lt;p&gt;To fully build out the proof of concept you will need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A Raspberry Pi&lt;/li&gt;
&lt;li&gt;A Raspberry Pi Camera&lt;/li&gt;
&lt;li&gt;A Google Cloud Platform account&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The first step is to make sure you have Python 3.7.x or higher installed on the Pi and add your requirements.txt.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;google&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;cloud&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;vision&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="n"&gt;google&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;cloud&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;texttospeech&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="mf"&gt;2.2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="n"&gt;picamera&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="mf"&gt;1.13&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Next, let’s build out the application.&lt;/p&gt;

&lt;p&gt;In your main.py, declare your imports, provide your GCP credentials, and instantiate your Google SDK clients. Your credentials will reference a JSON file, and it should have permissions to the Cloud Vision and Cloud Text-to-Speech APIs.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;picamera&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;google.cloud&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;vision&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;google.cloud&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;texttospeech&lt;/span&gt;

&lt;span class="c1"&gt;# Needs permission for Cloud Vision API and Cloud Text-to-Speech API
&lt;/span&gt;
&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"GOOGLE_APPLICATION_CREDENTIALS"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"YourServiceAccount.json"&lt;/span&gt;
&lt;span class="n"&gt;client_vision&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vision&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ImageAnnotatorClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;client_tts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;texttospeech&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TextToSpeechClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;To analyze a photo, you first have to take a picture. The beautiful thing about a Raspberry Pi camera is that this is a simple task. &lt;/p&gt;

&lt;p&gt;Once your camera is plugged in and enabled in your Configuration, utilize the PiCamera library to take a photo. Below is a simple function for taking a picture using PiCamera.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;takephoto&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;camera&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;picamera&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PiCamera&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;camera&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;resolution&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;768&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Show me a quick preview before snapping the photo (If you have a monitor)
&lt;/span&gt;
    &lt;span class="n"&gt;camera&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_preview&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Take the photo
&lt;/span&gt;    &lt;span class="n"&gt;camera&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;capture&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'image.jpg'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The primary function will execute the &lt;code&gt;takephoto()&lt;/code&gt; function and start the process where an image.jpg file populates to the local drive. The file will be read into memory and processed by the Cloud Vision SDK, then analyzed by Google’s Cloud Vision AI service.&lt;/p&gt;

&lt;p&gt;In this instance, I chose to use the label_detection feature to help identify objects in the photo. The service also has separate functions to recognize the existence of faces, famous logos, and more. For some detailed info on what it can do, visit the official &lt;a href="https://cloud.google.com/vision/docs/labels"&gt;Google Cloud Vision AI&lt;/a&gt; docs page.&lt;/p&gt;

&lt;p&gt;The Text-to-Speech utilizes SSML and Google’s premium Wavenet voices. I don’t fully use SSML in the below example, but if you would like to see documentation highlighting some of the deeper SSML capabilities, you can do so &lt;a href="https://cloud.google.com/text-to-speech/docs/ssml"&gt;here&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;As for the voices, I highly recommend Google Wavenet &lt;a href="https://cloud.google.com/text-to-speech/docs/voices"&gt;voices&lt;/a&gt; for all TTS applications that demand near human quality synthesis.&lt;/p&gt;

&lt;p&gt;The Speech is streamed back and stored as an MP3 file on the local drive. Once saved, mpg123 is used to play the MP3 over any speaker hooked up to the Raspberry Pi. If you have not done so already, install mpg123 via the &lt;code&gt;apt install mpg123&lt;/code&gt; command.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;takephoto&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'image.jpg'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'rb'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;image_file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;image_file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vision&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client_vision&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;label_detection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client_vision&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;labels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;label_annotations&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'Labels:'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;synthesis_input&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;''&lt;/span&gt;

        &lt;span class="c1"&gt;# Make a simple comma delimited string type sentence.
&lt;/span&gt;        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;synthesis_input&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;', '&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;synthesis_input&lt;/span&gt;

        &lt;span class="n"&gt;synthesis_in&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;texttospeech&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SynthesisInput&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;synthesis_input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Let's make this a premium Wavenet voice in SSML
&lt;/span&gt;        &lt;span class="n"&gt;voice&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;texttospeech&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;VoiceSelectionParams&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;language_code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"en-US"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"en-US-Wavenet-A"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;ssml_gender&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;texttospeech&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SsmlVoiceGender&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MALE&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Select the type of audio file you want returned
&lt;/span&gt;        &lt;span class="n"&gt;audio_config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;texttospeech&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AudioConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;audio_encoding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;texttospeech&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AudioEncoding&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MP3&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Perform the text-to-speech request on the text input with the     selected
&lt;/span&gt;        &lt;span class="c1"&gt;# voice parameters and audio file type
&lt;/span&gt;        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client_tts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;synthesize_speech&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;synthesis_in&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;voice&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;voice&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;audio_config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;audio_config&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# The response's audio_content is binary.
&lt;/span&gt;        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"output.mp3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"wb"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# Write the response to the output file.
&lt;/span&gt;            &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;audio_content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'Audio content written to file "output.mp3"'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nb"&gt;file&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"output.mp3"&lt;/span&gt;
        &lt;span class="c1"&gt;# apt install mpg123
&lt;/span&gt;        &lt;span class="c1"&gt;# Save the audio file to the dir
&lt;/span&gt;        &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;system&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"mpg123 "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;'__main__'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;All code for this tutorial is on Github. Feel free to take it and modify it into something better…stronger…faster. 💪&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--i3JOwpme--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev.to/assets/github-logo-ba8488d21cd8ee1fee097b8410db9deaa41d0ca30b004c0c63de0a479114156f.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/mgrofsky"&gt;
        mgrofsky
      &lt;/a&gt; / &lt;a href="https://github.com/mgrofsky/GoogleAI-Pi"&gt;
        GoogleAI-Pi
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Google AI Vision &amp;amp; Speech on a Raspberry Pi
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;h1&gt;
GoogleAI-Pi&lt;/h1&gt;
&lt;p&gt;Google AI Vision &amp;amp; Speech on a Raspberry Pi&lt;/p&gt;
&lt;p&gt;A python demo that will take a photo of a scene, analyze it, and then speak back the results.&lt;/p&gt;
&lt;p&gt;This demo is for use on a Raspberry Pi with a Pi Camera attachment.&lt;/p&gt;
&lt;p&gt;A full breakdown of requirements can be found at:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://medium.com/@mgrofsky/google-ai-vision-text-to-speech-on-a-raspberry-pi-875dc13b3d73" rel="nofollow"&gt;https://medium.com/@mgrofsky/google-ai-vision-text-to-speech-on-a-raspberry-pi-875dc13b3d73&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;

  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/mgrofsky/GoogleAI-Pi"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;



</description>
      <category>machinelearning</category>
      <category>googlecloud</category>
    </item>
    <item>
      <title>Building Scaleable .NET Apps Without Windows</title>
      <dc:creator>Matt Grofsky</dc:creator>
      <pubDate>Mon, 09 Nov 2020 15:20:31 +0000</pubDate>
      <link>https://dev.to/code_munkee/building-scaleable-net-apps-without-windows-2hoi</link>
      <guid>https://dev.to/code_munkee/building-scaleable-net-apps-without-windows-2hoi</guid>
      <description>&lt;p&gt;“I built a .NET Web App on macOS in Visual Studio and deployed a Linux Docker container to Google App Engine.”&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F5000%2F1%2AsN0W0Ji2er3Lgd7YirLBhQ%402x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F5000%2F1%2AsN0W0Ji2er3Lgd7YirLBhQ%402x.png"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That is a phrase you don’t hear too often, if at all. It wasn’t much more than a few years ago that .NET core was released. This open-source project allowed .NET developers to free themselves from the Windows platform, and allowed Microsoft to expand into what was once unfriendly territory. Fast forward to current technology trends, and you will find .NET developers expanding out to Linux and Mac Based machines. This once paradoxical development flow led me on a personal path to see how these technologies would fit into my Google Cloud, Mac-centric world. As it turns out, it feels quite natural.&lt;/p&gt;

&lt;p&gt;Paradigms around which cloud provider is right for what purpose have persisted over the past few years. When engineers think about deploying a Windows web application in the cloud, they automatically think of Microsoft Azure. When Data Scientists want best-of-breed in Machine Learning and Analytics, they automatically think of Google Cloud. Amazon AWS is known for its global reach, maturity, and reputation. Over the past few years, these lines of distinction for building cloud-native applications have started to blur within not just cloud providers, but within the tools and operating systems used to create them. The following information contains a brief breakdown on how you or your team can start working with .NET outside of the Microsoft Windows ecosystem.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Deploying .NET in Google App Engine&lt;/strong&gt;&lt;br&gt;
For those that are not familiar with App Engine, it is a fully managed serverless application platform that takes advantage of Google’s years of building resilient and scalable architectures. It is an excellent solution if you want to start playing with .NET using Docker and don’t want to deal with Kubernetes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prerequisites:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://visualstudio.microsoft.com/vs/mac/" rel="noopener noreferrer"&gt;Download&lt;/a&gt; Visual Studio for Mac&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://console.cloud.google.com/getting-started" rel="noopener noreferrer"&gt;Sign up&lt;/a&gt; for a Google Cloud Platform account. You will get $300 in free usage.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://cloud.google.com/sdk/docs/downloads-interactive" rel="noopener noreferrer"&gt;Install&lt;/a&gt; the Google Cloud SDK.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://console.cloud.google.com/projectselector2/home/dashboard?_ga=2.16975289.-1403532037.1573498213" rel="noopener noreferrer"&gt;Create&lt;/a&gt; a project in Google Cloud.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Open Terminal in Mac and run the following command to set your default project:&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;gcloud confid set project &amp;lt;PROJECT-NAME&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The following link contains &lt;a href="https://github.com/mgrofsky/NET_Docker_Google-_App_Engine" rel="noopener noreferrer"&gt;sample code&lt;/a&gt; for those interested in trying out a pre-built Visual Studio Solution. Included is one example each for deploying a Docker container to Google App Engine as well as Kubernetes.&lt;/p&gt;

&lt;p&gt;Visual Studio comes with the needed Dockerfile templates to help any developer jump into building a container ready application. When created, the Dockerfile will expose the necessary ports so that the Web App is reachable. One essential item here is that when designing and deploying a custom runtime, the App Engine front end will route incoming requests to the appropriate module on port 8080, and you must be sure that your application code is listening on that port.&lt;/p&gt;

&lt;p&gt;The default &lt;strong&gt;EXPOSE&lt;/strong&gt; when adding in Docker support to a .NET web application is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;    &lt;span class="s"&gt;FROM mcr.microsoft.com/dotnet/core/aspnet:2.2-stretch-slim AS base&lt;/span&gt;
    &lt;span class="s"&gt;WORKDIR /app&lt;/span&gt;
    &lt;span class="s"&gt;EXPOSE &lt;/span&gt;&lt;span class="m"&gt;80&lt;/span&gt;
    &lt;span class="s"&gt;EXPOSE &lt;/span&gt;&lt;span class="m"&gt;443&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;You will want to change this to:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;    &lt;span class="s"&gt;FROM mcr.microsoft.com/dotnet/core/aspnet:2.2-stretch-slim AS base&lt;/span&gt;
    &lt;span class="s"&gt;WORKDIR /app&lt;/span&gt;
    &lt;span class="s"&gt;EXPOSE &lt;/span&gt;&lt;span class="m"&gt;8080&lt;/span&gt;
    &lt;span class="s"&gt;ENV ASPNETCORE_URLS=http://*:8080&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Everything else will remain the same. You may then go ahead and create your .NET web application as usual. The final step before deploying to App Engine is to specify the custom runtime. Create a file called app.yaml and place it in your root directory.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;    &lt;span class="na"&gt;runtime&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;custom&lt;/span&gt;
    &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;flex&lt;/span&gt;
    &lt;span class="na"&gt;manual_scaling&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;instances&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
    &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
      &lt;span class="na"&gt;memory_gb&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.5&lt;/span&gt;
      &lt;span class="na"&gt;disk_size_gb&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;

    &lt;span class="na"&gt;service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;service-test&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The sample app.yaml above incurs costs to run on the App Engine flexible environment. The settings are to reduce costs during testing and are not appropriate for production use. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Remember that the flex environment does not scale down to 0 and could become costly if you supply the wrong resources and forget about it.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For more information, see: &lt;a href="https://cloud.google.com/appengine/docs/flexible/python/configuring-your-app-with-app-yaml" rel="noopener noreferrer"&gt;https://cloud.google.com/appengine/docs/flexible/python/configuring-your-app-with-app-yaml&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In Terminal on your Mac, browse to the root directory containing your app.yaml file and run the command:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gcloud app deploy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;You can preview your running application by browsing to &lt;strong&gt;&lt;em&gt;App-Engine&amp;gt;Services&lt;/em&gt;&lt;/strong&gt; in the Google Cloud Platform Console.&lt;/p&gt;

&lt;p&gt;You just deployed a highly resilient and scalable .NET application without running a Windows Client machine or bringing up any Windows Servers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sample Visual Studio Solution&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev.to%2Fassets%2Fgithub-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/mgrofsky" rel="noopener noreferrer"&gt;
        mgrofsky
      &lt;/a&gt; / &lt;a href="https://github.com/mgrofsky/NET_Docker_Google-_App_Engine" rel="noopener noreferrer"&gt;
        NET_Docker_Google-_App_Engine
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;.NET Web App in Google App Engine Flexible w/ Docker Support&lt;/h2&gt;
&lt;/div&gt;




&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;app.yaml&lt;/h3&gt;
&lt;/div&gt;

&lt;p&gt;This is required to deploy to GAE Flexible. Runtime will be &lt;code&gt;custom&lt;/code&gt; and env will be &lt;code&gt;flex&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;You can either add your service into the &lt;code&gt;app.yaml&lt;/code&gt; or specify it in the &lt;code&gt;gcloud app deploy&lt;/code&gt; command.&lt;/p&gt;

&lt;div class="snippet-clipboard-content notranslate position-relative overflow-auto"&gt;&lt;pre class="notranslate"&gt;&lt;code&gt;runtime: custom
env: flex

# This sample incurs costs to run on the App Engine flexible environment. 
# The settings below are to reduce costs during testing and are not appropriate
# for production use. For more information, see:
# https://cloud.google.com/appengine/docs/flexible/python/configuring-your-app-with-app-yaml
manual_scaling:
  instances: 1
resources:
  cpu: 1
  memory_gb: 0.5
  disk_size_gb: 10

service: matt-test
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Remember that the flex environment does not scale down to 0 and could become costly if you supply the wrong resources and forget about it.&lt;/strong&gt;&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;Dockerfile&lt;/h3&gt;

&lt;/div&gt;

&lt;p&gt;One key item here is that when creating and deploying a custom runtime, the App Engine front end will route incoming requests to…&lt;/p&gt;
&lt;/div&gt;


&lt;/div&gt;
&lt;br&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/mgrofsky/NET_Docker_Google-_App_Engine" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;br&gt;
&lt;/div&gt;
&lt;br&gt;


&lt;p&gt;Translated for Ukrainian audiences:&lt;/p&gt;


&lt;div class="ltag__link"&gt;
  &lt;a href="https://medium.com/temy-ukraine/%D1%81%D1%82%D0%B2%D0%BE%D1%80%D0%B5%D0%BD%D0%BD%D1%8F-%D0%BC%D0%B0%D1%81%D1%88%D1%82%D0%B0%D0%B1%D0%BE%D0%B2%D0%B0%D0%BD%D0%B8%D1%85-%D0%BF%D1%80%D0%BE%D0%B3%D1%80%D0%B0%D0%BC-net-%D0%B1%D0%B5%D0%B7-windows-31ddacac5dba" class="ltag__link__link" rel="noopener noreferrer"&gt;
    &lt;div class="ltag__link__pic"&gt;
      &lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fv2%2Fresize%3Afill%3A88%3A88%2F0%2AGcOXQAriwSkf935K.jpg" alt="Andrew Sheludenkov"&gt;
    &lt;/div&gt;
  &lt;/a&gt;
  &lt;a href="https://medium.com/temy-ukraine/%D1%81%D1%82%D0%B2%D0%BE%D1%80%D0%B5%D0%BD%D0%BD%D1%8F-%D0%BC%D0%B0%D1%81%D1%88%D1%82%D0%B0%D0%B1%D0%BE%D0%B2%D0%B0%D0%BD%D0%B8%D1%85-%D0%BF%D1%80%D0%BE%D0%B3%D1%80%D0%B0%D0%BC-net-%D0%B1%D0%B5%D0%B7-windows-31ddacac5dba" class="ltag__link__link" rel="noopener noreferrer"&gt;
    &lt;div class="ltag__link__content"&gt;
      &lt;h2&gt;Створення масштабованих програм .NET без Windows | by Andrew Sheludenkov | Temy.co Ukraine | Medium&lt;/h2&gt;
      &lt;h3&gt;Andrew Sheludenkov ・ &lt;time&gt;Oct 7, 2020&lt;/time&gt; ・ 
      &lt;div class="ltag__link__servicename"&gt;
        &lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev.to%2Fassets%2Fmedium-f709f79cf29704f9f4c2a83f950b2964e95007a3e311b77f686915c71574fef2.svg" alt="Medium Logo"&gt;
        Medium
      &lt;/div&gt;
    &lt;/h3&gt;
&lt;/div&gt;
  &lt;/a&gt;
&lt;/div&gt;


</description>
      <category>dotnet</category>
      <category>docker</category>
      <category>googlecloud</category>
    </item>
    <item>
      <title>Analyze Your Call Recordings With Google AI</title>
      <dc:creator>Matt Grofsky</dc:creator>
      <pubDate>Mon, 09 Nov 2020 04:32:44 +0000</pubDate>
      <link>https://dev.to/code_munkee/analyze-your-call-recordings-with-google-ai-2a7h</link>
      <guid>https://dev.to/code_munkee/analyze-your-call-recordings-with-google-ai-2a7h</guid>
      <description>&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--GANp5gHP--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/ajcug1elhpmd9wgg3d9q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--GANp5gHP--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/ajcug1elhpmd9wgg3d9q.png" alt="Analyze Calls"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For most companies, the story usually goes like this.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A customer calls in to complain, praise, or ask for assistance.&lt;/li&gt;
&lt;li&gt;The call is recorded for further training or evaluation.&lt;/li&gt;
&lt;li&gt;The recording is typically picked at random, listened to by someone, and reviewed with the customer service representative.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This process can take anywhere from an hour to a week after a customer hangs up. During this time, a lot can go wrong. Compliance issues and poor service could leave you with some unhappy customers. &lt;/p&gt;

&lt;p&gt;I’ll show you how to work smarter, not harder, and identify problems as soon as they occur. What most developers don’t realize is that the intricate pieces pre-built inside the Google Cloud Platform.&lt;/p&gt;

&lt;p&gt;There are three essential items you will want to look for when evaluating a call.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Identity&lt;/strong&gt; — Separate the individuals on the call distinctly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sentiment&lt;/strong&gt; — Are these individuals generally positive or negative in the interaction.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Trigger Words&lt;/strong&gt; — Were any words or phrases said that warrant further review.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let’s complicate this a bit and evaluate single-channel audio phone calls. Complexity means we are not only dealing with call quality type audio, but also audio where each caller co-mingles in a single channel. Single channels make it much harder to distinguish who is talking and when.&lt;/p&gt;

&lt;p&gt;A Google Cloud Function is the easiest way to trigger code execution at scale when a file is uploaded to Cloud Storage. &lt;a href="https://cloud.google.com/functions/docs/tutorials/storage#functions-change-directory-python"&gt;Setting up a Cloud Function&lt;/a&gt; for this purpose is easy and straight forward.&lt;/p&gt;

&lt;p&gt;Let’s first start with the requirements.txt file and imports.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Requirements.txt&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;google&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;cloud&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;speech&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="mf"&gt;1.3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;google&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;cloud&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;storage&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="mf"&gt;1.27&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="n"&gt;pathlab&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;imports&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In this example, I will be using diarization to distinguish and separate the audio between the two callers. &lt;a href="https://en.wikipedia.org/wiki/Speaker_diarisation#:~:text=Speaker%20diarisation%20(or%20diarization)%20is,according%20to%20the%20speaker%20identity."&gt;Diarization&lt;/a&gt; is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The process of partitioning an input audio stream into homogeneous segments according to the speaker identity&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This process requires Cloud Speech beta module speech_v1p1beta1.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;requests&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;sys&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;uuid&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;google.cloud&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;speech_v1p1beta1&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;google.cloud.speech_v1p1beta1&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;enums&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;google.cloud&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;storage&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Identifying the created file&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;As the Cloud Function is triggered by a google.storage.object.finalize event inside GCS, a dictionary with data specific to this type of event is sent.&lt;/p&gt;

&lt;p&gt;Grabbing the path of the file name is as easy as pulling out the object &lt;code&gt;file[‘name’]&lt;/code&gt; from the &lt;code&gt;[dictionary]&lt;/code&gt; (&lt;a href="https://cloud.google.com/functions/docs/calling/storage"&gt;https://cloud.google.com/functions/docs/calling/storage&lt;/a&gt;). Knowing all this information, we can build out a gs:// URI that can be used for various Google AI services.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;BucketName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'gcs-bucket'&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;transcribe_audio&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nb"&gt;file&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;
    &lt;span class="n"&gt;now&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;FileName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'name'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;storage_uri&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'gs://'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;BucketName&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;'/'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;FileName&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Transcribing the Audio&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Before transcribing the audio, I first want to make sure it is an actual audio file. In this example, I am only going to deal with mp3 audio. There are a tremendous amount of options to choose from, and I will highlight a few. First, the hertz rate is essential, and more often than not, is 8000 for phone audio recordings. Second, because this is a phone call, it is different. Google has a different Machine Learning model for phone call audio that creates a better transcription overall. Finally, for proper configuration, make sure to enable diarization and set the appropriate amount of speakers on the call. If required, auto-adjust your utterance dictionary and pick out specific pronouns, business names, or phrases that can show up in conversation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;    &lt;span class="c1"&gt;# Let's process only mp3 files
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;storage_uri&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;:]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="s"&gt;".mp3"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;speech_v1p1beta1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SpeechClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Sample rate in Hertz of the audio data sent
&lt;/span&gt;        &lt;span class="n"&gt;sample_rate_hertz&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;8000&lt;/span&gt;

    &lt;span class="c1"&gt;# The language of the supplied audio
&lt;/span&gt;        &lt;span class="n"&gt;language_code&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"en-US"&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"phone_call"&lt;/span&gt;

    &lt;span class="c1"&gt;# Encoding of audio data sent. This sample sets this explicitly.
&lt;/span&gt;    &lt;span class="c1"&gt;# This field is optional for FLAC and WAV audio formats.
&lt;/span&gt;        &lt;span class="n"&gt;encoding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;enums&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RecognitionConfig&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AudioEncoding&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MP3&lt;/span&gt;
        &lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s"&gt;"sample_rate_hertz"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;sample_rate_hertz&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"language_code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;language_code&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"encoding"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;encoding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"use_enhanced"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"enable_automatic_punctuation"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"enable_speaker_diarization"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"diarization_speaker_count"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"speech_contexts"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
                &lt;span class="s"&gt;"phrases"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"Thank you for calling ABC"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
                &lt;span class="s"&gt;"Thank you for contacting ABC"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="s"&gt;"Welcome to ABC"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="s"&gt;"ABC customer service"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="s"&gt;"Thank you for calling ABC customer support."&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="p"&gt;}]&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;audio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;"uri"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;storage_uri&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;operation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;long_running_recognize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;audio&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;#print(u"Waiting for operation to complete...")
&lt;/span&gt;        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;operation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;transcript&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;
        &lt;span class="n"&gt;transcriptw&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;
        &lt;span class="n"&gt;sendtrans&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
        &lt;span class="n"&gt;keyword&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Empty Audio"&lt;/span&gt;
        &lt;span class="n"&gt;speaker&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;words_info&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alternatives&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;words&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;word_info&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;words_info&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word_info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;speaker_tag&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s"&gt;"0"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word_info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;speaker_tag&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;speaker&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;#print(str(word_info.speaker_tag) + " is not " + str(speaker))
&lt;/span&gt;                    &lt;span class="n"&gt;speaker&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word_info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;speaker_tag&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="n"&gt;transcriptw&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;transcriptw&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;-------&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;*Speaker "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;speaker&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;":* "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;word_info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;
                 &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;#print(str(word_info.speaker_tag) + " is " + speaker)
&lt;/span&gt;                    &lt;span class="n"&gt;transcriptw&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;transcriptw&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;" "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;word_info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;
                    &lt;span class="n"&gt;speaker&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word_info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;speaker_tag&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;sendtrans&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
    &lt;span class="n"&gt;keyword&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Empty Audio"&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transcriptw&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;transcriptw&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;transcriptw&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"*No Sound*"&lt;/span&gt;
        &lt;span class="n"&gt;sendtrans&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nb"&gt;list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"bitcoin"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;"payment"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"invoice"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"bill"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"utilities"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"utility"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"electricity"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"credit card"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"package"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"testing"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;"kits"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;"financial"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"supplies"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"mask"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"symptoms"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"isolate"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;"oxygen"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;"ventilator"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;"social security"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;"government"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;"internal revenue"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;"covid"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"world health"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"national institute"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"virus"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"corona"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;"quarantine"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;"stimulus"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;"relief"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;"cdc"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;"disease"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;"pandemic"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;"epidemic"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;"sickness"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; 
        &lt;span class="c1"&gt;# Using for loop 
&lt;/span&gt;        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; 
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;transcriptw&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
                &lt;span class="n"&gt;keyword&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                &lt;span class="n"&gt;sendtrans&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;sendtrans&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s"&gt;"Sending to Slack: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'name'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;."&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;filename&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'name'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="n"&gt;send_slack&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transcript&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;&lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;keyword&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For Longer audio such as entire phone conversations, the best practice is to use the &lt;code&gt;client.long_running_recognize(config, audio)&lt;/code&gt; method. This method performs asynchronous speech recognition.&lt;/p&gt;

&lt;p&gt;After transcribing, I check the transcript for any keyword triggers and, if any match, send the transcription to slack for immediate notification. &lt;/p&gt;

&lt;p&gt;Below is the slack function&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;send_slack&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transcript&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;keyword&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"https://hooks.slack.com/services/ABCDEFG/123456/ABC123"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="s"&gt;"Content-Type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"application/json"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"*Audio:* https://storage.cloud.google.com/"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;BucketName&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;"/"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;filename&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;*Transcription:*&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;transcript&lt;/span&gt; 
        &lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'Response HTTP Status Code: {status_code}'&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'Response HTTP Response Body: {content}'&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exceptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RequestException&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'HTTP Request failed'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;An open-source and simplified example of the above code is in one of &lt;a href="https://gitlab.com/ytelprojects/covid-19-compliance-module"&gt;Ytel’s public Gitlab repositories&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Telecom companies quickly needed to identify and report certain types of scam oriented communications when the Covid-19 outbreak started.&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>googlecloud</category>
    </item>
    <item>
      <title>Error Budgeting &amp; Site Reliability Engineering</title>
      <dc:creator>Matt Grofsky</dc:creator>
      <pubDate>Sun, 08 Nov 2020 17:54:38 +0000</pubDate>
      <link>https://dev.to/code_munkee/error-budgeting-site-reliability-engineering-3061</link>
      <guid>https://dev.to/code_munkee/error-budgeting-site-reliability-engineering-3061</guid>
      <description>&lt;p&gt;When most companies search for an online SaaS solution, Service Level Agreements (SLA) play a crucial role in influencing the sale. Downtime surrounding a company’s SLA typically contain calculations around minutes of uptime for the service over some time. One sure-fire way you can provide adequate, and actionable SLA numbers are by implementing Site Reliability Engineering methodologies (SRE) and Error Budgeting.&lt;/p&gt;

&lt;p&gt;As defined by &lt;a href="https://cloud.google.com/blog/products/gcp/sre-fundamentals-slis-slas-and-slos"&gt;Google&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;SRE begins with the idea that a prerequisite to success is availability. A system that is unavailable cannot perform its function and will fail by default. Availability, in SRE terms, defines whether a system is able to fulfill its intended function at a point in time. In addition to being used as a reporting tool, the historical availability measurement can also describe the probability that your system will perform as expected in the future.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;There are 3 tools at your disposal to help you identify and measure your SRE efforts.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;SLA:&lt;/strong&gt; The Service Level Agreement is a contract that the service provider promises customers on service availability, performance.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;SLO:&lt;/strong&gt; The Service Level Objective is a goal for a component that a service provider wants to reach. The SLO is not shared with the customer but is instead an internal goal.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;SLI:&lt;/strong&gt; The Service Level Indicator is a measurement the service provider uses for the SLO goal.(A Measurement that defines “Good Enough.” We should have enough “Good Enough” s to meet our SLO)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Utilizing an example service that provides a REST API to send out an SMS, we need to identify customers and a successful journey.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example User and Service:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;What is my service?&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;The service is an API that sends an SMS.&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Who uses my service?&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;The users consist of businesses that like to communicate via SMS to their customers.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example Journey:&lt;/strong&gt;&lt;br&gt;
A business user can make a REST request to our API and send an SMS to a mobile phone. The SMS should respond to, and the inbound SMS can be forwarded to a URL endpoint when received.&lt;/p&gt;

&lt;p&gt;If I now rewrite this journey by using something measurable, I can create something that could be tracked and monitored.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Define your SLI&lt;/strong&gt;&lt;br&gt;
When a customer initiates a request to the SendSMS API endpoint, the amount of time to get a response back (response time), as measured by the time to send a response back to the customer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Calculating SLI&lt;/strong&gt;&lt;br&gt;
When calculating reliability with your SLI, most take the approach of defining availability as:&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Availability = (Number of minutes a system is working well / Total minutes) * 100&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;What you end up with is a fraction that defines your uptime percentage. This method of calculating reliability has some positives, but it also has some negatives. It’s straightforward for a human to understand the percentage and gauge reliability since the metric is binary. The service is up, or the service is down. The downside is that this process doesn’t work in distributed systems where multiple systems and servers are involved in contributing to this calculation.&lt;/p&gt;

&lt;p&gt;A better approach to calculating SLI is to track events between your systems and not minutes.&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Availability = (Number of good events / Total events) * 100&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;What happens here is that you gain some additional benefits by tracking events across servers versus tracking just time up and down. The number of servers in this scenario is irrelevant as you are measuring events that affect customers and their journey directly. This helps in situations where you use managed instance groups or preemptible machines in a cloud environment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Define your SLO&lt;/strong&gt;&lt;br&gt;
An example SLO could be something as simple as, “&lt;em&gt;99% of SMS requests return in under 300ms&lt;/em&gt;”. As you create the SLO, understand that this number is not static. Over time, your customers help define the true metric and decide if it needs to be adjusted upward. You should adjust your calculations to match the level of the outage. As an example, if your outage is considered degraded, multiply your error budget consumption for the incident by 0.25. If your outage is considered partial, multiply your error budget consumption for the incident by 0.5. Understanding the basics of making calculations across your infrastructure will help guide your decisions, but sometimes these calculations should be modified to meet specific goals. One example is to apply a force multiplier to specific customers that pay more for your services, or a force divider if a “bad” event occurs off-hours.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--DRQFagcT--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images-1.medium.com/max/4428/1%2A2zeu7In8gR87gyx1n2Nsrw%402x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--DRQFagcT--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images-1.medium.com/max/4428/1%2A2zeu7In8gR87gyx1n2Nsrw%402x.png" alt="Error Budget Allocation Based on Uptime Percentage"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A 99% uptime allows for 14m 24s of downtime per day. This downtime is roughly 1h 40m, 7h 18m, and 3d 15h of downtime per week, month, and a year, respectively. These times calculated through your SLO target is your “Error Budget.” The error budget is your downtime, and the least allowable time you are willing to deal with lowered performance over 30 days and should be lower than your public SLA. Calculate your SLO through your various SLIs. Each time your SLI fails, and an event returns bad, you are consuming a portion of your allowed error budget.&lt;/p&gt;

&lt;p&gt;It’s important to remember that technology services are complex, and those complex systems fail. Embracing failure is essential to growth as long as that failure in the system is understood, and effort is taken to fix the cause. Using this thought process of complexity in systems, we can say that 100% reliability does not exist. As long as you are within your budget, failure is OK. It’s important to monitor as much you can as this helps you gain insight into systems, and you can only improve upon what you measure.&lt;/p&gt;

&lt;p&gt;If you find you are all quickly eating up your budget because a service is having problems, it’s time to re-prioritize your team to fix the budget eater. It provides guidance to stakeholders that reducing efforts on non-reliability features and refocusing on reliability features and infrastructure. One key benefit of utilizing error budgets is the reduction of paging alert fatigue for engineers. Error budgeting makes it easier to set paging alerts based on the amount of error budget consumed in X minutes versus the traditional method of paging someone every time a failure is noticed.&lt;/p&gt;

&lt;p&gt;As you work to stay within your error budget, you will start to notice better reliability. Better reliability translates to increased uptime and an SLO that can be increased upward. Furthermore, as reliability increases, so does customer satisfaction, and this directly translates to increased revenue.&lt;/p&gt;

&lt;p&gt;Do you have error budgeting and SRE all figured out? Try shutting down a random server or zone and see what happens.&lt;/p&gt;

&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/uTEL8Ff1Zvk"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;This article was originally posted on Medium:&lt;br&gt;
&lt;/p&gt;
&lt;div class="ltag__link"&gt;
  &lt;a href="https://medium.com/swlh/error-budgeting-site-reliability-engineering-e71b104daa73" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__pic"&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--u3SKpWMR--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/fit/c/96/96/2%2Ab7KP0_EEsEDHG1m4fEsgKA.jpeg" alt="Matt Grofsky"&gt;
    &lt;/div&gt;
  &lt;/a&gt;
  &lt;a href="https://medium.com/swlh/error-budgeting-site-reliability-engineering-e71b104daa73" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__content"&gt;
      &lt;h2&gt;Error Budgeting &amp;amp; Site Reliability Engineering | by Matt Grofsky | The Startup | Medium&lt;/h2&gt;
      &lt;h3&gt;Matt Grofsky ・ &lt;time&gt;Nov 28, 2019&lt;/time&gt; ・ 
      &lt;div class="ltag__link__servicename"&gt;
        &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ze5yh_2q--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev.to/assets/medium_icon-90d5232a5da2369849f285fa499c8005e750a788fdbf34f5844d5f2201aae736.svg" alt="Medium Logo"&gt;
        Medium
      &lt;/div&gt;
    &lt;/h3&gt;
&lt;/div&gt;
  &lt;/a&gt;
&lt;/div&gt;


</description>
      <category>sre</category>
      <category>monitoring</category>
    </item>
  </channel>
</rss>
