<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: João Moura</title>
    <description>The latest articles on DEV Community by João Moura (@joaopcm1996).</description>
    <link>https://dev.to/joaopcm1996</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F715601%2Fc83e3628-a143-4af7-987f-c477e8530aaf.jpeg</url>
      <title>DEV Community: João Moura</title>
      <link>https://dev.to/joaopcm1996</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/joaopcm1996"/>
    <language>en</language>
    <item>
      <title>NLP@AWS Newsletter 03/2022</title>
      <dc:creator>João Moura</dc:creator>
      <pubDate>Tue, 08 Mar 2022 17:59:07 +0000</pubDate>
      <link>https://dev.to/aws/aws-nlp-newsletter-february-2022-m6f</link>
      <guid>https://dev.to/aws/aws-nlp-newsletter-february-2022-m6f</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk3b6fewaamn8259m0pyy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk3b6fewaamn8259m0pyy.png" alt="Alt Text" width="720" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Hello world. This is the monthly Natural Language Processing(NLP) newsletter covering everything related to NLP at AWS in the month of February. You can find previous month's newsletters &lt;a href="https://dev.to/search?q=aws%20nlp%20newsletter"&gt;here&lt;/a&gt;. Feel free to leave comments or share it with your social networks to celebrate this new launch with us. Let's dive in! &lt;/p&gt;




&lt;h2&gt;
  
  
  NLP Customer Success Stories
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/blogs/machine-learning/how-kustomer-utilizes-custom-docker-images-amazon-sagemaker-to-build-a-text-classification-pipeline/" rel="noopener noreferrer"&gt;&lt;strong&gt;How Kustomer utilizes custom Docker images &amp;amp; Amazon SageMaker to build a text classification pipeline&lt;/strong&gt;&lt;/a&gt;&lt;br&gt;
Kustomer is the omnichannel SaaS CRM platform reimagining enterprise customer service to deliver standout experiences. Kustomer wanted the ability to rapidly analyze large volumes of support communications for their business customers — customer experience and service organizations — and automate discovery of information such as the end-customer’s intent, customer service issue, and other relevant insights related to the consumer.&lt;/p&gt;

&lt;p&gt;In this blog post, the authors describe how Kustomer uses custom Docker images for SageMaker training and inference, which eases integration and streamlines the process. With this approach, Kustomer’s business customers are automatically classifying over 50k support emails each month with up to 70% accuracy.&lt;/p&gt;




&lt;h2&gt;
  
  
  Updates on AWS Language Services
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/blogs/machine-learning/apply-profanity-masking-in-amazon-translate/" rel="noopener noreferrer"&gt;&lt;strong&gt;Apply profanity masking in Amazon Translate&lt;/strong&gt;&lt;/a&gt;&lt;br&gt;
Amazon Translate typically chooses clean words for your translation output. But in some situations, you want to prevent words that are commonly considered as profane terms from appearing in the translated output.&lt;/p&gt;

&lt;p&gt;You can now apply profanity masking to both &lt;a href="https://docs.aws.amazon.com/translate/latest/dg/sync.html" rel="noopener noreferrer"&gt;real-time translation&lt;/a&gt; or &lt;a href="https://docs.aws.amazon.com/translate/latest/dg/async.html" rel="noopener noreferrer"&gt;asynchronous batch processing&lt;/a&gt; in Amazon Translate. When using Amazon Translate with profanity masking enabled, the five-character sequence ?$#@$ is used to mask each profane word or phrase, regardless of the number of characters. Amazon Translate detects each profane word or phrase literally, not contextually.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh42v7eb50tibs3zaa9x3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh42v7eb50tibs3zaa9x3.png" alt="Alt Text" width="800" height="561"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/blogs/machine-learning/control-formality-in-machine-translated-text-using-amazon-translate/" rel="noopener noreferrer"&gt;&lt;strong&gt;Control formality in machine translated text using Amazon Translate&lt;/strong&gt;&lt;/a&gt;&lt;br&gt;
This newly released feature in Amazon Translate allows you to customize the level of formality in your translation output. At the time of writing, the formality customization feature is available for six target languages: French, German, Hindi, Italian, Japanese, and Spanish. You can customize the formality of your translated output to suit your communication needs, at three different levels:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Default&lt;/strong&gt; – No control over formality by letting the neural machine translation operate with no influence&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Formal&lt;/strong&gt; – Useful in the insurance and healthcare industry, where you may prefer a more formal translation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Informal&lt;/strong&gt; – Useful for customers in gaming and social media who prefer an informal translation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/blogs/machine-learning/announcing-the-launch-of-the-model-copy-feature-for-amazon-comprehend-custom-models/" rel="noopener noreferrer"&gt;&lt;strong&gt;Announcing the launch of the model copy feature for Amazon Comprehend custom models&lt;/strong&gt;&lt;/a&gt;&lt;br&gt;
AWS has launched the Amazon Comprehend custom model copy feature this past month, unlocking the important capability of automatically copying your Amazon Comprehend custom models from a source account to designated target accounts in the same Region, without requiring access to the datasets that the model was trained and evaluated on. This new feature is available for both Amazon Comprehend custom classification and custom entity recognition models. This feature also unlocks benefits such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Multi-account MLOps strategy&lt;/strong&gt; – Train a model one time, deploy in multiple accounts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Faster deployment&lt;/strong&gt; – No need to retrain in every account&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Protect sensitive datasets&lt;/strong&gt; – No need to share datasets between accounts or users – especially important for industries bound to regulatory requirements around data isolation and sandboxing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Easy collaboration&lt;/strong&gt; – Partners or vendors can now easily train in Amazon Comprehend Custom and share the models with their customers.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  NLP on Amazon SageMaker
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/blogs/machine-learning/train-175-billion-parameter-nlp-models-with-model-parallel-additions-and-hugging-face-on-amazon-sagemaker/" rel="noopener noreferrer"&gt;&lt;strong&gt;Train 175+ billion parameter NLP models with model parallel additions and Hugging Face on Amazon SageMaker&lt;/strong&gt;&lt;/a&gt;&lt;br&gt;
In this blog post, the authors briefly summarize the rise of large and small-scale NLP models, primarily through the abstraction provided by Hugging Face and with the modular backend of Amazon SageMaker. The launch of four additional features within the SageMaker model parallel library are highlighted, which unlock 175 billion parameter NLP model pretraining and fine-tuning for customers.&lt;/p&gt;

&lt;p&gt;The SM Model Parallel library is used on the SageMaker training platform, achieving a throughput of 32 samples per second on 120 ml.p4d.24xlarge instances and 175 billion parameters. The authors extrapolate that, if compute power is  increased to 240 instances, the full model would take 25 days to train.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnyfdwj1uzltl0x3tqtpt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnyfdwj1uzltl0x3tqtpt.png" alt="Alt Text" width="691" height="404"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/aws/amazon-sagemaker-examples/tree/main/training/distributed_training/pytorch/model_parallel" rel="noopener noreferrer"&gt;In this repo&lt;/a&gt; you will find sample code for training BERT, GPT-2, and the recently released GPT-J models using model parallelism on Amazon SageMaker.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/blogs/machine-learning/improve-high-value-research-with-hugging-face-and-amazon-sagemaker-asynchronous-inference-endpoints/" rel="noopener noreferrer"&gt;&lt;strong&gt;Improve high-value research with Hugging Face and Amazon SageMaker asynchronous inference endpoints&lt;/strong&gt;&lt;/a&gt;&lt;br&gt;
Many of our AWS customers provide research, analytics, and business intelligence as a service. This type of research and business intelligence enables their end customers to stay ahead of markets and competitors, identify growth opportunities, and address issues proactively. The NLP models used for these types of research tasks deal with large models and usually involve long articles to be summarized considering the size of the corpus—and dedicated endpoints, which aren’t cost-optimized at the moment. These applications receive a burst of incoming traffic at different times of the day.&lt;/p&gt;

&lt;p&gt;We believe customers would greatly benefit from the ability to scale down to zero and ramp up their inference capability on as needed basis. This optimizes the research cost and still doesn’t compromise on inference quality. This post discusses how Hugging Face along with Amazon SageMaker asynchronous inference can help achieve this.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/blogs/machine-learning/choose-the-best-data-source-for-your-amazon-sagemaker-training-job/" rel="noopener noreferrer"&gt;&lt;strong&gt;Choose the best data source for your Amazon SageMaker training job&lt;/strong&gt;&lt;/a&gt;&lt;br&gt;
Data ingestion is an integral part of any training pipeline, and SageMaker training jobs support a variety of data storage and input modes to suit a wide range of training workloads.&lt;/p&gt;

&lt;p&gt;This post helps you choose the best data source for your SageMaker ML training use case. We introduce the data sources options that SageMaker training jobs support natively. For each data source and input mode, we outline its ease of use, performance characteristics, cost, and limitations. To help you get started quickly, we provide the diagram with a sample decision flow that you can follow based on your key workload characteristics. Lastly, we perform several benchmarks for realistic training scenarios to demonstrate the practical implications on the overall training cost and performance.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk5d32ft58wogbigpypjn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk5d32ft58wogbigpypjn.png" alt="Alt Text" width="506" height="656"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Community Content
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.philschmid.de/terraform-huggingface-amazon-sagemaker-advanced" rel="noopener noreferrer"&gt;&lt;strong&gt;Hugging Face Inference Sagemaker Terraform Module&lt;/strong&gt;&lt;/a&gt;&lt;br&gt;
Our partners at Hugging Face have released a Terraform module which is incredibly useful to deploy Hugging Face Transformer models like BERT, from either Amazon S3 or the &lt;a href="https://huggingface.co/models" rel="noopener noreferrer"&gt;Hugging Face Model Hub&lt;/a&gt; to Amazon SageMaker. They have jam-packed it full of great features, such as deploying private Transformer models from hf.co/models, directly adding an autoscaling configuration for the deployed Amazon SageMaker endpoints, and even deploying &lt;a href="https://docs.aws.amazon.com/sagemaker/latest/dg/async-inference.html" rel="noopener noreferrer"&gt;Asynchronous Inference Endpoints&lt;/a&gt;!&lt;/p&gt;

&lt;p&gt;Check out the Terraform module &lt;a href="https://registry.terraform.io/modules/philschmid/sagemaker-huggingface/aws/latest" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.toNLP%20Data%20Augmentation%20on%20Amazon%20SageMaker"&gt;&lt;strong&gt;NLP Data Augmentation on Amazon SageMaker&lt;/strong&gt;&lt;/a&gt;&lt;br&gt;
Machine learning models are very data-intensive – which is especially true for Natural Language Processing (NLP) models; at the same time, data scarcity is a common challenge in NLP, especially for low-resource languages. This is where data augmentation can greatly help – it is the process of enriching or synthetically enlarge the dataset that a machine learning model is trained on.&lt;/p&gt;

&lt;p&gt;In this blog post, the authors explain how to efficiently perform data augmentation – namely using back translation – by leveraging SageMaker Processing Jobs and pre-trained Hugging Face translation models.)&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff8um90pm5mvz6a8dz1lf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff8um90pm5mvz6a8dz1lf.png" alt="Alt Text" width="800" height="455"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>nlp</category>
      <category>aws</category>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>AWS - NLP newsletter September 2021</title>
      <dc:creator>João Moura</dc:creator>
      <pubDate>Thu, 30 Sep 2021 15:24:09 +0000</pubDate>
      <link>https://dev.to/aws/aws-nlp-newsletter-2021-sep-34o2</link>
      <guid>https://dev.to/aws/aws-nlp-newsletter-2021-sep-34o2</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzp6oesygzu431ar44wcb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzp6oesygzu431ar44wcb.png" alt="Alt Text" width="720" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Hello world. This is the second monthly Natural Language Processing(NLP) newsletter, covering everything related to NLP at AWS, and more. Feel free to leave comments, or share on your social network. Let's dive in!  &lt;/p&gt;




&lt;h2&gt;
  
  
  AWS NLP Services
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Feature Releases
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/blogs/aws/amazon-textract-updates-up-to-32-price-reduction-in-8-aws-regions-and-up-to-50-reduction-in-asynchronous-job-processing-times/" rel="noopener noreferrer"&gt;&lt;strong&gt;Amazon Textract announcements price reductions, reduction in processing time for asynchronous operations up to 50% worldwide, US FedRAMP authorization&lt;/strong&gt;&lt;/a&gt;&lt;br&gt;
The usage of the AnalyzeDocument and DetectDocumentText API’s in eight AWS regions will now be billed at the same rates as prices in the US East (N.Virginia) region (not inclusive of the recently launched &lt;a href="https://aws.amazon.com/about-aws/whats-new/2021/07/amazon-textract-announces-specialized-support-automated-processing-invoices-receipts/" rel="noopener noreferrer"&gt;AnalyzeExpense API&lt;/a&gt;), posing a price reduction of up to 32%. Based on costumer feedback, enhancements made to Textract’s asynchronous operations reduced latency by as much as 50 percent worldwide. Finally, Textract achieved US FedRAMP authorization and added IRAP compliance support. &lt;a href="https://aws.amazon.com/about-aws/whats-new/2021/08/amazon-textract-reduced-pricing-analyzedocument-detectdocumenttext-region-expansion/" rel="noopener noreferrer"&gt;What’s New&lt;/a&gt;, &lt;a href="https://aws.amazon.com/blogs/aws/amazon-textract-updates-up-to-32-price-reduction-in-8-aws-regions-and-up-to-50-reduction-in-asynchronous-job-processing-times/" rel="noopener noreferrer"&gt;AWS News Blog&lt;/a&gt;, &lt;a href="https://docs.aws.amazon.com/textract/latest/dg/what-is.html" rel="noopener noreferrer"&gt;Documentation&lt;/a&gt;.  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/about-aws/whats-new/2021/08/amazon-transcribe-speech-text-new-languages/" rel="noopener noreferrer"&gt;&lt;strong&gt;Amazon Transcribe adds support for 6 new languages&lt;/strong&gt;&lt;/a&gt;, &lt;a href="https://aws.amazon.com/about-aws/whats-new/2021/09/amazon-lex-launches-support-korean/" rel="noopener noreferrer"&gt;&lt;strong&gt;Amazon Lex adds support for Korean&lt;/strong&gt;&lt;/a&gt;&lt;br&gt;
Amazon Transcribe now supports batch transcription in six new languages - Afrikaans, Danish, Mandarin Chinese (Taiwan), Thai, New Zealand English, and South African English. Additionally, Amazon Lex it has just added support for Korean. &lt;a href="https://aws.amazon.com/about-aws/whats-new/2021/08/amazon-transcribe-speech-text-new-languages/" rel="noopener noreferrer"&gt;What’s New (Transcribe)&lt;/a&gt;, &lt;a href="https://aws.amazon.com/about-aws/whats-new/2021/09/amazon-lex-launches-support-korean/" rel="noopener noreferrer"&gt;What’s New (Lex)&lt;/a&gt;, &lt;a href="https://docs.aws.amazon.com/transcribe/latest/dg/transcribe-whatis.html" rel="noopener noreferrer"&gt;Transcribe Documentation&lt;/a&gt;, &lt;a href="https://docs.aws.amazon.com/lexv2/latest/dg/what-is.html" rel="noopener noreferrer"&gt;Lex Documentation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.aws.amazon.com/transcribe/latest/dg/subtitles.html" rel="noopener noreferrer"&gt;&lt;strong&gt;Amazon Transcribe can now generate subtitles for your video files&lt;/strong&gt;&lt;/a&gt;&lt;br&gt;
Amazon Transcribe now supports the generation of WebVTT (*.vtt) and SubRip (.srt) output for use as video subtitles during a batch transcription job. You can select one or both options when you submit the job, and the resultant subtitle files are generated in the same destination as the underlying transcription output file. Find more details in the title link above.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/about-aws/whats-new/2021/09/amazon-transcribe-pii-streaming-transcriptions/" rel="noopener noreferrer"&gt;&lt;strong&gt;Amazon Transcribe now supports redaction of personal identifiable information (PII) for streaming transcriptions&lt;/strong&gt;&lt;/a&gt;&lt;br&gt;
You can now use Amazon Transcribe to automatically identify and redact PII - such as Social Security numbers, credit card/bank account information, and contact information (i.e. name, email address, phone number and mailing address) - from your streaming transcription results. In addition, granular PII categories are now provided, instead of the unique [PII] tag available when redacting PII in a batch transcription job. With this new feature, companies can provide their contact center agents with valuable transcripts for on-going conversation while maintaining privacy standards. &lt;a href="https://aws.amazon.com/about-aws/whats-new/2021/09/amazon-transcribe-pii-streaming-transcriptions/" rel="noopener noreferrer"&gt;What’s New&lt;/a&gt;, &lt;a href="https://aws.amazon.com/blogs/machine-learning/introducing-pii-identification-and-redaction-in-streaming-transcriptions-using-amazon-transcribe/" rel="noopener noreferrer"&gt;AWS ML Blog&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/about-aws/whats-new/2021/09/amazon-comprehend-extract-entities-native-format/" rel="noopener noreferrer"&gt;&lt;strong&gt;Extract custom entities from documents in their native format with Amazon Comprehend&lt;/strong&gt;&lt;/a&gt;&lt;br&gt;
Amazon Comprehend now allows you to extract custom entities from documents in a variety of formats (PDF, Word, plain text) and layouts (e.g., bullets ,lists). Prior to this announcement, you could only use Comprehend on plain text documents, which required you to flatten documents into machine-readable text; this feature combines the power of NLP and Optical Character Recognition (OCR) to extract custom entitities from your documents using the same API and with no preprocessing required. &lt;a href="https://aws.amazon.com/about-aws/whats-new/2021/09/amazon-comprehend-extract-entities-native-format/" rel="noopener noreferrer"&gt;What’s New&lt;/a&gt;, &lt;a href="https://aws.amazon.com/blogs/machine-learning/extract-custom-entities-from-documents-in-their-native-format-with-amazon-comprehend/" rel="noopener noreferrer"&gt;Getting Started (blog)&lt;/a&gt;, &lt;a href="https://aws.amazon.com/blogs/machine-learning/custom-document-annotation-for-extracting-named-entities-in-documents-using-amazon-comprehend/" rel="noopener noreferrer"&gt;Document Annotation for new feature (blog)&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Blog posts/demos
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/blogs/machine-learning/transcribe-class-lectures-accurately-using-amazon-transcribe-with-custom-language-models/" rel="noopener noreferrer"&gt;&lt;strong&gt;Boost transcription accuracy of class lectures with custom language models for Amazon Transcribe&lt;/strong&gt;&lt;/a&gt;&lt;br&gt;
Practical example of how training a custom language model in Amazon Transcribe can help improve transcription accuracy on difficult specialized topics, such as biology lectures.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw4g5873kyisq6cibxkrr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw4g5873kyisq6cibxkrr.png" alt="Alt Text" width="800" height="265"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Read more about how to leverage custom language models in the &lt;a href="https://docs.aws.amazon.com/transcribe/latest/dg/custom-language-models.html" rel="noopener noreferrer"&gt;Transcribe documentation&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  NLP on Amazon SageMaker
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Feature Releases
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/about-aws/whats-new/2021/09/amazon-sagemaker-studio-inference-endpoint-testing/" rel="noopener noreferrer"&gt;&lt;strong&gt;Amazon SageMaker now supports inference endpoint testing from SageMaker Studio&lt;/strong&gt;&lt;/a&gt;&lt;br&gt;
Once a model is deployed to Amazon SageMaker, customers can get predictions from their models deployed on SageMaker real-time endpoints. Previously, customers used third-party tooling such as curl or wrote code in Jupyter Notebooks to invoke the endpoints for inference. Now, customers can provide a JSON payload, send the inference request to the endpoint, and receive results directly from SageMaker Studio. The results are displayed directly in SageMaker Studio and can be downloaded for further analysis.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/blogs/machine-learning/announcing-the-amazon-s3-plugin-for-pytorch/" rel="noopener noreferrer"&gt;&lt;strong&gt;Amazon S3 plugin for PyTorch&lt;/strong&gt;&lt;/a&gt;&lt;br&gt;
This is an open-source library, built to be used with the deep learning framework PyTorch for streaming data with Amazon S3. This feature is also available in PyTorch Deep Learning Containers, and with it you can take advantage of using data from S3 buckets directly with PyTorch dataset and dataloader API’s without needing to download it first on local storage. &lt;a href="https://aws.amazon.com/blogs/machine-learning/announcing-the-amazon-s3-plugin-for-pytorch/" rel="noopener noreferrer"&gt;AWS ML Blog&lt;/a&gt;, &lt;a href="https://github.com/aws/amazon-s3-plugin-for-pytorch" rel="noopener noreferrer"&gt;Plugin Github&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Blog posts/demos
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://github.com/aws-samples/detecting-data-drift-in-nlp-using-amazon-sagemaker-custom-model-monitor" rel="noopener noreferrer"&gt;&lt;strong&gt;Detecting Data Drift in NLP using SageMaker Custom Model Monitor&lt;/strong&gt;&lt;/a&gt;&lt;br&gt;
Detecting data drift in NLP is a challenging task. Model monitoring becomes an important aspect in MLOPS, because the change in data distribution from the training corpus to real-world data at inference time can cause model performance decay. This distribution shift is called data drift. This demo focuses on detecting that drift, making use of the custom monitoring capabilities of SageMaker Model Monitor.&lt;/p&gt;




&lt;h2&gt;
  
  
  Upcoming events
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.nlpsummit.org/nlp-2021/" rel="noopener noreferrer"&gt;&lt;strong&gt;NLP Summit 2021&lt;/strong&gt;&lt;/a&gt;&lt;br&gt;
Oct 05-07, 2021 &lt;br&gt;
Join the NLP Summit: two weeks of immersive, industry-focused content. Week one will include over 30 unique sessions, with a special track on NLP in Healthcare. Week two will feature beginner to advanced training workshops with certifications. Attendees can also participate in coffee chats with speakers, committers, and industry experts. &lt;a href="https://www.nlpsummit.org/nlp-2021/" rel="noopener noreferrer"&gt;Registration&lt;/a&gt; is free.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://aws-startuploft-emea.com/e/dcb4f/aws-startup-accelerate-start-your-nlp-journey-on-aws-level-200-300" rel="noopener noreferrer"&gt;&lt;strong&gt;AWS Startup Accelerate: Start your NLP journey on AWS&lt;/strong&gt;&lt;/a&gt;&lt;br&gt;
Oct 11, 2021 &lt;br&gt;
AWS will be running a Technical talk on "Starting your NLP journey with AWS". Based on feedback from lead NLP ML Core startups, we see that developing NLP models is a complex and costly process, which is why we’d like to engage with Data Scientists and ML engineers to help them in their adoption journey. We would love to have you there! Register &lt;a href="https://aws-startuploft-emea.com/e/dcb4f/aws-startup-accelerate-start-your-nlp-journey-on-aws-level-200-300" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Miscellaneous
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;🤗 HuggingFace: Hardware Partner Program, Optimum, and Infinity&lt;/strong&gt;&lt;br&gt;
A trio of announcements for HuggingFace this month:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hugging Face has launched a &lt;a href="https://huggingface.co/blog/hardware-partners-program" rel="noopener noreferrer"&gt;Hardware Partner Program&lt;/a&gt;, partnering with AI Hardware accelerators to make state of the art production performance accessible with Transformers.&lt;/li&gt;
&lt;li&gt;In this context, HuggingFace has released &lt;a href="https://huggingface.co/hardware" rel="noopener noreferrer"&gt;Optimum&lt;/a&gt;, an ML optimization toolkit, which enables maximum efficiency to train and run models on specific hardware. As of today, you can use it to easily prune and/or quantize Transformer models for Intel Xeon CPU’s using Intel Low Precision Optimization Tool (LPOT), and later this year the first models &lt;a href="https://huggingface.co/blog/graphcore" rel="noopener noreferrer"&gt;optimized for GraphCore’s Intelligence Processing Unit (IPU)&lt;/a&gt; will be added.&lt;/li&gt;
&lt;li&gt;Finally, &lt;a href="https://huggingface.co/infinity" rel="noopener noreferrer"&gt;Infinity&lt;/a&gt; - HugginFace’s enterprise-scale inference solution - was officially announced on September 28th, comprised of a containerized solution which promises Transformers’ accuracy at 1ms latency.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>aws</category>
      <category>nlp</category>
      <category>machinelearning</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
