DEV Community

Mia Chang for AWS

Posted on

AWS - NLP newsletter October 2021

Cover Photo for AWS NLP Newsletter Ep03.2021.Oct.
Hello world. This is the monthly Natural Language Processing(NLP) newsletter covering everything related to NLP at AWS. This is our third newsletter on Dev.to. If you missed our earlier episode, here are Ep01 and Ep02. Feel free to leave comments, share it on your social network to celebrate this new launch with us!

Service updates about NLP on AWS

  • Amazon Lex launches progress updates for fulfillment

    You can now configure your Amazon Lex bots to provide periodic updates to users while their requests are processed. Customer support conversations often require execution of business logic that can take some time to complete. For example, updating an itinerary on an airline reservation system may take a couple of minutes during peak hours. Typically, support agents put the call on hold and provide periodic updates (e.g., “We are still processing your request; thank you for your patience”) until the request is fulfilled. Now, you can easily configure your bot to automatically provide such periodic updates in a conversation. With progress updates capability, bot builders can quickly enhance the ability of virtual contact center agents and smart assistants.

  • New AWS Solution: AWS QnABot, a self-service conversational chatbot built on Amazon Lex

    The AWS QnABot has now been released as an official AWS Solution Implementation. The AWS QnABot is an open source, multichannel, multi-language conversational chatbot built on Amazon Lex, that responds to your customer’s questions, answers, and feedback. Without programming, the AWS QnABot solution allows customers to quickly deploy self-service conversational AI on multiple channels including their contact centers, websites, social media channels, SMS text messaging, or Amazon Alexa.

  • Amazon Transcribe now supports custom language models for streaming transcription

    Amazon Transcribe will now support custom language models (CLM) for streaming transcription. Amazon Transcribe is an automatic speech recognition (ASR) service that makes it easy for you to add speech-to-text capabilities to your applications. CLM allows you to leverage pre-existing data to build a custom speech engine tailored for your transcription use case. No prior machine learning experience required. AWS ML Blog, Transcribe Documentation.

Text analysis charts

NLP on SageMaker

  • Amazon SageMaker JumpStart introduces new multimodal (long-form text, tabular) financial analysis tools

    With this new release, you can use the new set of multimodal financial analysis tools within Amazon SageMaker JumpStart. With these new tools, you can enhance your tabular ML workflows with new insights from financial text documents and potentially help save up to weeks of development time. Using the new SageMaker JumpStart Industry SDK, you can easily retrieve common public financial documents, including SEC filings, and further process financial text documents with features such as summarization and scoring for sentiment, litigiousness, risk, readability etc. In addition, you can access pre-trained language models trained on financial text for transfer learning, and use example notebooks for data retrieval, text feature engineering, multimodal classification and regression models. AWS ML Blog #1, AWS ML Blog #2, AWS ML Blog #3, JumpStart Documentation

  • Organize product data to your taxonomy with Amazon SageMaker

    When companies deal with data that comes from various sources or the collection of this data has changed over time, the data often becomes difficult to organize. Perhaps you have product category names that are similar but don’t match, and on your website you want to surface these products as a group. Therefore, you need to go through the tedious work of manually creating a map from source to target to be able to transform the data into your own taxonomy. In these cases, we’re not talking about a few hundred rows of data, but more often many hundreds of thousands of rows, with new data flowing in regularly. In this post, we discuss how to organize product data to your classification needs with Amazon SageMaker.

  • Bring structure to diverse documents with Amazon Textract and transformer-based models on Amazon SageMaker

    From application forms, to identity documents, recent utility bills, and bank statements, many business processes today still rely on exchanging and analyzing human-readable documents—particularly in industries like financial services and law. In this post, we show how you can use Amazon SageMaker, an end-to-end platform for machine learning (ML), to automate especially challenging document analysis tasks with advanced ML models.

AWS Blog posts, papers, and more

  • Create a dashboard with SEC text for financial NLP in Amazon SageMaker JumpStart

    In this post, the author showed how to curate a dataset of Securities Exchange Commission, SEC filings, use NLP for feature engineering on the dataset, and present the features in a dashboard.

    To get started, you can refer to the example notebook in JumpStart titled Dashboarding SEC Filings. You can also refer to the example notebook in JumpStart titled Create a TabText Dataset of SEC Filings in a Single API Call, which contains more details of SEC forms retrieval, summarization, and NLP scoring.

  • Amazon Science Publication: Sample selection guided by domain and task for cross-domain targeted sentiment analysis

    Building supervised targeted sentiment analysis models for a new target domain requires substantial annotation effort since most datasets for this task are domain-specific. Domain adaptation for this task has two dimensions: the nature of targets and the opinion words used to describe sentiment towards the target. We present a data sampling strategy informed by domain differences across these two dimensions with the goal of selecting a small number of examples, thereby minimizing annotation effort. This obtains performance in the 86-100% range compared to the full supervised model using only ∼4-15% of the full training data.

YouTube demo video "Amazon Transcribe video snacks: Using vocabulary filters"

  • YouTube demo video "Amazon Transcribe video snacks: Using vocabulary filters"

    Amazon Transcribe is a automatic speech recognition service that can be used when you have audio and video that contains speech you want to convert to text. You can mask, remove, or tag words you don't want in your transcription results with vocabulary filtering. For example, you can use vocabulary filtering to prevent the display of offensive or profane terms. In the demo, we will customize Transcribe to mask swear words that we recently encountered in a famous play written by William Shakespeare.

  • 4 ways conversational AI and Amazon Lex help the public sector transform customer engagement

    Conversational artificial intelligence (AI) and chatbots can be used to transform the customer experience, enhance engagement, improve services, and help scale more simply. Learn how conversational AI and chatbots help public sector organizations.

Community content

SageMaker and Hugging Face

  • Workshop: Getting started with Amazon Sagemaker Train a Hugging Face Transformers and deploy it
    Learn how to use Amazon SageMaker to train a Hugging Face Transformer model and deploy it afterward. Prepare and upload a test dataset to S3, prepare a fine-tuning script to be used with Amazon SageMaker Training jobs, Launch a training job and store the trained model into S3, and Deploy the model after successful training. GitHub Repository

  • October “HuggingFace Blog” entries:

    1. Showcase Your Projects in Spaces using Gradio
    2. Hosting your Models and Datasets on Hugging Face Spaces using Streamlit
    3. Fine-tuning CLIP with Remote Sensing (Satellite) images and captions
    4. The Age of Machine Learning As Code Has Arrived
    5. Train a Sentence Embedding Model with 1B Training Pairs
    6. Large Language Models: A New Moore’s Law?
    7. Course Launch Community Event

Upcoming NLP events

Both community events and AWS events

Stay in touch with NLP on AWS

Our contact: aws-nlp@amazon.com
Email us about (1) your awesome project about NLP on AWS, (2) let us know which post in the newsletter helped your NLP journey, (3) other things that you want us to post on the newsletter. Talk to you soon.

Latest comments (0)