Hello world. This is the November 2022 edition of the AWS Natural Language Processing (NLP) newsletter covering everything related to NLP at AWS. Feel free to leave comments & share it on your social network.
AWS re:Invent 2022 starts today!
*Explore the AWS AI/ML Services Attendee Guide *– Your complete resource for the AI & machine learning sessions at re:Invent 2022 and share them with your customers.
CheckoutYour guide to AI/ML at AWS re:Invent 2022 ** ** to get a sense of how the AI/ML track is organized and some key sessions.
NLP@AWS Customer Success Stories
ByteDance saves up to 60% on inference costs while reducing latency and increasing throughput using AWS Inferentia"The ByteDance AML team focuses on the research and implementation of cutting-edge ML systems and the heterogenous computing resources they require. We create large-scale training and inference systems for a wide variety of recommender, natural language processing (NLP), and computer vision (CV) models. These models are highly complex and process a huge amount of data from the many content platforms ByteDance operates. Deploying these models requires significant GPU resources, whether in the cloud or on premises. Therefore, the compute costs for these inference systems are quite high […]
Ultimately, after evaluating several options, we chose EC2 Inf1 instances for their better performance/price ratio compared to G4dn instances and NVIDIA T4 on premises. We engaged in a cycle of continuous iteration with the AWS team to unlock the price and performance benefits of Inf1.
Real estate brokerage firm John L. Scott uses Amazon Textract and Amazon Comprehend to strike racially restrictive language from property deeds for homeownersWhen company operating officer Phil McBride joined the company in 2007, one of his initial challenges was to shift the company's public website from an on-premises environment to a cloud-hosted one. According to McBride, a world of resources opened up to John L. Scott once the company started working with AWS to build an easily controlled, cloud-enabled environment.
2/ AI Language Services
Amazon Textract launches the ability to detect signatures on any document . Amazon Textract is a machine learning service that automatically extracts printed text, handwriting, and data from any document or image. Textract now provides you the capability to detect handwritten signatures, e-signatures, and initials on documents such as loan application forms, checks, claim forms and more. AnalyzeDocument Signatures reduces the need for human reviewers and helps customers reduce costs, save time, and build scalable solutions for document processing.
Introducing DTMF slot settings within Amazon Lex.Amazon Lex is a service for building conversational interfaces into any application using voice and text. With Amazon Lex, you can quickly and easily build conversational bots ("chatbots"), virtual agents, and interactive voice response (IVR) systems. Amazon Lex is excited to launch DTMF-only slot settings and configurable session attributes within the Lex console.
Amazon Translate Enables Tagging Support for Parallel Data and Custom Terminology. Amazon Translate is a neural machine translation service that delivers fast, high-quality, affordable, and customizable language translation. Today, we are launching support of tagging for custom terminology and parallel data resources and then allow/restrict access on them based on the tags.
Amazon Transcribe now supports Thai and Hindi languages for streaming audio. Amazon Transcribe is an automatic speech recognition (ASR) service that makes it easy for you to add speech-to-text capabilities to your applications. Today, we are excited to announce Thai and Hindi language support for streaming audio transcriptions. These new languages expand the coverage of Amazon Transcribe streaming and enable customers to reach a broader global audience.
Amazon Kendra is now FedRAMP High Compliant. Amazon Kendra is now authorized as FedRAMP High in AWS GovCloud (US-West) Region. Amazon Kendra is a highly accurate intelligent search service powered by machine learning. Kendra reimagines enterprise search for your websites and applications so your employees and customers can easily find the content they are looking for, even when it's scattered across multiple locations and content repositories within your organization.
Virtual Private Cloud (VPC) support is generally available for Amazon Polly. Amazon Polly is a service that turns text into lifelike speech, allowing you to create applications that talk, and build entirely new categories of speech-enabled products. Starting today, you can now use Amazon Polly inside an Amazon Virtual Private Cloud (VPC) , instead of connecting over the internet, which allows you to have better control over your network environment.
Intelligent document processing with AWS AI services in the insurance industry. In this two-part series (Part 1, Part 2), they authors take you through how you can automate and intelligently process documents at scale using AWS AI services for an insurance claims processing use case.
Improve data extraction and document processing with Amazon Textract . In this post, the authors demonstrate how to use Amazon Textract to extract meaningful, actionable data from a wide range of complex multi-format PDF files.
Train gigantic models with near-linear scaling using sharded data parallelism on Amazon SageMaker . Learn how to train a 30B parameter GPT-2 model on SageMaker with ease with Sharded data parallelism on Amazon SageMaker, a new memory-saving distributed training technique in the SageMaker model parallel (SMP) library. Sharded data parallelism is purpose-built for extreme-scale models and uses Amazon in-house MiCS technology under the hood, a science effort to minimize the communication scale by bringing down expensive communication overhead rooted in parameter gathering and gradient synchronization.
AlexaTM 20B is now available in Amazon SageMaker JumpStart . Amazon's state-of-the-art Alexa Teacher Model with 20 billion parameters (AlexaTM 20B) is now available through Amazon SageMaker JumpStart, SageMaker's machine learning hub. You can use AlexaTM 20B for a wide range of industry use-cases, from summarizing financial reports to question answering for customer service chatbots. It can be applied even when there are only a few available training examples, or even none at all. AlexaTM 20B outperforms a 175 billion GPT-3 model on zero-shot learning tasks such as SuperGLUE and shows state-of-the-art performance for multilingual zero-shot tasks such as XNLI.
Deploy BLOOM-176B and OPT-30B on Amazon SageMaker with large model inference Deep Learning Containers and DeepSpeed ** ** Learn how to use a new SageMaker large model inference Deep Learning Container to deploy two of the most popular large NLP models: BigScience's BLOOM-176B and Meta's OPT-30B from the Hugging Face repository.
Transfer learning for TensorFlow text classification models in Amazon SageMaker by SageMaker now provides a new built-in algorithm for text classification using TensorFlow. This supervised learning algorithm supports transfer learning for many pre-trained models available in TensorFlow hub. It takes a piece of text as input and outputs the probability for each of the class labels. You can fine-tune these pre-trained models using transfer learning even when a large corpus of text isn't available.
Transformers and Large Language Models (LLMs): A meeting of minds. Conjecture, Amazon Web Services (AWS), and NLP London co-hosted an informal meetup on 22nd November in London, UK to connect Large Language Model (LLM) researchers, practitioners, start-up builders, investors, and business innovators to ignite new ideas and accelerate innovation.
Stability AI release Stable Diffusion 2.0.Stable Diffusion 2.0is out! Read about the new features and improvements it delivers.
Stay in touch with NLP on AWS
Our contact: firstname.lastname@example.org
Email us about (1) your awesome project about NLP on AWS, (2) let us know which post in the newsletter helped your NLP journey, (3) other things that you want us to post on the newsletter. Talk to you soon.
- Main Editor - Anastasia Tzeveleka
- Reviewed by - Mia Chang