Active Learning for Natural Language Processing

#machinelearning #activelearning #ai

More than 90% of machine learning applications improve with human feedback. For example, a model that classifying news articles into pre-defined topics has been trained on 1000s of examples where humans have manually annotated the topics. However, if there are tens of millions of news articles, it might not be feasible to manually annotate even 1% of them. If we only sample randomly, we will mostly get popular topics like “politics” that the machine learning model can already identify accurately. So, we need to be smarter about how we sample. This talk is about “Active Learning”, the process of deciding what raw data is the most optimal for human review, covering: Uncertainty Sampling; Diversity Sampling; and some advanced methods like Active Transfer Learning.

Robert Munro has worked as a leader at several Silicon Valley machine learning companies and also led AWS’s first Natural Language Processing and Machine Translation solutions. Robert is the author of Human-in-the-Loop Machine Learning, covering practical methods for Active Learning, Transfer Learning, and Annotation. Robert organizes Bay Area NLP, the world’s largest community of Language Technology professionals. Robert is also a disaster responder and is currently helping with the response to COVID-19.

The slides are available on this link.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

DEV Community

Active Learning for Natural Language Processing

Top comments (0)

Read next

VoiceScribe: Elevating Transcriptions with AssemblyAI's Universal-2 Model

AI-Powered Solution Cuts Mixed-Integer Programming Time by 40% Using Unsupervised Learning

ECCV 2024: Zero-shot Video Anomaly Detection: Leveraging Large Language Models for Rule-Based Reasoning

Accelerate AI Workloads with Amazon EC2 Trn1 Instances and AWS Neuron SDK