DEV Community

Machine Learning

A branch of artificial intelligence (AI) and computer science which focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving its accuracy.

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Context Injection Attacks on Large Language Models

Context Injection Attacks on Large Language Models

Comments
4 min read
Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks

Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks

Comments
5 min read
Oil & Water? Diffusion of AI Within and Across Scientific Fields

Oil & Water? Diffusion of AI Within and Across Scientific Fields

1
Comments
3 min read
The Impacts of Data, Ordering, and Intrinsic Dimensionality on Recall in Hierarchical Navigable Small Worlds

The Impacts of Data, Ordering, and Intrinsic Dimensionality on Recall in Hierarchical Navigable Small Worlds

Comments
4 min read
Training-Free Long-Context Scaling of Large Language Models

Training-Free Long-Context Scaling of Large Language Models

Comments
4 min read
Easy Problems That LLMs Get Wrong

Easy Problems That LLMs Get Wrong

6
Comments
3 min read
Kotlin ML Pack: Technical Report

Kotlin ML Pack: Technical Report

Comments
4 min read
BadLlama: cheaply removing safety fine-tuning from Llama 2-Chat 13B

BadLlama: cheaply removing safety fine-tuning from Llama 2-Chat 13B

Comments
4 min read
Compressed-Language Models for Understanding Compressed File Formats: a JPEG Exploration

Compressed-Language Models for Understanding Compressed File Formats: a JPEG Exploration

Comments
5 min read
Is Complexity an Illusion?

Is Complexity an Illusion?

Comments
2 min read
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning

LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning

Comments
4 min read
Grokfast: Accelerated Grokking by Amplifying Slow Gradients

Grokfast: Accelerated Grokking by Amplifying Slow Gradients

1
Comments
4 min read
Human vs. Machine: Behavioral Differences Between Expert Humans and Language Models in Wargame Simulations

Human vs. Machine: Behavioral Differences Between Expert Humans and Language Models in Wargame Simulations

Comments
4 min read
The rising costs of training frontier AI models

The rising costs of training frontier AI models

Comments
5 min read
The Road Less Scheduled

The Road Less Scheduled

Comments
4 min read
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models

NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models

Comments
3 min read
Look Once to Hear: Target Speech Hearing with Noisy Examples

Look Once to Hear: Target Speech Hearing with Noisy Examples

Comments
5 min read
NPGA: Neural Parametric Gaussian Avatars

NPGA: Neural Parametric Gaussian Avatars

2
Comments
3 min read
MoEUT: Mixture-of-Experts Universal Transformers

MoEUT: Mixture-of-Experts Universal Transformers

Comments
3 min read
AnyLoss: Transforming Classification Metrics into Loss Functions

AnyLoss: Transforming Classification Metrics into Loss Functions

Comments
4 min read
Diffusion On Syntax Trees For Program Synthesis

Diffusion On Syntax Trees For Program Synthesis

2
Comments
4 min read
PlaceFormer: Transformer-based Visual Place Recognition using Multi-Scale Patch Selection and Fusion

PlaceFormer: Transformer-based Visual Place Recognition using Multi-Scale Patch Selection and Fusion

Comments
3 min read
Formalizing and Benchmarking Prompt Injection Attacks and Defenses

Formalizing and Benchmarking Prompt Injection Attacks and Defenses

2
Comments
4 min read
Neural Network Parameter Diffusion

Neural Network Parameter Diffusion

Comments
4 min read
Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models

Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models

Comments
4 min read
loading...