DEV Community

Machine Learning

A branch of artificial intelligence (AI) and computer science which focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving its accuracy.

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Laboratory-Scale AI: Open-Weight Models are Competitive with ChatGPT Even in Low-Resource Settings

Laboratory-Scale AI: Open-Weight Models are Competitive with ChatGPT Even in Low-Resource Settings

Comments
3 min read
YouTube Video Transcripts Using LangChain

YouTube Video Transcripts Using LangChain

14
Comments 3
2 min read
Kotlin ML Pack: Technical Report

Kotlin ML Pack: Technical Report

Comments
4 min read
Training-Free Long-Context Scaling of Large Language Models

Training-Free Long-Context Scaling of Large Language Models

Comments
4 min read
Faithful Logical Reasoning via Symbolic Chain-of-Thought

Faithful Logical Reasoning via Symbolic Chain-of-Thought

2
Comments
3 min read
Compressed-Language Models for Understanding Compressed File Formats: a JPEG Exploration

Compressed-Language Models for Understanding Compressed File Formats: a JPEG Exploration

Comments
5 min read
BadLlama: cheaply removing safety fine-tuning from Llama 2-Chat 13B

BadLlama: cheaply removing safety fine-tuning from Llama 2-Chat 13B

Comments
4 min read
Oil & Water? Diffusion of AI Within and Across Scientific Fields

Oil & Water? Diffusion of AI Within and Across Scientific Fields

1
Comments
3 min read
Grokfast: Accelerated Grokking by Amplifying Slow Gradients

Grokfast: Accelerated Grokking by Amplifying Slow Gradients

1
Comments
4 min read
Is Complexity an Illusion?

Is Complexity an Illusion?

Comments
2 min read
The Impacts of Data, Ordering, and Intrinsic Dimensionality on Recall in Hierarchical Navigable Small Worlds

The Impacts of Data, Ordering, and Intrinsic Dimensionality on Recall in Hierarchical Navigable Small Worlds

Comments
4 min read
Easy Problems That LLMs Get Wrong

Easy Problems That LLMs Get Wrong

5
Comments
3 min read
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning

LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning

Comments
4 min read
Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks

Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks

Comments
5 min read
Certifiably Robust RAG against Retrieval Corruption

Certifiably Robust RAG against Retrieval Corruption

1
Comments
4 min read
AnyLoss: Transforming Classification Metrics into Loss Functions

AnyLoss: Transforming Classification Metrics into Loss Functions

Comments
4 min read
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models

NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models

Comments
3 min read
The Road Less Scheduled

The Road Less Scheduled

Comments
4 min read
Look Once to Hear: Target Speech Hearing with Noisy Examples

Look Once to Hear: Target Speech Hearing with Noisy Examples

Comments
5 min read
NPGA: Neural Parametric Gaussian Avatars

NPGA: Neural Parametric Gaussian Avatars

2
Comments
3 min read
The rising costs of training frontier AI models

The rising costs of training frontier AI models

Comments
5 min read
MoEUT: Mixture-of-Experts Universal Transformers

MoEUT: Mixture-of-Experts Universal Transformers

Comments
3 min read
Human vs. Machine: Behavioral Differences Between Expert Humans and Language Models in Wargame Simulations

Human vs. Machine: Behavioral Differences Between Expert Humans and Language Models in Wargame Simulations

Comments
4 min read
Diffusion On Syntax Trees For Program Synthesis

Diffusion On Syntax Trees For Program Synthesis

2
Comments
4 min read
Towards Lightweight Super-Resolution with Dual Regression Learning

Towards Lightweight Super-Resolution with Dual Regression Learning

Comments
3 min read
PlaceFormer: Transformer-based Visual Place Recognition using Multi-Scale Patch Selection and Fusion

PlaceFormer: Transformer-based Visual Place Recognition using Multi-Scale Patch Selection and Fusion

Comments
3 min read
Neural Network Parameter Diffusion

Neural Network Parameter Diffusion

Comments
4 min read
Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities

Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities

Comments
4 min read
Formalizing and Benchmarking Prompt Injection Attacks and Defenses

Formalizing and Benchmarking Prompt Injection Attacks and Defenses

2
Comments
4 min read
Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models

Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models

Comments
4 min read
Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities

Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities

Comments
4 min read
Assessing Large Language Models on Climate Information

Assessing Large Language Models on Climate Information

Comments
3 min read
SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering

SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering

1
Comments
4 min read
Sparse maximal update parameterization: A holistic approach to sparse training dynamics

Sparse maximal update parameterization: A holistic approach to sparse training dynamics

Comments
5 min read
Is In-Context Learning Sufficient for Instruction Following in LLMs?

Is In-Context Learning Sufficient for Instruction Following in LLMs?

Comments
4 min read
Learning to Model the World with Language

Learning to Model the World with Language

Comments
4 min read
There and Back Again: The AI Alignment Paradox

There and Back Again: The AI Alignment Paradox

Comments
4 min read
Large Language Models Can Self-Improve At Web Agent Tasks

Large Language Models Can Self-Improve At Web Agent Tasks

Comments
3 min read
Metaheuristics and Large Language Models Join Forces: Towards an Integrated Optimization Approach

Metaheuristics and Large Language Models Join Forces: Towards an Integrated Optimization Approach

Comments
4 min read
Privacy-Aware Visual Language Models

Privacy-Aware Visual Language Models

Comments
4 min read
LLMs achieve adult human performance on higher-order theory of mind tasks

LLMs achieve adult human performance on higher-order theory of mind tasks

Comments
4 min read
Executable Code Actions Elicit Better LLM Agents

Executable Code Actions Elicit Better LLM Agents

Comments
4 min read
ToonCrafter: Generative Cartoon Interpolation

ToonCrafter: Generative Cartoon Interpolation

3
Comments
4 min read
LLaMA Pro: Progressive LLaMA with Block Expansion

LLaMA Pro: Progressive LLaMA with Block Expansion

Comments
5 min read
Text clustering with LLM embeddings

Text clustering with LLM embeddings

14
Comments
4 min read
Arrows of Time for Large Language Models

Arrows of Time for Large Language Models

Comments
5 min read
Evaluating AI-generated code for C++, Fortran, Go, Java, Julia, Matlab, Python, R, and Rust

Evaluating AI-generated code for C++, Fortran, Go, Java, Julia, Matlab, Python, R, and Rust

1
Comments
4 min read
Simplifying Transformer Blocks

Simplifying Transformer Blocks

Comments
3 min read
Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models

Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models

Comments
4 min read
You Need to Pay Better Attention: Rethinking the Mathematics of Attention Mechanism

You Need to Pay Better Attention: Rethinking the Mathematics of Attention Mechanism

Comments
4 min read
Generate and Pray: Using SALLMS to Evaluate the Security of LLM Generated Code

Generate and Pray: Using SALLMS to Evaluate the Security of LLM Generated Code

Comments
4 min read
gzip Predicts Data-dependent Scaling Laws

gzip Predicts Data-dependent Scaling Laws

Comments
4 min read
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

3
Comments 1
1 min read
Implementation of Perceptron...

Implementation of Perceptron...

3
Comments
4 min read
Side Quest Devblog #2: Virtual Insanity

Side Quest Devblog #2: Virtual Insanity

Comments
10 min read
Ai Will Live and Die By Trust it

Ai Will Live and Die By Trust it

1
Comments 1
3 min read
Random Forest Algorithm in Machine Learning

Random Forest Algorithm in Machine Learning

Comments
1 min read
CVPR 2024 Datasets and Benchmarks - Part 2: Benchmarks

CVPR 2024 Datasets and Benchmarks - Part 2: Benchmarks

1
Comments 1
15 min read
Intro to Deep Learning...

Intro to Deep Learning...

6
Comments 2
9 min read
5 Must-Do FREE Microsoft Learning Paths on Machine Learning!

5 Must-Do FREE Microsoft Learning Paths on Machine Learning!

Comments
1 min read
loading...