DEV Community

Data Science

Data Science allows us to extract meaning from and interpret data.

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
CompanyKG: A Large-Scale Heterogeneous Graph for Company Similarity Quantification

CompanyKG: A Large-Scale Heterogeneous Graph for Company Similarity Quantification

Comments
3 min read
Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression

Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression

Comments
3 min read
Virtual avatar generation models as world navigators

Virtual avatar generation models as world navigators

Comments
4 min read
Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning

Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning

Comments
4 min read
ChatDev: Communicative Agents for Software Development

ChatDev: Communicative Agents for Software Development

Comments
3 min read
Do Llamas Work in English? On the Latent Language of Multilingual Transformers

Do Llamas Work in English? On the Latent Language of Multilingual Transformers

Comments
4 min read
SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales

SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales

1
Comments
4 min read
Evaluating Quantized Large Language Models

Evaluating Quantized Large Language Models

Comments
5 min read
Examining the robustness of LLM evaluation to the distributional assumptions of benchmarks

Examining the robustness of LLM evaluation to the distributional assumptions of benchmarks

Comments
4 min read
Implementing Reinforcement Learning Datacenter Congestion Control in NVIDIA NICs

Implementing Reinforcement Learning Datacenter Congestion Control in NVIDIA NICs

Comments
3 min read
Empirical influence functions to understand the logic of fine-tuning

Empirical influence functions to understand the logic of fine-tuning

Comments
4 min read
RAFT: Adapting Language Model to Domain Specific RAG

RAFT: Adapting Language Model to Domain Specific RAG

1
Comments
4 min read
SqueezeLLM: Dense-and-Sparse Quantization

SqueezeLLM: Dense-and-Sparse Quantization

1
Comments
4 min read
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning

Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning

Comments
5 min read
LLark: A Multimodal Instruction-Following Language Model for Music

LLark: A Multimodal Instruction-Following Language Model for Music

Comments
3 min read
Bridging the Empirical-Theoretical Gap in Neural Network Formal Language Learning Using Minimum Description Length

Bridging the Empirical-Theoretical Gap in Neural Network Formal Language Learning Using Minimum Description Length

Comments
4 min read
Large Language Models for Generative Information Extraction: A Survey

Large Language Models for Generative Information Extraction: A Survey

Comments
5 min read
A Shocking Amount of the Web is Machine Translated: Insights from Multi-Way Parallelism

A Shocking Amount of the Web is Machine Translated: Insights from Multi-Way Parallelism

Comments
4 min read
Position: Categorical Deep Learning is an Algebraic Theory of All Architectures

Position: Categorical Deep Learning is an Algebraic Theory of All Architectures

Comments
1 min read
Optimizing Matplotlib Performance: Handling Memory Leaks Efficiently

Optimizing Matplotlib Performance: Handling Memory Leaks Efficiently

14
Comments
3 min read
Data Mesh: An Executive Guide to Modern Data Architecture in Manufacturing

Data Mesh: An Executive Guide to Modern Data Architecture in Manufacturing

1
Comments
13 min read
Book Recommendations

Book Recommendations

10
Comments
12 min read
ValueError: A given column is not a column of the dataframe

ValueError: A given column is not a column of the dataframe

1
Comments
1 min read
Navigating the Data Jungle. Data Analysis Software: A Comprehensive Guide

Navigating the Data Jungle. Data Analysis Software: A Comprehensive Guide

Comments
11 min read
🎬 From image to text data... to image...to movie clip.

🎬 From image to text data... to image...to movie clip.

2
Comments 1
1 min read
Pedaling Through Data: A Wheelie Fun Bike Rental Analysis with Python and PostgreSQL

Pedaling Through Data: A Wheelie Fun Bike Rental Analysis with Python and PostgreSQL

1
Comments
4 min read
Uncovering and Solving Data Wrangling Issues with Property-Based Testing

Uncovering and Solving Data Wrangling Issues with Property-Based Testing

Comments
13 min read
Evaluation Metrics: Machine Learning Models 🤖🐍

Evaluation Metrics: Machine Learning Models 🤖🐍

6
Comments
3 min read
Difference between Data Analysts, Data Scientists, and Data Engineers

Difference between Data Analysts, Data Scientists, and Data Engineers

Comments 1
1 min read
Elementary Logic And Proof Techniques

Elementary Logic And Proof Techniques

1
Comments
7 min read
Recapping the AI, Machine Learning and Data Science Meetup - May 30, 2024

Recapping the AI, Machine Learning and Data Science Meetup - May 30, 2024

Comments
6 min read
On the Brittle Foundations of ReAct Prompting for Agentic Large Language Models

On the Brittle Foundations of ReAct Prompting for Agentic Large Language Models

1
Comments 1
4 min read
An Introduction to Vision-Language Modeling

An Introduction to Vision-Language Modeling

1
Comments 1
4 min read
Relightable Gaussian Codec Avatars

Relightable Gaussian Codec Avatars

Comments
4 min read
Laboratory-Scale AI: Open-Weight Models are Competitive with ChatGPT Even in Low-Resource Settings

Laboratory-Scale AI: Open-Weight Models are Competitive with ChatGPT Even in Low-Resource Settings

Comments
3 min read
Context Injection Attacks on Large Language Models

Context Injection Attacks on Large Language Models

Comments
4 min read
Contextual Position Encoding: Learning to Count What's Important

Contextual Position Encoding: Learning to Count What's Important

1
Comments
4 min read
Is Complexity an Illusion?

Is Complexity an Illusion?

Comments
2 min read
Oil & Water? Diffusion of AI Within and Across Scientific Fields

Oil & Water? Diffusion of AI Within and Across Scientific Fields

1
Comments
3 min read
Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks

Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks

Comments
5 min read
Training-Free Long-Context Scaling of Large Language Models

Training-Free Long-Context Scaling of Large Language Models

Comments
4 min read
Certifiably Robust RAG against Retrieval Corruption

Certifiably Robust RAG against Retrieval Corruption

1
Comments
4 min read
Grokfast: Accelerated Grokking by Amplifying Slow Gradients

Grokfast: Accelerated Grokking by Amplifying Slow Gradients

1
Comments
4 min read
Easy Problems That LLMs Get Wrong

Easy Problems That LLMs Get Wrong

5
Comments
3 min read
Kotlin ML Pack: Technical Report

Kotlin ML Pack: Technical Report

Comments
4 min read
BadLlama: cheaply removing safety fine-tuning from Llama 2-Chat 13B

BadLlama: cheaply removing safety fine-tuning from Llama 2-Chat 13B

Comments
4 min read
Compressed-Language Models for Understanding Compressed File Formats: a JPEG Exploration

Compressed-Language Models for Understanding Compressed File Formats: a JPEG Exploration

Comments
5 min read
Faithful Logical Reasoning via Symbolic Chain-of-Thought

Faithful Logical Reasoning via Symbolic Chain-of-Thought

2
Comments
3 min read
The Impacts of Data, Ordering, and Intrinsic Dimensionality on Recall in Hierarchical Navigable Small Worlds

The Impacts of Data, Ordering, and Intrinsic Dimensionality on Recall in Hierarchical Navigable Small Worlds

Comments
4 min read
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning

LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning

Comments
4 min read
The Road Less Scheduled

The Road Less Scheduled

Comments
4 min read
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models

NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models

Comments
3 min read
AnyLoss: Transforming Classification Metrics into Loss Functions

AnyLoss: Transforming Classification Metrics into Loss Functions

Comments
4 min read
Look Once to Hear: Target Speech Hearing with Noisy Examples

Look Once to Hear: Target Speech Hearing with Noisy Examples

Comments
5 min read
NPGA: Neural Parametric Gaussian Avatars

NPGA: Neural Parametric Gaussian Avatars

2
Comments
3 min read
MoEUT: Mixture-of-Experts Universal Transformers

MoEUT: Mixture-of-Experts Universal Transformers

Comments
3 min read
Human vs. Machine: Behavioral Differences Between Expert Humans and Language Models in Wargame Simulations

Human vs. Machine: Behavioral Differences Between Expert Humans and Language Models in Wargame Simulations

Comments
4 min read
The rising costs of training frontier AI models

The rising costs of training frontier AI models

Comments
5 min read
Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models

Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models

Comments
4 min read
Neural Network Parameter Diffusion

Neural Network Parameter Diffusion

Comments
4 min read
loading...