DEV Community

Mike Young profile picture

Mike Young

Building indie hacker stuff in my free time, focusing on AI. Launching https://aimodels.fyi - find the right AI model for your project!

Location Washington, DC Joined Joined on  Personal website https://aimodels.fyi twitter website

Education

Purdue

Work

Indie hacking stuff!

Know Your Neighborhood: General and Zero-Shot Capable Binary Function Search Powered by Call Graphlets

Know Your Neighborhood: General and Zero-Shot Capable Binary Function Search Powered by Call Graphlets

Comments
4 min read

Want to connect with Mike Young?

Create an account to connect with Mike Young. You can also sign in below to proceed if you already have an account.

Already have an account? Sign in
Scalable MatMul-free Language Modeling

Scalable MatMul-free Language Modeling

1
Comments
4 min read
Wav2Prompt: End-to-End Speech Prompt Generation and Tuning For LLM in Zero and Few-shot Learning

Wav2Prompt: End-to-End Speech Prompt Generation and Tuning For LLM in Zero and Few-shot Learning

Comments
3 min read
Bootstrap3D: Improving 3D Content Creation with Synthetic Data

Bootstrap3D: Improving 3D Content Creation with Synthetic Data

Comments
4 min read
Graph Convolutional Branch and Bound

Graph Convolutional Branch and Bound

Comments
3 min read
Scalable Detection of Salient Entities in News Articles

Scalable Detection of Salient Entities in News Articles

Comments
3 min read
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models

Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models

Comments
5 min read
Vision-LSTM: xLSTM as Generic Vision Backbone

Vision-LSTM: xLSTM as Generic Vision Backbone

Comments
4 min read
Improving Text Embeddings with Large Language Models

Improving Text Embeddings with Large Language Models

Comments
3 min read
Ask LLMs Directly, What shapes your bias?: Measuring Social Bias in Large Language Models

Ask LLMs Directly, What shapes your bias?: Measuring Social Bias in Large Language Models

Comments
1 min read
QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks

QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks

Comments
3 min read
Approximate Nearest Neighbor Search with Window Filters

Approximate Nearest Neighbor Search with Window Filters

Comments
3 min read
TinyLlama: An Open-Source Small Language Model

TinyLlama: An Open-Source Small Language Model

Comments
4 min read
CityDreamer: Compositional Generative Model of Unbounded 3D Cities

CityDreamer: Compositional Generative Model of Unbounded 3D Cities

Comments
3 min read
Improving Alignment and Robustness with Short Circuiting

Improving Alignment and Robustness with Short Circuiting

Comments
4 min read
Open-Endedness is Essential for Artificial Superhuman Intelligence

Open-Endedness is Essential for Artificial Superhuman Intelligence

Comments
4 min read
ReGAL: Refactoring Programs to Discover Generalizable Abstractions

ReGAL: Refactoring Programs to Discover Generalizable Abstractions

Comments
4 min read
The Geometry of Categorical and Hierarchical Concepts in Large Language Models

The Geometry of Categorical and Hierarchical Concepts in Large Language Models

Comments
4 min read
Knockout: A simple way to handle missing inputs

Knockout: A simple way to handle missing inputs

Comments
4 min read
Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Comments
4 min read
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Comments
4 min read
S-LoRA: Serving Thousands of Concurrent LoRA Adapters

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Comments
4 min read
LLMs cannot find reasoning errors, but can correct them given the error location

LLMs cannot find reasoning errors, but can correct them given the error location

Comments
5 min read
Gated Linear Attention Transformers with Hardware-Efficient Training

Gated Linear Attention Transformers with Hardware-Efficient Training

Comments
4 min read
WaveCoder: Widespread And Versatile Enhancement For Code Large Language Models By Instruction Tuning

WaveCoder: Widespread And Versatile Enhancement For Code Large Language Models By Instruction Tuning

1
Comments
3 min read
Deep Learning for Camera Calibration and Beyond: A Survey

Deep Learning for Camera Calibration and Beyond: A Survey

Comments
4 min read
Enhancing Multimodal Large Language Models with Vision Detection Models: An Empirical Study

Enhancing Multimodal Large Language Models with Vision Detection Models: An Empirical Study

Comments
4 min read
Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning

Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning

Comments
4 min read
Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models

Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models

Comments
5 min read
Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?

Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?

Comments
4 min read
Virtual avatar generation models as world navigators

Virtual avatar generation models as world navigators

Comments
4 min read
InfoLossQA: Characterizing and Recovering Information Loss in Text Simplification

InfoLossQA: Characterizing and Recovering Information Loss in Text Simplification

Comments
3 min read
Harvard Undergraduate Survey on Generative AI

Harvard Undergraduate Survey on Generative AI

Comments
2 min read
Will we run out of data? Limits of LLM scaling based on human-generated data

Will we run out of data? Limits of LLM scaling based on human-generated data

Comments
4 min read
Implementing Reinforcement Learning Datacenter Congestion Control in NVIDIA NICs

Implementing Reinforcement Learning Datacenter Congestion Control in NVIDIA NICs

Comments
3 min read
RAFT: Adapting Language Model to Domain Specific RAG

RAFT: Adapting Language Model to Domain Specific RAG

Comments
4 min read
Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression

Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression

Comments
3 min read
SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales

SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales

Comments
4 min read
SqueezeLLM: Dense-and-Sparse Quantization

SqueezeLLM: Dense-and-Sparse Quantization

Comments
4 min read
ChatDev: Communicative Agents for Software Development

ChatDev: Communicative Agents for Software Development

Comments
3 min read
Contrastive Learning and Mixture of Experts Enables Precise Vector Embeddings

Contrastive Learning and Mixture of Experts Enables Precise Vector Embeddings

Comments
4 min read
LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code

LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code

Comments
4 min read
Do Llamas Work in English? On the Latent Language of Multilingual Transformers

Do Llamas Work in English? On the Latent Language of Multilingual Transformers

Comments
4 min read
LLark: A Multimodal Instruction-Following Language Model for Music

LLark: A Multimodal Instruction-Following Language Model for Music

Comments
3 min read
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning

Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning

Comments
5 min read
REBUS: A Robust Evaluation Benchmark of Understanding Symbols

REBUS: A Robust Evaluation Benchmark of Understanding Symbols

Comments
4 min read
Empirical influence functions to understand the logic of fine-tuning

Empirical influence functions to understand the logic of fine-tuning

Comments
4 min read
Evaluating Quantized Large Language Models

Evaluating Quantized Large Language Models

Comments
5 min read
CompanyKG: A Large-Scale Heterogeneous Graph for Company Similarity Quantification

CompanyKG: A Large-Scale Heterogeneous Graph for Company Similarity Quantification

Comments
3 min read
To Believe or Not to Believe Your LLM

To Believe or Not to Believe Your LLM

Comments
4 min read
Examining the robustness of LLM evaluation to the distributional assumptions of benchmarks

Examining the robustness of LLM evaluation to the distributional assumptions of benchmarks

Comments
4 min read
Position: Categorical Deep Learning is an Algebraic Theory of All Architectures

Position: Categorical Deep Learning is an Algebraic Theory of All Architectures

Comments
1 min read
A Shocking Amount of the Web is Machine Translated: Insights from Multi-Way Parallelism

A Shocking Amount of the Web is Machine Translated: Insights from Multi-Way Parallelism

Comments
4 min read
Large Language Models for Generative Information Extraction: A Survey

Large Language Models for Generative Information Extraction: A Survey

Comments
5 min read
Bridging the Empirical-Theoretical Gap in Neural Network Formal Language Learning Using Minimum Description Length

Bridging the Empirical-Theoretical Gap in Neural Network Formal Language Learning Using Minimum Description Length

Comments
4 min read
Towards Lightweight Super-Resolution with Dual Regression Learning

Towards Lightweight Super-Resolution with Dual Regression Learning

Comments
3 min read
Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities

Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities

Comments
4 min read
Relightable Gaussian Codec Avatars

Relightable Gaussian Codec Avatars

Comments
4 min read
Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities

Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities

Comments
4 min read
Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models

Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models

Comments
4 min read
loading...