DEV Community

Mike Young profile picture

Mike Young

Building indie hacker stuff in my free time, focusing on AI. Launching https://aimodels.fyi - find the right AI model for your project!

Location Washington, DC Joined Joined on  Personal website https://aimodels.fyi twitter website

Education

Purdue

Work

Indie hacking stuff!

Models That Prove Their Own Correctness

Models That Prove Their Own Correctness

Comments
5 min read

Want to connect with Mike Young?

Create an account to connect with Mike Young. You can also sign in below to proceed if you already have an account.

Already have an account? Sign in
ZigZag: Universal Sampling-free Uncertainty Estimation Through Two-Step Inference

ZigZag: Universal Sampling-free Uncertainty Estimation Through Two-Step Inference

Comments
4 min read
Surveilling the Masses with Wi-Fi-Based Positioning Systems

Surveilling the Masses with Wi-Fi-Based Positioning Systems

Comments
5 min read
Lessons from the Trenches on Reproducible Evaluation of Language Models

Lessons from the Trenches on Reproducible Evaluation of Language Models

Comments
5 min read
XMoE: Sparse Models with Fine-grained and Adaptive Expert Selection

XMoE: Sparse Models with Fine-grained and Adaptive Expert Selection

Comments
4 min read
Not All Language Model Features Are Linear

Not All Language Model Features Are Linear

Comments
4 min read
Many-Shot In-Context Learning

Many-Shot In-Context Learning

Comments
4 min read
Efficient Encoder-Decoder Transformer Decoding for Decomposable Tasks

Efficient Encoder-Decoder Transformer Decoding for Decomposable Tasks

Comments
3 min read
Ephemeral Rollups are All you Need

Ephemeral Rollups are All you Need

Comments
3 min read
Reprompting: Automated Chain-of-Thought Prompt Inference Through Gibbs Sampling

Reprompting: Automated Chain-of-Thought Prompt Inference Through Gibbs Sampling

Comments
4 min read
Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization

Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization

Comments
4 min read
Denoising LM: Pushing the Limits of Error Correction Models for Speech Recognition

Denoising LM: Pushing the Limits of Error Correction Models for Speech Recognition

Comments
4 min read
Chain-of-Thought Reasoning Without Prompting

Chain-of-Thought Reasoning Without Prompting

Comments
4 min read
AstroPT: Scaling Large Observation Models for Astronomy

AstroPT: Scaling Large Observation Models for Astronomy

Comments
3 min read
Extracting Prompts by Inverting LLM Outputs

Extracting Prompts by Inverting LLM Outputs

Comments
4 min read
Bring Your Own KG: Self-Supervised Program Synthesis for Zero-Shot KGQA

Bring Your Own KG: Self-Supervised Program Synthesis for Zero-Shot KGQA

Comments
4 min read
Direct3D: Scalable Image-to-3D Generation via 3D Latent Diffusion Transformer

Direct3D: Scalable Image-to-3D Generation via 3D Latent Diffusion Transformer

Comments
4 min read
DarkDNS: Revisiting the Value of Rapid Zone Update

DarkDNS: Revisiting the Value of Rapid Zone Update

Comments
4 min read
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Comments
4 min read
LoRA Fine-tuning Efficiently Undoes Safety Training in Llama 2-Chat 70B

LoRA Fine-tuning Efficiently Undoes Safety Training in Llama 2-Chat 70B

Comments
4 min read
Levels of AGI for Operationalizing Progress on the Path to AGI

Levels of AGI for Operationalizing Progress on the Path to AGI

Comments
4 min read
The CAP Principle for LLM Serving

The CAP Principle for LLM Serving

Comments
4 min read
ColorFoil: Investigating Color Blindness in Large Vision and Language Models

ColorFoil: Investigating Color Blindness in Large Vision and Language Models

Comments
4 min read
Increasing the LLM Accuracy for Question Answering: Ontologies to the Rescue!

Increasing the LLM Accuracy for Question Answering: Ontologies to the Rescue!

Comments
5 min read
Self-playing Adversarial Language Game Enhances LLM Reasoning

Self-playing Adversarial Language Game Enhances LLM Reasoning

Comments
4 min read
Unsupervised Evaluation of Code LLMs with Round-Trip Correctness

Unsupervised Evaluation of Code LLMs with Round-Trip Correctness

Comments
4 min read
Training Language Models to Generate Text with Citations via Fine-grained Rewards

Training Language Models to Generate Text with Citations via Fine-grained Rewards

Comments
3 min read
ARAIDA: Analogical Reasoning-Augmented Interactive Data Annotation

ARAIDA: Analogical Reasoning-Augmented Interactive Data Annotation

Comments
4 min read
InvertAvatar: Incremental GAN Inversion for Generalized Head Avatars

InvertAvatar: Incremental GAN Inversion for Generalized Head Avatars

Comments
4 min read
Why are Sensitive Functions Hard for Transformers?

Why are Sensitive Functions Hard for Transformers?

Comments
4 min read
From Sparse to Soft Mixtures of Experts

From Sparse to Soft Mixtures of Experts

Comments
4 min read
ExcelFormer: Can a DNN be a Sure Bet for Tabular Prediction?

ExcelFormer: Can a DNN be a Sure Bet for Tabular Prediction?

Comments
3 min read
VecFusion: Vector Font Generation with Diffusion

VecFusion: Vector Font Generation with Diffusion

Comments
4 min read
Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving

Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving

Comments
3 min read
Transformers Can Do Arithmetic with the Right Embeddings

Transformers Can Do Arithmetic with the Right Embeddings

Comments
4 min read
Print-N-Grip: A Disposable, Compliant, Scalable and One-Shot 3D-Printed Multi-Fingered Robotic Hand

Print-N-Grip: A Disposable, Compliant, Scalable and One-Shot 3D-Printed Multi-Fingered Robotic Hand

Comments
4 min read
Thermodynamic Natural Gradient Descent

Thermodynamic Natural Gradient Descent

Comments
5 min read
Can a Transformer Represent a Kalman Filter?

Can a Transformer Represent a Kalman Filter?

Comments
4 min read
Representation noising effectively prevents harmful fine-tuning on LLMs

Representation noising effectively prevents harmful fine-tuning on LLMs

Comments
5 min read
AI-Assisted Assessment of Coding Practices in Modern Code Review

AI-Assisted Assessment of Coding Practices in Modern Code Review

Comments
4 min read
A Declarative System for Optimizing AI Workloads

A Declarative System for Optimizing AI Workloads

Comments
4 min read
YOLOv10: Real-Time End-to-End Object Detection

YOLOv10: Real-Time End-to-End Object Detection

Comments
3 min read
BiomedParse: a biomedical foundation model for image parsing of everything everywhere all at once

BiomedParse: a biomedical foundation model for image parsing of everything everywhere all at once

Comments
4 min read
Demo Paper: A Game Agents Battle Driven by Free-Form Text Commands Using Code-Generation LLM

Demo Paper: A Game Agents Battle Driven by Free-Form Text Commands Using Code-Generation LLM

Comments
4 min read
Attention as an RNN

Attention as an RNN

Comments
4 min read
Evaluation of the Programming Skills of Large Language Models

Evaluation of the Programming Skills of Large Language Models

Comments
3 min read
Deceptive, Disruptive, No Big Deal: Japanese People React to Simulated Dark Commercial Patterns

Deceptive, Disruptive, No Big Deal: Japanese People React to Simulated Dark Commercial Patterns

Comments
4 min read
Pareto Optimal Learning for Estimating Large Language Model Errors

Pareto Optimal Learning for Estimating Large Language Model Errors

Comments
4 min read
Fractal Patterns May Illuminate the Success of Next-Token Prediction

Fractal Patterns May Illuminate the Success of Next-Token Prediction

Comments
5 min read
Influencer Cartels

Influencer Cartels

Comments
4 min read
As an AI Language Model, Yes I Would Recommend Calling the Police'': Norm Inconsistency in LLM Decision-Making

As an AI Language Model, Yes I Would Recommend Calling the Police'': Norm Inconsistency in LLM Decision-Making

Comments
4 min read
TimeGPT-1

TimeGPT-1

Comments
4 min read
Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models

Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models

Comments
5 min read
BrepGen: A B-rep Generative Diffusion Model with Structured Latent Geometry

BrepGen: A B-rep Generative Diffusion Model with Structured Latent Geometry

Comments
4 min read
VerMCTS: Synthesizing Multi-Step Programs using a Verifier, a Large Language Model, and Tree Search

VerMCTS: Synthesizing Multi-Step Programs using a Verifier, a Large Language Model, and Tree Search

Comments
4 min read
Robust Classification via a Single Diffusion Model

Robust Classification via a Single Diffusion Model

Comments
4 min read
Your Transformer is Secretly Linear

Your Transformer is Secretly Linear

Comments
4 min read
Track Anything Rapter(TAR)

Track Anything Rapter(TAR)

Comments
3 min read
UFO: A UI-Focused Agent for Windows OS Interaction

UFO: A UI-Focused Agent for Windows OS Interaction

Comments
4 min read
Power Hungry Processing: Watts Driving the Cost of AI Deployment?

Power Hungry Processing: Watts Driving the Cost of AI Deployment?

Comments
4 min read
loading...