DEV Community

Mike Young profile picture

Mike Young

Building indie hacker stuff in my free time, focusing on AI. Launching https://aimodels.fyi - find the right AI model for your project!

Location Washington, DC Joined Joined on  Personal website https://aimodels.fyi twitter website

Education

Purdue

Work

Indie hacking stuff!

Do not think pink elephant!

Do not think pink elephant!

Comments
5 min read

Want to connect with Mike Young?

Create an account to connect with Mike Young. You can also sign in below to proceed if you already have an account.

Already have an account? Sign in
Infusion: Preventing Customized Text-to-Image Diffusion from Overfitting

Infusion: Preventing Customized Text-to-Image Diffusion from Overfitting

Comments
4 min read
Removing Reflections from RAW Photos

Removing Reflections from RAW Photos

Comments
3 min read
An Analysis of the Math Requirements of 199 CS BS/BA Degrees at 158 U.S. Universities

An Analysis of the Math Requirements of 199 CS BS/BA Degrees at 158 U.S. Universities

Comments
3 min read
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

Comments
4 min read
SpaceByte: Towards Deleting Tokenization from Large Language Modeling

SpaceByte: Towards Deleting Tokenization from Large Language Modeling

Comments
5 min read
Pixel is a Barrier: Diffusion Models Are More Adversarially Robust Than We Think

Pixel is a Barrier: Diffusion Models Are More Adversarially Robust Than We Think

2
Comments
4 min read
Leftover Lunch: Advantage-based Offline Reinforcement Learning for Language Models

Leftover Lunch: Advantage-based Offline Reinforcement Learning for Language Models

1
Comments
4 min read
RETVec: Resilient and Efficient Text Vectorizer

RETVec: Resilient and Efficient Text Vectorizer

Comments
3 min read
From LLM to NMT: Advancing Low-Resource Machine Translation with Claude

From LLM to NMT: Advancing Low-Resource Machine Translation with Claude

Comments
4 min read
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

Comments
3 min read
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

Comments
4 min read
A Survey on Self-Evolution of Large Language Models

A Survey on Self-Evolution of Large Language Models

Comments
3 min read
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models

JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models

Comments
4 min read
Functionally-Complete Boolean Logic in Real DRAM Chips: Experimental Characterization and Analysis

Functionally-Complete Boolean Logic in Real DRAM Chips: Experimental Characterization and Analysis

Comments
4 min read
Does GPT-4 pass the Turing test?

Does GPT-4 pass the Turing test?

1
Comments
4 min read
Think before you speak: Training Language Models With Pause Tokens

Think before you speak: Training Language Models With Pause Tokens

Comments
4 min read
Reading Between the Lines: Modeling User Behavior and Costs in AI-Assisted Programming

Reading Between the Lines: Modeling User Behavior and Costs in AI-Assisted Programming

Comments
4 min read
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Comments
4 min read
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Comments
4 min read
Matching Patients to Clinical Trials with Large Language Models

Matching Patients to Clinical Trials with Large Language Models

Comments
4 min read
Bot or Human? Detecting ChatGPT Imposters with A Single Question

Bot or Human? Detecting ChatGPT Imposters with A Single Question

3
Comments
4 min read
Who Validates the Validators? Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences

Who Validates the Validators? Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences

Comments
4 min read
ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs

ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs

Comments
5 min read
HanDiffuser: Text-to-Image Generation With Realistic Hand Appearances

HanDiffuser: Text-to-Image Generation With Realistic Hand Appearances

Comments
4 min read
Microsoft’s Phi-3 model is cool tech, but local LLMs are useless

Microsoft’s Phi-3 model is cool tech, but local LLMs are useless

Comments
6 min read
TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Comments
4 min read
Interpretable Graph Neural Networks for Tabular Data

Interpretable Graph Neural Networks for Tabular Data

Comments
4 min read
AI Consciousness is Inevitable: A Theoretical Computer Science Perspective

AI Consciousness is Inevitable: A Theoretical Computer Science Perspective

Comments
4 min read
Ten Hard Problems in Artificial Intelligence We Must Get Right

Ten Hard Problems in Artificial Intelligence We Must Get Right

2
Comments 3
4 min read
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Comments
4 min read
Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves

Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves

2
Comments
4 min read
Deep Neural Networks via Complex Network Theory: a Perspective

Deep Neural Networks via Complex Network Theory: a Perspective

Comments
4 min read
Lossless Acceleration of Large Language Model via Adaptive N-gram Parallel Decoding

Lossless Acceleration of Large Language Model via Adaptive N-gram Parallel Decoding

2
Comments
4 min read
From $r$ to $Q^*$: Your Language Model is Secretly a Q-Function

From $r$ to $Q^*$: Your Language Model is Secretly a Q-Function

2
Comments
4 min read
Language Imbalance Can Boost Cross-lingual Generalisation

Language Imbalance Can Boost Cross-lingual Generalisation

1
Comments
3 min read
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing

Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing

2
Comments
4 min read
InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models

InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models

1
Comments
4 min read
Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations

Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations

1
Comments
3 min read
Online Advertisements with LLMs: Opportunities and Challenges

Online Advertisements with LLMs: Opportunities and Challenges

1
Comments
4 min read
mABC: multi-Agent Blockchain-Inspired Collaboration for root cause analysis in micro-services architecture

mABC: multi-Agent Blockchain-Inspired Collaboration for root cause analysis in micro-services architecture

1
Comments
4 min read
Information Retrieval with Entity Linking

Information Retrieval with Entity Linking

2
Comments
4 min read
Twenty Constructionist Things to Do with Artificial Intelligence and Machine Learning

Twenty Constructionist Things to Do with Artificial Intelligence and Machine Learning

Comments
4 min read
A decoder-only foundation model for time-series forecasting

A decoder-only foundation model for time-series forecasting

Comments
4 min read
LLM Agents can Autonomously Exploit One-day Vulnerabilities

LLM Agents can Autonomously Exploit One-day Vulnerabilities

Comments
4 min read
Efficient Sentiment Analysis: A Resource-Aware Evaluation of Feature Extraction Techniques, Ensembling, and Deep Learning Models

Efficient Sentiment Analysis: A Resource-Aware Evaluation of Feature Extraction Techniques, Ensembling, and Deep Learning Models

1
Comments
4 min read
A Closer Look at AUROC and AUPRC under Class Imbalance

A Closer Look at AUROC and AUPRC under Class Imbalance

Comments
4 min read
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models

Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models

Comments
4 min read
What are human values, and how do we align AI to them?

What are human values, and how do we align AI to them?

Comments
4 min read
The Curious Decline of Linguistic Diversity: Training Language Models on Synthetic Text

The Curious Decline of Linguistic Diversity: Training Language Models on Synthetic Text

Comments
4 min read
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Comments
4 min read
Confidential Federated Computations

Confidential Federated Computations

Comments
4 min read
Born With a Silver Spoon? Investigating Socioeconomic Bias in Large Language Models

Born With a Silver Spoon? Investigating Socioeconomic Bias in Large Language Models

3
Comments
4 min read
Long-form music generation with latent diffusion

Long-form music generation with latent diffusion

2
Comments
4 min read
Chinchilla Scaling: A replication attempt

Chinchilla Scaling: A replication attempt

Comments
3 min read
AutoCodeRover: Autonomous Program Improvement

AutoCodeRover: Autonomous Program Improvement

1
Comments
3 min read
The Illusion of State in State-Space Models

The Illusion of State in State-Space Models

Comments
4 min read
Zero-shot Building Age Classification from Facade Image Using GPT-4

Zero-shot Building Age Classification from Facade Image Using GPT-4

Comments
4 min read
H2O-Danube-1.8B Technical Report

H2O-Danube-1.8B Technical Report

Comments
4 min read
Dataset Reset Policy Optimization for RLHF

Dataset Reset Policy Optimization for RLHF

Comments
4 min read
loading...