DEV Community

Mike Young profile picture

Mike Young

Building indie hacker stuff in my free time, focusing on AI. Launching https://aimodels.fyi - find the right AI model for your project!

Location Washington, DC Joined Joined on  Personal website https://aimodels.fyi twitter website

Education

Purdue

Work

Indie hacking stuff!

A beginner's guide to the Codellama-13b model by Meta on Replicate

A beginner's guide to the Codellama-13b model by Meta on Replicate

2
Comments
3 min read

Want to connect with Mike Young?

Create an account to connect with Mike Young. You can also sign in below to proceed if you already have an account.

Already have an account? Sign in
A beginner's guide to the Codellama-34b model by Meta on Replicate

A beginner's guide to the Codellama-34b model by Meta on Replicate

1
Comments
2 min read
A beginner's guide to the Video-Retalking model by Xiankgx on Replicate

A beginner's guide to the Video-Retalking model by Xiankgx on Replicate

1
Comments
3 min read
A beginner's guide to the Codellama-34b-Instruct model by Meta on Replicate

A beginner's guide to the Codellama-34b-Instruct model by Meta on Replicate

Comments
3 min read
A beginner's guide to the Codellama-13b-Python model by Meta on Replicate

A beginner's guide to the Codellama-13b-Python model by Meta on Replicate

Comments
2 min read
A beginner's guide to the Codellama-7b-Instruct model by Meta on Replicate

A beginner's guide to the Codellama-7b-Instruct model by Meta on Replicate

Comments
2 min read
A beginner's guide to the Codellama-13b-Instruct model by Meta on Replicate

A beginner's guide to the Codellama-13b-Instruct model by Meta on Replicate

Comments
2 min read
A beginner's guide to the Llava-V1.6-Mistral-7b model by Yorickvp on Replicate

A beginner's guide to the Llava-V1.6-Mistral-7b model by Yorickvp on Replicate

Comments
3 min read
A beginner's guide to the Llava-V1.6-Vicuna-7b model by Yorickvp on Replicate

A beginner's guide to the Llava-V1.6-Vicuna-7b model by Yorickvp on Replicate

Comments
3 min read
A beginner's guide to the Llava-V1.6-Vicuna-13b model by Yorickvp on Replicate

A beginner's guide to the Llava-V1.6-Vicuna-13b model by Yorickvp on Replicate

Comments
3 min read
CNN-Based Equalization for Communications: Achieving Gigabit Throughput with a Flexible FPGA Hardware Architecture

CNN-Based Equalization for Communications: Achieving Gigabit Throughput with a Flexible FPGA Hardware Architecture

1
Comments
4 min read
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?

Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?

1
Comments
5 min read
Language Modeling Using Tensor Trains

Language Modeling Using Tensor Trains

1
Comments
4 min read
A Survey on the Real Power of ChatGPT

A Survey on the Real Power of ChatGPT

1
Comments
4 min read
Fishing for Magikarp: Automatically Detecting Under-trained Tokens in Large Language Models

Fishing for Magikarp: Automatically Detecting Under-trained Tokens in Large Language Models

1
Comments
3 min read
You Only Cache Once: Decoder-Decoder Architectures for Language Models

You Only Cache Once: Decoder-Decoder Architectures for Language Models

1
Comments
4 min read
Evaluating Large Language Models for Material Selection

Evaluating Large Language Models for Material Selection

1
Comments
4 min read
Linearizing Large Language Models

Linearizing Large Language Models

1
Comments
4 min read
Motorway: Seamless high speed BFT

Motorway: Seamless high speed BFT

1
Comments
3 min read
From Interpolation to Extrapolation: Complete Length Generalization for Arithmetic Transformers

From Interpolation to Extrapolation: Complete Length Generalization for Arithmetic Transformers

1
Comments
4 min read
Memory Mosaics

Memory Mosaics

1
Comments
4 min read
The Power of Training: How Different Neural Network Setups Influence the Energy Demand

The Power of Training: How Different Neural Network Setups Influence the Energy Demand

1
Comments
4 min read
Chain of Thoughtlessness: An Analysis of CoT in Planning

Chain of Thoughtlessness: An Analysis of CoT in Planning

1
Comments
4 min read
The AI Review Lottery: Widespread AI-Assisted Peer Reviews Boost Paper Scores and Acceptance Rates

The AI Review Lottery: Widespread AI-Assisted Peer Reviews Boost Paper Scores and Acceptance Rates

1
Comments
4 min read
LLMs Can Patch Up Missing Relevance Judgments in Evaluation

LLMs Can Patch Up Missing Relevance Judgments in Evaluation

1
Comments
4 min read
Neural Networks Make Approximately Independent Errors Over Repeated Training

Neural Networks Make Approximately Independent Errors Over Repeated Training

1
Comments
4 min read
Large Language Models can Strategically Deceive their Users when Put Under Pressure

Large Language Models can Strategically Deceive their Users when Put Under Pressure

Comments
4 min read
TIM: An Efficient Temporal Interaction Module for Spiking Transformer

TIM: An Efficient Temporal Interaction Module for Spiking Transformer

Comments
5 min read
OptPDE: Discovering Novel Integrable Systems via AI-Human Collaboration

OptPDE: Discovering Novel Integrable Systems via AI-Human Collaboration

Comments
4 min read
A beginner's guide to the Meta-Llama-3-8b model by Meta on Replicate

A beginner's guide to the Meta-Llama-3-8b model by Meta on Replicate

Comments
2 min read
Mitigating LLM Hallucinations via Conformal Abstention

Mitigating LLM Hallucinations via Conformal Abstention

Comments
4 min read
Can't say cant? Measuring and Reasoning of Dark Jargons in Large Language Models

Can't say cant? Measuring and Reasoning of Dark Jargons in Large Language Models

Comments
3 min read
The Psychosocial Impacts of Generative AI Harms

The Psychosocial Impacts of Generative AI Harms

Comments
4 min read
Generative Multimodal Models are In-Context Learners

Generative Multimodal Models are In-Context Learners

Comments
4 min read
Assemblage: Automatic Binary Dataset Construction for Machine Learning

Assemblage: Automatic Binary Dataset Construction for Machine Learning

Comments
4 min read
HCC Is All You Need: Alignment-The Sensible Kind Anyway-Is Just Human-Centered Computing

HCC Is All You Need: Alignment-The Sensible Kind Anyway-Is Just Human-Centered Computing

Comments
4 min read
xLSTM: Extended Long Short-Term Memory

xLSTM: Extended Long Short-Term Memory

Comments
3 min read
CascadedGaze: Efficiency in Global Context Extraction for Image Restoration

CascadedGaze: Efficiency in Global Context Extraction for Image Restoration

Comments
3 min read
Agent Hospital: A Simulacrum of Hospital with Evolvable Medical Agents

Agent Hospital: A Simulacrum of Hospital with Evolvable Medical Agents

Comments
4 min read
TrustScore: Reference-Free Evaluation of LLM Response Trustworthiness

TrustScore: Reference-Free Evaluation of LLM Response Trustworthiness

Comments
3 min read
AlphaMath Almost Zero: process Supervision without process

AlphaMath Almost Zero: process Supervision without process

Comments
4 min read
FurniScene: A Large-scale 3D Room Dataset with Intricate Furnishing Scenes

FurniScene: A Large-scale 3D Room Dataset with Intricate Furnishing Scenes

1
Comments
4 min read
SequenceMatch: Imitation Learning for Autoregressive Sequence Modelling with Backtracking

SequenceMatch: Imitation Learning for Autoregressive Sequence Modelling with Backtracking

1
Comments
4 min read
Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity

Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity

1
Comments
4 min read
A Simple and Effective Pruning Approach for Large Language Models

A Simple and Effective Pruning Approach for Large Language Models

1
Comments
3 min read
Prompt Design and Engineering: Introduction and Advanced Methods

Prompt Design and Engineering: Introduction and Advanced Methods

1
Comments
4 min read
ChainForge: A Visual Toolkit for Prompt Engineering and LLM Hypothesis Testing

ChainForge: A Visual Toolkit for Prompt Engineering and LLM Hypothesis Testing

1
Comments
3 min read
SAR image matching algorithm based on multi-class features

SAR image matching algorithm based on multi-class features

Comments
4 min read
Generative AI Beyond LLMs: System Implications of Multi-Modal Generation

Generative AI Beyond LLMs: System Implications of Multi-Modal Generation

1
Comments
3 min read
PopulAtion Parameter Averaging (PAPA)

PopulAtion Parameter Averaging (PAPA)

Comments
3 min read
Porting HPC Applications to AMD Instinct$^text{TM}$ MI300A Using Unified Memory and OpenMP

Porting HPC Applications to AMD Instinct$^text{TM}$ MI300A Using Unified Memory and OpenMP

Comments
4 min read
Network reconstruction via the minimum description length principle

Network reconstruction via the minimum description length principle

Comments
3 min read
Scalable Bayesian Inference in the Era of Deep Learning: From Gaussian Processes to Deep Neural Networks

Scalable Bayesian Inference in the Era of Deep Learning: From Gaussian Processes to Deep Neural Networks

Comments
4 min read
Are aligned neural networks adversarially aligned?

Are aligned neural networks adversarially aligned?

Comments
4 min read
Poisoning Web-Scale Training Datasets is Practical

Poisoning Web-Scale Training Datasets is Practical

Comments
3 min read
Circuit Component Reuse Across Tasks in Transformer Language Models

Circuit Component Reuse Across Tasks in Transformer Language Models

Comments
4 min read
R-Tuning: Instructing Large Language Models to Say `I Don't Know'

R-Tuning: Instructing Large Language Models to Say `I Don't Know'

Comments
4 min read
FMGS: Foundation Model Embedded 3D Gaussian Splatting for Holistic 3D Scene Understanding

FMGS: Foundation Model Embedded 3D Gaussian Splatting for Holistic 3D Scene Understanding

Comments
4 min read
Beyond Memorization: Violating Privacy Via Inference with Large Language Models

Beyond Memorization: Violating Privacy Via Inference with Large Language Models

Comments
3 min read
FLAME: Factuality-Aware Alignment for Large Language Models

FLAME: Factuality-Aware Alignment for Large Language Models

Comments
4 min read
loading...