DEV Community

Alex Spinov
Alex Spinov

Posted on

Together AI Has a Free API — Here's How to Run Open-Source LLMs 4x Cheaper Than OpenAI

A startup CTO told me: 'Our OpenAI bill was $2,000/month for GPT-4. We switched the non-critical tasks to Llama 3 on Together AI — same quality for those use cases, $200/month.'

What Together AI Offers

Together AI:

  • Free $5 credit on signup — enough for 10M+ tokens on smaller models
  • 100+ open-source models — Llama 3, Mixtral, CodeLlama, DBRX, etc.
  • OpenAI-compatible API — switch with one line
  • Inference — fastest Llama inference available
  • Fine-tuning — train custom models on your data
  • Embeddings — text embeddings for RAG/search
  • Image generation — Stable Diffusion, FLUX

Quick Start

npm install openai  # Together uses OpenAI-compatible API
Enter fullscreen mode Exit fullscreen mode
import OpenAI from 'openai';

const together = new OpenAI({
  baseURL: 'https://api.together.xyz/v1',
  apiKey: process.env.TOGETHER_API_KEY
});

const response = await together.chat.completions.create({
  model: 'meta-llama/Llama-3-70b-chat-hf',
  messages: [{ role: 'user', content: 'Explain Docker in simple terms' }],
  max_tokens: 500
});

console.log(response.choices[0].message.content);
Enter fullscreen mode Exit fullscreen mode

REST API

# Chat completion
curl 'https://api.together.xyz/v1/chat/completions' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "meta-llama/Llama-3-70b-chat-hf",
    "messages": [{"role": "user", "content": "Hello!"}],
    "max_tokens": 200
  }'

# Streaming
curl 'https://api.together.xyz/v1/chat/completions' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "meta-llama/Llama-3-70b-chat-hf",
    "messages": [{"role": "user", "content": "Tell me a joke"}],
    "stream": true
  }'

# Embeddings
curl 'https://api.together.xyz/v1/embeddings' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "togethercomputer/m2-bert-80M-8k-retrieval",
    "input": "What is machine learning?"
  }'

# Image generation
curl 'https://api.together.xyz/v1/images/generations' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "stabilityai/stable-diffusion-xl-base-1.0",
    "prompt": "A robot painting a landscape",
    "n": 1
  }'
Enter fullscreen mode Exit fullscreen mode

Code Generation

// CodeLlama for coding tasks
const code = await together.chat.completions.create({
  model: 'codellama/CodeLlama-34b-Instruct-hf',
  messages: [{
    role: 'user',
    content: 'Write a Python function that finds all prime numbers up to n using the Sieve of Eratosthenes'
  }],
  max_tokens: 500
});
Enter fullscreen mode Exit fullscreen mode

Embeddings for RAG

// Generate embeddings for semantic search
const embedding = await together.embeddings.create({
  model: 'togethercomputer/m2-bert-80M-8k-retrieval',
  input: ['How to deploy Docker containers', 'Kubernetes vs Docker Swarm']
});

// Use embeddings for similarity search
const vectors = embedding.data.map(d => d.embedding);
Enter fullscreen mode Exit fullscreen mode

JSON Mode

// Force structured JSON output
const result = await together.chat.completions.create({
  model: 'meta-llama/Llama-3-70b-chat-hf',
  messages: [{
    role: 'user',
    content: 'Extract the product name, price, and rating from: "The Sony WH-1000XM5 headphones are $349 with 4.7 stars"'
  }],
  response_format: { type: 'json_object' }
});
// Returns: {"product": "Sony WH-1000XM5", "price": 349, "rating": 4.7}
Enter fullscreen mode Exit fullscreen mode

Cost Comparison (per 1M tokens)

Model Provider Cost
GPT-4o OpenAI $5.00 input
Claude 3.5 Sonnet Anthropic $3.00 input
Llama 3 70B Together AI $0.90 input
Llama 3 8B Together AI $0.20 input
Mixtral 8x7B Together AI $0.60 input

Need AI-powered scraping? Check out my web scraping actors on Apify — intelligent data extraction.

Need custom AI solutions? Email me at spinov001@gmail.com.

Top comments (0)