๐ฐ Claude API ใณในใๅๆธ่ก๏ผๅ ฌๅผใใฏใใใฏใงๆๅคง90%็ฏ็ด๏ผ
ใพใ ๅฎไพกใงClaude APIใไฝฟใฃใฆใใพใใใ๏ผๅฎใฏๅ ฌๅผใๆไพใใ3ใคใฎๆฉ่ฝใไฝฟใใ ใใงใAPIใณในใใๆๅคง90%ใพใงๅๆธใงใใใใงใใ
๐ฏ ใฏใใใซ๏ผใชใใณในใๅๆธใ้่ฆใชใฎใ
็งใใกๅๅฟๆใงใฏใ28ๅนใฎ็ซใจ็ฌใใกใฎๅ็ใปๅ็ปใๆฏๆฅAIๅๆใใฆใใพใใๅ็ฉ่ญๅฅใใณใณใใณใ็ๆใ่ชๅๆ็จฟ...Claude APIใฎๅผใณๅบใใฏ1ๆฅๆฐ็พๅใซใๅใณใพใใ
ๆๅใฎๆใฎ่ซๆฑๆธใ่ฆใๆใๆญฃ็ด้ฉใใพใใใใใใใ็ถใใใใใฎ...๏ผใ
ใใใใAnthropicๅ ฌๅผใใญใฅใกใณใใ่ชญใฟ่พผใใ ็ตๆใ3ใคใฎ็ฏ็ดใใฏใใใฏใ็บ่ฆใไปใงใฏๅใๅฆ็้ใงใณในใใ80%ไปฅไธๅๆธใงใใฆใใพใใ
ไปๆฅใฏใใฎ็งๅฏใๅ จใฆๅ ฌ้ใใพใใ
๐ ไธๅคง็ฏ็ดใใฏใใใฏ
1๏ธโฃ Batch API๏ผ50%ใชใ๏ผ
ๅณๆใฌในใใณในใไธ่ฆใชๅฆ็ใซๆ้ฉ๏ผ
Batch APIใฏใใชใฏใจในใใ24ๆ้ไปฅๅ ใซๅฆ็ใใไปฃใใใซใ50%ๅฒๅผใๆไพใใพใใ
้ฉ็จใทใผใณ
- ๅคง้ใฎ็ปๅๅๆ
- ใใใ็ฟป่จณๅฆ็
- ๅค้ใฎๅฎๆๅฆ็
- ใฌใใผใ็ๆ
Python ใณใผใไพ
import anthropic
import json
client = anthropic.Anthropic()
# ใใใใชใฏใจในใใไฝๆ
def create_batch_request(prompts: list[str]) -> str:
"""่คๆฐใฎใใญใณใใใใใใๅฆ็ใใ"""
requests = []
for i, prompt in enumerate(prompts):
requests.append({
"custom_id": f"request-{i}",
"params": {
"model": "claude-sonnet-4-20250514",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": prompt}
]
}
})
# ใใใใไฝๆ
batch = client.batches.create(requests=requests)
print(f"โ
ใใใไฝๆๅฎไบ: {batch.id}")
print(f"๐ ใชใฏใจในใๆฐ: {len(prompts)}")
print(f"๐ฐ ็ฏ็ด็: 50%!")
return batch.id
# ไฝฟ็จไพ
prompts = [
"ใใฎ็ซใฎ็นๅพดใ่ชฌๆใใฆใใ ใใ",
"ใใฎ็ฌใฎๅ็จฎใๅคๅฎใใฆใใ ใใ",
"ใใฎๅ็ฉใฎ่กๅใๅๆใใฆใใ ใใ"
]
batch_id = create_batch_request(prompts)
๐ก ใใคใณใ
- ็ตๆใฏ24ๆ้ไปฅๅ ใซ่ฟๅด
- ๅคง้ๅฆ็ใปใฉๅนๆ็
-
client.batches.retrieve(batch_id)ใง็ถๆ ็ขบ่ช
2๏ธโฃ Prompt Caching๏ผๆๅคง90%ใชใ๏ผ
็นฐใ่ฟใไฝฟใใใญใณใใใใญใฃใใทใฅ๏ผ
ๅใใทในใใ ใใญใณใใใ้ทใใณใณใใญในใใไฝๅบฆใ้ไฟกใใฆใใพใใใ๏ผPrompt Cachingใไฝฟใใฐใใญใฃใใทใฅใใใ้จๅใฏ90%ใชใใซใชใใพใใ
้ฉ็จใทใผใณ
- ้ทใใทในใใ ใใญใณใใ
- ใใญใฅใกใณใๅๆ๏ผๅใใใญใฅใกใณใใซ่คๆฐ่ณชๅ๏ผ
- Few-shotไพใฎๅๅฉ็จ
- RAGใฎใณใณใใญในใ
Python ใณใผใไพ
import anthropic
client = anthropic.Anthropic()
def analyze_with_cache(document: str, questions: list[str]):
"""ๅใใใญใฅใกใณใใซๅฏพใใฆ่คๆฐใฎ่ณชๅใใญใฃใใทใฅๆดป็จใงๅฆ็"""
results = []
for i, question in enumerate(questions):
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
system=[
{
"type": "text",
"text": "ใใชใใฏๅ็ฉ่กๅๅๆใฎๅฐ้ๅฎถใงใใ",
},
{
"type": "text",
"text": document,
"cache_control": {"type": "ephemeral"} # ๐ ใญใฃใใทใฅๆๅฎ
}
],
messages=[
{"role": "user", "content": question}
]
)
# ใญใฃใใทใฅ็ถๆณใ็ขบ่ช
usage = response.usage
cache_read = getattr(usage, 'cache_read_input_tokens', 0)
cache_creation = getattr(usage, 'cache_creation_input_tokens', 0)
if cache_read > 0:
print(f"โ
่ณชๅ{i+1}: ใญใฃใใทใฅใใใ! {cache_read}ใใผใฏใณ (90%ใชใ)")
elif cache_creation > 0:
print(f"๐ ่ณชๅ{i+1}: ใญใฃใใทใฅไฝๆ {cache_creation}ใใผใฏใณ")
results.append(response.content[0].text)
return results
# ไฝฟ็จไพ๏ผ้ทใใใญใฅใกใณใใซ่คๆฐใฎ่ณชๅ
document = """
[ๅๅฟๆใฎๅ็ฉใใญใใฃใผใซ - 10,000ๆๅญใฎใใญใฅใกใณใ...]
"""
questions = [
"Jellyใฎๆงๆ ผใๆใใฆใใ ใใ",
"Goldใฎๅฅฝใใช้ฃใน็ฉใฏ๏ผ",
"Arielใฎ็นๅพด็ใช่กๅใใฟใผใณใฏ๏ผ"
]
results = analyze_with_cache(document, questions)
# โ 2ๅ็ฎไปฅ้ใฎ่ณชๅใงใญใฃใใทใฅใใใใ90%็ฏ็ด๏ผ
๐ก ใใคใณใ
- ใญใฃใใทใฅใฏ5ๅ้ๆๅน
- 1024ใใผใฏใณไปฅไธใงใญใฃใใทใฅๅฏ่ฝ
-
cache_control: {"type": "ephemeral"}ใไปใใใ ใ
3๏ธโฃ Extended Thinking๏ผๆ่ใใผใฏใณ็ด80%ใชใ๏ผ
่ค้ใชๆจ่ซใฟในใฏใซๆ้ฉ๏ผ
Extended ThinkingใฏใClaudeใซใ่ใใๆ้ใใไธใใๆฉ่ฝใๆ่ใใผใฏใณใฏ้ๅธธใฎ็ด80%ใชใใฎ็นๅฅไพกๆ ผใงใใ
้ฉ็จใทใผใณ
- ่ค้ใช่ซ็ๆจ่ซ
- ใณใผใ็ๆใปใใใใฐ
- ๆฐๅญฆ็ๅ้ก่งฃๆฑบ
- ๆฆ็ฅ็ซๆก
Python ใณใผใไพ
import anthropic
client = anthropic.Anthropic()
def solve_complex_problem(problem: str):
"""Extended Thinkingใง่ค้ใชๅ้กใ่งฃๆฑบ"""
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=16000,
thinking={
"type": "enabled",
"budget_tokens": 10000 # ๆ่ใซไฝฟใใใผใฏใณๆฐ
},
messages=[
{"role": "user", "content": problem}
]
)
# ๆ่ใใญใปในใจๅ็ญใๅ้ข
thinking_content = None
answer_content = None
for block in response.content:
if block.type == "thinking":
thinking_content = block.thinking
elif block.type == "text":
answer_content = block.text
# ใณในใ่จ็ฎ
usage = response.usage
input_tokens = usage.input_tokens
output_tokens = usage.output_tokens
print(f"๐ ๅ
ฅๅใใผใฏใณ: {input_tokens}")
print(f"๐ ๅบๅใใผใฏใณ: {output_tokens}")
print(f"๐ญ ๆ่ใใผใฏใณใฏ็ด80%ใชใ!")
return {
"thinking": thinking_content,
"answer": answer_content
}
# ไฝฟ็จไพ
problem = """
ๅๅฟๆใฎ28ๅนใฎๅ็ฉใใกใฎๆ้ฉใช็ตฆ้คในใฑใธใฅใผใซใ่จญ่จใใฆใใ ใใใ
ๆกไปถ๏ผ
- ็ซ23ๅนใ็ฌ5ๅน
- ๆใปๅคใฎ2ๅ็ตฆ้ค
- ็นๅฅ้ฃใๅฟ
่ฆใชๅ็ฉใ3ๅน
- ในใฟใใใฏ2ๅ
"""
result = solve_complex_problem(problem)
print(f"\n๐ฏ ๅ็ญ:\n{result['answer']}")
๐ก ใใคใณใ
- ๆ่ใใผใฏใณใฏๅบๅใซๅซใพใใชใ
-
budget_tokensใงๆ่้ใๅถๅพก - ่ค้ใชๅ้กใปใฉๅนๆ็
๐ ็ฏ็ดใฌใใผใ๏ผๅฎ้ใฎๅนๆ
ๅๅฟๆใงใฎ1ใถๆ้ใฎๅฎ็ธพใๅ
ฌ้ใใพใ๏ผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ๐ฐ Claude API ๆ้็ฏ็ดใฌใใผใ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฃ
โ โ
โ ๐ ๅฆ็ๅ
่จณ โ
โ โโ ๅ็ฉ่ญๅฅ (Batch) : 3,000ๅ/ๆ โ 50%ใชใ โ
โ โโ ใณใณใใณใ็ๆ (Cache): 1,500ๅ/ๆ โ 90%ใชใ โ
โ โโ ๆฆ็ฅ็ซๆก (Thinking) : 100ๅ/ๆ โ 80%ใชใ โ
โ โ
โ ๐ต ใณในใๆฏ่ผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ ๅฎไพก : $450.00 โโโโโโโโโโโโโโโโโโโโ โ โ
โ โ ็ฏ็ดๅพ : $89.50 โโโโ โ โ
โ โ ็ฏ็ด้ก : $360.50 (80.1%ๅๆธ!) โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ ๐ฏ ใใฏใใใฏๅฅๅนๆ โ
โ โโ Batch API : -$75.00 (50%ๅๆธ) โ
โ โโ Prompt Caching : -$243.00 (90%ๅๆธ) โ
โ โโ Ext. Thinking : -$42.50 (80%ๅๆธ) โ
โ โ
โ โ
ๅนด้ๆ็ฎ็ฏ็ด้ก: $4,326.00 โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ ๏ธ ใใใซไฝฟใใ็ตฑๅใฏใฉใน
3ใคใฎใใฏใใใฏใ็ตฑๅใใไพฟๅฉใชใฏใฉในใไฝใใพใใ๏ผ
import anthropic
from dataclasses import dataclass
from typing import Optional
from enum import Enum
class OptimizationMode(Enum):
BATCH = "batch" # 50%ใชใใ24ๆ้ไปฅๅ
CACHED = "cached" # 90%ใชใใ็นฐใ่ฟใๅฆ็
THINKING = "thinking" # 80%ใชใใ่ค้ใชๆจ่ซ
@dataclass
class CostOptimizedRequest:
"""ใณในใๆ้ฉๅใใใAPIใชใฏใจในใ"""
mode: OptimizationMode
prompt: str
system_prompt: Optional[str] = None
cache_context: Optional[str] = None
thinking_budget: int = 10000
class ClaudeCostOptimizer:
"""Claude API ใณในใๆ้ฉๅใฏใฉใน"""
def __init__(self):
self.client = anthropic.Anthropic()
self.stats = {
"total_requests": 0,
"estimated_savings": 0.0
}
def process(self, request: CostOptimizedRequest):
"""ๆ้ฉๅใขใผใใซๅฟใใฆๅฆ็ใๅฎ่ก"""
self.stats["total_requests"] += 1
if request.mode == OptimizationMode.BATCH:
return self._process_batch(request)
elif request.mode == OptimizationMode.CACHED:
return self._process_cached(request)
elif request.mode == OptimizationMode.THINKING:
return self._process_thinking(request)
def _process_batch(self, request):
"""Batch APIๅฆ็๏ผ50%ใชใ๏ผ"""
batch = self.client.batches.create(
requests=[{
"custom_id": "opt-request",
"params": {
"model": "claude-sonnet-4-20250514",
"max_tokens": 1024,
"messages": [{"role": "user", "content": request.prompt}]
}
}]
)
self.stats["estimated_savings"] += 0.50 # ๆฆ็ฎ
return {"batch_id": batch.id, "mode": "batch", "savings": "50%"}
def _process_cached(self, request):
"""Prompt Cachingๅฆ็๏ผ90%ใชใ๏ผ"""
system = []
if request.system_prompt:
system.append({"type": "text", "text": request.system_prompt})
if request.cache_context:
system.append({
"type": "text",
"text": request.cache_context,
"cache_control": {"type": "ephemeral"}
})
response = self.client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
system=system if system else None,
messages=[{"role": "user", "content": request.prompt}]
)
cache_read = getattr(response.usage, 'cache_read_input_tokens', 0)
if cache_read > 0:
self.stats["estimated_savings"] += 0.90
return {"response": response.content[0].text, "mode": "cached", "savings": "90%"}
def _process_thinking(self, request):
"""Extended Thinkingๅฆ็๏ผ80%ใชใ๏ผ"""
response = self.client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=16000,
thinking={"type": "enabled", "budget_tokens": request.thinking_budget},
messages=[{"role": "user", "content": request.prompt}]
)
self.stats["estimated_savings"] += 0.80
answer = next(
(b.text for b in response.content if b.type == "text"),
None
)
return {"response": answer, "mode": "thinking", "savings": "80%"}
def get_stats(self):
"""็ตฑ่จๆ
ๅ ฑใๅๅพ"""
return self.stats
# ไฝฟ็จไพ
optimizer = ClaudeCostOptimizer()
# Batchๅฆ็๏ผๅณๆๆงไธ่ฆใชๅคง้ๅฆ็๏ผ
result1 = optimizer.process(CostOptimizedRequest(
mode=OptimizationMode.BATCH,
prompt="ใใฎ็ปๅใฎๅ็ฉใ่ญๅฅใใฆใใ ใใ"
))
# Cacheๅฆ็๏ผๅใใณใณใใญในใใง่คๆฐ่ณชๅ๏ผ
result2 = optimizer.process(CostOptimizedRequest(
mode=OptimizationMode.CACHED,
prompt="Jellyใฎ็นๅพดใฏ๏ผ",
cache_context="[ๅๅฟๆใฎๅ็ฉใใผใฟใใผใน...]"
))
# Thinkingๅฆ็๏ผ่ค้ใชๆจ่ซ๏ผ
result3 = optimizer.process(CostOptimizedRequest(
mode=OptimizationMode.THINKING,
prompt="28ๅนใฎๅ็ฉใฎๆ้ฉใชๅฅๅบท็ฎก็ใใฉใณใ่จญ่จใใฆใใ ใใ",
thinking_budget=15000
))
print(f"๐ ็ตฑ่จ: {optimizer.get_stats()}")
๐ฏ ใฉใฎใใฏใใใฏใไฝฟใในใ๏ผ
ๅคๆญใใญใผใใฃใผใ๏ผ
ๅฆ็ใฎๅณๆๆงใๅฟ
่ฆ๏ผ
โโ ใใใ โ Batch API (50%ใชใ) โ
โโ ใฏใ
โโ ๅใใณใณใใญในใใ็นฐใ่ฟใไฝฟใ๏ผ
โ โโ ใฏใ โ Prompt Caching (90%ใชใ) โ
โโ ่ค้ใชๆจ่ซใๅฟ
่ฆ๏ผ
โโ ใฏใ โ Extended Thinking (80%ใชใ) โ
โโ ใใใ โ ้ๅธธAPI
| ใทใผใณ | ๆจๅฅจใใฏใใใฏ | ็ฏ็ด็ |
|---|---|---|
| ๅค้ใใใๅฆ็ | Batch API | 50% |
| ใใญใฅใกใณใๅๆ | Prompt Caching | 90% |
| ใณใผใ็ๆ | Extended Thinking | 80% |
| ใใฃใใใใใ | Prompt Caching | 90% |
| ็ปๅๅคง้ๅๆ | Batch API | 50% |
| ๆฆ็ฅ็ซๆก | Extended Thinking | 80% |
๐ ๅ่ใชใณใฏ
- Anthropicๅ ฌๅผ: Batch API
- Anthropicๅ ฌๅผ: Prompt Caching
- Anthropicๅ ฌๅผ: Extended Thinking
- Claude API Pricing
๐พ ใใใใซ
ๅๅฟๆใงใฏใใใใใฎใใฏใใใฏใ้งไฝฟใใฆใ28ๅนใฎ็ซใจ็ฌใใกใฎAIๅๆใๆ็ถๅฏ่ฝใชๅฝขใง้ๅถใใฆใใพใใ
ใAIใฎๅใงใๅ็ฉใใกใใใฃใจๅนธใใซใ
ใณในใๅๆธใฏใใใฎๅคขใๅฎ็พใใใใใฎๅคงๅใชไธๆญฉใงใใ
็ใใใใใฒใใใใใฎใใฏใใใฏใ่ฉฆใใฆใฟใฆใใ ใใใ่ณชๅใใใใฐใใณใกใณใๆฌใงใๆฐ่ปฝใซใฉใใ๏ผ
๐พ by ๅๅฟๆ washinmura.jp
ๅ็ ไธ่ตท๏ผ็็ๅ
จไธ็
ใใใใจไธ็ทใซใไธ็ใ็ใใ
Top comments (0)