Thanawat Wongchai

Posted on May 8 • Originally published at apidog.com

วิธีใช้ Grok 4.3 API

xAI เปิดตัว Grok 4.3 แบบเป็นระยะ: เบต้า 17 เมษายน 2026, API 30 เมษายน และ GA 6 พฤษภาคม 2026 จุดสำคัญสำหรับนักพัฒนาคือ context window 1,000,000 โทเค็น, video input แบบเนทีฟ, reasoning เปิดตลอดเวลา และราคาลดลงประมาณ 40% เทียบกับ Grok 4.20 หากคุณยังใช้ซีรีส์ grok-3 หรือ grok-4 รุ่นเก่า ให้วางแผนย้ายก่อนโมเดลเก่าแปดรุ่นถูกปลดระวางในวันที่ 15 พฤษภาคม 2026

ลองใช้ Apidog วันนี้

บทความนี้เป็นคู่มือใช้งาน Grok 4.3 จากโค้ดจริง: endpoint, authentication, OpenAI-compatible base URL, reasoning_effort, video input, function calling และการทดสอบ request ด้วย Apidog

สำหรับการใช้งานด้านเสียงของ release เดียวกัน ดู วิธีการใช้ Grok Voice ฟรี และถ้าต้องการเทียบกับโมเดลเสียงเรือธงของ OpenAI ดู Grok Voice vs GPT-Realtime

TL;DR

Grok 4.3 เปิด GA วันที่ 6 พฤษภาคม 2026
โมเดลเก่าแปดรุ่นจะปลดระวางวันที่ 15 พฤษภาคม 2026
ราคา: $1.25 / 1M input tokens, $2.50 / 1M output tokens, cached input $0.20 / 1M
มี context window 1 ล้านโทเค็น, native video input, และ reasoning เปิดตลอดเวลา
ระดับ reasoning: low, medium, high โดยค่าเริ่มต้นคือ medium
Endpoint หลัก: https://api.x.ai/v1/chat/completions
ใช้ OpenAI SDK ได้โดยเปลี่ยน base_url
Throughput ประมาณ 159 tokens/second สำหรับ standard tier
Intelligence index 53 จากการวิเคราะห์โดยปัญญาประดิษฐ์ อยู่ลำดับที่ 10 จาก 146 โมเดลทั่วโลก
ใช้ Apidog เพื่อบันทึก request, สลับ reasoning_effort ด้วย environment variables และ replay request เทียบกันได้ง่าย

มีอะไรเปลี่ยนไปใน Grok 4.3

สิ่งที่ส่งผลกับทีม dev มากที่สุด:

ราคาลดลงประมาณ 40%

Input ลดลง 37.5% เทียบกับ Grok 4.20 และ output ลดลง 58.3% ส่วน cached input อยู่ที่ $0.20/1M ทำให้ prompt ระบบที่ยาวและคงที่ถูกลงมาก
context window 1 ล้านโทเค็น

เพิ่มจาก 256k ใน Grok 4.20 เหมาะกับการใส่ codebase ขนาดกลาง, transcript ยาว, เอกสารกฎหมาย หรือ context ของ agent หลายรอบไว้ใน request เดียว
native video input

Grok 4.3 เป็น Grok รุ่นแรกที่รับ URL วิดีโอโดยตรงและให้เหตุผลจากเฟรมวิดีโอได้
reasoning เปิดตลอดเวลา

ทุก request มี reasoning ขั้นพื้นฐานอยู่แล้ว พารามิเตอร์ reasoning_effort ใช้ปรับระดับความลึก แต่ไม่สามารถปิด reasoning ทั้งหมดได้
agent workflow ดีขึ้น

คะแนน GDPval-AA เพิ่มขึ้น 300 Elo points เมื่อเทียบกับ Grok 4.20 ทำให้การเลือกเครื่องมือและ workflow หลายขั้นตอนเสถียรขึ้น

ดัชนีความฉลาดที่ 53 ทำให้ Grok 4.3 สูงกว่าค่าเฉลี่ย 35 ของโมเดลในช่วงราคาเดียวกัน และอยู่ในอันดับที่ 10 จาก 146 โมเดลที่ถูกติดตาม

ข้อกำหนดเบื้องต้น

ก่อนเรียก API ให้เตรียม:

บัญชี xAI Console ที่ console.x.ai
API key จาก billing tier ที่ใช้งานได้
แนะนำให้ใช้ project-scoped API key สำหรับ production
ติดตั้ง OpenAI SDK หรือ xAI SDK
มี API client สำหรับ replay request และเปรียบเทียบผลลัพธ์

ตั้งค่า environment variable:

export XAI_API_KEY="xai-..."

Endpoint และ Authentication

Grok 4.3 ใช้ Chat Completions API ที่เข้ากันได้กับ OpenAI ผ่าน base URL ของ xAI:

POST https://api.x.ai/v1/chat/completions

Header มาตรฐาน:

Authorization: Bearer $XAI_API_KEY
Content-Type: application/json

เพราะ API เข้ากันได้กับ OpenAI คุณสามารถใช้ OpenAI SDK เดิมแล้วเปลี่ยนแค่ base_url และ model

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["XAI_API_KEY"],
    base_url="https://api.x.ai/v1",
)

response = client.chat.completions.create(
    model="grok-4.3",
    messages=[
        {
            "role": "user",
            "content": "Summarize the trade-offs of GraphQL vs REST in three bullets."
        }
    ],
    reasoning_effort="medium",
)

print(response.choices[0].message.content)

ถ้าใช้ xAI SDK รูปแบบการเรียกจะคล้ายกัน ความต่างหลักคือ package import และ client setup

พารามิเตอร์คำขอ

พารามิเตอร์	ชนิด	ค่า	หมายเหตุ
`model`	string	`grok-4.3`	จำเป็น
`messages`	array	OpenAI message format	จำเป็น รองรับ `system`, `user`, `assistant`
`reasoning_effort`	string	`low`, `medium`, `high`	ไม่บังคับ ค่าเริ่มต้น `medium`
`max_tokens`	int	1–32768	จำกัด output
`temperature`	float	0.0–2.0	ค่าเริ่มต้น 1.0
`top_p`	float	0.0–1.0	nucleus sampling
`stream`	bool	`true` / `false`	เปิด server-sent events
`tools`	array	OpenAI tools format	ใช้กับ function calling
`tool_choice`	string / object	`auto`, `none`, หรือระบุ tool	ใช้ semantics แบบ OpenAI
`response_format`	object	`{ "type": "json_object" }`	output แบบ structured
`seed`	int	ค่าใดก็ได้	ใช้เพื่อ reproducibility เมื่อ `temperature: 0`

ตัวอย่าง curl ที่เรียกใช้งานได้ทันที:

curl https://api.x.ai/v1/chat/completions \
  -H "Authorization: Bearer $XAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "grok-4.3",
    "messages": [
      {
        "role": "system",
        "content": "You are a senior backend engineer."
      },
      {
        "role": "user",
        "content": "Review this query plan and flag the bottleneck."
      }
    ],
    "reasoning_effort": "high"
  }'

Response ใช้รูปแบบมาตรฐานของ OpenAI:

ข้อความหลักอยู่ที่ choices[0].message.content
usage อยู่ที่ usage
token breakdown รวม prompt_tokens, completion_tokens, reasoning_tokens, และ total_tokens

เลือก `reasoning_effort` ให้เหมาะกับงาน

Grok 4.3 มี reasoning เปิดตลอดเวลา แต่คุณปรับระดับได้ 3 แบบ:

`low`

ใช้กับงานเร็วและตรงไปตรงมา เช่น:

classification
summarization
rule extraction
Q&A ง่าย
routing intent

เหมาะเมื่อ latency สำคัญกว่า reasoning เชิงลึก

`medium`

ค่าเริ่มต้น เหมาะกับ production traffic ส่วนใหญ่ เช่น:

customer support
function calling แบบไม่ซับซ้อน
data analysis
tool use แบบขั้นตอนเดียว
code explanation

`high`

ใช้กับงานที่ต้องวางแผนหรือไล่เหตุผลหลายขั้น เช่น:

multi-step agents
code review ยาว
math reasoning
workflow ที่ต้องใช้หลาย tools
debugging จาก context จำนวนมาก

ตัวอย่างการเปรียบเทียบผลลัพธ์:

for effort in ["low", "medium", "high"]:
    response = client.chat.completions.create(
        model="grok-4.3",
        messages=[
            {
                "role": "user",
                "content": "Find the likely root cause from this incident timeline: ..."
            }
        ],
        reasoning_effort=effort,
    )

    print("EFFORT:", effort)
    print(response.choices[0].message.content)
    print(response.usage)

อย่าคาดหวังว่าจะลดต้นทุนด้วยการปิด reasoning ทั้งหมด เพราะ Grok 4.3 ไม่มีโหมด no-reasoning ให้ใช้ ระดับ low ยังมี reasoning ขั้นพื้นฐานอยู่

Function Calling

Grok 4.3 ใช้ tools format แบบ OpenAI ได้โดยตรง:

ประกาศ tools
ส่ง request
อ่าน tool_calls จาก assistant message
execute tool ฝั่งแอปของคุณ
ส่งผลลัพธ์กลับด้วย role tool

ตัวอย่าง:

tools = [
    {
        "type": "function",
        "function": {
            "name": "lookup_user",
            "description": "Look up a user by ID.",
            "parameters": {
                "type": "object",
                "properties": {
                    "user_id": {
                        "type": "string"
                    }
                },
                "required": ["user_id"],
            },
        },
    }
]

response = client.chat.completions.create(
    model="grok-4.3",
    messages=[
        {
            "role": "user",
            "content": "Find user u_42 and tell me their last login."
        }
    ],
    tools=tools,
    reasoning_effort="medium",
)

tool_calls = response.choices[0].message.tool_calls

for call in tool_calls:
    print(call.function.name)
    print(call.function.arguments)

หลัง execute function แล้ว ส่งผลลัพธ์กลับ:

messages = [
    {
        "role": "user",
        "content": "Find user u_42 and tell me their last login."
    },
    response.choices[0].message,
    {
        "role": "tool",
        "tool_call_id": tool_calls[0].id,
        "content": '{"user_id":"u_42","last_login":"2026-05-04T10:31:00Z"}'
    }
]

final_response = client.chat.completions.create(
    model="grok-4.3",
    messages=messages,
    tools=tools,
)

print(final_response.choices[0].message.content)

จากการปรับปรุงด้าน agent workflow Grok 4.3 เลือก tool ได้ดีขึ้น ลดการเรียกซ้ำที่ไม่จำเป็น และ recover จาก tool error ได้ดีขึ้น หากคุณกำลังทดสอบ flow แบบ MCP ดูคู่มือ การทดสอบเซิร์ฟเวอร์ MCP ใน Apidog

Video Input

Grok 4.3 เป็น Grok รุ่นแรกที่รองรับวิดีโอแบบเนทีฟ ส่ง video URL ใน content block ได้เลย:

response = client.chat.completions.create(
    model="grok-4.3",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Describe what happens in this clip and flag any anomalies."
                },
                {
                    "type": "video_url",
                    "video_url": {
                        "url": "https://example.com/clip.mp4"
                    }
                },
            ],
        }
    ],
)

print(response.choices[0].message.content)

ข้อควรระวัง:

video tokens ถูกนับเป็น input tokens
วิดีโอยาวใช้ context เร็ว
ถ้าต้นทุนสำคัญ ให้ trim หรือ compress ก่อนส่ง
ไม่จำเป็นต้อง extract keyframes เอง เพราะโมเดลให้เหตุผลจากเฟรมวิดีโอโดยตรง

ใช้ context window 1 ล้านโทเค็นอย่างไร

context window 1 ล้านโทเค็นเหมาะกับงานที่เคยต้องทำ chunking หลายรอบ เช่น:

1. Code review ทั้ง codebase

ใส่ข้อมูลที่เกี่ยวข้องใน prompt เดียว:

diff
ไฟล์ที่แก้ไข
test output
lint result
architectural notes

ตัวอย่าง prompt:

You are reviewing a backend pull request.

Context:
- Repository structure
- Modified files
- Test output
- Lint output

Task:
1. Find correctness issues.
2. Find security risks.
3. Suggest minimal patches.
4. Return output as JSON.

2. Q&A เอกสารยาว

ใช้กับ:

สัญญา 200 หน้า
policy documents
financial reports
technical specs

ให้ถามคำถามเฉพาะเจาะจงแทนการสรุปกว้าง ๆ เพื่อประหยัด output tokens

3. Long-running agent memory

เก็บ conversation history ของ agent ไว้ใน context สำหรับ personalization หรือ decision continuity

Cached input ราคา $0.20/1M ทำให้ prompt ยาวที่คงที่ถูกลงมาก เช่น prompt ระบบ 400k tokens ที่ cache ได้จะคิดเป็น $0.08 ต่อ cached call แทน $0.50 สำหรับ input ใหม่

การโยกย้ายจาก Grok รุ่นเก่า

โมเดล Grok รุ่นเก่าแปดรุ่นจะปลดระวางวันที่ 15 พฤษภาคม 2026 เวลา 12:00 น. PT

ขั้นตอน migration ขั้นต่ำ:

ค้นหา model string เดิม เช่น grok-3 หรือ grok-4
เปลี่ยนเป็น grok-4.3
เพิ่มหรือปรับ reasoning_effort
ทดสอบ output format และ latency
deploy ก่อน deadline

ตัวอย่าง:

 response = client.chat.completions.create(
-    model="grok-4",
+    model="grok-4.3",
     messages=messages,
+    reasoning_effort="medium",
 )

สิ่งที่ควรทดสอบซ้ำ:

Latency: Grok 4.3 มี reasoning เปิดตลอดเวลา ถ้าเดิมใช้ path ที่ไม่ต้อง reasoning ให้เริ่มจาก low
Output format: Grok 4.3 มีแนวโน้มตอบแบบ structured มากขึ้น ถ้าระบบหลังบ้าน parse ด้วย regex ให้ regression test ก่อนเปลี่ยน
Token usage: ดู usage.reasoning_tokens เพื่อประเมินต้นทุนจริง

สำหรับการเปรียบเทียบราคาฝั่ง OpenAI ดู ราคา GPT-5.5 และถ้าต้องการเทียบ reasoning model แบบตรง ๆ ดู วิธีการใช้ GPT-5.5 API

การทดสอบใน Apidog

วิธีทดสอบ Grok 4.3 กับ use case จริงอย่างรวดเร็ว:

สร้าง environment ใน Apidog
- XAI_API_KEY = xai-...
- BASE_URL = https://api.x.ai/v1
สร้าง request:

   POST {{BASE_URL}}/chat/completions

ตั้ง header:

   Authorization: Bearer {{XAI_API_KEY}}
   Content-Type: application/json

สร้าง body สำหรับ reasoning level ต่าง ๆ:

   {
     "model": "grok-4.3",
     "messages": [
       {
         "role": "system",
         "content": "You are a senior backend engineer."
       },
       {
         "role": "user",
         "content": "Review this API design and list production risks."
       }
     ],
     "reasoning_effort": "medium"
   }

Clone request เป็น 3 ชุด:
- reasoning_effort: low
- reasoning_effort: medium
- reasoning_effort: high
Run แล้วเทียบ:
- response quality
- latency
- usage.reasoning_tokens
- usage.total_tokens
เพิ่ม request ที่ใช้ OpenAI base URL เพื่อเทียบกับ GPT-5.5 บน prompt เดียวกัน

ดาวน์โหลด Apidog เพื่อจัดการ collection และ replay request ระหว่าง provider ได้ง่าย สำหรับกลยุทธ์การทดสอบ API เพิ่มเติม ดู เครื่องมือทดสอบ API สำหรับวิศวกร QA

Rate Limits

Rate limit ใน xAI Console เริ่มจากระดับพื้นฐานที่หลักพัน requests/minute ใน Tier 1 ไปจนถึงหลักแสน requests/minute ใน enterprise tiers ตัวเลขจริงอาจเปลี่ยนได้ ให้ตรวจสอบจาก dashboard ของ console เสมอ

throughput 159 tokens/second ที่ xAI ระบุคือความเร็ว output ต่อ stream ไม่ใช่ throughput รวมของบัญชี ถ้าใช้ concurrent requests จะ scale ภายในขีดจำกัดของ tier

เมื่อเจอ rate limit API จะตอบ 429 พร้อม header retry-after ให้ใช้ exponential backoff:

import time
from openai import RateLimitError

for attempt in range(5):
    try:
        response = client.chat.completions.create(
            model="grok-4.3",
            messages=[{"role": "user", "content": "Summarize this log: ..."}],
        )
        break
    except RateLimitError:
        wait = 2 ** attempt
        time.sleep(wait)

คำถามที่พบบ่อย

Grok 4.3 เข้ากันได้กับ OpenAI ทั้งหมดหรือไม่?

สำหรับ Chat Completions ใช่ ใช้ OpenAI SDK, เปลี่ยน base_url, เปลี่ยน model แล้วเรียก function calling, structured output และ streaming ได้ในรูปแบบเดียวกัน

รองรับ Responses API หรือไม่?

อินเทอร์เฟซของ xAI ตอนนี้คือ Chat Completions ส่วน Responses API เป็นของ OpenAI

context limit จริงคือเท่าไร?

1,000,000 tokens แต่ input ยาวยังมีต้นทุนจริง แม้ราคาอยู่ที่ $1.25/1M จึงควรใช้ caching เมื่อ prompt คงที่

reasoning เปิดตลอดเวลามีผลกับ latency อย่างไร?

first-token latency จะสูงกว่าโมเดลที่ไม่มี reasoning เล็กน้อย แต่ Grok 4.3 stream output ได้ประมาณ 159 tokens/second ทำให้ end-to-end latency ยังแข่งขันได้ โดยเฉพาะงานที่ต้องการความแม่นยำสูง

ใช้ Grok 4.3 ร่วมกับ Grok Voice ได้ไหม?

ได้ voice agent เช่น grok-voice-think-fast-1.0 จะเรียก Grok 4.3 เบื้องหลังเมื่อมี reasoning และคุณยังเรียก Grok 4.3 โดยตรงจาก voice loop ที่สร้างด้วย TTS/STT เองได้

หลัง 15 พฤษภาคม การเรียก Grok 3 / Grok 4 เดิมจะเกิดอะไรขึ้น?

จะล้มเหลวด้วยรหัส 410 เพราะโมเดลถูกปลดระวาง ควร migration ก่อน deadline

Grok 4.3 รองรับ image input หรือไม่?

รองรับ นอกจาก video input ใหม่แล้ว สามารถส่ง image URL ใน content block ได้ด้วยรูปแบบเดียวกับ OpenAI

สรุป

Grok 4.3 เป็น release ที่เน้นทั้งราคาและความสามารถสำหรับงาน production: ราคาลดลงประมาณ 40%, context 1 ล้านโทเค็น, reasoning เปิดตลอดเวลา และ video input แบบเนทีฟ จุดที่สำคัญสำหรับทีม dev คือความเข้ากันได้กับ OpenAI ทำให้ migration ส่วนใหญ่เป็นการเปลี่ยน base_url และ model ไม่ใช่การเขียน integration ใหม่

วิธีเริ่มที่แนะนำ: สร้าง request 3 แบบใน Apidog สำหรับ low, medium, high ใส่ prompt จริงของระบบคุณ วัด latency, output quality และ reasoning_tokens แล้ว migration ก่อนวันที่ 15 พฤษภาคม

DEV Community

วิธีใช้ Grok 4.3 API

TL;DR

มีอะไรเปลี่ยนไปใน Grok 4.3

ข้อกำหนดเบื้องต้น

Endpoint และ Authentication

พารามิเตอร์คำขอ

เลือก `reasoning_effort` ให้เหมาะกับงาน

`low`

`medium`

`high`

Function Calling

Video Input

ใช้ context window 1 ล้านโทเค็นอย่างไร

1. Code review ทั้ง codebase

2. Q&A เอกสารยาว

3. Long-running agent memory

การโยกย้ายจาก Grok รุ่นเก่า

การทดสอบใน Apidog

Rate Limits

คำถามที่พบบ่อย

สรุป

Top comments (0)

TL;DR

มีอะไรเปลี่ยนไปใน Grok 4.3

ข้อกำหนดเบื้องต้น

Endpoint และ Authentication

พารามิเตอร์คำขอ

เลือก reasoning_effort ให้เหมาะกับงาน

low

medium

high

Function Calling

Video Input

ใช้ context window 1 ล้านโทเค็นอย่างไร

1. Code review ทั้ง codebase

2. Q&A เอกสารยาว

3. Long-running agent memory

การโยกย้ายจาก Grok รุ่นเก่า

การทดสอบใน Apidog

Rate Limits

คำถามที่พบบ่อย

สรุป

เลือก `reasoning_effort` ให้เหมาะกับงาน

`low`

`medium`

`high`