You're probably only using model and messages. Here are 5 advanced parameters that'll make your AI app faster, cheaper, and smarter.
Most developers treat AI APIs like a black box: send a prompt, get a response.
But the real magic is in the parameters. Here are 5 you should be using:
1. temperature — Control Randomness
response = client.chat.completions.create(
model="deepseek-v4-pro",
messages=[{"role": "user", "content": "Write a poem"}],
temperature=0.2 # 0.0 = deterministic, 1.0 = creative
)
Use case: 0.0 for code generation, 0.7 for creative writing.
2. max_tokens — Prevent Runaway Costs
response = client.chat.completions.create(
model="deepseek-v4-pro",
messages=[{"role": "user", "content": "Explain AI"}],
max_tokens=200 # ← Limit response length
)
Why it matters: A verbose model can burn through your budget. Cap it.
3. top_p — Nucleus Sampling
response = client.chat.completions.create(
model="deepseek-v4-pro",
messages=[{"role": "user", "content": "Suggest a startup idea"}],
top_p=0.1 # Only consider top 10% of probability mass
)
What it does: Instead of considering all possible next tokens, only look at the top p%. Makes output more focused.
Rule of thumb: Use temperature OR top_p, not both.
4. frequency_penalty — Reduce Repetition
response = client.chat.completions.create(
model="deepseek-v4-pro",
messages=[{"role": "user", "content": "List 10 AI tools"}],
frequency_penalty=0.5 # Penalize repeated phrases
)
Use case: Great for listicles, brainstorming, or any task where repetition sucks.
5. stream — Make Your App Feel 3x Faster
response = client.chat.completions.create(
model="deepseek-v4-flash",
messages=[{"role": "user", "content": "Explain quantum computing"}],
stream=True # ← Stream tokens as they're generated
)
for chunk in response:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
Why it matters: First token in < 500ms instead of waiting 3 seconds for the full response.
The "Pro" Setup
Here's how I combine all 5 for a production app:
def ask_ai(prompt, task_type="general"):
params = {
"model": "deepseek-v4-flash",
"messages": [{"role": "user", "content": prompt}],
"stream": True,
"max_tokens": 500,
}
if task_type == "code":
params["temperature"] = 0.0
elif task_type == "creative":
params["temperature"] = 0.8
params["frequency_penalty"] = 0.3
return client.chat.completions.create(**params)
Try It
- Get a free API key → aibridge-api.com
- Copy the code above
- Experiment with parameters (it's free to test)
Your AI app will feel smarter, faster, and cheaper.




Top comments (0)