I'm using the Anthropic Claude API and I'm trying to generate multiple completions (n completions) for a given prompt in a single API call. OpenAI's API provides an n parameter in their sampling settings to achieve this, but I can't find an equivalent option in the Claude API.
My Current Approach:
I'm currently using a retry mechanism to handle potential errors during API calls, which looks like this:
from tenacity import retry, stop_after_attempt, wait_exponential
def before_sleep(retry_state):
print(f"(Tenacity) Retry, error that caused it: {retry_state.outcome.exception()}")
def retry_error_callback(retry_state):
exception = retry_state.outcome.exception()
exception_str = str(exception)
if "prompt is too long" in exception_str and "400" in exception_str:
raise exception
return 'No error that requires us to exit early.'
@retry(stop=stop_after_attempt(20), wait=wait_exponential(multiplier=2, max=256),
before_sleep=before_sleep, retry_error_callback=retry_error_callback)
def call_to_anthropic_client_api_with_retry(gen: AnthropicGenerator, prompt: str) -> dict:
response = gen.llm.messages.create(
model=gen.model,
max_tokens=gen.sampling_params.max_tokens,
system=gen.system_prompt,
messages=[
{"role": "user", "content": [{"type": "text", "text": prompt}]}
],
temperature=gen.sampling_params.temperature,
top_p=gen.sampling_params.top_p,
n=gen.sampling_params.n, # Intended to generate multiple completions
stop_sequences=gen.sampling_params.stop[:3],
)
return response
Problem:
I can't find an n parameter in the Anthropic API documentation that allows generating multiple completions in one request.
Questions:
- Does the Claude API support generating multiple completions (n completions) directly within a single API call?
- If not, is there a recommended workaround or best practice to achieve this without resorting to looping multiple requests? Any guidance or suggestions would be greatly appreciated!
Top comments (0)