The AI Revolution with GPT-4
The tech world has been buzzing ever since GPT-4 hit the scene. It's not just another advancement; it's a revolution. Businesses and developers are eager to harness its power for generative AI applications. Yet, amidst this excitement lies a subtle but critical challenge – need of structured outputs like json from GPT-3.5/4.
Need for Structured Data Output
ChatGPT, in its essence, is a conversational maestro, designed to return textual responses. This works perfectly for chat applications but presents a hurdle when we need structured data formats like JSON for more complex AI applications. It's akin to fitting a square peg into a round hole – a delicate and often frustrating process.
The Early Struggles with JSON
Initially, extracting structured JSON from GPT-4 was a task marked with trials and tribulations. Regardless of how meticulously prompts were crafted, the responses occasionally lacked the necessary keys or returned them in unexpected formats. This was particularly problematic in applications heavily reliant on precise LLM outputs.
Invalid Json output examples:
- not a valid json format (json parse error).
- returning the json response with extra strings or quotes (like json)
- missing keys or a different key names
JSON Mode: A Step Forward
The introduction of JSON mode by OpenAI seemed like a beacon of hope. It certainly made strides in offering structured outputs, but the issue of accurate key representation persisted. As a developer, my experience mirrored this – receiving a structured JSON response was one thing, but ensuring it contained the right keys was another challenge altogether.
A Shift in Perspective: Developer to User
The breakthrough came when I shifted my approach. I began viewing the issue not just through a developer's lens but from a user's standpoint. In regular chats, if GPT-4 slips up, we simply ask it to correct itself. Why not apply the same principle to JSON errors?
I started sending the flawed JSON outputs back to GPT-4, asking it to rectify the errors, much like pointing out a mistake in a casual conversation. It was almost like asking a colleague to double-check their work. And to my surprise, it worked.
This journey taught me an invaluable lesson: sometimes, the best solutions are born from simplicity and a bit of creative thinking. It's about stepping back, reevaluating, and approaching the problem from a different angle.
If you're navigating the same waters, give this method a shot. It might just make your work with GPT-4 a whole lot smoother. Remember, in the world of AI and coding, a touch of human ingenuity can make all the difference.
Below is a simple python code example with the mentioned solution in action.
pip install openai
import openai
import json
import traceback
# Replace 'your-api-key' with your actual OpenAI API key
openai.api_key = 'your-api-key'
def get_response_from_gpt(messages):
try:
response = openai.ChatCompletion.create(
model="gpt-4-1106-preview",
messages=messages,
response_format={"type": "json_object"},
)
return response.choices[0].message['content']
except Exception as e:
print(f"Error with OpenAI API: {e}")
return None
def handle_json_error(traceback_error, messages):
# Formulate a new user prompt to instruct the assistant to fix the JSON error using traceback
fix_json_prompt = f"""
Got below json error with your response. Fix it and return the valid json without any quotes or json tags.
{traceback_error}
"""
messages.append(
{
'role': "user",
"content": fix_json_prompt
}
)
return get_response_from_gpt(messages)
# Define the initial prompts and messages
system_prompt = """
You are a helpful assistant who answers users' queries in JSON format.
"""
user_prompt = """
Tell me the details of modern browsers like Chrome, Safari in the below format.
{
"browsers": [
{
"name": "", # name of browser,
"launched_on": "", # year in which it is launched,
"owned_by": "", # who owns the browser,
"no_of_downloads": "" # the total number of times this browser is downloaded
}
]
}
"""
# Initiate the conversation
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt},
]
response = get_response_from_gpt(messages)
print(response)
try:
# Try to parse the GPT-4 response as JSON
json_data = json.loads(response)
print("Successfully received JSON:", json_data)
except json.JSONDecodeError as json_err:
# Handle JSON parsing error
error_traceback = traceback.format_exc()
print("JSON error encountered. Asking GPT-4 to fix...")
messages.append(
{
'role': 'assistant',
'content': response
}
)
fixed_response = handle_json_error(error_traceback, messages)
print(fixed_response)
fixed_json_data = json.loads(fixed_response)
print("Successfully fixed and received JSON:", fixed_json_data)
The similar flow can be achieved with function calling also. the main focus i am writing this article is to highlight the perspective.
Also You might be thinking about this will increase the cost and latency but in my testing and usage i considered it less problematic than the accuracy of response and user satisfaction with my application.
Thank you for staying till the end, I hope it was worth your time.
Please comment your suggestions or practices you followed in building GenAI Applications
Top comments (0)