You open the app, fill fields, select options, and submit.
It works but it’s friction.
I wanted something simpler.
What if a user could just say:
_**“Netflix ₹499 monthly”** …and the system handles everything?_
The Core Idea
Instead of forcing users to adapt to the system…
Make the system adapt to the user.
The pipeline looks like this:
Each step reduces ambiguity and moves toward structured data.
Step 1 — Handling Voice & Text Input
The system doesn’t just rely on one type of input.
Users can either:
- Speak (“Netflix ₹499 monthly”)
- Type a quick message (just like a notification or note)
So the first step is to normalize everything into plain text.
If the input is voice, we convert it using a speech-to-text service.
If it’s already text, we process it directly.
The goal is simple: everything becomes text before any processing begins.
Example Input
\# Case 1: User typed a message (like a quick note)
user\_input = "Netflix 499 monthly"
\# Case 2: Voice input (after speech-to-text conversion)
voice\_transcribed = "Spotify 199 per month"
Basic Handling Layer
def normalize\_input(input\_data, input\_type="text"):
if input\_type == "voice":
# Simulated speech-to-text (replace with real API)
text = input\_data # already transcribed
else:
text = input\_data
return text.lower().strip()
\# Example usage
text\_input = normalize\_input(user\_input, "text")
voice\_input = normalize\_input(voice\_transcribed, "voice")
print(text\_input)
print(voice\_input)
Why This Step Matters
This step might look simple, but it’s critical.
Because:
- It creates a single entry point for all inputs
- It keeps downstream logic clean
- It allows you to support multiple input methods easily
And more importantly:
It makes the system feel natural — users can just “say” or “type” what they did.
Step 2 — Lightweight Regex Filtering
Before sending everything to the LLM, I added a simple filter.
Why?
Because not all inputs are subscription-related.
This saves cost and improves accuracy.
import re
def is\_subscription(text):
patterns = \[
r'\\b(monthly|yearly|weekly)\\b',
r'₹\\d+',
r'\\b(netflix|spotify|amazon|prime)\\b'
\]
return any(re.search(p, text.lower()) for p in patterns)
\# Example
print(is\_subscription(user\_input)) # True
If it’s not a subscription, we can route it elsewhere.
Step 3 — LLM Parsing
Now comes the important part — extracting structured data.
We send the filtered input to an LLM with a strict prompt.
from openai import OpenAI
client = OpenAI()
def parse\_subscription(text):
prompt = f"""
Extract subscription details from the input.
Return JSON with fields:
name, cost, billing\_cycle
Input: "{text}"
"""
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=\[{"role": "user", "content": prompt}\]
)
return response.choices\[0\].message.content
\# Example
result = parse\_subscription(user\_input)
print(result)
Expected output:
{
"name": "Netflix",
"cost": 499,
"billing\_cycle": "monthly"
}
Step 4 — Structuring the Event
Now we convert this into a system event.
import json
from datetime import datetime
def create\_event(parsed\_json):
data = json.loads(parsed\_json)
event = {
"type": "SUBSCRIPTION\_CREATED",
"timestamp": datetime.utcnow().isoformat(),
"payload": {
"name": data\["name"\],
"cost": data\["cost"\],
"billing\_cycle": data\["billing\_cycle"\]
}
}
return event
event = create\_event(result)
print(event)
Step 5 — Saving to Database
Finally, store it.
def save\_to\_db(event):
# Replace with actual DB logic
print("Saving to DB:", event)
save\_to\_db(event)
Why This Works
This system feels simple, but a few design decisions make it powerful:
1. Regex Before LLM
- Filters irrelevant input
- Reduces cost
- Improves signal
2. LLM for Structure, Not Logic
- LLM extracts meaning
- System enforces rules
3. Event-Based Design
- Everything becomes an event
- Easy to extend (notifications, analytics, etc.)
Where This Gets Interesting
Once this pipeline is in place, you can extend it easily:
- Add reminders automatically
- Trigger notifications
- Detect duplicates
- Categorize spending
And most importantly:
The user doesn’t feel like they’re using a system.
They just type or speak naturally or we can take permission and extract messages from cell phone.
Final Thought
This isn’t about AI.
It’s about reducing friction.
Forms make users adapt to systems.
Natural language lets systems adapt to users.
And that small shift makes everything feel… effortless.


Top comments (0)