Building AIFiqh: Our Journey with Islamic Knowledge and AI
How we're trying to make authentic Islamic scholarship more accessible (and learning a lot along the way)
Why We Built This Thing
Let me be straight here. When we started AIFiqh, we weren't trying to "disrupt" Islamic knowledge or whatever. We just noticed something annoying: finding reliable Islamic rulings online was a mess.
You'd search for something simple like "Can I pray with nail polish?" and get buried in forum discussions, random blog posts, and that one guy who's really confident but probably wrong. Meanwhile, there are literally hundreds of thousands of authentic Islamic texts sitting there, but good luck finding what you need without spending hours digging through PDFs.
So we thought: what if we could build something that actually knows this stuff?
When We Realized This Was Actually Hard
Turns out, building AI for Islamic knowledge isn't like building a chatbot for customer service. Islamic jurisprudence is nuanced. Context matters. Scholarly opinions differ. And you can't just throw GPT at it and hope for the best.
We learned this the hard way when our first prototype confidently told someone that "all fish are halal" without mentioning the Hanafi school's very specific opinions about shellfish. Oops.
That's when we realized we needed to get serious about this.
Why GPT and Claude Fall Short (And Why We Had to Build Our Own)
Before building AIFiqh, we tested the big players. Here's what we found:
Challenge | GPT-4 | Claude | AIFiqh (Current) |
---|---|---|---|
Source Attribution | Generic responses, no citations | Limited Islamic sources | Every answer traces to specific kitab + page number |
Madhab Awareness | Mixes schools without distinction | Occasional mention of differences | Clearly separates Hanafi, Maliki, Shafi'i, Hanbali opinions |
Arabic Authenticity | Often paraphrases or translates incorrectly | Better than GPT but still limited | Original Arabic text + verified translations |
Classical vs Contemporary | Can't distinguish scholarly weight | Similar issue | Properly weights classical scholars over modern opinions |
Controversial Topics | Avoids or gives watered-down answers | Sometimes too cautious | Presents authentic scholarly positions with context |
Update Frequency | Knowledge cutoff limitations | Same issue | Live updates from contemporary fatwa councils |
Specialized Terminology | Generic Islamic terms | Better understanding but limited | Precise fiqh terminology with explanations |
Infrastructure | Billion-dollar backing, global CDN | Well-funded, enterprise-grade | Running on startup budget (hoping investors are reading this 👀) |
The Real Problem: General AI models are trained on everything—Wikipedia articles, random blogs, social media posts about Islam. They can't tell the difference between a authentic hadith commentary and someone's opinion on Reddit.
Our Problem: We know exactly what we need to build to compete with the big players, but we're currently bootstrapping this on DigitalOcean droplets while OpenAI has data centers around the world.
We needed something that actually knows the difference between Ibn Taymiyyah and that guy on IslamQA who's really confident but questionably qualified. But let's be honest, we also need the resources to scale this properly.
The Data Mission: 500K Texts and Counting
First things first: we needed the real deal. Not summaries, not interpretations, but actual source texts. So we went hunting.
What we digitized:
- The entire Mausu'ah Fiqhiyyah Kuwaitiyah (all 45 volumes)
- Al-Qardhawi's Fiqh Zakat
- Hundreds of contemporary fatwas
- Classical texts on muamalat
- Tabung Haji documentation
- And about 495,000 other texts that nearly broke our OCR budget
Converting these wasn't just scanning PDFs. Classical Arabic with diacritics, different fonts, handwritten marginalia, weird page layouts—our OCR pipeline had to handle it all. Fun times.
The Tech Stack (Or: How We Keep This Thing Running)
Frontend: React + Next.js
Because life's too short for vanilla JavaScript, and we needed SSR for performance.
Backend: The fun part we went multi-service:
- Python + Flask for the heavy AI lifting
- NestJS for our main API (TypeScript makes debugging less painful)
- PostgreSQL + Prisma because we like our databases relational and our queries type-safe
- ChromaDB for vector storage (more on this later)
AI Stack:
- TensorFlow for our custom models
- Google Gemini for the really tricky stuff
- text-embedding-004 for turning Arabic text into numbers that actually mean something
Infrastructure: DigitalOcean everywhere
S3-compatible storage, droplets, managed databases—the works. No vendor lock-in headaches.
The Two-Tier Cache
Here's where it gets interesting. We built this two-tier caching system that's honestly kind of elegant:
Tier 1: The Speed Demon
LRU cache in memory. Sub-100ms responses for anything we've seen before. Because nobody wants to wait 3 seconds to find out if their prayer is valid.
Tier 2: The Brain
ChromaDB vector database with all our embeddings. When someone asks something new, we do semantic search across our entire corpus. Cosine similarity, 80% threshold, the works.
# This is simplified, but you get the idea
async def find_answer(question):
# Check the fast cache first
cached = await tier1_cache.get(question)
if cached:
return cached
# Semantic search in the vector DB
embedding = await embed_text(question)
similar_texts = await vector_db.search(embedding, threshold=0.8)
if similar_texts:
answer = synthesize_answer(similar_texts)
await tier1_cache.set(question, answer)
return answer
# Fall back to Gemini for novel questions
return await gemini_generate(question, context=our_knowledge_base)
The cool part? If we don't have a good match in our vector DB, we fall back to Gemini but feed it relevant context from our texts. Best of both worlds.
Making Arabic Text Behave
Working with Arabic is... special. Different diacritics, right-to-left text, multiple valid spellings for the same word. Our preprocessing pipeline handles:
- Normalization: Converting different Arabic fonts and diacritics to a standard form
- Context preservation: Keeping track of which mazhab (school of thought) each text comes from
- Citation tracking: Every piece of knowledge traces back to its source
We spent weeks just getting Arabic text to embed properly. Turns out, most embedding models are trained on English and get confused by Arabic morphology.
The UI: Making Knowledge Accessible
The interface is deliberately simple. Chat style interaction because that's what people expect from AI. But under the hood:
- Source attribution for every answer (with Arabic text + translation)
- Multiple perspectives when scholars disagree
- Related questions to guide learning
- Progressive disclosure so beginners don't get overwhelmed
We tried to make it clean, focused, and respectful. No flashy animations, just solid information architecture.
Testing Against the Big Players: A Reality Check
We ran some side-by-side comparisons during our beta. Here are real examples:
Query: "What's the ruling on cryptocurrency in Islam?"
GPT-4 Response: "Cryptocurrency is generally considered permissible in Islam, though some scholars have concerns about volatility and speculation..."
Claude Response: "Islamic scholars have different views on cryptocurrency. Some consider it halal while others have reservations due to gharar (uncertainty)..."
AIFiqh Response: "Contemporary scholars differ on cryptocurrency. Hanafi perspective (Dr. Monzer Kahf, 2018): Permissible as long as not used for gambling. Maliki perspective (European Council for Fatwa, 2019): Cautious approval with conditions. Concerns raised by Dar al-Ifta Egypt (2017): Excessive gharar and lack of intrinsic value. Source: Mausu'ah Fiqhiyyah Kuwaitiyah, Vol 31, pp. 234-237, plus contemporary fatwa compilation."
See the difference? We're not just giving opinions—we're showing you exactly where these opinions come from and how different schools approach the issue.
Real Talk: What We Got Wrong (And Fixed)
Performance Issues: Our first vector search was slow. Like, really slow. We fixed it with better indexing and parallel processing.
Cultural Blindspots: Beta users taught us things we never considered. Like how different communities have different transliteration preferences.
Source Weighting: Initially, we treated all texts equally. Bad idea. Classical scholars carry more weight than contemporary opinions, and our algorithm now reflects that.
Context Window: Arabic sentences can be really long. We had to tune our embedding strategy to capture complete thoughts without losing nuance.
The Numbers (Because Everyone Loves Metrics)
Current Performance:
- 555 active beta users across 50+ countries
- 33.35s average response time (we're working on this!)
- 4,562 total queries processed
- 66 queries today and growing
Data Scale:
- 500,000+ source texts processed
- 2.3M+ individual rulings extracted
- 50+ languages of source material
- 2MB average storage per user
What's Next (If We Can Keep the Servers Running)
Short term: We're working on voice interface (imagine asking fiqh questions while driving) and better mobile optimization. Assuming our DigitalOcean bill doesn't get too scary.
Medium term: Multi-language support, starting with Malay and Urdu. Also planning a scholar verification system—verified Islamic scholars can review and validate AI responses. This stuff needs proper funding though.
Long term: We want to build a complete Islamic knowledge graph. Imagine connecting every hadith to related Quranic verses and fiqh rulings automatically. Think Google's knowledge graph, but for Islamic scholarship. Obviously, this requires resources that match the ambition.
The Reality Check: We're currently a free tool serving 5,000+ users with bootstrap-level infrastructure. ChatGPT has billions in funding and global data centers. We have passion, authentic sources, and really good coffee.
If any investors are reading this: we've proven the concept works. Now we need to scale it properly. The Muslim community deserves AI that actually understands Islamic knowledge, not generic responses from models trained on Wikipedia.
The Human Element
Here's the thing about AI and religious knowledge: the technology is just a tool. We're not trying to replace scholars or traditional learning. We're trying to make authentic knowledge more accessible.
Every response includes source citations. We encourage users to verify important matters with qualified scholars. When scholars disagree, we show multiple perspectives.
The goal isn't to be the final authority it's to be a reliable starting point for learning.
Privacy & Ethics (The Boring But Important Stuff)
- We don't sell user data. Ever.
- Sensitive processing happens client-side when possible
- Users can delete their data completely
- We're transparent about our sources and limitations
Islamic knowledge is sacred. We treat it and our users with respect.
Behind the Scenes
Building this required more than just coding. We worked with Islamic scholars, studied classical Arabic, learned about different schools of jurisprudence. Our team spent months just understanding the domain before writing any code.
The technical challenges were real, but the cultural responsibility was bigger. Every design decision considered: "Does this serve the Muslim community well?"
Try It Yourself
AIFiqh is live at aifiqh.com. We're still in beta, still learning, still improving.
If you're a developer interested in Islamic tech, hit me up. If you're a scholar who wants to help improve our accuracy, we'd love to collaborate.
And if you're just someone who's ever struggled to find a clear Islamic ruling online well, that's exactly who we built this for.
Built with ❤️ by Muslims, for Muslims. Technical leadership by yours truly, with an amazing team of engineers and Islamic knowledge experts.
Current status: Beta (which means it's pretty good but we're still fixing things)
Questions? Feedback? Email us at hello@aiafiqh.com or better yet, try the platform and ask it directly.
Top comments (0)