The Problem: Reading Research Papers is Soul-Crushing
Picture this: You're deep in blockchain research. You need to understand consensus mechanisms. So you:
- Search arXiv for "blockchain consensus"
- Get 847 results
- Download 50 PDFs
- Realize you need to read all of them
- Cry
Then it hit me: What if an AI could do this entire workflow autonomously?
Not just "summarize papers" like ChatGPT. I mean:
- Search academic databases
- Download papers automatically
- Parse PDFs and extract knowledge
- Build a searchable knowledge base
- Generate hypotheses for novel protocols
- Run simulations to test ideas
- Write research papers with proper citations
All by itself. While I sleep.
That's ConsensusMind.
The Controversial Decision: Pure Rust (No Python)
Everyone builds AI tools in Python. Everyone.
I chose Rust.
"Are you insane?"
That's what I thought too. But here's the thing:
Python is slow:
# Python: PDF parsing
start = time.time()
text = extract_text("paper.pdf")
print(f"Took {time.time() - start}s")
# Output: Took 0.5s
Rust is fast:
// Rust: Same PDF
let start = Instant::now();
let text = parser.extract_text(&pdf_path)?;
println!("Took {:?}", start.elapsed());
// Output: Took 100ms
5x faster. And that's just parsing ONE paper.
When you're processing thousands of papers, this compounds.
"But Python has better AI libraries!"
True. But I don't need them.
- LLM? Self-hosted vLLM via REST API (language-agnostic)
- Vector search? SQLite with vec0 extension (pure Rust)
- PDF parsing? pdf-extract crate (pure Rust)
- Everything else? Rust ecosystem has it
The real kicker: Rust gives me a single binary deployment.
# Python deployment
pip install -r requirements.txt
# Result: 50 packages, version conflicts, pray it works
# Rust deployment
./consensusmind
# Result: One file. Just works.
Week 1: Foundation (Avoiding Analysis Paralysis)
Day 1 was rough. I had two choices:
- Spend weeks designing the perfect architecture
- Ship something that works, iterate fast
I chose option 2.
The Minimum Viable Foundation
// Day 1: Just make it compile
pub struct ConsensusMind {
config: Config,
logger: Logger,
llm_client: LlmClient,
}
impl ConsensusMind {
pub fn new() -> Result {
// Load config, setup logging, that's it
Ok(Self { /* ... */ })
}
}
Quality rule from day 1: Zero compiler warnings.
Not "we'll fix it later." Not "technical debt is fine for MVP."
Zero. Warnings.
This decision saved me later.
Week 2: The arXiv Integration (When APIs Fight Back)
The Challenge
Build a client that:
- Searches arXiv for papers
- Downloads PDFs
- Doesn't get rate-limited
- Doesn't crash
- Actually works
The Reality
// First attempt - DON'T DO THIS
pub async fn search(&self, query: &str) -> Result<Vec> {
let response = self.client.get(&url).send().await?;
// ... parse XML ...
// Works! Ship it!
}
What happened: Got rate-limited after 10 requests. arXiv blocked me.
The Fix
// What actually works
pub async fn search(&self, query: &str) -> Result<Vec> {
let response = self.client.get(&url).send().await?;
let papers = self.parse_response(&response)?;
// The magic: respect rate limits
sleep(Duration::from_secs(3)).await;
Ok(papers)
}
Lesson learned: Read the API docs. All of them. Twice.
The Payoff
First successful test:
$ cargo test test_arxiv_search -- --ignored --nocapture
Downloaded: "Byzantine Fault Tolerance in Practice"
Downloaded: "Consensus in the Age of Blockchains"
Downloaded: "Practical Byzantine Fault Tolerance Revisited"
β 3 papers in 12 seconds
That moment when it works: Chef's kiss.
Week 2.5: PDF Parsing Hell
The Problem
Some PDFs are... weird.
- Scanned images (no text)
- Encrypted
- Malformed
- In Comic Sans (okay, not really, but felt like it)
My First Naive Attempt
let text = extract_text(&pdf_path)?;
// Assumes it just works
Spoiler: It didn't.
What Actually Worked
pub fn extract_text(&self, pdf_path: &Path) -> Result {
// Check file exists (obvious but important)
if !pdf_path.exists() {
return Err(ParserError::FileNotFound);
}
// Try extraction
let text = extract_text(pdf_path)
.map_err(|e| ParserError::ExtractionFailed(e.to_string()))?;
// Sanity check
if text.trim().is_empty() {
warn!("Empty PDF or scanned document: {}", pdf_path.display());
return Err(ParserError::EmptyDocument);
}
// More sanity
let word_count = text.split_whitespace().count();
if word_count < 100 {
warn!("Suspiciously short: {} words", word_count);
}
Ok(text)
}
Real test result:
- Paper: 20 pages, dense academic writing
- Extracted: 12,973 words, 83,063 characters
- Time: 100ms
- Accuracy: Near-perfect
Victory.
Week 3: Vector Search (Making Papers Searchable)
The Vision
"Show me papers about Byzantine fault tolerance that mention network partitions"
Not keyword search. Semantic search.
The Implementation
// Store papers as vectors
pub struct VectorStore {
db: Connection,
embeddings: HashMap>,
}
impl VectorStore {
pub fn search(&self, query: &str, top_k: usize) -> Result<Vec> {
// Convert query to vector
let query_vec = self.embed(query)?;
// Find similar vectors using cosine similarity
let results = self.db.query(
"SELECT * FROM papers
ORDER BY vec_distance_cosine(embedding, ?)
LIMIT ?",
params![query_vec, top_k],
)?;
Ok(results)
}
}
The "Holy Shit It Works" Moment
// Query: "consensus under network partition"
let results = store.search("consensus under network partition", 5)?;
// Results (paraphrased):
// 1. "Partition-tolerant consensus protocols"
// 2. "Byzantine agreement with network delays"
// 3. "Consensus in asynchronous systems"
// 4. "Network partition recovery in distributed systems"
// 5. "Fault tolerance under partial connectivity"
Not a single result mentioned "network partition" explicitly in the title.
Semantic search works. Mind = blown.
The Brutal Truth: What Almost Killed The Project
Problem 1: Scope Creep
Week 2, 3am: "What if it also analyzed tweets and Reddit posts andβ"
Solution: Slapped myself. Stuck to the plan.
Problem 2: Perfect Code Paralysis
Week 2, day 4: Spent 6 hours debating enum vs struct for error types.
Solution: Picked one. Shipped it. Moved on.
Problem 3: "Should I Use Python?"
Week 2, day 6: Seriously considered rewriting in Python because "that's what everyone uses."
Solution: Looked at my Rust code. Zero warnings. Fast as hell. Single binary.
Stayed with Rust.
The Numbers (Because People Love Numbers)
Development Timeline
- Week 1: Foundation + arXiv integration
- Week 2: PDF parsing + metadata tracking
- Week 3: Vector search + agent core
- Total: 3 weeks, ~150 hours
Code Quality
$ cargo clippy
# Result: 0 warnings
$ cargo test
# Result: 15 tests, 15 passed, 0 failed
$ cargo build --release
# Result: Binary size: 20MB (single file!)
Performance Benchmarks
| Task | Rust | Python | Speedup |
|---|---|---|---|
| PDF parsing | 100ms | 500ms | 5x faster |
| Vector search | 10ms | 100ms | 10x faster |
| Full pipeline | 2s | 10s+ | 5x faster |
| Memory usage | 50MB | 500MB | 10x less |
Business Metrics (Projected)
- Development cost: $0 (solo project)
- Hosting cost: ~$280/month (RunPod GPU)
- Year 1 revenue target: $140,000
- Break-even: 2-6 paying users
What I Learned
1. Rust Is Ready for AI
Myth: "You need Python for AI."
Reality: You need good libraries and APIs. Language doesn't matter.
Rust has:
- Excellent HTTP clients (reqwest)
- PDF processing (pdf-extract)
- Vector databases (SQLite + vec0)
- Async runtime (tokio)
- Everything you need
2. Quality Compounds
Day 1 decision: Zero warnings allowed.
Week 3 result: Zero refactoring needed. Code just worked.
The math:
- Fix warnings daily: 10 min/day Γ 21 days = 210 minutes
- Fix warnings at the end: 3-5 days of hell
Front-load the pain. Thank yourself later.
3. Ship Fast, But Ship Quality
Fast β Sloppy
Fast = Efficient
I shipped in 3 weeks by:
- Making quick decisions (not perfect ones)
- Writing tests immediately (not "later")
- Maintaining quality gates (zero warnings)
- Iterating rapidly (ship, measure, improve)
4. Self-Hosted LLMs Work
RunPod + vLLM:
- $0.39/hour for A40 GPU
- DeepSeek-R1 quality responses
- Full control over prompts
- No OpenAI API bills
Total LLM cost during development: $47
vs. OpenAI API for same workload: $300+
5. Open Source Builds Credibility
Released on GitHub from day 1:
- Forces clean code (people will see it)
- Builds portfolio
- Attracts contributors
- Creates trust with users
Side benefit: Looks great on LinkedIn.
The Tech Stack (For the Curious)
Core Technologies:
- Language: Rust 2021
- Async Runtime: Tokio
- HTTP Client: Reqwest + Rustls (HTTPS only)
- Database: SQLite
- Vector Database: vec0 extension
- PDF Processing: pdf-extract
- XML Parsing: quick-xml
- LLM: Self-hosted vLLM (RunPod)
- Deployment: Single binary
- CI/CD: GitHub Actions
- Hosting: RunPod (GPU) + GitHub Pages (landing)
Notable absences: Python, Docker (for core app), Kubernetes, microservices
Why? They weren't needed. Simplicity wins.
What's Next?
Short Term (This Month)
- [x] GitHub release v1.0.0
- [ ] Landing page launch
- [ ] Waitlist setup
- [ ] First 100 signups
Q1 2026
- [ ] Authentication (GitHub, Google, Email)
- [ ] SaaS infrastructure
- [ ] Beta launch
- [ ] First paying customers
Q2 2026
- [ ] Public launch
- [ ] Enterprise tier
- [ ] Academic paper publication
- [ ] $10k MRR (Monthly Recurring Revenue)
The Dream
Platform for autonomous research. Not just blockchain. Any technical domain.
Imagine:
- AI researching quantum computing papers
- AI analyzing medical research
- AI exploring ML architectures
- All autonomous. All documented. All open source.
Try It Yourself
# Clone the repository
git clone https://github.com/ChronoCoders/consensusmind.git
cd consensusmind
# Build (requires Rust)
cargo build --release
# Run
./target/release/consensusmind
# Or just explore the code
Fair warning: You'll see that AI tools don't need Python. This might change your perspective on everything.
The Real Lesson
You don't need:
- Python for AI
- Months to build production software
- A team to ship something real
- Venture capital to start
- Permission to build the future
You DO need:
- A clear vision
- Quality standards
- Execution speed
- Willingness to learn
- Guts to ship
I built ConsensusMind in 3 weeks, solo, with zero budget.
What's your excuse?
Links and Resources
- Live Demo: chronocoders.github.io/consensusmind
- GitHub Repository: github.com/ChronoCoders/consensusmind
- Documentation: GitHub README
- Contact: hello@dslabs.network
Built something cool with Rust? Drop a comment. Let's talk.
About the Author
Altug Tatlisu builds autonomous research tools at Distributed Systems Labs. He believes the future is written in Rust, not Python. Follow him on GitHub to watch the journey unfold.
P.S. If you're still reading, you're probably going to build something awesome. When you do, tag me. I want to see it.
Top comments (0)