I’m trying to build a similar Google Scholar for my university. Do you have any ideas?
This is my college project. We are assigned to build a pretty basic search engine using a crawler like Selenium. Then, I thought to myself, “Why stop there? Making a generic project won’t make me shine. So, I started researching.”
What is a search engine, and why is it important? One thing leads to another, and I found out there’s another name for a search engine now. That’s what Perplexity is basically. Then, I heard the CEO of Perplexity in Lex. He talked about how Google’s search engine isn’t its primary source of revenue. It also earns YouTube alone 100 billion dollars annually. Back to search engines, I went off-topic. My aim is also to build a billion-dollar company someday. LOL.
So, search engines let us go back to where vector search naive theory and different algorithms were used previously. Now, Google uses an algorithm with BM25 or BM22, which I need to check on.
I had an idea: how about I build a search engine and answer engine, naming it Sonic? Sonic crawlers will search all the time and rate the webpage. This is an added layer for better ranking. The crawlers must be pretty unbiased, must be.
However, I started building this Sonic search engine. It did get “AI” and “all the help I can get.” Then I realized that if I continued doing this, I have this tendency of overloading my ideas so much that I can’t carry myself and drop it and forget about it. The same thing was going to happen, so I only thought about creating an assignment-worthy project for now and later adding features. This way, I can make it exist and make it better later.
I have this highly distracted yet incredibly curious mind. I believe it’s only me now. I’m learning the basics of algorithms, mathematics, and programming that are necessary to complete this project.
Project Name: University Scholar
The project will be available on GitHub after I finish it. I’ll post it here daily.
Tech Stack:
- Frontend: Next.js
- Backend: FastAPI (most of the AI stuff happens here)
- Data Indexing: Elasticsearch
- AI/NLP: I want to add an answering feature as well.
- Crawler: Selenium
I’ve completed the frontend using Next.js. I’ll be updating it daily. I’m having issues with Docker and FastAPI.
Top comments (0)