I try to "dogfood" dev.to as much as possible. That means if I want the answer to a programming question I use our search engine instead of Google. This is kind of silly, as Google has the infinite universe of search, and our search is a pretty basic hacked-together search feature that only includes posts from this one website which is itself pretty new.
But it's actually getting pretty good. When I feel unsure about a topic, a lot of the times I can find a really nice personal take on how to explain it and it's pretty reliable. It doesn't need to be better than Google, because that search giant will always be there when we need it, but it should be different in good ways.
Let's take a look at the git rebase query
dev.to
How about serverless?
dev.to
Let's try blockchain
dev.to
The other day, I wanted to look up some of Lorenzo Pasqualis's posts, so I searched his first name.
Let's compare the lorenzo query
dev.to
The next nice feature, currently not available in the mobile UI, is some simple filtering which is helpful depending on what you're looking for. For example, back to the serverless example. If we're trying to get deep into this subject, we may want to burn through some Podcasts to soak up as much as we can. (At 2x speed for me 😋)
It's not that hard to code up a thing that does a thing. It's very hard to make that thing actually useful. In order to make the dev.to search useful, it took a lot of effort in growing the community, understanding where Google's search occasionally falls short, and having a humbleness to stay in our lane at the same time.
When you see a post from this community, you get some quality signals, some uniformity in terms of comments, and also the ability to ask a follow up question in the comments with a decent chance of getting a reply. It's a lot more than just search.
In a year's time the dev.to search will be much more useful than it is now, but I'm proud to say it's actually pretty useful.
Happy coding.
Top comments (21)
Out of curiosity - what's the doc store? Or is the search being done in-memory by the backend? I used to work at a non-Google search company, and we had a lot of success using ElasticSearch as our doc store. I've even been able to write reasonable searches for the Vietnamese language with it. Handling human language is something it does really well, and something that is quite a pain to cobble together by hand. Then again, not sure that level of complexity is necessarily needed for just searching blog posts :)
Good question. We actually use Algolia, which is a hosted search. It specializes in low latency search which you can use for stuff like autocomplete, and I'm really drawn to that because I figured it would take years before we'd have a great search index, but we could quickly get up and running with a fast search index. That gives people the chance to make a few quick searches if we don't give them the right answers the first time around. We use Algolia's global distribution so it's fast everywhere on earth. We wanted to be globally performant with the whole site from the get-go and the path to getting there is a bit more complicated with some of the other routes we could have chosen.
The last 10% of the work is going to be the hardest, so we went with a user-friendly solution that would get us most of the way there. I've used a few of the other search indexes and I'm pretty happy with the direction we chose for this project so far.
I'd love to see some kind of flowchat or diagram of dev.to's internals. Every time something like this comes up I become more and more fascinated by how every aspect of this site is fast, and how everything is built.
Thanks for the feedback. I've written about it a bit before, but there's definitely more ways to describe it.
Good job on picking Algolia. Elasticsearch under the covers I believe. I'm recently designing the architecture for a client's website and recommended Algolia to them it's a neat product and saves you having to roll your own lucene engine. I particularly liked the automated indexing of your site if you are willing to pay a little for it.
Thanks for recommending Algolia ImTheDeveloper! If you don't mind, I'll ping you on twitter about sending you a t-shirt :)
In fact, Algolia is new search technology built from scratch, not based on Lucene or Elasticsearch. If you want to read about the design of our engine, I recommend the "Inside the Algolia Engine" series written by our CTO. Here's a link to the first part: blog.algolia.com/inside-the-algoli...
Just had a read through the blog post including some of the previous ones. Great descriptions and really bring to life all of the design decisions that had to be made. Brave usage of Zookeeper, I know it sometimes can be met with gasps due to being a bit tough to configure and manage but it appears to fit the use case well.
Good information on nginx and using redis for counters. I've used a similar architecture in the past and the key/value store is perfect. Is there any further details on the search algorithms themselves? The blogs elude to the applications being written in pure C for speed and also that there are some specialist techniques used for relevance and the like.
I've asked a few of the team who are closer to the engine about the algorithms used. I'll post back the response. Also, there will be more posts in the Inside the Engine series next year, so stay tuned! :)
Interesting, that is a solution I am not familiar with - I'll definitely check it out. Setting up ElasticSearch (or any other type of search) is always a lot of trial and error so if someone can take out the infrastructure tweaking part and just let you fiddle with the search part, that sounds ideal.
Yeah, it's kind of a whole different direction to go. I'd definitely recommend checking it out. You're not going to get all the flexibility, but you'll get a powerful product.
Totally off topic, but I <3 Algolia. I use their WordPress plugin on my own site, for type-to-search and autocomplete. They've been responsive and helpful, and their templating is easy to pick up.
It's like eli5 for developers. Good job.
🙌
Hi Ben! just a passing thought - are you guys planning on adding tags to the search as well? Usually, the Top 100 section is more than helpful, but it would be awesome if the global search handled tags too! Otherwise, the search is pretty powerful.
Yep, should be on its way. Most likely after we open source in January. We're done with the search for a bit to focus on some other things.
@ben , dude. You guys need to go open source and unleash all the glory this app is. 💙🙌
Really like this PWA.
Indeed the search is really useful!
Oh My God this is cool
Off-Topic:
All images in the main post shall be better delivered without doing the on-demand resizing.
Presumably for optimization, but actually counter-optimization...
See also:
https://github.com/MasterInQuestion/talk/discussions/35
For such example... estimated ~ 80% size reduction potentially.
(without apparent quality loss; or 60%+ losslessly)
[[
]]
High resolution doesn't necessarily mean big size.
Properly processed, the media can be of both: small size + high fidelity.
Tested `cwebp` 1.4.0:
~ 70.3% size reduction (out 107,302 B) losslessly.
~ 79.43% (out 74,340 B) with "-near_lossless 20".
Further reduction (without impairing quality) still possible with more sensible denoise, instead of using "-near_lossless".
[ ^ See also: https://github.com/MasterInQuestion/talk/discussions/22 ]
I find the search functionality much broken inconsistent...
For example, to accomplish similar:
https://hn.algolia.com/?query=author:MasterIQ&type=comment&dateRange=all&sort=byDate
https://news.ycombinator.com/threads?id=MasterIQ
Below all worked in quite unexpected ways:
https://dev.to/search?q=@ben&filters=class_name:Comment
[ ^ Appeared working somehow... but by coincidence? ]
https://dev.to/search?q=@ben&filters=class_name:Article
https://dev.to/search?q=@ben
.
https://dev.to/search?q=@ben&filters=class_name:Comment&sort_by=published_at&sort_direction=desc
https://dev.to/search?q=author:ben&filters=class_name:Comment&sort_by=published_at&sort_direction=desc
.
https://dev.to/search?q=author:ben&filters=class_name:Article
https://dev.to/search?q=author:@ben&filters=class_name:Article
Besides, the UI buttons for search control also appeared sort of broken.
Condamné useful... may I say?
Pardon.