Like any note-taker worth their salt, I’ve built up a cumbersome, often neglected pile of half-formed thoughts or ideas that crossed my mind throughout the day. I’ve always believed there were nuggets of value buried in that chaos.
Among these notes are recipes, tech experiments, a plethora of computer science knowledge from college, and many follow standardized templates I created. I wanted to share them online through a blog, making them searchable for anyone who might find them useful.
So instead of dumping these into a CMS by hand, I decided to repurpose my Obsidian vault into a searchable wiki I can embed on my blog. Since everything is already in Markdown and includes metadata in frontmatter, I could skip the copy-paste slog and get straight to building something useful.
This tutorial walks through how I turned my local notes into a web-accessible knowledge base, with:
- MongoDB Atlas Search for full-text, fuzzy, and autocomplete querying.
- Spring Boot for ingesting and formatting data, as well as the REST API.
- Custom weighting and boosting to prioritise my best and most useful notes.
I’ll be indexing my own technical notes on some frameworks as a demo, and I’ll show you how to expose and query them through an API, so you can do the same with yours.
If you just want the code, it's all available in this Github repo.
What is Obsidian?
Obsidian is a free note-taking app that lets you write and organize your thoughts in plain text using Markdown. It’s designed for people who want to build a personal knowledge base, with features like backlinks and graph views, and it’s designed to help users organize and structure their thoughts and knowledge in a flexible, non-linear way.
I like Obsidian as a note-taking app because it allows a lot of customization, and it’s really just a wrapper for viewing Markdown files. That means I never have to worry about my notes being locked away if Obsidian disappears. They’re still just plain text on disk. It also has a vast community plugin ecosystem and makes it easy to build your own, so I can optimize my workflow exactly how I want.
Why MongoDB Atlas Search makes sense
Since Obsidian stores everything as plain Markdown with optional frontmatter, the structure maps cleanly to documents in MongoDB. I don’t have to force a schema onto my notes and can store multiple formats in the same collection. Each one can have different tags, metadata, or headings, and MongoDB’s flexible model handles that without complaint. But the real reason this setup works so well is MongoDB Atlas Search.
It gives us full-text search, autocomplete, and fuzzy matching, all built in, no separate indexing service or extra infrastructure needed. We can even boost certain fields to prioritize titles, tags, or whatever matters most to us. Want to weigh title matches higher than body content? Or prioritize notes with certain tags? Totally up to us.
The core of it all is the $search
stage. For something as messy and inconsistent as a personal vault of thoughts, it’s more than robust enough to build the search for my wiki.
With that out of the way, let’s get into how I actually built it, starting with parsing notes and getting them into MongoDB.
Prerequisites
Before we get started, there are a few things we'll need:
- A MongoDB Atlas account with a cluster set up
- A free M0 cluster is perfect for this
- Java on your machine, set up and ready to go (I use Java version 24, but for some of the code, you will need at least Java version 22!)
- Maven (I use version 3.9.10)
- An Obsidian vault, with some notes inside
Creating our Spring app
To get started, we’ll scaffold a simple Spring Boot project using Spring Initializr. We’re keeping things minimal, just enough to expose a REST API and connect to MongoDB.
Here’s what I selected on Spring Initializr:
- Project: Maven
- Language: Java
- Spring Boot: 3.x (any stable version works)
- Dependencies:
- Spring Web—to expose our REST API
- Spring Data MongoDB—to interact with our MongoDB collection
Once we've configured those options, give it a name (SpringSearch
) and define the group (com.timkelly
), click Generate, unzip the project, and open it in our IDE of choice.
From here, we’ll start wiring things up so that Spring can read our notes, store them in MongoDB, and expose them through a clean API.
Storing in MongoDB
Before we start importing notes, let’s make sure our app can actually talk to MongoDB.
In our application.properties
, we add the connection string for our MongoDB Atlas cluster, and specify the database we'd like to use:
spring.data.mongodb.uri=YOUR-CONNECTION-STRING
spring.data.mongodb.database=obsidian
Using MongoDB Atlas, you can grab your connection string from the cluster dashboard. Just make sure to whitelist your IP and create a database user.
Now, let’s define a simple repository so Spring Data MongoDB can handle the persistence for us. Create a package repository and in it, we'll define an interface NoteRepository
:
package com.timkelly.springsearch.repository;
import com.timkelly.springsearch.model.Note;
import org.springframework.data.mongodb.repository.MongoRepository;
import org.springframework.stereotype.Repository;
import java.util.Optional;
@Repository
public interface NoteRepository extends MongoRepository<Note, String> {
Optional<Note> findByTitle(String title);
}
This gives us built-in CRUD operations out of the box (thanks, MongoRepository
), and we add one custom method: findByTitle(String title)
, which we’ll use later to check if a note already exists before inserting or updating it.
Ingesting Obsidian notes
For this tutorial, I’m using a few sample notes on tech frameworks, each following a simple Markdown template, that I have included in the Github repo:
---
tags:
- List
- of
- items
created: YYYY-MM-DD HH:mm
---
# Title
Content Lorem Ipsum
This structure gives us a frontmatter block that we’ll treat as metadata. The tags
array defines relevant topics for the note (e.g., AI
, Java
, REST
), and created
marks when the note was written. Below that, we assume a level-one heading (#
) for the title, followed by the main body content.
If your notes are structured differently, you can adjust the parser to suit. There’s nothing stopping you from using the filename as a title, omitting the frontmatter, or extending the metadata. MongoDB is even happy to handle documents with varied structures in the same collection.
These notes will be mapped to MongoDB documents like this:
{
"title": "String",
"tags": ["String"],
"createdAt": "ISODate",
"content": "String"
}
Now, in order to interact with MongoDB and map our documents properly, we’ll need a Note
model.
Create a new class called Note
inside a new package named model
in our Spring app. We’ll use the @Document
annotation from Spring Data MongoDB to let Spring know that this class represents a document in the "notes"
collection.
Here’s what it looks like:
package com.timkelly.springsearch.model;
import org.springframework.data.mongodb.core.mapping.Document;
import java.util.Date;
@Document(collection = "notes")
public class Note {
private String title;
private String content;
private String[] tags;
private Date createdAt;
public Note() {
}
public Note(String title, String content, String[] tags, Date createdAt) {
this.title = title;
this.content = content;
this.tags = tags;
this.createdAt = createdAt;
}
public String getTitle() {
return title;
}
public void setTitle(String title) {
this.title = title;
}
public String getContent() {
return content;
}
public void setContent(String content) {
this.content = content;
}
public String[] getTags() {
return tags;
}
public void setTags(String[] tags) {
this.tags = tags;
}
public Date getCreatedAt() {
return createdAt;
}
public void setCreatedAt(Date createdAt) {
this.createdAt = createdAt;
}
}
Nothing fancy—we're defining the fields we want to store in MongoDB, along with a no-args constructor (needed by Spring) and a full constructor for convenience. Then, we generate the usual getters and setters to interact with our object.
Now that we’ve got a Note
model and a structure we’re happy with, let’s actually load the files from our Obsidian vault and push them into MongoDB.
First, let's define where our notes live. In our application.properties
or application.yml
, we need to add the following entry pointing to our notes folder. Here’s my dummy data in the GitHub repo, if you’re following along:
notes.folder.path=dummyData
Next, we’ll bind that value using a simple Spring @ConfigurationProperties
class. Create a config
package and add the class below, NotesFolderProperties
:
package com.timkelly.springsearch.config;
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.context.annotation.Configuration;
@Configuration
@ConfigurationProperties(prefix = "notes.folder")
public class NotesFolderProperties {
private String path;
public String getPath() { return path; }
public void setPath(String path) { this.path = path; }
}
This lets us inject the file path wherever we need it.
Parsing Markdown
After our config, we need some logic to parse our Markdown and extract our data. Since the format of our notes is predictable for this example, converting them into our document structure is straightforward. Below is an implementation of a MarkdownNoteParser
class, which we can place in a new util
package:
package com.timkelly.springsearch.util;
import com.timkelly.springsearch.model.Note;
import org.springframework.stereotype.Component;
import org.yaml.snakeyaml.Yaml;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.*;
import java.util.regex.*;
@Component
public class MarkdownNoteParser {
private static final Pattern FRONTMATTER_PATTERN = Pattern.compile("^---\\s*\\n(.*?)\\n---\\s*\\n", Pattern.DOTALL);
private static final SimpleDateFormat DATE_FORMAT = new SimpleDateFormat("yyyy-MM-dd HH:mm");
public Note parse(String rawMarkdown) {
Matcher matcher = FRONTMATTER_PATTERN.matcher(rawMarkdown);
Map<String, Object> metadata = new HashMap<>();
if (matcher.find()) {
String frontmatter = matcher.group(1);
rawMarkdown = rawMarkdown.substring(matcher.end());
Yaml yaml = new Yaml();
metadata = yaml.load(frontmatter);
}
String title = extractTitle(rawMarkdown);
String content = rawMarkdown.trim();
String[] tags = Optional.ofNullable(metadata.get("tags"))
.map(t -> ((List<?>) t).stream().map(Object::toString).toArray(String[]::new))
.orElse(new String[0]);
Date createdAt = Optional.ofNullable(metadata.get("created"))
.map(Object::toString)
.map(this::parseDate)
.orElse(null);
return new Note(title, content, tags, createdAt);
}
private String extractTitle(String markdown) {
Scanner scanner = new Scanner(markdown);
while (scanner.hasNextLine()) {
String line = scanner.nextLine().trim();
if (line.startsWith("# ")) {
return line.substring(2).trim();
}
}
return "Untitled";
}
private Date parseDate(String dateStr) {
try {
return DATE_FORMAT.parse(dateStr);
} catch (ParseException e) {
System.err.println("Failed to parse date: " + dateStr);
return null;
}
}
}
This class uses YAML to parse the frontmatter metadata block, a regular expression to detect the frontmatter, and a simple scanner to extract the title.
The FRONTMATTER_PATTERN
constant defines a regex for identifying YAML frontmatter blocks in the Markdown content. For example, if the frontmatter is:
---
tags:
- java
- spring
created: 2025-07-23 12:00
---
It will be extracted as {tags=[java, spring], created="2025-07-23 12:00"}
, and the remaining content will be stored in rawMarkdown
after the frontmatter is removed.
Next, we use the extractTitle(rawMarkdown)
helper method to detect the note’s title, provided the Markdown starts with a # Heading
. We also check for a tags
field in the metadata, converting it to a string array if present.
For the creation date, parseDate(String dateStr)
parses the created
value from the metadata into a Date
object.
Finally, we trim any extra whitespace from the Markdown body using rawMarkdown.trim()
before returning a fully populated Note
object.
Importing from our vault
Now that we can parse our Markdown notes, the next step is to scan an entire folder in our vault, parse every .md
file, and insert or update each note in MongoDB. For this, we’ll create a service called NoteImportService
inside a new service
package:
package com.timkelly.springsearch.service;
import com.timkelly.springsearch.config.NotesFolderProperties;
import com.timkelly.springsearch.model.Note;
import com.timkelly.springsearch.repository.NoteRepository;
import com.timkelly.springsearch.util.MarkdownNoteParser;
import org.springframework.stereotype.Service;
import java.io.IOException;
import java.nio.file.*;
import java.util.Arrays;
import java.util.List;
import java.util.Objects;
@Service
public class NoteImportService {
private final NotesFolderProperties props;
private final MarkdownNoteParser parser;
private final NoteRepository repository;
public NoteImportService(NotesFolderProperties props, MarkdownNoteParser parser, NoteRepository repository) {
this.props = props;
this.parser = parser;
this.repository = repository;
}
public void importNotes() throws IOException {
List<Path> files = Files.walk(Paths.get(props.getPath()))
.filter(p -> p.toString().endsWith(".md"))
.toList();
for (Path path : files) {
String content = Files.readString(path);
Note newNote = parser.parse(content);
repository.findByTitle(newNote.getTitle())
.ifPresentOrElse(existing -> {
boolean changed =
!Objects.equals(existing.getContent(), newNote.getContent()) ||
!Arrays.equals(existing.getTags(), newNote.getTags()) ||
!Objects.equals(existing.getCreatedAt(), newNote.getCreatedAt());
if (changed) {
existing.setContent(newNote.getContent());
existing.setTags(newNote.getTags());
existing.setCreatedAt(newNote.getCreatedAt());
repository.save(existing);
System.out.println("Updated: " + existing.getTitle());
} else {
System.out.println("Skipped (no changes): " + existing.getTitle());
}
}, () -> {
repository.save(newNote);
System.out.println("Inserted: " + newNote.getTitle());
});
}
}
}
This service walks through your vault, finds every .md
file, and parses it into a Note
object. It then checks MongoDB for an existing note with the same title. The repository method findByTitle(...)
looks for a document with the same title.
If no note with that title exists, the note is saved as a new document. If a note is found, the service compares three fields to detect changes:
-
content
: Compared usingObjects.equals(...)
. -
tags
: Compared usingArrays.equals(...)
. -
createdAt
: Compared usingObjects.equals(...)
.
If any of these fields differ, the existing note is updated with the new values. Otherwise, the write is skipped to avoid unnecessary database operations. This approach is redimentary, but ensures we can run the import multiple times without creating duplicates or re-writing unchanged notes.
The duplicate detection here is very basic. It assumes that titles are unique across all notes. While this is fine for smaller directories (as in this example), larger vaults may benefit from optimizations such as:
- Reducing the number of file checks by caching file modification times, or manually calling the files to upload.
- Minimizing database queries by batch operations or indexing our regularly queried fields like
title
andcreatedAt
.
Search our database with MongoDB Atlas Search
We have our documents in our database, modelled as we wish them to be. Now, it is time to search using the $search
operator. This will allow us to do our full-text search, autocomplete, fuzzy matching, and custom boosting. I'll introduce them one after the other so we can have a better idea of how they all work individually, and then wrap it up with all of them in the one search query.
Our Atlas Search index
Before we can perform full-text queries, we need to configure an Atlas Search index for our collection. In our MongoDB Atlas cluster, go to the collection where you will be storing your documents and select the search index tab:
We need to create a new index for the notes
collection in a database called obsidian
with the following JSON configuration. Since we haven’t run our application yet, we need to manually create the notes
collection and the obsidian
database. Then, we can add our search index like below:
{
"mappings": {
"dynamic": false,
"fields": {
"title": [
{ "type": "string" },
{ "type": "autocomplete" }
],
"tags": {
"type": "string"
},
"content": {
"type": "string"
}
}
}
}
The configuration above enables full-text search on all three fields, and we’ve added an autocomplete
mapping on the title
field. This gives us the foundation for type-ahead search later.
Simple search
We’ll start with a simple search method that queries across all three fields (title
, content
, and tags
) using the $search
operator. We’ll build from here as we add more advanced search capabilities. In the repository
package, create a NoteSearchRepository
class like below:
package com.timkelly.springsearch.repository;
import com.timkelly.springsearch.model.Note;
import org.bson.Document;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.data.domain.Sort;
import org.springframework.data.mongodb.core.MongoTemplate;
import org.springframework.data.mongodb.core.aggregation.*;
import org.springframework.stereotype.Repository;
import java.util.List;
@Repository
public class NoteSearchRepository {
private final MongoTemplate mongoTemplate;
@Autowired
public NoteSearchRepository(MongoTemplate mongoTemplate) {
this.mongoTemplate = mongoTemplate;
}
public List<Note> search(String query) {
AggregationOperation searchStage = _ -> new Document("$search",
new Document("index", "default")
.append("text", new Document("query", query)
.append("path", List.of("title", "content", "tags")))
);
Aggregation aggregation = Aggregation.newAggregation(searchStage);
return mongoTemplate.aggregate(aggregation, "notes", Note.class).getMappedResults();
}
// we'll add more search methods here...
}
This is our most basic search function. We use MongoTemplate
to run an aggregation pipeline that includes a $search
stage. The text
operator tells MongoDB Atlas Search to look for the query string across the title
, content
, and tags
fields. The results are then mapped directly into our Note
model, giving us strongly typed objects ready for use in the rest of the application.
Next, we expose a REST endpoint so our frontend (or even just curl
) can hit /api/notes/search?q=...
and get matching results. Create a controller
package and add a new class, NoteSearchController
:
package com.timkelly.springsearch.controller;
import com.timkelly.springsearch.model.Note;
import com.timkelly.springsearch.repository.NoteSearchRepository;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import java.util.List;
@RestController
@RequestMapping("/api/notes")
public class NoteSearchController {
private final NoteSearchRepository searchRepository;
public NoteSearchController(NoteSearchRepository searchRepository) {
this.searchRepository = searchRepository;
}
@GetMapping("/search")
public List<Note> search(@RequestParam String q) {
return searchRepository.search(q);
}
// we'll add more endpoints here
}
Here, we’ve wired up a simple /api/notes/search
endpoint that takes a q
parameter and passes it along to our repository. The controller returns a list of Note
objects as JSON, making it easy for any frontend (or even a quick curl
request) to query our indexed notes and get instant results.
Search boosted by title
By default, MongoDB Atlas Search treats all fields equally. But often, a match in the title
should matter more than one in the body text. We can achieve this by using the compound
operator and applying a boost
to the score of title matches. We need to add the following to our NoteSearchRepository
, below the earlier code:
public List<Note> searchBoostedByTitle(String query) {
AggregationOperation searchStage = _ -> new Document("$search",
new Document("index", "default")
.append("compound", new Document("should", List.of(
new Document("text", new Document("query", query)
.append("path", "title")
.append("score", new Document("boost", new Document("value", 5)))), // Boost title matches
new Document("text", new Document("query", query)
.append("path", "content")),
new Document("text", new Document("query", query)
.append("path", "tags"))
)))
);
AggregationOperation sortStage = Aggregation.sort(Sort.by(Sort.Order.desc("score")));
Aggregation agg = Aggregation.newAggregation(searchStage, sortStage);
return mongoTemplate.aggregate(agg, "notes", Note.class).getMappedResults();
}
And in our NoteSearchController
, we wire this up in our new endpoint:
@GetMapping("/search/boost-title")
public List<Note> searchBoostedByTitle(@RequestParam String q) {
return searchRepository.searchBoostedByTitle(q);
}
Search boosted by title and tags
We can take this further by boosting both the title
and tags
fields. Here, the title gets the highest weight, tags are slightly less important, and content is left at default. In our NoteSearchRepository
, add:
public List<Note> searchBoostedTitleAndTags(String query) {
AggregationOperation searchStage = _ -> new Document("$search",
new Document("index", "default")
.append("compound", new Document("should", List.of(
new Document("text", new Document("query", query)
.append("path", "title")
.append("score", new Document("boost", new Document("value", 5)))),
new Document("text", new Document("query", query)
.append("path", "tags")
.append("score", new Document("boost", new Document("value", 3)))),
new Document("text", new Document("query", query)
.append("path", "content"))
)))
);
Aggregation agg = Aggregation.newAggregation(searchStage);
return mongoTemplate.aggregate(agg, "notes", Note.class).getMappedResults();
}
And the controller endpoint:
@GetMapping("/search/boost-title-tags")
public List<Note> searchBoostedTitleAndTags(@RequestParam String q) {
return searchRepository.searchBoostedTitleAndTags(q);
}
Search with autocomplete and fuzzy matching
The boosting lets us prioritise where the words we are searching for appear in our documents, but we're not quite there yet. Searching for exact matches has its limits. Let's bring in autocomplete and fuzzy search to round out our searching experience.
Autocomplete, sometimes called type ahead, lets us complete our search queries without the complete input string. We can use the autocomplete
operator if we want our search bar to have a search-as-you-type component to predict words with increasing accuracy as characters are entered in our application's search field.
Fuzzy search brings in typo tolerance to our search mechanics. This way, even if we misspell our query, say Sprong AI
, the search term will still try to use inexact matches.
public List<Note> searchAutocompleteAndFuzzy(String query) {
AggregationOperation searchStage = _ -> new Document("$search",
new Document("index", "default")
.append("compound", new Document("should", List.of(
new Document("autocomplete", new Document()
.append("query", query)
.append("path", "title")
),
new Document("text", new Document()
.append("query", query)
.append("path", "title")
.append("fuzzy", new Document("maxEdits", 2))
)
)))
);
AggregationOperation limitStage = Aggregation.limit(10);
Aggregation aggregation = Aggregation.newAggregation(searchStage, limitStage);
return mongoTemplate.aggregate(aggregation, "notes", Note.class).getMappedResults();
}
And the controller endpoint:
@GetMapping("/search/autocomplete-fuzzy")
public List<Note> searchAutocompleteAndFuzzy(@RequestParam String q) {
return searchRepository.searchAutocompleteAndFuzzy(q);
}
Search with autocomplete, fuzzy matching, and boosting
So we've brought them in step by step, but what does this all look like together? In our NoteSearchRepository
, we need to add a searchAutocompleteFuzzyBoosted(String query)
like in the code below:
public List<Note> searchAutocompleteFuzzyBoosted(String query) {
AggregationOperation searchStage = _ -> new Document("$search",
new Document("index", "default")
.append("compound", new Document("should", List.of(
// Autocomplete on title (highest boost)
new Document("autocomplete", new Document()
.append("query", query)
.append("path", "title")
.append("score", new Document("boost", new Document("value", 6)))
),
// Fuzzy text search on title
new Document("text", new Document()
.append("query", query)
.append("path", "title")
.append("fuzzy", new Document("maxEdits", 2))
.append("score", new Document("boost", new Document("value", 5)))
),
// Fuzzy text search on tags
new Document("text", new Document()
.append("query", query)
.append("path", "tags")
.append("fuzzy", new Document("maxEdits", 1))
.append("score", new Document("boost", new Document("value", 3)))
),
// Fuzzy text search on content (no boost = default weight)
new Document("text", new Document()
.append("query", query)
.append("path", "content")
.append("fuzzy", new Document("maxEdits", 1))
)
)))
);
AggregationOperation sortByScore = Aggregation.sort(Sort.by(Sort.Order.desc("score")));
AggregationOperation limitStage = Aggregation.limit(10);
Aggregation aggregation = Aggregation.newAggregation(searchStage, sortByScore, limitStage);
return mongoTemplate.aggregate(aggregation, "notes", Note.class).getMappedResults();
}
And our last bit of code:
@GetMapping("/search/autocomplete-fuzzy-boosted")
public List<Note> searchAutocompleteFuzzyBoosted(@RequestParam String q) {
return searchRepository.searchAutocompleteFuzzyBoosted(q);
}
Testing our search
With all of our endpoints in place, it’s time to see them in action. Start your Spring Boot app:
mvn clean compile
mvn spring-boot:run
You can now test each search feature from your browser or using curl
. For example:
Basic full-text search:
curl "http://localhost:8080/api/notes/search?q=Spring"
Boosted by title:
curl "http://localhost:8080/api/notes/search/boost-title?q=Spring"
Boosted by title and tags:
curl "http://localhost:8080/api/notes/search/boost-title-tags?q=Spring"
Autocomplete and fuzzy search (try misspelling the query):
curl "http://localhost:8080/api/notes/search/autocomplete-fuzzy?q=Sprong"
All-in-one (autocomplete + fuzzy + boosting):
curl "http://localhost:8080/api/notes/search/autocomplete-fuzzy-boosted?q=Sprong"
If you hit these endpoints from your browser (e.g., http://localhost:8080/api/notes/search?q=Spring Boot
), you’ll see the raw JSON response, but this is exactly what a frontend search bar would consume.
Conclusion
What began as a neglected pile of half-thoughts is now a structured, searchable knowledge base, thanks to MongoDB Atlas Search and Spring Boot. We built a REST API that can handle everything from simple keyword queries to fuzzy, autocomplete-driven searches with custom boosting, making my notes far more discoverable.
By combining Atlas Search’s operators with Spring Boot’s simplicity, I’ve turned my Obsidian vault into a living wiki I can share online, without the manual hassle of a CMS. If you’ve ever wanted to make your notes or knowledge base truly searchable, this approach is simple to replicate.
If you found this useful, check out my other tutorials on MongoDB with Spring, like Spring AI and MongoDB: How to Build RAG Applications or Building a Real-Time Market Data Aggregator with Kafka and MongoDB. If you want to learn more about Search with MongoDB, check out this short skills course which covers everything you need to know.
Top comments (0)