Harsh Jain

Posted on Mar 17

Your Mind Doesn't Think in English

#webdev #programming #ai #gemini

Your Mind Doesn't Think in English

I built a second brain that stores concepts, not languages.

Last week, at 2 AM, I was deep in a Stack Overflow thread about WebSocket connection pooling. The answer was in English. The code was in TypeScript. But when my friend called the next morning and asked what I'd figured out, I explained the whole thing in Hindi. I didn't translate. I didn't pause to convert English words into Hindi words. The idea was already in Hindi. Or rather, it was never in any language at all. I'd understood the concept, and the concept has no language.

This isn't just something I noticed. It's something neuroscience has confirmed.

In 2024, Evelina Fedorenko at MIT published research that should have changed how we build software. Her lab used brain imaging to watch what happens when people think. Not talk to themselves, but actually reason through problems. The finding was blunt: "Your language system is basically silent when you do all sorts of thinking." The brain regions that handle language literally go quiet when you're reasoning or problem-solving. Thinking and language are different systems.

The philosopher Jerry Fodor called this deeper system "mentalese." An internal representation layer where thought actually happens, independent of any natural language. You don't think in English or Hindi. You think in concepts. Then, when you need to communicate, you translate from mentalese into whatever language the situation demands.

I built a second brain that works the same way.

The blind spot in every second brain tool

I love mymind.com. If you haven't used it, it's the best "save anything, find it later" tool out there. One click to save articles, images, notes. AI that auto-categorizes everything. A search engine that actually works. No folders, no manual tagging. It's what a second brain should feel like.

But mymind, and Notion, and Raindrop, and every other tool in this space, all share the same blind spot. Content stays in whatever language you saved it in.

Think about what that means for the majority of the internet's users. I'm a developer in India. I read English documentation, English blog posts, English API references. But when I want to recall what I learned, I think about it in Hindi. When I search my memory, the query in my head isn't in English. My brain already did the translation, silently, without me asking.

Every second brain tool today ignores this. They store files, not understanding. They give you back exactly what you put in, in the exact language you put it in. Your actual brain doesn't work that way. Why should your second brain?

That's the core idea behind YourMind. Save anything, in any language. Search in your language. Read summaries in your language. The knowledge layer is language-agnostic, just like your actual mind.

The model that made this possible

Here's where timing mattered more than talent.

I was building YourMind for the Lingo.dev Multilingual Hackathon. The hackathon started March 9th. I'd spent Day 1 setting up the stack: Next.js, Supabase, the usual. I knew the hardest part would be making cross-lingual search actually work. If someone saves an English article and searches in Hindi, how do you match those?

The traditional approach is brutal. You'd need language detection for every piece of content. Separate embedding models per language, or at least per language family. OCR pipelines for images. Transcription pipelines for audio. Multiple vector stores or some complex mapping layer. It's the kind of architecture that takes a team months.

On March 10th, Day 2 of the hackathon, Google released Gemini Embedding 2.

Gemini Embedding 2 is the first natively multimodal embedding model. Text, images, audio, documents, all mapped into a single vector space, across 100+ languages. One API call. One model. One vector space.

What this meant for YourMind: I didn't need separate pipelines for anything. An English article and a Hindi search query get embedded into the same space. An image gets embedded alongside text. The cosine similarity just works, across languages, across modalities. The architecture diagram I'd sketched on Day 1 with six boxes collapsed into one.

The embedding function: this is the core of cross-lingual search. I wrote a custom GeminiEmbeddingFunction that wraps Gemini Embedding 2 with native multimodal support:

// lib/chroma.ts — custom embedding function
class GeminiEmbeddingFunction {
  private genAI: GoogleGenerativeAI;
  private modelName = "gemini-embedding-2-preview";

  // Text embedding — articles, notes, search queries
  async generateText(text: string): Promise<number[]> {
    const model = this.genAI.getGenerativeModel({ model: this.modelName });
    const result = await model.embedContent(text);
    return result.embedding.values;
  }

  // Native multimodal — images and audio go in as raw bytes
  // No text conversion, no intermediate description step
  async generateMedia(buffer: Buffer, mimeType: string): Promise<number[]> {
    const model = this.genAI.getGenerativeModel({ model: this.modelName });
    const result = await model.embedContent([{
      inlineData: {
        mimeType,
        data: buffer.toString("base64"),
      },
    }]);
    return result.embedding.values;
  }
}

Images and audio are base64-encoded and passed directly to embedContent(). No OCR, no transcription, no text description as an intermediate step. The model natively understands what's in the image and maps it to the same vector space as text. This is why a Hindi search query can find an English-captioned image.

The search function uses the same embedding space:

// lib/pipeline.ts — semantic search with fallback
export async function semanticSearch(
  query: string,   // in ANY language
  userId: string,
  limit = 20
) {
  const collection = await getUserCollection(userId);

  // Chroma handles embedding the query via our GeminiEmbeddingFunction
  const results = await collection.query({
    queryTexts: [query],
    nResults: limit,
    include: ["distances", "metadatas"],
  });

  // Convert cosine distance → similarity score
  const similarities = results.distances[0].map(
    (d: number) => Math.max(0, Math.min(1, 1 - d))
  );

  // Fetch full translated metadata from Supabase
  const { data: items } = await supabase
    .from("items")
    .select("id, translated_title, translated_summary, ...")
    .in("id", results.ids[0])
    .eq("user_id", userId);

  return items.map((item, i) => ({ ...item, similarity: similarities[i] }));
}

That's the whole trick. There's no language detection step. No routing logic. No per-language anything. You put content in, you get a vector out, and that vector lives in the same space regardless of whether the input was English text, a Japanese image, or a Spanish audio clip.

I'm not going to pretend this was brilliant engineering. It was brilliant timing. Gemini Embedding 2 dropped two days before I needed it. But recognizing that it eliminated an entire class of complexity, and betting the architecture on it? That's the decision that made YourMind possible in a hackathon timeline.

Making translations feel human

Getting cross-lingual search to work was the technical breakthrough. But there's a difference between "technically correct" and "actually good."

When I first wired up the translation pipeline using Lingo.dev's SDK to translate content metadata into the user's preferred language, the Hindi output read like Google Translate circa 2015. Technically accurate. Emotionally dead. Nobody in India actually talks like that.

Here's what I mean. When YourMind saves an article about React Server Components, it generates a title and summary in your language. The first version gave me:

रिएक्ट सर्वर कम्पोनेंट्स: सर्वर-साइड रेंडरिंग का नया प्रतिमान

That's Hindi, technically. But no developer in Bangalore talks like this. We speak Hinglish, a mix of Hindi and English where technical terms stay in English and the conversational wrapper is Hindi. "React Server Components ka naya tarika" — that's how you'd actually say it.

Lingo.dev has a feature called brand voice instructions. You configure it in the dashboard, tell it how your audience actually speaks. I set it up with instructions about natural Hinglish, about keeping technical terms in English, about matching the casual tone developers actually use. Then I ran a full locale rebuild:

npx lingo.dev@latest lockfile --force
git rm locales/{hi,es,fr,de,ja}.json
git commit -m "i18n: force full locale rebuild with brand voice"
git push  # CI retranslates all locales with brand voice applied

The after:

रिएक्ट सर्वर कॉम्पोनेंट्स: सर्वर-साइड रेंडरिंग का नया अप्रोच

That reads like a person wrote it. And this matters more than it sounds. If your second brain talks to you in a language that feels robotic, it doesn't feel like your mind. It feels like a database with a translation layer on top. The brand voice configuration turned "technically translated" into "naturally localized."

This is the part most people get wrong about i18n. They think the job is done when the words are in the right language. Localization is voice. It's the difference between a tool that speaks Hindi and a tool that speaks like you speak Hindi.

How it all fits together

The full architecture has three translation layers, and they fire at different moments for different reasons.

The first layer is build-time static UI. Every button, label, placeholder, and error message in YourMind exists in six languages: English, Hindi, Spanish, French, German, Japanese. These are JSON locale files generated by Lingo.dev's CLI and served through Next.js middleware based on the URL prefix (/hi/dashboard, /es/dashboard). A GitHub Action runs on every push and auto-translates any new or changed strings. This is the table-stakes layer. Most i18n stops here.

The second layer fires at save-time. When you save content, Gemini Flash extracts a title, summary, tags, and category. Immediately after, Lingo.dev's SDK translates that metadata into your preferred language. This is what makes the dashboard feel native. You're browsing your saved knowledge and everything is in your language, even though the source content was in English or Japanese or anything else.

// lib/lingo.ts — Layer 2: translate metadata on every save
import { LingoDotDevEngine } from 'lingo.dev';

const engine = new LingoDotDevEngine({
  apiKey: process.env.LINGODOTDEV_API_KEY!,
});

export async function translateMeta(
  meta: { title: string; summary: string },
  sourceLocale: string,
  targetLocale: string
) {
  if (sourceLocale === targetLocale) return meta;
  const result = await engine.localizeObject(meta, {
    sourceLocale, targetLocale, fast: true,  // speed-optimized
  });
  return { title: result.title as string, summary: result.summary as string };
}

export async function translateTags(
  tags: string[],
  sourceLocale: string,
  targetLocale: string
) {
  if (sourceLocale === targetLocale || tags.length === 0) return tags;
  return engine.localizeStringArray(tags, {
    sourceLocale, targetLocale, fast: true,
  });
}

The third layer is on-demand. When you open an item to read the full article, it translates the complete content into your language, then caches the result. Second time you open the same article? Instant, served from cache. And there's a "View Original" toggle so you can always switch back.

// lib/lingo.ts — Layer 3: full content translation (on-demand)
export async function translateContent(
  content: string,
  sourceLocale: string,
  targetLocale: string
) {
  if (sourceLocale === targetLocale) return content;
  return engine.localizeText(content, { sourceLocale, targetLocale });
}

// api/items/[id]/translate/route.ts — with caching
export async function POST(req, { params }) {
  const { id } = params;
  const { targetLocale } = await req.json();

  // Cache hit? Return instantly
  const cached = await supabase
    .from('translations_cache')
    .select('translated_content')
    .eq('item_id', id).eq('locale', targetLocale).single();

  if (cached.data) return Response.json(cached.data);

  // Cache miss: translate, store, return
  const item = await supabase
    .from('items').select('content').eq('id', id).single();

  const translated = await translateContent(
    item.data.content, 'en', targetLocale
  );

  await supabase.from('translations_cache').insert({
    item_id: id, locale: targetLocale, translated_content: translated,
  });

  return Response.json({ translated_content: translated });
}

One design decision I'm proud of: Chroma (our vector store) is non-blocking. If the embedding service is down, the content still gets saved with all its AI-generated metadata. Search falls back to text matching in Supabase. The item is always accessible. You just lose semantic search until the vector catches up.

Where the clean design broke

The AI processing (Gemini Flash summarization, embedding, translation) runs in the background after content is saved. If any step fails, the item just sits there with a spinner forever. No error. No timeout. The user has no idea what happened.

The fix: a 3-minute timeout with a isFailed() check. If processing_status is still "processing" and created_at was more than 3 minutes ago, the card shows a red error state with "Processing failed." Not elegant. But it means the user always knows what's going on, and it took about 10 lines of code.

Durable workflows is next.

Trade-offs I accepted

Honest accounting of the technical debt in this codebase:

No per-user localization style. This is the one that bothers me most. Right now, Lingo.dev's brand voice is configured at the project level in the dashboard. Every user gets the same Hinglish tone, the same formality level, the same amount of English mixing. But that's not how people actually think. A senior engineer in Pune and a college student in Delhi speak very different Hindi. What I wanted to build was a style prompt during onboarding where you could configure your own localization: how formal, how much code-switching, how culturally localized. Your mind doesn't just think in your language, it thinks in your style. Lingo.dev's SDK doesn't support per-call style overrides yet (the engine config is project-wide, not per-request), so this stayed on the roadmap. But if they ever ship per-request instructions, this becomes the feature that separates a second brain from a translation layer.

Existing items don't re-translate when you change languages. If you switch from Hindi to Spanish, new saves get Spanish metadata, but your existing 50 items stay in Hindi. Re-translating everything would mean hitting the Lingo.dev API for every item in your library, and in a hackathon I chose not to build that queue. The right fix is a background job that re-translates in batches.

No durable workflow for the pipeline. The AI processing (Gemini Flash, embedding, translation) runs fire-and-forget in the background. If any step fails mid-flight, there's no retry queue, no dead letter handling, no exponential backoff. That item just gets a "failed" badge after 3 minutes. For a production system you'd want a proper job queue with durable state. For a hackathon, the timeout and error card are honest enough.

What I'd build next

YourMind is an MVP. It handles articles, notes, images, and audio. But the vision is bigger.

PDF support is the obvious next step. Gemini's File API handles PDF parsing, and the rest of the pipeline stays the same. A right-click context menu in the Chrome extension, so you can save a text selection or an image without opening the full page. Custom spaces beyond the auto-generated categories. And eventually, a mobile app, because the best second brain is the one you actually use, and that means it needs to be in your pocket.

The codebase is open and forkable. If you want to add PDF support, the pipeline architecture means you just need to handle the extraction step. Everything downstream (Gemini Flash, Lingo.dev, Gemini Embedding 2, Chroma) works identically regardless of content type.

Takeaway

The second brain space is booming. There's a new app every week, and most of them are solving the same problem: better ways to store and organize information.

Nobody's solving understanding.

When you remember something, your brain doesn't pull up a file in the language it was written in. It gives you the concept, in the language you think in, shaped by the way you understand the world. That's not a feature request. That's how cognition works. Neuroscience proved it. Philosophy theorized it decades earlier.

YourMind doesn't have a language dropdown on its content. There's no "translate this" button. The knowledge is just there, in your language, because that's how it should have been all along.

Your mind doesn't think in English. Your tools shouldn't either.

YourMind is open source and built for the Lingo.dev Multilingual Hackathon #3. Try it at your-mind.vercel.app or fork it on GitHub.

DEV Community

Your Mind Doesn't Think in English

Your Mind Doesn't Think in English

The blind spot in every second brain tool

The model that made this possible

Making translations feel human

How it all fits together

Where the clean design broke

Trade-offs I accepted

What I'd build next

Takeaway

Top comments (0)