<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Weimo Liu</title>
    <description>The latest articles on DEV Community by Weimo Liu (@weimoliu).</description>
    <link>https://dev.to/weimoliu</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3076809%2F00240436-7db0-42b2-aef0-9efb9a0cb1d2.png</url>
      <title>DEV Community: Weimo Liu</title>
      <link>https://dev.to/weimoliu</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/weimoliu"/>
    <language>en</language>
    <item>
      <title>How Wiz Crushed Lacework: A Data Infrastructure Perspective</title>
      <dc:creator>Weimo Liu</dc:creator>
      <pubDate>Mon, 04 Aug 2025 19:35:56 +0000</pubDate>
      <link>https://dev.to/weimoliu/how-wiz-crushed-lacework-a-data-infrastructure-perspective-1jk8</link>
      <guid>https://dev.to/weimoliu/how-wiz-crushed-lacework-a-data-infrastructure-perspective-1jk8</guid>
      <description>&lt;p&gt;Google's acquisition of Wiz for $32 billion was a clear signal to the industry: the cloud security war has a winner. What's more interesting is how they won. Wiz wasn't the first mover. Lacework started five years earlier with a solid team, strong product vision, and top-tier VC backing. So what went wrong for Lacework? And what went right for Wiz?&lt;/p&gt;

&lt;p&gt;If you browse social media, you'll find engineers and CISOs asking the same thing. Threads on X, Reddit, and Hacker News have dozens of posts dissecting the matchup. The answer holds lessons not just for security vendors, but for anyone building modern data-intensive products.&lt;/p&gt;

&lt;p&gt;Obviously, Wiz did many things right from product strategy and GTM to customer support and execution. But there's one angle I haven't seen talked about much in the usual analysis. It happens to be my niche: data infrastructure. It might just be their secret weapon, and that's what prompted me to write this breakdown.&lt;/p&gt;

&lt;p&gt;Take Reddit, for example. Multiple posts are comparing Lacework and Wiz, with engineers sharing firsthand experiences from evaluations and deployments.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1d7853dxwlvvhtyv53iz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1d7853dxwlvvhtyv53iz.png" alt=" " width="749" height="323"&gt;&lt;/a&gt;Source: &lt;a href="https://www.reddit.com/r/cybersecurity/comments/1crzc92/wiz_vs_lacework/" rel="noopener noreferrer"&gt;Reddit&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvp7waat87bzokt63pvbx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvp7waat87bzokt63pvbx.png" alt=" " width="670" height="285"&gt;&lt;/a&gt;Source: &lt;a href="https://www.reddit.com/r/cybersecurity/comments/1c1s9r2/comment/kz5xqe5/?utm_source=share&amp;amp;utm_medium=web3x&amp;amp;utm_name=web3xcss&amp;amp;utm_term=1&amp;amp;utm_content=share_button" rel="noopener noreferrer"&gt;Reddit&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzfed8i3bw1iqfxetdvkj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzfed8i3bw1iqfxetdvkj.png" alt=" " width="595" height="217"&gt;&lt;/a&gt;Source: &lt;a href="https://www.reddit.com/r/cybersecurity/comments/1c1s9r2/comment/l7q1oie/?utm_source=share&amp;amp;utm_medium=web3x&amp;amp;utm_name=web3xcss&amp;amp;utm_term=1&amp;amp;utm_content=share_button" rel="noopener noreferrer"&gt;Reddit&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft5mdo0by0dtu9qvr5hcv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft5mdo0by0dtu9qvr5hcv.png" alt=" " width="724" height="159"&gt;&lt;/a&gt;Source: &lt;a href="https://www.reddit.com/r/cybersecurity/comments/1c1s9r2/comment/kz7xx7s/?utm_source=share&amp;amp;utm_medium=web3x&amp;amp;utm_name=web3xcss&amp;amp;utm_term=1&amp;amp;utm_content=share_button" rel="noopener noreferrer"&gt;Reddit&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frt85nxekcshruk3c3y5d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frt85nxekcshruk3c3y5d.png" alt=" " width="689" height="202"&gt;&lt;/a&gt;Source: &lt;a href="https://www.reddit.com/r/cybersecurity/comments/1cxi15l/comment/l553xbg/?utm_source=share&amp;amp;utm_medium=web3x&amp;amp;utm_name=web3xcss&amp;amp;utm_term=1&amp;amp;utm_content=share_button" rel="noopener noreferrer"&gt;Reddit&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I'm not a security guy. I come from a data infrastructure background. But this story is just as much about data architecture as it is about product strategy.&lt;/p&gt;

&lt;p&gt;Let's look at what each company built.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lacework: Graph Ideas, SQL Reality
&lt;/h2&gt;

&lt;p&gt;Lacework launched in 2015 with the Polygraph® Data Platform. It aimed to detect threats by mapping relationships and behaviors between cloud assets, a classic graph use case. But under the hood, Lacework didn't use a graph database. They built it on Snowflake.&lt;/p&gt;

&lt;p&gt;Why Snowflake? Probably because Sutter Hill Ventures incubated both companies. And to be fair, Snowflake made sense on paper. It offers strong scalability and relatively low cost. You can store huge volumes of cloud telemetry, and it scales elastically. That's helpful for cost control and data retention.&lt;/p&gt;

&lt;p&gt;But there's a catch. Snowflake isn't built for graph workloads. Writing a 3-hop relationship query in SQL can take 100+ lines of nested joins. Here's what a basic traversal looks like in SQL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;device_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;network_id&lt;/span&gt; 
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; 
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;logins&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt; 
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;devices&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;device_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;device_id&lt;/span&gt; 
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;connections&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;device_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;device_id&lt;/span&gt; 
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;networks&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;network_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;network_id&lt;/span&gt; 
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now imagine debugging this at 10 hops, with filters, aggregations, and alert logic. Even the best engineers will slow down. Development becomes brittle and difficult to maintain.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wiz: Native Graph, Feature Velocity
&lt;/h2&gt;

&lt;p&gt;Wiz was founded in 2020 by Assaf Rappaport and his former team from Adallom. They chose a different path. From day one, Wiz used Amazon Neptune, a native graph database.&lt;/p&gt;

&lt;p&gt;In a joint blog post with AWS titled "The World is a Graph", Wiz CTO Ami Luttwak explained their approach:&lt;/p&gt;

&lt;p&gt;"The world is a graph, not a table. It's time our tooling reflected this."&lt;/p&gt;

&lt;p&gt;Wiz modeled everything, users, assets, roles, and flows, as nodes and edges. They queried it with Gremlin. Here's a real-world example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight groovy"&gt;&lt;code&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;V&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;hasLabel&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"vm"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;has&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"public"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"connectedTo"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;hasLabel&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"network"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"reachableBy"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;has&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"role"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"admin"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
  &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This kind of logic is expressible in 10 lines with Gremlin. In SQL? It would be a nightmare.&lt;/p&gt;

&lt;p&gt;This architectural choice gave Wiz a massive edge in developer velocity. With Neptune and Gremlin, engineers could express complex security logic in concise, readable queries and ship them quickly. What would take days or weeks in SQL due to brittle joins and long query chains could be prototyped and pushed in hours. This mattered. Security is a fast-moving field, and Wiz's ability to ship features at startup speed meant it could respond to customer requests, compliance requirements, and threat intelligence faster than Lacework. Even with a smaller team, they consistently outpaced Lacework's product delivery cadence.&lt;/p&gt;

&lt;p&gt;By 2022, Wiz deepened its commitment to graph infrastructure by continuing to scale on Amazon Neptune. Their bet on native graph tech was not just architectural; it defined their velocity and differentiation.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Graph Bet That Changed Everything
&lt;/h2&gt;

&lt;p&gt;Lacework prioritized cost efficiency. By using Snowflake, they could ingest and retain massive volumes of telemetry with elastic scaling and lower marginal cost. They didn't need to maintain a separate graph database or optimize for graph workloads. The tradeoff was in capability: Snowflake's tabular design wasn't built for deep relationship queries. Modeling graph logic in SQL, especially multi-hop joins, was verbose, fragile, and hard to iterate on. This slowed down development and made advanced threat modeling harder to execute.&lt;/p&gt;

&lt;p&gt;Wiz is optimized for speed. By betting on a native graph engine, they gained fast iteration, concise query logic, and a security model grounded in relationships. They could express new detections or traversal-based insights in a few lines of Gremlin, prototype ideas quickly, and ship updates faster.&lt;/p&gt;

&lt;p&gt;In cybersecurity, speed wins. Customers care more about feature velocity and detection quality than marginal compute savings. Wiz took a costly but strategic path: they paid much more for infrastructure but delivered faster innovation and outpaced the field.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A friend at a Series F cybersecurity startup told me they only store a single day's worth of graph data, because the graph database cannot scale out.&lt;/li&gt;
&lt;li&gt;Another company splits its graph workload: topology stays in a graph database, but all attributes are offloaded to a warehouse like Snowflake or Databricks.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Lacework's architecture helped them scale cheaply, but that same architecture made it difficult to build graph-native security features. Their bet is optimized for storage and cost. Wiz's bet is optimized for iteration and product value. The outcome was clear.&lt;/p&gt;

&lt;h2&gt;
  
  
  Can You Get the Wiz Speed With Low Cost and Unlimited Scalability of Data Lakes?
&lt;/h2&gt;

&lt;p&gt;If you've made it this far, you might be wondering: Is it possible to get the benefits of a native graph system, fast iteration, expressive multi-hop queries, without the painful cost and complexity of traditional graph databases?&lt;br&gt;
Plenty of cybersecurity unicorns have attempted creative workarounds to address the scalability and cost challenges of traditional graph databases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No ETL&lt;/li&gt;
&lt;li&gt;No duplicated storage&lt;/li&gt;
&lt;li&gt;Query your Parquet files or iceberg/delta tables with Cypher or Gremlin&lt;/li&gt;
&lt;li&gt;Subsecond response times&lt;/li&gt;
&lt;li&gt;Lower cost than Snowflake&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are clever tradeoffs. But they're still compromises.&lt;br&gt;
What if you didn't have to choose between fast iteration and low cost?&lt;/p&gt;

&lt;p&gt;(Trigger warning: Shameless plug coming)&lt;/p&gt;

&lt;p&gt;That's the question we asked ourselves when building PuppyGraph, a graph query engine designed to run directly on your data lake.&lt;/p&gt;

&lt;p&gt;Wiz chose graphs and shipped features fast. Lacework chose SQL and struggled with velocity.&lt;/p&gt;

&lt;p&gt;The best part of the Wiz story isn't just that they chose graphs, it's that they embraced the tradeoff. They paid more in infrastructure, but got faster iteration and better product velocity in return.&lt;/p&gt;

&lt;p&gt;Now imagine building at that speed, with a much smaller bill. If you're building the next Wiz, maybe you don't need a $32B exit. Perhaps you just need the right graph engine. (Okay fine, a $32B exit would be nice too.)&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Real-Time Threat Detection With MongoDB &amp; PuppyGraph</title>
      <dc:creator>Weimo Liu</dc:creator>
      <pubDate>Fri, 11 Jul 2025 03:37:19 +0000</pubDate>
      <link>https://dev.to/weimoliu/real-time-threat-detection-with-mongodb-puppygraph-29la</link>
      <guid>https://dev.to/weimoliu/real-time-threat-detection-with-mongodb-puppygraph-29la</guid>
      <description>&lt;p&gt;Security operations teams face an increasingly complex environment. Cloud-native applications, identity sprawl, and continuous infrastructure changes generate a flood of logs and events. From API calls in AWS to lateral movement between virtual machines, the volume of telemetry is enormous-and it's growing.&lt;/p&gt;

&lt;p&gt;The challenge isn't just scale. Its structure. Traditional security tooling often looks at events in isolation, relying on static rules or dashboards to highlight anomalies. But real attacks unfold as chains of related actions: A user assumes a role, launches a resource, accesses data, and then pivots again. These relationships are hard to capture with flat queries or disconnected logs.&lt;/p&gt;

&lt;p&gt;That's where graph analytics comes in. By modeling your data as a network of users, sessions, identities, and events, you can trace how threats emerge and evolve. And with PuppyGraph, you don't need a separate graph database or batch pipelines to get there.&lt;br&gt;
In this post, we'll show how to combine MongoDB and PuppyGraph to analyze AWS CloudTrail data as a graph-without moving or duplicating data. You'll see how to uncover privilege escalation chains, map user behavior across sessions, and detect suspicious access patterns in real time.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why MongoDB for cybersecurity data
&lt;/h2&gt;

&lt;p&gt;MongoDB is a popular choice for managing security telemetry. Its document-based model is ideal for ingesting unstructured and semi-structured logs like those generated by AWS CloudTrail, GuardDuty, or Kubernetes audit logs. Events are stored as flexible JSON documents, which evolve naturally as logging formats change.&lt;/p&gt;

&lt;p&gt;This flexibility matters in security, where schemas can shift as providers update APIs or teams add new context to events. MongoDB handles these changes without breaking pipelines or requiring schema migrations. It also supports high-throughput ingestion and horizontal scaling, making it well-suited for operational telemetry.&lt;/p&gt;

&lt;p&gt;Many security products and SIEM backends already support MongoDB as a destination for real-time event streams. That makes it a natural foundation for graph-based security analytics: The data is already there—rich, semi-structured, and continuously updated.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why graph analytics for threat detection
&lt;/h2&gt;

&lt;p&gt;Modern security incidents rarely unfold as isolated events. Attackers don’t just trip a single rule—they navigate through systems, identities, and resources, often blending in with legitimate activity. Understanding these behaviors means connecting the dots across multiple entities and actions. That’s precisely what graph analytics excels at. By modeling users, sessions, events, and assets as interconnected nodes and edges, analysts can trace how activity flows through a system. This structure makes it easy to ask questions that involve multiple hops or indirect relationships—something traditional queries often struggle to express.&lt;/p&gt;

&lt;p&gt;For example, imagine you’re investigating activity tied to a specific AWS account. You might start by counting how many sessions are associated with that account. Then, you might break those sessions down by whether they were authenticated using MFA. If some weren’t, the next question becomes: What resources were accessed during those unauthenticated sessions?&lt;/p&gt;

&lt;p&gt;This kind of multi-step investigation is where graph queries shine. Instead of scanning raw logs or filtering one table at a time, you can traverse the entire path from account to identity to session to event to resource, all in a single query. You can also group results by attributes like resource type to identify which services were most affected.&lt;/p&gt;

&lt;p&gt;And when needed, you can go beyond metrics and pivot to visualization, mapping out full access paths to see how a specific user or session interacted with sensitive infrastructure. This helps surface lateral movement, track privilege escalation, and uncover patterns that static alerts might miss.&lt;/p&gt;

&lt;p&gt;Graph analytics doesn’t replace your existing detection rules; it complements them by revealing the structure behind security activity. It turns complex event relationships into something you can query directly, explore interactively, and act on with confidence.&lt;/p&gt;
&lt;h2&gt;
  
  
  Query MongoDB data as a graph without ETL
&lt;/h2&gt;

&lt;p&gt;MongoDB is a popular choice for storing security event data, especially when working with logs that don’t always follow a fixed structure. Services like AWS CloudTrail produce large volumes of JSON-based records with fields that can differ across events. MongoDB’s flexible schema makes it easy to ingest and query that data as it evolves.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://hubs.ly/Q03q_lFd0" rel="noopener noreferrer"&gt;PuppyGraph&lt;/a&gt; builds on this foundation by introducing graph analytics—without requiring any data movement. Through the &lt;a href="https://www.mongodb.com/products/platform/atlas-sql-interface" rel="noopener noreferrer"&gt;MongoDB Atlas SQL Interface&lt;/a&gt;, PuppyGraph can connect directly to your collections and treat them as relational tables. From there, you define a graph model by mapping key fields into nodes and relationships.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Figure 1.&lt;/strong&gt; Architecture of the integration of MongoDB and PuppyGraph.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkehjez8lwavysvblmb6w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkehjez8lwavysvblmb6w.png" alt="Figure 1. Architecture of the integration of MongoDB and PuppyGraph." width="800" height="285"&gt;&lt;/a&gt;&lt;br&gt;
This makes it possible to explore questions that involve multiple entities and steps, such as tracing how a session relates to an identity or which resources were accessed without MFA. The graph itself is virtual. There’s no ETL process or data duplication. Queries run in real time against the data already stored in MongoDB.&lt;/p&gt;

&lt;p&gt;While PuppyGraph works with tabular structures exposed through the SQL interface, many security logs already follow a relatively flat pattern: consistent fields like account IDs, event names, timestamps, and resource types. That makes it straightforward to build graphs that reflect how accounts, sessions, events, and resources are linked. By layering graph capabilities on top of MongoDB, teams can ask more connected questions of their security data, without changing their storage strategy or duplicating infrastructure.&lt;/p&gt;
&lt;h2&gt;
  
  
  Investigating CloudTrail activity using graph queries
&lt;/h2&gt;

&lt;p&gt;To demonstrate how graph analytics can enhance security investigations, we’ll explore a real-world dataset of AWS CloudTrail logs. This dataset originates from &lt;a href="https://summitroute.com/blog/2020/10/09/public_dataset_of_cloudtrail_logs_from_flaws_cloud/" rel="noopener noreferrer"&gt;flaws.cloud&lt;/a&gt;, a security training environment developed by Scott Piper.&lt;/p&gt;

&lt;p&gt;The dataset comprises anonymized CloudTrail logs collected over 3.5 years, capturing a wide range of simulated attack scenarios within a controlled AWS environment. It includes over 1.9 million events, featuring interactions from thousands of unique IP addresses and user agents. The logs encompass various AWS API calls, providing a comprehensive view of potential security events and misconfigurations.&lt;/p&gt;

&lt;p&gt;For our demonstration, we imported a subset of approximately 100,000 events into MongoDB Atlas. By importing this dataset into MongoDB Atlas and applying PuppyGraph’s graph analytics capabilities, we can model and analyze complex relationships between accounts, identities, sessions, events, and resources.&lt;/p&gt;
&lt;h3&gt;
  
  
  Demo
&lt;/h3&gt;

&lt;p&gt;Let’s walk through the demo step by step! We have provided all the materials for this demo on &lt;a href="https://github.com/puppygraph/puppygraph-getting-started/tree/main/use-case-demos/cloudtrail-mongodb-demo" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;. Please download the materials or clone the repository directly.&lt;/p&gt;

&lt;p&gt;If you’re new to integrating MongoDB Atlas with PuppyGraph, we recommend starting with the &lt;a href="https://hubs.ly/Q03q_lRV0" rel="noopener noreferrer"&gt;MongoDB Atlas + PuppyGraph Quickstart Demo&lt;/a&gt; to get familiar with the setup and core concepts.&lt;/p&gt;
&lt;h4&gt;
  
  
  Prerequisites
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;A MongoDB Atlas account (free tier is sufficient)&lt;/li&gt;
&lt;li&gt;Docker&lt;/li&gt;
&lt;li&gt;Python 3&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;
  
  
  Set up MongoDB Atlas
&lt;/h4&gt;

&lt;p&gt;Follow the &lt;a href="https://www.mongodb.com/docs/atlas/getting-started/" rel="noopener noreferrer"&gt;MongoDB Atlas Getting Started guide&lt;/a&gt; to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a new cluster (free tier is fine).&lt;/li&gt;
&lt;li&gt;Add a database user.&lt;/li&gt;
&lt;li&gt;Configure IP access.&lt;/li&gt;
&lt;li&gt;Note your connection string for the MongoDB Python driver (you’ll need it shortly).&lt;/li&gt;
&lt;/ol&gt;
&lt;h4&gt;
  
  
  Download and import CloudTrail logs
&lt;/h4&gt;

&lt;p&gt;Run the following commands to fetch and prepare the dataset:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;wget https://summitroute.com/downloads/flaws_cloudtrail_logs.tar
mkdir -p ./raw_data
tar -xvf flaws_cloudtrail_logs.tar --strip-components=1 -C ./raw_data
gunzip ./raw_data/*.json.gz
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create a virtual environment and install dependencies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# On some Linux distributions, install `python3-venv` first.
sudo apt-get update
sudo apt-get install python3-venv
# Create a virtual environment, activate it, and install the necessary packages 
python -m venv venv
source venv/bin/activate
pip install ijson faker pandas pymongo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Import the first chunk of CloudTrail data (replace the connection string with your Atlas URI):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export MONGODB_CONNECTION_STRING="your_mongodb_connection_string"
python import_data.py raw_data/flaws_cloudtrail00.json --database cloudtrail
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates a new cloudtrail database and loads the first chunk of data containing 100,000 structured events.&lt;/p&gt;

&lt;h4&gt;
  
  
  Enable Atlas SQL interface and get JDBC URI
&lt;/h4&gt;

&lt;p&gt;To enable graph access:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create an Atlas SQL Federated Database instance.&lt;/li&gt;
&lt;li&gt;Ensure the schema is available (generate from sample, if needed).&lt;/li&gt;
&lt;li&gt;Copy the JDBC URI from the Atlas SQL interface.
See PuppyGraph’s guide for &lt;a href="https://docs.puppygraph.com/getting-started/querying-mongodb-atlas-data-as-a-graph/" rel="noopener noreferrer"&gt;setting up MongoDB Atlas SQL&lt;/a&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;
  
  
  Start PuppyGraph and upload the graph schema
&lt;/h4&gt;

&lt;p&gt;Start the PuppyGraph container:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker run -p 8081:8081 -p 8182:8182 -p 7687:7687 \
  -e PUPPYGRAPH_PASSWORD=puppygraph123 \
  -d --name puppy --rm --pull=always puppygraph/puppygraph:stable
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Log in to the web UI at &lt;a href="http://localhost:8081" rel="noopener noreferrer"&gt;http://localhost:8081&lt;/a&gt; with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Username: puppygraph.&lt;/li&gt;
&lt;li&gt;Password: puppygraph123.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Upload the schema:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Open schema.json.&lt;/li&gt;
&lt;li&gt;Fill in your JDBC URI, username, and password.&lt;/li&gt;
&lt;li&gt;Upload via the &lt;strong&gt;Upload Graph Schema JSON&lt;/strong&gt; section or run:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;curl -XPOST -H "content-type: application/json" \
  --data-binary @./schema.json \
  --user "puppygraph:puppygraph123" localhost:8081/schema
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wait for the schema to upload and initialize (approximately five minutes).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Figure 2.&lt;/strong&gt; A graph visualization of the schema, which models the graph from relational data.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F29q06nhhso0su5n2s393.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F29q06nhhso0su5n2s393.png" alt="Figure 2. A graph visualization of the schema, which models the graph from relational data." width="800" height="448"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Run graph queries to investigate security activity
&lt;/h4&gt;

&lt;p&gt;Once the graph is live, open the Query panel in PuppyGraph’s UI.&lt;/p&gt;

&lt;p&gt;Let's say we want to investigate the activity of a specific account. First, we count the number of sessions associated with the account.&lt;/p&gt;

&lt;p&gt;Cypher:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;MATCH (a:Account)-[:HasIdentity]-&amp;gt;(i:Identity)
  -[:HasSession]-&amp;gt;(s:Session)
WHERE id(a) = "Account[811596193553]"
RETURN count(s)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Gremlin:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;g.V("Account[811596193553]")
 .out("HasIdentity").out("HasSession").count()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Figure 3.&lt;/strong&gt; Graph query in the PuppyGraph UI.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhcoe8zsxfjbqsp2fdtwr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhcoe8zsxfjbqsp2fdtwr.png" alt="Figure 3. Graph query in the PuppyGraph UI." width="800" height="448"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then, we want to see how many of these sessions are MFA-authenticated or not.&lt;/p&gt;

&lt;p&gt;Cypher:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;MATCH (a:Account)-[:HasIdentity]-&amp;gt;(i:Identity)
  -[:HasSession]-&amp;gt;(s:Session)
WHERE id(a) = "Account[811596193553]"
RETURN s.mfa_authenticated AS mfaStatus, count(s) AS count
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Gremlin:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;g.V("Account[811596193553]")
  .out("HasIdentity").out("HasSession")
  .groupCount().by("mfa_authenticated")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Figure 4.&lt;/strong&gt; Graph query results in the PuppyGraph UI.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3rd9nuf5fw63lceynvp8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3rd9nuf5fw63lceynvp8.png" alt="Figure 4. Graph query results in the PuppyGraph UI." width="800" height="448"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Next, we investigate those sessions that are not MFA authenticated and see what resources they accessed.&lt;/p&gt;

&lt;p&gt;Cypher:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;MATCH (a:Account)-[:HasIdentity]-&amp;gt;
  (i:Identity)-[:HasSession]-&amp;gt;
  (s:Session {mfa_authenticated: false})
  -[:RecordsEvent]-&amp;gt;(e:Event)
  -[:OperatesOn]-&amp;gt;(r:Resource)
WHERE id(a) = "Account[811596193553]"
RETURN r.resource_type AS resourceType, count(r) AS count
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Gremlin:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;g.V("Account[811596193553]").out("HasIdentity")
  .out("HasSession")
  .has("mfa_authenticated", false)
  .out('RecordsEvent').out('OperatesOn')
  .groupCount().by("resource_type")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Figure 5.&lt;/strong&gt; PuppyGraph UI showing results that are not MFA authenticated.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Filxe85mozfbscymg1mr2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Filxe85mozfbscymg1mr2.png" alt="Figure 5. PuppyGraph UI showing results that are not MFA authenticated." width="800" height="448"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We show those access paths in a graph.&lt;/p&gt;

&lt;p&gt;Cypher:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;MATCH path = (a:Account)-[:HasIdentity]-&amp;gt;
  (i:Identity)-[:HasSession]-&amp;gt;
  (s:Session {mfa_authenticated: false})
  -[:RecordsEvent]-&amp;gt;(e:Event)
  -[:OperatesOn]-&amp;gt;(r:Resource)
WHERE id(a) = "Account[811596193553]"
RETURN path
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Gremlin:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;g.V("Account[811596193553]").out("HasIdentity").out("HasSession").has("mfa_authenticated", false)
  .out('RecordsEvent').out('OperatesOn')
  .path()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Figure 6.&lt;/strong&gt; Graph visualization in PuppyGraph UI.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdtykf6feuxwt47m1j1lr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdtykf6feuxwt47m1j1lr.png" alt="Figure 6. Graph visualization in PuppyGraph UI." width="800" height="448"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  Tear down the environment
&lt;/h4&gt;

&lt;p&gt;When you’re done:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker stop puppy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your MongoDB data will persist in Atlas, so you can revisit or expand the graph model at any time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Security data is rich with relationships, between users, sessions, resources, and actions. Modeling these connections explicitly makes it easier to understand what’s happening in your environment, especially when investigating incidents or searching for hidden risks.&lt;/p&gt;

&lt;p&gt;By combining MongoDB Atlas and PuppyGraph, teams can analyze those relationships in real time without moving data or maintaining a separate &lt;a href="https://www.puppygraph.com/blog/graph-database?utm_campaign=12151601-MongoDB%20partner%20blog%20campaign&amp;amp;utm_source=https%3A%2F%2Fwww.mongodb.com%2F&amp;amp;utm_medium=MongoDB%20blog" rel="noopener noreferrer"&gt;graph database&lt;/a&gt;. MongoDB provides the flexibility and scalability to store complex, evolving security logs like AWS CloudTrail, while PuppyGraph adds a native graph layer for exploring that data as connected paths and patterns.&lt;/p&gt;

&lt;p&gt;In this post, we walked through how to import real-world audit logs, define a graph schema, and investigate access activity using graph queries. With just a few steps, you can transform a log collection into an interactive graph that reveals how activity flows across your cloud infrastructure.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>PuppyGraph on MongoDB: Native Graph Queries Without ETL</title>
      <dc:creator>Weimo Liu</dc:creator>
      <pubDate>Tue, 22 Apr 2025 22:13:34 +0000</pubDate>
      <link>https://dev.to/weimoliu/puppygraph-on-mongodb-native-graph-queries-without-etl-416l</link>
      <guid>https://dev.to/weimoliu/puppygraph-on-mongodb-native-graph-queries-without-etl-416l</guid>
      <description>&lt;p&gt;Structured data across a wide range of workloads—from product catalogs to telemetry streams to user activity logs. Its schema-less structure and distributed architecture make it a natural fit for applications that demand both agility and scale.&lt;/p&gt;

&lt;p&gt;But in many real-world scenarios, data points aren’t just valuable on their own—they’re more powerful when understood in context. Connections between entities often reveal the patterns that matter most: how users interact, how systems behave, and how events unfold over time. While MongoDB provides expressive tools for working with nested documents and join operations across collections, some types of relationship analysis are more naturally expressed as graph queries.&lt;/p&gt;

&lt;p&gt;That’s where PuppyGraph comes in. It adds a real-time graph layer on top of your existing MongoDB deployment—no ETL, no separate &lt;a href="https://hubs.ly/Q03jp_470" rel="noopener noreferrer"&gt;graph databases&lt;/a&gt; needed. You can define a graph model across your collections and run queries using openCypher or Gremlin, all without modifying your source data.&lt;/p&gt;

&lt;p&gt;In this tutorial, we’ll walk through how PuppyGraph connects to MongoDB, how it complements document-based architectures with graph capabilities, and how you can get started running graph queries with minimal setup.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is MongoDB?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.mongodb.com/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=puppygraph&amp;amp;utm_term=tony.kim" rel="noopener noreferrer"&gt;MongoDB&lt;/a&gt; is a document-oriented NoSQL database designed to store and manage data in a flexible, JSON-like format. Unlike traditional relational databases that use tables and rows, MongoDB employs collections and documents, allowing for dynamic schemas that can evolve with application requirements. This flexibility makes it particularly well-suited for handling semi-structured and unstructured data, accommodating use cases such as content management systems, real-time analytics, AI vector search, and Internet of Things (IoT) applications.&lt;/p&gt;

&lt;p&gt;In MongoDB, data is organized into collections of documents, each containing key-value pairs. This structure enables developers to represent complex hierarchical relationships within a single document, reducing the need for expensive join operations. For example, a document representing a blog post can encapsulate not only the post content but also metadata like author information and comments, all within the same document.&lt;/p&gt;

&lt;p&gt;The database offers a rich set of features, including a powerful query API that supports field searches, range queries, and regular expressions. Indexing capabilities enhance query performance, allowing developers to create indexes on any field. Additionally, MongoDB’s aggregation framework facilitates data transformation and analysis directly within the database, streamlining the development of analytics applications.&lt;/p&gt;

&lt;h2&gt;
  
  
  MongoDB Atlas: Managed Cloud Database Service
&lt;/h2&gt;

&lt;p&gt;Recognizing the operational challenges associated with managing databases, MongoDB introduced &lt;a href="https://www.mongodb.com/products/platform/atlas-database?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=puppygraph&amp;amp;utm_term=tony.kim" rel="noopener noreferrer"&gt;MongoDB Atlas&lt;/a&gt;, a fully managed cloud database service. MongoDB Atlas simplifies the deployment, scaling, and management of MongoDB databases, allowing developers to focus on building applications rather than handling database administration tasks.&lt;/p&gt;

&lt;p&gt;MongoDB Atlas provides automated deployment across major cloud providers, including AWS, Google Cloud Platform, and Microsoft Azure, offering flexibility and global reach. It features automated backups, ensuring data durability and facilitating disaster recovery. Built-in monitoring tools provide real-time insights into database performance, enabling proactive optimization and maintenance.&lt;/p&gt;

&lt;p&gt;Security is a core component of MongoDB Atlas, with features such as end-to-end encryption, network isolation, and fine-grained access controls to protect sensitive data. The platform also supports compliance with various industry standards, making it suitable for applications with stringent security requirements.&lt;/p&gt;

&lt;p&gt;By combining the flexibility of MongoDB’s document model with the operational simplicity of a managed service, MongoDB Atlas empowers organizations to build and scale applications with greater agility and confidence.&lt;/p&gt;

&lt;h2&gt;
  
  
  Graph Analytics for MongoDB using PuppyGraph
&lt;/h2&gt;

&lt;p&gt;For teams working with MongoDB, many valuable insights come from understanding how entities relate across collections — whether it’s tracing user journeys, mapping operational dependencies, or detecting linked anomalies. In many cases, understanding those relationships across documents and collections can unlock deeper insights, especially when the goal is to trace connections, analyze paths, or detect patterns that span multiple entities.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://hubs.ly/Q03jp_C_0" rel="noopener noreferrer"&gt;PuppyGraph&lt;/a&gt; adds a real-time graph layer to MongoDB, allowing teams to query those relationships using graph-specific languages like Gremlin or openCypher. Without migrating or duplicating data, you can define how collections map to nodes and edges, then run graph queries directly against MongoDB Atlas or self-hosted deployments. Under the hood, PuppyGraph connects via the MongoDB Atlas SQL JDBC driver, querying live data and returning results with no ETL or transformation required.&lt;/p&gt;

&lt;p&gt;This integration offers several key benefits:&lt;/p&gt;

&lt;p&gt;Query Live Data, Not Snapshots: MongoDB often powers applications with dynamic, operational data — user interactions, catalogs, IoT streams, or content updates. PuppyGraph allows the execution of graph queries directly on this live data. This means you can immediately analyze emerging relationships, detect anomalies like fraud patterns as they happen, or power real-time recommendation engines without waiting for slow batch processes or dealing with data staleness.&lt;/p&gt;

&lt;p&gt;Traverse Relationships with Purpose-Built Queries: Go beyond simple document retrieval. Graph query languages like Gremlin and openCypher, supported by PuppyGraph, are purpose-built for traversing connections, finding paths, analyzing influence, and understanding network structures. This facilitates the application of graph algorithms for tasks like PageRank (identifying importance), community detection (finding clusters), pathfinding, and more, uncovering insights hidden within the relationships scattered across your MongoDB documents — insights that might be difficult or inefficient to obtain using standard document queries alone.&lt;/p&gt;

&lt;p&gt;No ETL, No New Stack: Instead of building a separate graph system and maintaining sync jobs, PuppyGraph works directly with MongoDB. This means lower operational overhead, fewer moving parts, and analytics that always reflect current application state — all from your existing data platform. It ensures your analytics always reflect the current state of your operational data, providing a unified source of truth for both document-based and graph-based analysis.&lt;/p&gt;

&lt;h2&gt;
  
  
  Integration Architecture: PuppyGraph and MongoDB
&lt;/h2&gt;

&lt;p&gt;Integrating PuppyGraph with MongoDB Atlas involves a series of components working together to enable seamless graph analytics capabilities.&lt;/p&gt;

&lt;h3&gt;
  
  
  Architectural Components
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.mongodb.com/cloud/atlas/register?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=puppygraph&amp;amp;utm_term=tony.kim" rel="noopener noreferrer"&gt;MongoDB Atlas&lt;/a&gt;: A fully managed cloud database service that stores data in a flexible, document-oriented format.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.mongodb.com/try/download/jdbc-driver?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=puppygraph&amp;amp;utm_term=tony.kim" rel="noopener noreferrer"&gt;MongoDB Atlas SQL JDBC Driver&lt;/a&gt;: Provides SQL-based access to MongoDB Atlas databases, facilitating connections with SQL-compatible tools and applications.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://hubs.ly/Q03jp_C_0" rel="noopener noreferrer"&gt;PuppyGraph&lt;/a&gt;: Connects to MongoDB Atlas via the JDBC driver, allowing users to define graph schemas over existing collections and execute graph queries using languages like Gremlin or openCypher.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Integration Steps
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Prepare data in the MongoDB Atlas cluster: Create or import the necessary collections into your MongoDB Atlas cluster.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Configure Connection Settings: Set up the connection in PuppyGraph using the JDBC connection string.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Define the Graph Schema: Map MongoDB collections to graph elements such as vertices and edges within PuppyGraph.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Execute Graph Queries and Algorithms: Use Gremlin or openCypher to perform complex graph traversals and run graph algorithms directly on MongoDB data.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This architecture allows organizations to leverage their existing MongoDB infrastructure to perform sophisticated graph analyses, enhancing their data analysis capabilities without the need for additional data processing steps.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmnj03ikvev6ueqzel39v.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmnj03ikvev6ueqzel39v.webp" alt="Figure: How PuppyGraph Integrates with MongoDB" width="800" height="242"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step-by-Step: Running Graph Queries on MongoDB Atlas with PuppyGraph
&lt;/h2&gt;

&lt;p&gt;We will go through a simple demo and see how MongoDB is integrated with PuppyGraph exactly. It is also recommended to read the &lt;a href="https://docs.puppygraph.com/getting-started/querying-mongodb-atlas-data-as-a-graph/" rel="noopener noreferrer"&gt;getting-started document&lt;/a&gt;. What we will do here is essentially the same.&lt;/p&gt;

&lt;h3&gt;
  
  
  Prerequisites
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.mongodb.com/atlas?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=puppygraph&amp;amp;utm_term=tony.kim" rel="noopener noreferrer"&gt;MongoDB Atlas Cluster&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.docker.com/" rel="noopener noreferrer"&gt;Docker&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.mongodb.com/products/tools/shell?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=puppygraph&amp;amp;utm_term=tony.kim" rel="noopener noreferrer"&gt;MongoDB Shell&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Create a MongoDB Atlas Cluster
&lt;/h3&gt;

&lt;p&gt;See the &lt;a href="https://www.mongodb.com/docs/atlas/getting-started/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=puppygraph&amp;amp;utm_term=tony.kim" rel="noopener noreferrer"&gt;documentation&lt;/a&gt; to get started with MongoDB Atlas. You can use the MongoDB Atlas CLI or MongoDB Atlas UI to deploy a free cluster easily. Follow the detailed instructions in the document up to step 4, &lt;a href="https://www.mongodb.com/docs/atlas/getting-started/#manage-the-ip-access-list." rel="noopener noreferrer"&gt;Manage the IP access list&lt;/a&gt;.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a MongoDB Atlas cluster.&lt;/li&gt;
&lt;li&gt;Deploy a Free cluster.&lt;/li&gt;
&lt;li&gt;Manage database users for your cluster.&lt;/li&gt;
&lt;li&gt;Manage the IP access list.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Data Preparation
&lt;/h3&gt;

&lt;p&gt;See the &lt;a href="https://www.mongodb.com/docs/mongodb-shell/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=puppygraph&amp;amp;utm_term=tony.kim#mongodb-binary-bin.mongosh" rel="noopener noreferrer"&gt;documentation&lt;/a&gt; to connect your cluster via MongoDB Shell. You need to get your connection string. After connecting successfully, run the following commands to create collections and insert data. Documents within a collection are flexible; they don’t have to adhere to the same schema. However, to mitigate potential errors, we create collections with schema validators.&lt;/p&gt;

&lt;p&gt;First selecting the database, which will be automatically created after collections are created.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;use modern
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then create collections with schema validators and insert data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;db.createCollection("person", {
   validator: {
      $jsonSchema: {
         bsonType: "object",
         required: [ "id", "name", "age" ],
         properties: {
            id: { bsonType: "string" },
            name: { bsonType: "string" },
            age: { bsonType: "int"}
         }
      }
   }
})
db.person.insertMany([
  {id: 'v1', name: 'marko', age: 29},
  {id: 'v2', name: 'vadas', age: 27},
  {id: 'v4', name: 'josh', age: 32},
  {id: 'v6', name: 'peter', age: 35}
])
db.createCollection("software", {
   validator: {
      $jsonSchema: {
         bsonType: "object",
         required: [ "id", "name", "lang" ],
         properties: {
            id: { bsonType: "string" },
            name: { bsonType: "string" },
            lang: { bsonType: "string" }
         }
      }
   }
})
db.software.insertMany([
  {id: 'v3', name: 'lop', lang: 'java'},
  {id: 'v5', name: 'ripple', lang: 'java'}
])
db.createCollection("created", {
   validator: {
      $jsonSchema: {
         bsonType: "object",
         required: [ "id", "from_id", "to_id", "weight" ],
         properties: {
            id: { bsonType: "string" },
            from_id: { bsonType: "string" },
            to_id: { bsonType: "string" },
            weight: { bsonType: "double" }
         }
      }
   }
})
db.created.insertMany([
  {id: 'e9', from_id: 'v1', to_id: 'v3', weight: 0.4},
  {id: 'e10', from_id: 'v4', to_id: 'v5', weight: Double(1.1)},
  {id: 'e11', from_id: 'v4', to_id: 'v3', weight: 0.4},
  {id: 'e12', from_id: 'v6', to_id: 'v3', weight: 0.2}
])
db.createCollection("knows", {
   validator: {
      $jsonSchema: {
         bsonType: "object",
         required: [ "id", "from_id", "to_id", "weight" ],
         properties: {
            id: { bsonType: "string" },
            from_id: { bsonType: "string" },
            to_id: { bsonType: "string" },
            weight: { bsonType: "double" }
         }
      }
   }
})
db.knows.insertMany([
  {id: 'e7', from_id: 'v1', to_id: 'v2', weight: 0.5},
  {id: 'e8', from_id: 'v1', to_id: 'v4', weight: Double(1.1)}
])

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The data for this demo comes from the “modern” graph defined by Apache TinkerPop.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftukf3o1aaoy1y8pln7d0.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftukf3o1aaoy1y8pln7d0.webp" alt="Figure: the " width="800" height="569"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Deployment
&lt;/h3&gt;

&lt;p&gt;Run the following command to start the PuppyGraph container. The PUPPYGRAPH_PASSWORD environment variable sets the password for the default user puppygraph to puppygraph123. You can change it to your desired password. The — rm flag ensures that the container is removed after it stops.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker run -p 8081:8081 -p 8182:8182 -p 7687:7687 -e PUPPYGRAPH_PASSWORD=puppygraph123 -d --name puppy --rm --pull=always puppygraph/puppygraph:stable
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Modeling the Graph
&lt;/h3&gt;

&lt;p&gt;Log into the PuppyGraph Web UI at &lt;a href="http://localhost:8081" rel="noopener noreferrer"&gt;http://localhost:8081&lt;/a&gt; with the following credentials:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Username: puppygraph&lt;/li&gt;
&lt;li&gt;Password: puppygraph123&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are two methods to model the graph:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Use the Graph Schema Builder to create the schema manually.&lt;/li&gt;
&lt;li&gt;Upload the schema JSON file. We have prepared a template for you and you need to fill some fields about connection. You can upload the schema in two ways:&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;In Web UI, select the file schema.json under Upload Graph Schema JSON, then click on Upload.&lt;/li&gt;
&lt;li&gt;Run the following command in the terminal:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;curl -XPOST -H "content-type: application/json" --data-binary @./schema.json --user "puppygraph:puppygraph123" localhost:8081/schema
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "catalogs": [
    {
      "name": "mongodb_data",
      "type": "mongodb",
      "jdbc": {
        "username": "[username]",
        "password": "[password]",
        "jdbcUri": "[jdbcUri]",
        "driverClass": "com.mongodb.jdbc.MongoDriver"
      }
    }
  ],
  "graph": {
    "vertices": [
      {
        "label": "person",
        "oneToOne": {
          "tableSource": {
            "catalog": "mongodb_data",
            "schema": "modern",
            "table": "person"
          },
          "id": {
            "fields": [
              {
                "type": "String",
                "field": "id",
                "alias": "id"
              }
            ]
          },
          "attributes": [
            {
              "type": "Long",
              "field": "age",
              "alias": "age"
            },
            {
              "type": "String",
              "field": "name",
              "alias": "name"
            }
          ]
        }
      },
      {
        "label": "software",
        "oneToOne": {
          "tableSource": {
            "catalog": "mongodb_data",
            "schema": "modern",
            "table": "software"
          },
          "id": {
            "fields": [
              {
                "type": "String",
                "field": "id",
                "alias": "id"
              }
            ]
          },
          "attributes": [
            {
              "type": "String",
              "field": "lang",
              "alias": "lang"
            },
            {
              "type": "String",
              "field": "name",
              "alias": "name"
            }
          ]
        }
      }
    ],
    "edges": [
      {
        "label": "knows",
        "fromVertex": "person",
        "toVertex": "person",
        "tableSource": {
          "catalog": "mongodb_data",
          "schema": "modern",
          "table": "knows"
        },
        "id": {
          "fields": [
            {
              "type": "String",
              "field": "id",
              "alias": "id"
            }
          ]
        },
        "fromId": {
          "fields": [
            {
              "type": "String",
              "field": "from_id",
              "alias": "from_id"
            }
          ]
        },
        "toId": {
          "fields": [
            {
              "type": "String",
              "field": "to_id",
              "alias": "to_id"
            }
          ]
        },
        "attributes": [
          {
            "type": "Double",
            "field": "weight",
            "alias": "weight"
          }
        ]
      },
      {
        "label": "created",
        "fromVertex": "person",
        "toVertex": "software",
        "tableSource": {
          "catalog": "mongodb_data",
          "schema": "modern",
          "table": "created"
        },
        "id": {
          "fields": [
            {
              "type": "String",
              "field": "id",
              "alias": "id"
            }
          ]
        },
        "fromId": {
          "fields": [
            {
              "type": "String",
              "field": "from_id",
              "alias": "from_id"
            }
          ]
        },
        "toId": {
          "fields": [
            {
              "type": "String",
              "field": "to_id",
              "alias": "to_id"
            }
          ]
        },
        "attributes": [
          {
            "type": "Double",
            "field": "weight",
            "alias": "weight"
          }
        ]
      }
    ]
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When using the graph schema builder or the schema.json file, you need to fill in either the JDBC Connection String or jdbcUri — they are the same thing. The &lt;a href="https://www.mongodb.com/docs/atlas/data-federation/query/sql/drivers/jdbc/connect/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=puppygraph&amp;amp;utm_term=tony.kim" rel="noopener noreferrer"&gt;JDBC Connection String&lt;/a&gt; is used to connect to the MongoDB Atlas database.To find it, follow the &lt;a href="https://www.mongodb.com/docs/atlas/data-federation/query/sql/drivers/jdbc/connect/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=puppygraph&amp;amp;utm_term=tony.kim#connect-to-your-federated-database-instance" rel="noopener noreferrer"&gt;instructions&lt;/a&gt; as:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;In the MongoDB Atlas UI, go to the Data Federation page and click Connect for the federated database instance that you want to connect to.&lt;/li&gt;
&lt;li&gt;Under Access your data through tools, select Atlas SQL.&lt;/li&gt;
&lt;li&gt;Under Select your driver, select JDBC Driver from the dropdown.&lt;/li&gt;
&lt;li&gt;Under Get Connection String, select the database that you want to connect to and copy the connection string. In this demo, the database is modern.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1g1my6bzejoidnz45h64.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1g1my6bzejoidnz45h64.webp" alt="Figure: Get the JDBC Connection String." width="797" height="984"&gt;&lt;/a&gt;&lt;br&gt;
You also need to fill the user and password fields according to your setting. Once complete, you would see the schema graph.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3qv33hs6vwvakmizt7gk.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3qv33hs6vwvakmizt7gk.webp" alt="Figure: Schema graph in PuppyGraph UI" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Querying the Graph
&lt;/h3&gt;

&lt;p&gt;Go to Dashboard in the Web UI, you can see dashboard like the picture below. Each tab represents a query, and you can click on them to view the details. To add a new tab, click the plus (+) symbol located at the bottom right corner.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fehtoov7ts89irdbbofup.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fehtoov7ts89irdbbofup.webp" alt="Figure: PuppyGraph Dashboard" width="800" height="531"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Navigate to Query in the Web UI, then you can use Graph Query for Gremlin/openCypher queries with visualization.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1gih9fbg19ikiia8273b.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1gih9fbg19ikiia8273b.webp" alt="Figure: Query the graph using Gremlin/openCypher." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here are some example queries:&lt;/p&gt;

&lt;p&gt;Retrieve an vertex named “marko”.&lt;/p&gt;

&lt;p&gt;Gremlin:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;g.V().has("name", "marko").valueMap()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;openCypher:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;MATCH (v {name: 'marko'}) RETURN v
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Retrieve the paths from “marko” to the software created by those whom “marko” knows.&lt;/p&gt;

&lt;p&gt;Gremlin:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;g.V().has("name", "marko")
.out("knows").out("created").path()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;openCypher:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;MATCH p=(v {name: 'marko'})-[:knows]-&amp;gt;()-[:created]-&amp;gt;()
RETURN p
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;MongoDB’s flexible document model and robust query engine make it a strong foundation for modern applications, whether you’re powering transactional systems or real-time analytics. For use cases where understanding relationships between entities is key, adding graph capabilities can unlock a new class of insights.&lt;/p&gt;

&lt;p&gt;With PuppyGraph, teams can introduce real-time graph analytics into their MongoDB Atlas environment without modifying schemas, exporting data, or managing additional infrastructure. By connecting through the MongoDB Atlas SQL Interface, PuppyGraph lets you define graph models directly over your collections and query them using Gremlin or openCypher — while the data stays exactly where it is.&lt;/p&gt;

&lt;p&gt;If you’re working with connected data and want to explore graph queries on &lt;a href="https://www.mongodb.com/cloud/atlas/register?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=puppygraph&amp;amp;utm_term=tony.kim" rel="noopener noreferrer"&gt;MongoDB Atlas&lt;/a&gt;, try PuppyGraph’s &lt;a href="https://hubs.ly/Q03jpZGd0" rel="noopener noreferrer"&gt;free Developer Edition&lt;/a&gt; and experience what’s possible — no ETL required.&lt;/p&gt;

</description>
      <category>graph</category>
      <category>graphdatabase</category>
      <category>graphqueryengine</category>
      <category>mongodb</category>
    </item>
  </channel>
</rss>
