<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Volisoft</title>
    <description>The latest articles on DEV Community by Volisoft (@volisoft).</description>
    <link>https://dev.to/volisoft</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F9227%2F8dd3ad6a-4bb6-48ad-a7a0-6f300618cff8.jpg</url>
      <title>DEV Community: Volisoft</title>
      <link>https://dev.to/volisoft</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/volisoft"/>
    <language>en</language>
    <item>
      <title>DynamoDB design patterns considered harmful</title>
      <dc:creator>V</dc:creator>
      <pubDate>Tue, 18 Feb 2025 04:10:06 +0000</pubDate>
      <link>https://dev.to/volisoft/dynamodb-design-patterns-considered-harmful-bpe</link>
      <guid>https://dev.to/volisoft/dynamodb-design-patterns-considered-harmful-bpe</guid>
      <description>&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://volisoft.org/blog-dynamodb-patterns-harmful.html" rel="noopener noreferrer"&gt;Volisoft&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;  Overview
&lt;/li&gt;
&lt;li&gt;  Introduction
&lt;/li&gt;
&lt;li&gt;  Deeper Dive: Quantification is Key
&lt;/li&gt;
&lt;li&gt;  Case Study: Online Team Game - Different Data, Different Designs
&lt;/li&gt;
&lt;li&gt;  The Twist: Evolving Data Changes Everything
&lt;/li&gt;
&lt;li&gt;  Revised Assumptions: Stats are Scarce
&lt;/li&gt;
&lt;li&gt;  Conclusion: Data-Driven Design is Key
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a id="orgc7576f8"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;Key Takeaway: DynamoDB design patterns are helpful illustrations, &lt;b&gt;not&lt;/b&gt; rigid rules.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Effective DynamoDB design requires quantitative analysis of your data and access patterns.&lt;br&gt;
Applying patterns without this analysis risks increased costs and degraded performance.&lt;br&gt;
This article presents an automated DynamoDB design approach to address these critical challenges.&lt;/p&gt;

&lt;p&gt;&lt;a id="org4c31352"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Official AWS documentation offers DynamoDB design patterns to guide users migrating from relational to NoSQL databases, specifically DynamoDB.&lt;br&gt;
However, these patterns are technique demonstrations, &lt;strong&gt;not&lt;/strong&gt; prescriptive solutions.&lt;br&gt;
Truly efficient and cost-effective DynamoDB design depends on a deep, quantifiable understanding of your data and anticipated access patterns.&lt;/p&gt;

&lt;p&gt;&lt;a id="orgdddba59"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Deeper Dive: Quantification is Key
&lt;/h2&gt;

&lt;p&gt;“Understanding” here means quantification.&lt;br&gt;
For &lt;strong&gt;data&lt;/strong&gt;, this involves knowing the volume of each entity type and the distribution of key values.&lt;br&gt;
For &lt;strong&gt;access patterns&lt;/strong&gt;, it means determining data retrieval volumes and the frequency of each query.&lt;br&gt;
Ignoring these quantitative factors can lead to higher operational costs and reduced application performance.&lt;/p&gt;

&lt;p&gt;&lt;a id="orgfe00bd4"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Case Study: Online Team Game - Different Data, Different Designs
&lt;/h2&gt;

&lt;p&gt;To illustrate the importance of data characteristics, let’s consider an online team game example.&lt;br&gt;
We’ll model two entity types: &lt;code&gt;Game&lt;/code&gt; and &lt;code&gt;Stats&lt;/code&gt;.&lt;br&gt;
Let’s define their attributes and expected volumes:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;span&gt;Table 1:&lt;/span&gt; Game and Stats Entities: Data Volumes and Attributes

&lt;colgroup&gt;
&lt;col&gt;

&lt;col&gt;

&lt;col&gt;

&lt;col&gt;

&lt;col&gt;

&lt;col&gt;

&lt;col&gt;

&lt;col&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Entity&lt;/th&gt;
&lt;th&gt;Count&lt;/th&gt;
&lt;th&gt;time+team/name[id]&lt;/th&gt;
&lt;th&gt;time&lt;/th&gt;
&lt;th&gt;team/name&lt;/th&gt;
&lt;th&gt;archived?&lt;/th&gt;
&lt;th&gt;game/data&lt;/th&gt;
&lt;th&gt;stats/data&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Game&lt;/td&gt;
&lt;td&gt;1000&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;30&lt;/td&gt;
&lt;td&gt;500&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;Stats&lt;/td&gt;
&lt;td&gt;1000&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;30&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;In this online team game scenario, ’time’ represents the game timestamp.&lt;br&gt;
On average, each team generates 30 &lt;code&gt;Game&lt;/code&gt; records and 30 &lt;code&gt;Stats&lt;/code&gt; records.&lt;br&gt;
The fields ’time’ and ’team/name’ (represented as ’time+team/name[id]’) uniquely identify both &lt;code&gt;Game&lt;/code&gt; and &lt;code&gt;Stats&lt;/code&gt; entities.&lt;br&gt;
’archived?’, ’game/data’, and ’stats/data’ represent additional attributes associated with each entity type.&lt;/p&gt;

&lt;p&gt;The application needs to support the following queries.&lt;br&gt;
Understanding the frequency and expected return size of each query is crucial for optimal schema design:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;span&gt;Table 2:&lt;/span&gt; Application Query Profile: Frequency and Expected Return Sizes

&lt;colgroup&gt;
&lt;col&gt;

&lt;col&gt;

&lt;col&gt;

&lt;col&gt;

&lt;col&gt;

&lt;col&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Query Name&lt;/th&gt;
&lt;th&gt;Entity&lt;/th&gt;
&lt;th&gt;Partition Key&lt;/th&gt;
&lt;th&gt;Sort key&lt;/th&gt;
&lt;th&gt;Frequency&lt;/th&gt;
&lt;th&gt;Return Count&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;time-&amp;gt;games&lt;/td&gt;
&lt;td&gt;Game&lt;/td&gt;
&lt;td&gt;time&lt;/td&gt;
&lt;td&gt; &lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;team-time&amp;gt;games&lt;/td&gt;
&lt;td&gt;Game&lt;/td&gt;
&lt;td&gt;team/name&lt;/td&gt;
&lt;td&gt;time&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;time+archived?-&amp;gt;game&lt;/td&gt;
&lt;td&gt;Game&lt;/td&gt;
&lt;td&gt;time&lt;/td&gt;
&lt;td&gt;archived?&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;time-&amp;gt;stats&lt;/td&gt;
&lt;td&gt;Stats&lt;/td&gt;
&lt;td&gt;time&lt;/td&gt;
&lt;td&gt; &lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;team+time-&amp;gt;stats&lt;/td&gt;
&lt;td&gt;Stats&lt;/td&gt;
&lt;td&gt;team/name&lt;/td&gt;
&lt;td&gt;time&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Based on these data characteristics and query patterns, a read-optimized indexing schema &lt;strong&gt;could be&lt;/strong&gt; structured as follows:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;span&gt;Table 3:&lt;/span&gt; Read-Optimized Schema (Initial Data Assumptions)

&lt;colgroup&gt;
&lt;col&gt;

&lt;col&gt;

&lt;col&gt;

&lt;col&gt;

&lt;col&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;:table-cnt&lt;/th&gt;
&lt;th&gt;:table&lt;/th&gt;
&lt;th&gt;:pk&lt;/th&gt;
&lt;th&gt;:sk&lt;/th&gt;
&lt;th&gt;:entity&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;2000&lt;/td&gt;
&lt;td&gt;MAIN&lt;/td&gt;
&lt;td&gt;time&lt;/td&gt;
&lt;td&gt;team/name&lt;/td&gt;
&lt;td&gt;Game&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;2000&lt;/td&gt;
&lt;td&gt;MAIN&lt;/td&gt;
&lt;td&gt;time&lt;/td&gt;
&lt;td&gt;team/name&lt;/td&gt;
&lt;td&gt;Stats&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;2000&lt;/td&gt;
&lt;td&gt;GSI1&lt;/td&gt;
&lt;td&gt;team/name&lt;/td&gt;
&lt;td&gt; &lt;/td&gt;
&lt;td&gt;Game&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;2000&lt;/td&gt;
&lt;td&gt;GSI1&lt;/td&gt;
&lt;td&gt;team/name&lt;/td&gt;
&lt;td&gt; &lt;/td&gt;
&lt;td&gt;Stats&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;span&gt;Table 4:&lt;/span&gt; Query Costs (Initial Data Assumptions)

&lt;colgroup&gt;
&lt;col&gt;

&lt;col&gt;

&lt;col&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;:query&lt;/th&gt;
&lt;th&gt;:query-tbl&lt;/th&gt;
&lt;th&gt;:query-cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;time-&amp;gt;games&lt;/td&gt;
&lt;td&gt;MAIN&lt;/td&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;time+archived?-&amp;gt;game&lt;/td&gt;
&lt;td&gt;MAIN&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;time-&amp;gt;stats&lt;/td&gt;
&lt;td&gt;MAIN&lt;/td&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;team+time-&amp;gt;games&lt;/td&gt;
&lt;td&gt;GSI1&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;team+time-&amp;gt;stats&lt;/td&gt;
&lt;td&gt;GSI1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a id="orga3943ec"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Twist: Evolving Data Changes Everything
&lt;/h2&gt;

&lt;p&gt;Software applications evolve, and so does their data.&lt;br&gt;
Initial assumptions about data distribution may become outdated as requirements change.&lt;br&gt;
Optimization priorities can also shift, perhaps focusing on outlier cases rather than typical scenarios.&lt;br&gt;
Let’s examine how revised data assumptions impact database design.&lt;/p&gt;

&lt;p&gt;&lt;a id="org9fba138"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Revised Assumptions: Stats are Scarce
&lt;/h2&gt;

&lt;p&gt;Previously, we assumed &lt;strong&gt;an average&lt;/strong&gt; of 30 &lt;code&gt;Stats&lt;/code&gt; records &lt;strong&gt;per team&lt;/strong&gt;.&lt;br&gt;
Now, let’s assume we still have 30 &lt;code&gt;Game&lt;/code&gt; records per team, but dramatically &lt;strong&gt;reduce&lt;/strong&gt; the &lt;code&gt;Stats&lt;/code&gt; records to just 3 per team.&lt;br&gt;
This seemingly small change has significant design implications.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;span&gt;Table 5:&lt;/span&gt; Read-Optimized Schema (Revised Data Assumptions)

&lt;colgroup&gt;
&lt;col&gt;

&lt;col&gt;

&lt;col&gt;

&lt;col&gt;

&lt;col&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;:table-cnt&lt;/th&gt;
&lt;th&gt;:table&lt;/th&gt;
&lt;th&gt;:pk&lt;/th&gt;
&lt;th&gt;:sk&lt;/th&gt;
&lt;th&gt;:entity&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;2000&lt;/td&gt;
&lt;td&gt;MAIN&lt;/td&gt;
&lt;td&gt;time&lt;/td&gt;
&lt;td&gt;team/name&lt;/td&gt;
&lt;td&gt;Game&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;2000&lt;/td&gt;
&lt;td&gt;MAIN&lt;/td&gt;
&lt;td&gt;&lt;b&gt;team/name&lt;/b&gt;&lt;/td&gt;
&lt;td&gt;&lt;b&gt;time&lt;/b&gt;&lt;/td&gt;
&lt;td&gt;Stats&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;2000&lt;/td&gt;
&lt;td&gt;GSI1&lt;/td&gt;
&lt;td&gt;team/name&lt;/td&gt;
&lt;td&gt; &lt;/td&gt;
&lt;td&gt;Game&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;2000&lt;/td&gt;
&lt;td&gt;GSI1&lt;/td&gt;
&lt;td&gt;time&lt;/td&gt;
&lt;td&gt; &lt;/td&gt;
&lt;td&gt;Stats&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;span&gt;Table 6:&lt;/span&gt; Queries (Revised Data Assumptions)

&lt;colgroup&gt;
&lt;col&gt;

&lt;col&gt;

&lt;col&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;:query&lt;/th&gt;
&lt;th&gt;:query-tbl&lt;/th&gt;
&lt;th&gt;:query-cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;time-&amp;gt;games&lt;/td&gt;
&lt;td&gt;MAIN&lt;/td&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;time+archived?-&amp;gt;game&lt;/td&gt;
&lt;td&gt;MAIN&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;&lt;b&gt;team+time-&amp;gt;stats&lt;/b&gt;&lt;/td&gt;
&lt;td&gt;MAIN&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;team+time-&amp;gt;games&lt;/td&gt;
&lt;td&gt;GSI1&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;&lt;b&gt;time-&amp;gt;stats&lt;/b&gt;&lt;/td&gt;
&lt;td&gt;GSI1&lt;/td&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This shift in &lt;code&gt;Stats&lt;/code&gt; record volume suggests a revised indexing strategy.&lt;br&gt;
Indexing &lt;code&gt;Stats&lt;/code&gt; records using ’team/name’ as the partition key becomes more efficient due to its increased specificity in this scenario.&lt;br&gt;
A more specific partition key (lower cardinality) enhances DynamoDB’s ability to distribute data effectively.&lt;br&gt;
Consequently, the query mapping adapts: retrieving &lt;code&gt;Stats&lt;/code&gt; records by ’team/name’ [PK] and ’time’ [SK] for individual items can now be efficiently executed on the MAIN table.&lt;br&gt;
Conversely, retrieving &lt;code&gt;Stats&lt;/code&gt; records by ’time’ is now better served by querying the GSI1 index.&lt;/p&gt;

&lt;p&gt;&lt;a id="org8a25363"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion: Data-Driven Design is Key
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;Key Takeaway: Data-driven design is critical.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Different data characteristics suggest different design choices.&lt;br&gt;
Blindly applying patterns can be costly and inefficient.&lt;br&gt;
Embracing a data-centric approach, especially with &lt;a href="https://volisoft.org/ddb.html" rel="noopener noreferrer"&gt;automated analysis&lt;/a&gt;, leads to efficient, cost-effective, and performant DynamoDB database designs.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>aws</category>
      <category>productivity</category>
      <category>database</category>
    </item>
    <item>
      <title>Start from the Middle: Making Programming Easier (Part 1)</title>
      <dc:creator>V</dc:creator>
      <pubDate>Fri, 26 Jul 2024 00:00:00 +0000</pubDate>
      <link>https://dev.to/volisoft/start-from-the-middle-making-programming-easier-part-1-25kj</link>
      <guid>https://dev.to/volisoft/start-from-the-middle-making-programming-easier-part-1-25kj</guid>
      <description>&lt;h2&gt;Start from the Middle: Making Programming Easier (Part 1)&lt;/h2&gt;


&lt;p&gt;This post is a continuation of “Start from the Middle: How to Solve a Problem.”&lt;/p&gt;
&lt;p&gt;Programming is challenging for many reasons, including the multitude of decisions a programmer must make during development. Often, when faced with a problem, it’s not even clear where to start. My advice is to start from the middle.&lt;/p&gt;



&lt;h3&gt;Problem-solving process&lt;/h3&gt;

&lt;p&gt;The previous post described a 'philosophical' view on problem-solving. The idea is that starting from the middle of a problem is advantageous because it provides a retrospective view — looking backward as if the problem, or part of it, has already been solved. This allows to make a better decision about the where to start. This also makes it easier to think about the next steps, as we can abstract from the details of how we got to the middle part and focus on the rest of the solution.&lt;/p&gt;
&lt;p&gt;While this approach is universal and can be applied to numerous problems, how can we apply it to programming? Here’s the outline of the proposed method.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Look at the state in the middle of computation&lt;/li&gt;
&lt;li&gt;Determine variables and their initial state&lt;/li&gt;
&lt;li&gt;Determine the final condition&lt;/li&gt;
&lt;li&gt;Code the rest&lt;/li&gt;
&lt;/ul&gt;





&lt;h3&gt;The Middle Part&lt;/h3&gt;

&lt;p&gt;Starting from the middle means looking at the problem as if some part of it has already been solved, visualizing &lt;b&gt;what has been done&lt;/b&gt; so far and &lt;b&gt;what remains to be done&lt;/b&gt;. What constitutes the middle of a computation depends on the program. To make things more concrete, let’s assume our program has a loop, and the loop is the central part of the algorithm. In this case, it is helpful to look at the state of the program at the beginning of the loop iteration.&lt;/p&gt;
&lt;p&gt;What has been computed so far? What variables are needed to represent that work? What does the value of each variable represent? Is this information sufficient to complete the computation and obtain the result?&lt;/p&gt;
&lt;p&gt;These are some questions we may need to ask ourselves.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Note:&lt;/b&gt; not all programs have loops, but the most useful and interesting ones use iteration/recursion&lt;/p&gt;





&lt;h3&gt;Trivial Example&lt;/h3&gt;

&lt;p&gt;For the rest of this article, we’ll assume the reader has basic programming knowledge and some familiarity with the Python programming language. Python has a simple syntax, and hopefully, readers can easily translate the examples here to their preferred language. &lt;/p&gt; 
&lt;p&gt;As the first example, we will do a list summation problem: writing a program that computes the sum of all the numbers in a list &lt;code&gt;L&lt;/code&gt;. The result should be stored in the variable &lt;code&gt;s&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;s := ∑L&lt;/pre&gt;
&lt;p&gt;It’s clear that we need to look at each number in the list to find the sum, so we need a loop. Let’s imagine the state of the program in the middle of its execution. &lt;/p&gt;
&lt;p&gt;&lt;b&gt;What has been done so far? &lt;/b&gt;It is reasonable to think that we have a partial sum up to some index &lt;code&gt;i&lt;/code&gt; of the list &lt;code&gt;L&lt;/code&gt;. Therefore, we need the variable &lt;code&gt;i&lt;/code&gt; to represent this.&lt;/p&gt;
&lt;p&gt;Now, what exactly does the sum &lt;code&gt;s&lt;/code&gt; represent at this point? Should the sum include the value at &lt;code&gt;L[i]&lt;/code&gt; or not? Since it doesn’t seem to be important, we arbitrarily choose to exclude &lt;code&gt;L[i]&lt;/code&gt; from the sum at the beginning of the iteration. With this interpretation, &lt;code&gt;i&lt;/code&gt; represents the &lt;b&gt;number of items&lt;/b&gt; processed so far. If it’s not clear why &lt;code&gt;i&lt;/code&gt; represents the number of list items processed, consider the case when &lt;code&gt;i=0&lt;/code&gt;. This interpretation of &lt;code&gt;s&lt;/code&gt; affects the value of &lt;code&gt;s&lt;/code&gt; and &lt;code&gt;i&lt;/code&gt; at the beginning of the program.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Initial state. &lt;/b&gt;We initialize &lt;code&gt;i&lt;/code&gt; with 0 because we haven’t processed any list items yet. Since there are no numbers before &lt;code&gt;i=0&lt;/code&gt;, and the sum of an empty segment is 0, we initialize &lt;code&gt;s&lt;/code&gt; with 0.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;When should the computation stop? &lt;/b&gt;We assumed that we’re going to iterate through the list in a loop. We should stop the iteration when all numbers are included in the sum &lt;code&gt;s&lt;/code&gt;. We remind ourselves again about the meaning of &lt;code&gt;i&lt;/code&gt; — the number of list items processed so far. That means the program should stop when the whole list has been processed. In other words, the loop should continue while &lt;code&gt;i&lt;/code&gt; remains less than the length of the list: &lt;code&gt;i &amp;lt; len(L)&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Here’s the program so far:&lt;/p&gt;

&lt;pre&gt;
def sum_of(L):
   i = 0 # number of processed list items
   s = 0 # sum of i items in L

   while i &amp;lt; len(L):
     # 0 &amp;lt; i &amp;lt; len(L)
     # s = ∑ 0&amp;lt;n&amp;lt;i, L[n] -- sum of L items up to index i exclusive
     ...

   # i = len(L) -- after loop exits
   return s
&lt;/pre&gt;

&lt;p&gt;&lt;b&gt;Filling in the details. &lt;/b&gt;Let’s take a look at the body of the loop. To make progress, we need to increase &lt;code&gt;i&lt;/code&gt;. The minimal step is to increase &lt;code&gt;i&lt;/code&gt; by 1, so we add the &lt;code&gt;i&lt;/code&gt; increment to the loop. We also need to update &lt;code&gt;s&lt;/code&gt; &lt;b&gt;before&lt;/b&gt; the &lt;code&gt;i&lt;/code&gt; increment, not after. (Do you see why? Hint: remember what &lt;code&gt;i&lt;/code&gt; means). We update the program.&lt;/p&gt;
&lt;pre&gt;
def sum_of(L):
   i = 0 # number of processed list items
   s = 0 # sum of i items in L

   while i &amp;lt; len(L):
     # 0 &amp;lt; i &amp;lt; len(L)
     # s = ∑ 0&amp;lt;n&amp;lt;i, L[n] -- sum of L items up to index i exclusive

     s = s+L[i]
     i = i+1

   # i = len(L) -- after loop exits
   return s
&lt;/pre&gt;
&lt;p&gt;This completes the development.&lt;/p&gt;
&lt;p&gt;As an exercise, try to solve the list summation problem, but now interpret &lt;code&gt;s&lt;/code&gt; as the sum of the list items in the range &lt;code&gt;0&lt;/code&gt; to &lt;code&gt;i&lt;/code&gt; inclusive. This will yield a different program.&lt;/p&gt;





&lt;h3&gt;Summary&lt;/h3&gt;

&lt;p&gt;Starting from the middle means looking at the problem as if some part of it has already been solved, visualizing &lt;b&gt;what has been done&lt;/b&gt; so far and &lt;b&gt;what remains to be done. &lt;/b&gt;Using a simple list summation problem, we highlighted the steps to iteratively build solutions using this method. Each step is justified, minimizing arbitrary decisions and promoting a systematic approach.&lt;/p&gt;
&lt;p&gt;A common approach to programming involves guessing and debugging until the program works as expected. However, programming becomes much easier when approached methodically.&lt;/p&gt;
&lt;p&gt;The method discussed here transforms arbitrary decision-making into a structured activity. This method is not limited to programming but can also be applied to any problem-solving task.&lt;/p&gt;



&lt;br&gt;


</description>
      <category>problemsolving</category>
      <category>leetcode</category>
      <category>programming</category>
    </item>
    <item>
      <title>Start from the Middle: How to Solve a Problem (Part 0)</title>
      <dc:creator>V</dc:creator>
      <pubDate>Fri, 26 Jul 2024 00:00:00 +0000</pubDate>
      <link>https://dev.to/volisoft/look-in-the-middle-how-to-solve-a-problem-part-0-9jn</link>
      <guid>https://dev.to/volisoft/look-in-the-middle-how-to-solve-a-problem-part-0-9jn</guid>
      <description>&lt;p&gt;You are faced with a task, a problem you need to solve. There are many ways to start, making it hard to choose. Once a choice is made, further down the road it turns out to be unfit, and you reset back to the start. Idea after idea, option after option, there’s a slow progress. After each unsuccessful attempt, you are back to square one.&lt;/p&gt;

&lt;p&gt;If only there were a way to know which of the many options would work. A way to fast-forward and retrospectively discern the decisions that led to success. Rather than starting at the beginning, this crossroad of decisions, why not project ourselves a bit further along the trajectory?&lt;/p&gt;

&lt;p&gt;&lt;b&gt;Why not start from the middle?&lt;/b&gt;&lt;/p&gt;

&lt;p&gt;Overcoming hard challenges yields invaluable experience. In retrospect, it becomes evident which of decisions were good and which were not so much. This retrospective clarity is the perspective we need when confronting new problems.&lt;/p&gt;

&lt;p&gt;Let's term this approach “starting from the middle”. Envisioning from the midpoint means to visualize &lt;b&gt;what has been done&lt;/b&gt; so far and &lt;b&gt;what remains to be done&lt;/b&gt;. From this vantage point, it becomes feasible to identify the probable steps taken and actions that are likely to lead forward.&lt;/p&gt;

&lt;p&gt;To illustrate, imagine setting a goal to write a novel. The prospect is thrilling yet overwhelming. You are inundated with advice on plot structure, character development, narrative style, and countless other aspects of novel-writing. Where do you begin? Each attempt to outline or draft the first chapter seems inadequate, leading you back to the beginning, frustrated and uncertain.&lt;/p&gt;

&lt;p&gt;Now let’s take a midpoint view. Project yourself some time into the future. Imagine you’ve written half of your novel. What does your manuscript look like? What themes have emerged? How have your characters evolved? What plot twists have captivated your readers? Do you feel the narrative flow is engaging and coherent?&lt;/p&gt;

&lt;p&gt;If envisioning the mid-term result proves elusive, perhaps there’s too much information missing. It’s important to start not too far from the beginning, ensuring it's possible to connect the dots between the middle and the start. Equally vital is recognizing that not all problems have a solution. Identifying an insurmountable problem early conserves time, energy, and mental well-being.&lt;/p&gt;

&lt;h2&gt;Summary&lt;/h2&gt;

&lt;p&gt;The beginning of a problem-solving overwhelms our minds with myriad of options. Working backwards from the goal can occasionally be effective, but often the disconnect is too vast to bridge the end result with the current state. Starting from the middle seems like an optimal strategy.&lt;/p&gt;



</description>
      <category>problemsolving</category>
    </item>
    <item>
      <title>Streamlining NoSQL Database Design with AI: A Case Study using Amazon DynamoDB</title>
      <dc:creator>V</dc:creator>
      <pubDate>Wed, 20 Mar 2024 00:00:00 +0000</pubDate>
      <link>https://dev.to/volisoft/dynamodb-design-automated-e-commerce-example-22pa</link>
      <guid>https://dev.to/volisoft/dynamodb-design-automated-e-commerce-example-22pa</guid>
      <description>&lt;p&gt;In this article, we explore a basic yet practical use case, and demonstrate how it can be modeled in Amazon DynamoDB with a single-table design approach. Leveraging &lt;a href="https://volisoft.org/ddb.html" rel="noopener noreferrer"&gt;NoSQL Architect&lt;/a&gt;, an AI-powered tool, we showcase the potential of automated database design.&lt;/p&gt;

&lt;h2&gt;E-commerce Application Example&lt;/h2&gt;

&lt;p&gt;Let's consider an e-commerce application for processing customer orders. Here's a breakdown of the entities and their attributes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Product: &lt;/strong&gt;(product_id, name, category, price, quantity)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Order: &lt;/strong&gt;(order_id, customer_id, order_date, status)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Customer: &lt;/strong&gt;(customer_id, name, email, shipping_address)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt; The application needs to support these queries:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Query products by category&lt;/li&gt;
&lt;li&gt;Query orders based on customer ID and order status&lt;/li&gt;
&lt;li&gt;Query customer details along with all their associated orders&lt;/li&gt;
&lt;li&gt;Query the latest orders for a specific customer&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;Automating Design with NoSQL Architect&lt;/h2&gt;

&lt;p&gt;With information about entities and queries, we can generate a database design using &lt;a href="https://volisoft.org/ddb.html" rel="noopener noreferrer"&gt;NoSQL Architect.&lt;/a&gt;

&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy3aknyjtfouknza75570.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy3aknyjtfouknza75570.png" width="800" height="206"&gt;&lt;/a&gt;

&lt;/p&gt;

&lt;p&gt;Here, the &lt;code&gt;Entities/Cardinalities&lt;/code&gt; table specifies entities, data fields and their &lt;em&gt;cardinalities. &lt;/em&gt;&lt;em&gt;Cardinality&lt;/em&gt; is the estimated number of unique entity items associated with a field. For example, in &lt;code&gt;Product&lt;/code&gt; entity, &lt;code&gt;p/id&lt;/code&gt; field has a cardinality of 1. because &lt;code&gt;p/id&lt;/code&gt; is a unique field - it is associated with exactly &lt;em&gt;one&lt;/em&gt; &lt;code&gt;Product&lt;/code&gt; record. Entity &lt;code&gt;Customer&lt;/code&gt; and field &lt;code&gt;p/id&lt;/code&gt; has a cardinality of 0 because there's no association. Similarly, &lt;code&gt;Product&lt;/code&gt; and &lt;code&gt;p/name&lt;/code&gt; has a cardinality of 2 (max), because we estimate that there can be at most 2 products with the same name based on our sample data. It's important to note that cardinality can be modeled for maximum, average, minimum or any other relevant statistic. In our example we use &lt;em&gt;max&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Every case is different, and these assumptions about data will likely not hold true for a different e-commerce system. Moreover, these assumptions may change with time, in which case the design desicions should be revised. This is where automation tools like &lt;a href="https://volisoft.org/ddb.html" rel="noopener noreferrer"&gt;NoSQL Architect&lt;/a&gt; are most useful. By simlpy updating the inputs, NoSQL Architect can automatically output an optimal schema for the database.&lt;/p&gt;

&lt;h2&gt;Optimized Design in Seconds&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://volisoft.org/ddb.html" rel="noopener noreferrer"&gt;NoSQL Architect&lt;/a&gt; delivers a cost-effective database design optimized for both read and storage efficiency, typically within seconds. Here's a sample solution:&lt;/p&gt;

&lt;pre&gt;
Read-optimized schema
| :table-cnt | :table |        :pk |       :sk |  :entity |
|------------+--------+------------+-----------+----------|
|    1101000 |   MAIN |       p/id |           |  Product |
|    1101000 |   MAIN |       o/id |           |    Order |
|    1101000 |   MAIN |       c/id |           | Customer |
|    1001000 |   GSI1 |       c/id |  o/status |    Order |
|    1001000 |   GSI1 |       c/id | c/address | Customer |
|     100000 |   GSI2 | p/category |           |  Product |


Queries
|              :query | :query-tbl | :query-cost |
|---------------------+------------+-------------|
|     Order by status |       GSI1 |           5 |
| All customer orders |       GSI1 |         100 |
|       Latest orders |       GSI1 |         100 |
|   Prod. by category |       GSI2 |       10000 |
&lt;/pre&gt;

&lt;h2&gt;Summary&lt;/h2&gt;


&lt;p&gt;&lt;a href="https://volisoft.org/ddb.html" rel="noopener noreferrer"&gt;NoSQL Architect&lt;/a&gt; offers a unique, free solution for generating optimized database schemas, setting a new standard in database design automation. This AI-powered tool allows to create efficient single-table designs that scale effortlessly with your application's growth. As software requirements evolve, the database schema must accommodate these changes. Redesigning is an extremely costly endeavor, and reducing development costs is the primary motivation behind &lt;a href="https://volisoft.org/ddb.html" rel="noopener noreferrer"&gt;NoSQL Architect&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>singletabledesign</category>
      <category>dynamodb</category>
      <category>datamodelling</category>
    </item>
    <item>
      <title>NoSQL Architect vs AWS expert</title>
      <dc:creator>V</dc:creator>
      <pubDate>Thu, 30 Nov 2023 00:00:00 +0000</pubDate>
      <link>https://dev.to/volisoft/nosql-architect-vs-aws-expert-28d6</link>
      <guid>https://dev.to/volisoft/nosql-architect-vs-aws-expert-28d6</guid>
      <description>&lt;h2&gt;Introduction&lt;/h2&gt;

&lt;p&gt;Let's compare two AWS DynamoDB database schemas: one designed by an AWS expert and another one generated by automated &lt;a href="https://volisoft.org" rel="noopener noreferrer"&gt;NoSQL Architect&lt;/a&gt; tool.
The human expert’s schema incorporated best practices and years of experience in the field. On the other hand, our tool utilizes mathematical modeling techniques to optimize the schema based on the specific characteristics of the data.&lt;/p&gt;

&lt;p&gt;The primary focus of this comparison is to analyze cost savings, particularly in terms of read queries and storage costs.&lt;/p&gt;



&lt;h2&gt;Case study&lt;/h2&gt;

&lt;p&gt;For the experiment we take an example from &lt;a href="https://aws.amazon.com/blogs/compute/creating-a-single-table-design-with-amazon-dynamodb/" rel="noopener noreferrer"&gt;AWS blog&lt;/a&gt;. The author discusses the concept of a single-table design.The idea is to store all application data in a single table. This may seem counterintuitive to those familiar with relational databases. However, Amazon uses this approach for its internal designs. This is also the approach we use in our &lt;b&gt;NoSQL Architect&lt;/b&gt; tool to generate database schemas. &lt;/p&gt;

&lt;p&gt;To summarize, the blog post walks through converting a relational model into a single AWS DynamoDB table.&lt;/p&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--M__Mp5QA--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://d2908q01vomqb2.cloudfront.net/1b6453892473a467d07372d45eb05abc2031647a/2021/07/21/single-table-1.png" class="article-body-image-wrapper"&gt;&lt;img alt="single-table-1.png" src="https://res.cloudinary.com/practicaldev/image/fetch/s--M__Mp5QA--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://d2908q01vomqb2.cloudfront.net/1b6453892473a467d07372d45eb05abc2031647a/2021/07/21/single-table-1.png" width="800" height="202"&gt;&lt;/a&gt;Relational model of the Alleycat application from the AWS Blog.&lt;/p&gt;



&lt;h2&gt;Setup&lt;/h2&gt;

&lt;p&gt;We start by listing the access patterns from the article:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Get the results for each race by racer ID.&lt;/li&gt;
&lt;li&gt;Get a list of races by class ID.&lt;/li&gt;
&lt;li&gt;Get the best performance by racer for a class ID.&lt;/li&gt;
&lt;li&gt;Get the list of top scores by race ID.&lt;/li&gt;
&lt;li&gt;Get the second-by-second performance by racer for all races.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The database entities with their attributes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;classes: class-id, class-name&lt;/li&gt;
&lt;li&gt;races: race-id, class-id&lt;/li&gt;
&lt;li&gt;race-results: race-id, racer-id, second&lt;/li&gt;
&lt;li&gt;racers: racer-id, racer-name&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Last attribute appears later in the article but is not listed in the relational model. We include it here as a part of the data description.&lt;/p&gt;

&lt;p&gt;The effectiveness of any database design depends on how the data is distributed and accessed. Key factors include query frequency, the average number of records returned per query, and the size of the data set. &lt;/p&gt;

&lt;p&gt;The AWS blog post does not provide details about data distribution or query behavior. Therefore, we made reasonable assumptions about query frequencis and record counts, summarized below. We assume a dataset size of 1 million records.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;/colgroup&gt;
&lt;thead&gt;&lt;tr&gt;
&lt;th&gt;Query #&lt;/th&gt;
&lt;th&gt;# of records returned&lt;/th&gt;
&lt;th&gt;Frequency&lt;/th&gt;
&lt;th&gt;PK&lt;/th&gt;
&lt;th&gt;SK&lt;/th&gt;
&lt;th&gt;Return&lt;/th&gt;
&lt;/tr&gt;&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;1000&lt;/td&gt;
&lt;td&gt;“racer-id”&lt;/td&gt;
&lt;td&gt; &lt;/td&gt;
&lt;td&gt;“race-id”,“second”&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;30&lt;/td&gt;
&lt;td&gt;1000&lt;/td&gt;
&lt;td&gt;“class-id”&lt;/td&gt;
&lt;td&gt;“second”&lt;/td&gt;
&lt;td&gt;“race-id”&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;20000&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;“class-id”&lt;/td&gt;
&lt;td&gt; &lt;/td&gt;
&lt;td&gt;“racer-id”,“race-id”&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;1000&lt;/td&gt;
&lt;td&gt;“race-id”&lt;/td&gt;
&lt;td&gt;“second”&lt;/td&gt;
&lt;td&gt;“racer-id”&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1000&lt;/td&gt;
&lt;td&gt;“racer-id”&lt;/td&gt;
&lt;td&gt;“race-id”&lt;/td&gt;
&lt;td&gt;“second”&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;



&lt;h2&gt;Cost criteria&lt;/h2&gt;

&lt;p&gt;To compare performance, we focused on the cost of reads and data storage. We used the following scoring method: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Unique Record Queries: &lt;/strong&gt;For queries that return a single record (e.g., query #5), the schema must uniquely identify the record. If the schema allows for this, the query scores 1. Otherwise, the score reflects the actual number of records returned.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Projected Attributes: &lt;/strong&gt;For queries using indexes, we account for any missing attributes that require additional requests to the main table. Each extra request adds to the cost.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Storage: &lt;/strong&gt;Storage costs include both the main table and index storage. We assign one unit of cost for each record stored in the main table or index.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;The AWS blogpost schema&lt;/h2&gt;

&lt;p&gt;Below are the cost estimates for the schema suggested in the AWS blogpost.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;/colgroup&gt;
&lt;thead&gt;&lt;tr&gt;
&lt;th&gt;Data attribute&lt;/th&gt;
&lt;th&gt;Schema&lt;/th&gt;
&lt;/tr&gt;&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;:race-id&lt;/td&gt;
&lt;td&gt;(“gsi1_:pk” “gsi1_:f” “main1_:f” “lsi1_:sk”)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;:class-id&lt;/td&gt;
&lt;td&gt;(“gsi1_:pk”)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;:racer-id&lt;/td&gt;
&lt;td&gt;(“gsi1_:f” “main1_:pk” “main1_:sk”)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;:racer-name&lt;/td&gt;
&lt;td&gt;(“main1_:f”)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;:second&lt;/td&gt;
&lt;td&gt;(“main1_:f” “gsi1_:sk”)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;:class-name&lt;/td&gt;
&lt;td&gt;(“gsi1_:f”)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Column prefixes indicate table name, e.g. `main1` refers to the main table, `gsi1` refers to the GSI1 index and so on. Suffix denotes the column type, e.g. `pk` (partition key), `sk` (sort key) or `f` (unindexed field).&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;/colgroup&gt;
&lt;thead&gt;&lt;tr&gt;
&lt;th&gt;Query&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;query::race-id,second-&amp;gt;(“racer-id”)&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;query::racer-id,race-id-&amp;gt;(“second”)&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;query::class-id,second-&amp;gt;(“race-id”)&lt;/td&gt;
&lt;td&gt;30&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;query::racer-id-&amp;gt;(“race-id” “second”)&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;query::class-id-&amp;gt;(“racer-id” “race-id”)&lt;/td&gt;
&lt;td&gt;20000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;storage&lt;/td&gt;
&lt;td&gt;4000000&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;total&lt;/td&gt;
&lt;td&gt;6044000&lt;/td&gt;
&lt;/tr&gt;&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Costs of indivudual queries are listed in the Costs table. Total cost accounts for frequency of each query execution and storage costs,&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;/colgroup&gt;
&lt;thead&gt;&lt;tr&gt;
&lt;th&gt;Table&lt;/th&gt;
&lt;th&gt;Records #&lt;/th&gt;
&lt;/tr&gt;&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;:gsi 1&lt;/td&gt;
&lt;td&gt;2000000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;:lsi 1&lt;/td&gt;
&lt;td&gt;1000000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;:main 1&lt;/td&gt;
&lt;td&gt;1000000&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;Optimized schema&lt;/h2&gt;

&lt;p&gt;From the cost breakdown, it is clear that the most expensive query is &lt;code&gt;class-id-&amp;gt;(“racer-id” “race-id”)&lt;/code&gt;. This is due to the small number of unique class-id values (50) and the large number of records returned. Based on these insights, &lt;b&gt;NoSQL Architect&lt;/b&gt; restructured the indexes, reducing the costs.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;/colgroup&gt;
&lt;thead&gt;&lt;tr&gt;
&lt;th&gt;Data attribute&lt;/th&gt;
&lt;th&gt;Schema&lt;/th&gt;
&lt;/tr&gt;&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;:race-id&lt;/td&gt;
&lt;td&gt;(“gsi1_:f” “main1_:sk” “gsi2_:sk” “gsi2_:pk”)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;:class-id&lt;/td&gt;
&lt;td&gt;(“gsi1_:pk”)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;:racer-id&lt;/td&gt;
&lt;td&gt;(“gsi1_:f” “main1_:pk” “gsi2_:f”)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;:second&lt;/td&gt;
&lt;td&gt;(“gsi1_:sk” “gsi2_:f” “gsi2_:sk”)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;/colgroup&gt;
&lt;thead&gt;&lt;tr&gt;
&lt;th&gt;Query&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;query::race-id,second-&amp;gt;(“racer-id”)&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;query::racer-id,race-id-&amp;gt;(“second”)&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;query::class-id,second-&amp;gt;(“race-id”)&lt;/td&gt;
&lt;td&gt;30&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;query::racer-id-&amp;gt;(“race-id” “second”)&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;query::class-id-&amp;gt;(“racer-id” “race-id”)&lt;/td&gt;
&lt;td&gt;20000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;storage&lt;/td&gt;
&lt;td&gt;3000000&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;total&lt;/td&gt;
&lt;td&gt;5043000&lt;/td&gt;
&lt;/tr&gt;&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Note, the optimized index structure also reduced the amount of records in storage from 4 million to 3 million records total, or &lt;b&gt;25% reduction&lt;/b&gt; in storage costs.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;/colgroup&gt;
&lt;thead&gt;&lt;tr&gt;
&lt;th&gt;Table&lt;/th&gt;
&lt;th&gt;Records #&lt;/th&gt;
&lt;/tr&gt;&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;:gsi 1&lt;/td&gt;
&lt;td&gt;1000000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;:gsi 2&lt;/td&gt;
&lt;td&gt;1000000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;:main 1&lt;/td&gt;
&lt;td&gt;1000000&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This optimized schema results in a &lt;b&gt;25% reduction in storage &lt;/b&gt;costs and an overall &lt;b&gt;16.5% reduction in total &lt;/b&gt;costs compared to the original schema!&lt;/p&gt;

&lt;h2&gt;Summary&lt;/h2&gt;

&lt;p&gt;The optimized schema generated by &lt;b&gt;NoSQL Architect&lt;/b&gt; reduced costs by 16.5% compared to the schema created by an AWS expert. These savings were achieved by taking into account the unique characteristics of the data, such as query frequencies and result sizes.&lt;/p&gt;

&lt;p&gt;Beyond cost, &lt;b&gt;NoSQL Architect&lt;/b&gt; also offers significant time savings, generating the optimized schema in under a minute, whereas manual optimization and testing could take weeks or even months to achieve similar results.&lt;/p&gt;



</description>
      <category>dynamodb</category>
      <category>productivity</category>
      <category>aws</category>
      <category>machinelearning</category>
    </item>
  </channel>
</rss>
