<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Harshil Siyani</title>
    <description>The latest articles on DEV Community by Harshil Siyani (@harshil_siyani).</description>
    <link>https://dev.to/harshil_siyani</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3039736%2Fa4db2f1f-6f2a-49e2-91fb-c14e68d2cbc5.png</url>
      <title>DEV Community: Harshil Siyani</title>
      <link>https://dev.to/harshil_siyani</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/harshil_siyani"/>
    <language>en</language>
    <item>
      <title>Same Prompt. Same Model. Different Output. Every Time.</title>
      <dc:creator>Harshil Siyani</dc:creator>
      <pubDate>Tue, 15 Apr 2025 12:14:16 +0000</pubDate>
      <link>https://dev.to/harshil_siyani/same-prompt-same-model-different-output-every-time-3a3a</link>
      <guid>https://dev.to/harshil_siyani/same-prompt-same-model-different-output-every-time-3a3a</guid>
      <description>&lt;p&gt;AI is changing how we build software.&lt;/p&gt;

&lt;p&gt;But as builders, we’re quietly ignoring a major flaw:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;We don’t test our prompts.
&lt;/li&gt;
&lt;li&gt;We just deploy them… and hope.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Moment It Broke
&lt;/h3&gt;

&lt;p&gt;I was working on an AI feature that relied on a simple prompt to generate short summaries.&lt;/p&gt;

&lt;p&gt;Same model. Same prompt. Temperature 0.1.&lt;/p&gt;

&lt;p&gt;Ran it ten times and got five different outputs. Some subtle. Some wildly off.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcfqk9q5gxuurw6cz621n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcfqk9q5gxuurw6cz621n.png" alt="Why multiple tests is important" width="800" height="374"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It hit me: If your API response is unpredictable, your product is unreliable.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why This Is Getting Worse
&lt;/h3&gt;

&lt;p&gt;AI models are &lt;strong&gt;non-deterministic&lt;/strong&gt;, which means slight differences are expected. But that doesn't mean they're acceptable in production.&lt;/p&gt;

&lt;p&gt;To make matters worse:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Gemini 1.0 is already deprecated&lt;/strong&gt; &lt;/li&gt;
&lt;li&gt;GPT-4.0 could be next&lt;/li&gt;
&lt;li&gt;Each update subtly changes model behavior&lt;/li&gt;
&lt;li&gt;Prompts that worked yesterday might break tomorrow&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj4p6e2uhqpyu3q513n66.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj4p6e2uhqpyu3q513n66.png" alt="Gemini 1.0pro deprecating date" width="800" height="383"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you're building AI-first apps, you're in a loop of:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Test → Fix prompt → Re-test → Cross fingers → Repeat&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That’s hours of work.&lt;br&gt;&lt;br&gt;
Every. Time. A. Model. Changes.&lt;/p&gt;




&lt;h3&gt;
  
  
  So I Built PromptPerf.dev
&lt;/h3&gt;

&lt;p&gt;I needed a tool to help me &lt;em&gt;trust&lt;/em&gt; my AI outputs before shipping.&lt;/p&gt;

&lt;p&gt;PromptPerf.dev is a playground for prompt testing:&lt;/p&gt;

&lt;p&gt;✅ Test your prompt across multiple AI models&lt;br&gt;&lt;br&gt;
✅ Run it at different temperatures&lt;br&gt;&lt;br&gt;
✅ Compare outputs across multiple runs&lt;br&gt;&lt;br&gt;
✅ Track consistency + score against expected answers&lt;/p&gt;

&lt;p&gt;Here’s a sneak peek:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F51me1w4d82jxpy869bc2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F51me1w4d82jxpy869bc2.png" alt="Potential look of the new app" width="800" height="550"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Where We're Headed
&lt;/h3&gt;

&lt;p&gt;Right now, I’m building this in public. It’s early — but focused.&lt;/p&gt;




&lt;h3&gt;
  
  
  Why This Matters
&lt;/h3&gt;

&lt;p&gt;If you're building with LLMs, you know the feeling:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The "it worked locally" moment — but with GPT
&lt;/li&gt;
&lt;li&gt;A broken chain in Langchain or RAG that fails silently
&lt;/li&gt;
&lt;li&gt;Users noticing weird output before you do&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;PromptPerf doesn’t replace model tuning.&lt;br&gt;&lt;br&gt;
It makes &lt;strong&gt;prompt reliability visible&lt;/strong&gt;.&lt;/p&gt;




&lt;h3&gt;
  
  
  💬 I’d Love to Hear From You
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Have you run into inconsistency issues?
&lt;/li&gt;
&lt;li&gt;What’s your current prompt testing workflow (if any)?
&lt;/li&gt;
&lt;li&gt;Should prompt testing be part of CI/CD?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If this resonated, &lt;a href="https://promptperf.dev" rel="noopener noreferrer"&gt;&lt;strong&gt;join the waitlist&lt;/strong&gt;&lt;/a&gt; or just drop your thoughts below — I'd genuinely love feedback as we build.&lt;/p&gt;




&lt;p&gt;🧪 &lt;em&gt;PromptPerf.dev&lt;/em&gt; — build AI products you can trust.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>promptengineering</category>
      <category>buildinpublic</category>
      <category>devdiscuss</category>
    </item>
    <item>
      <title>[Boost]</title>
      <dc:creator>Harshil Siyani</dc:creator>
      <pubDate>Mon, 14 Apr 2025 10:35:41 +0000</pubDate>
      <link>https://dev.to/harshil_siyani/-3ln8</link>
      <guid>https://dev.to/harshil_siyani/-3ln8</guid>
      <description>&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/taradepan/5-ai-tools-to-build-your-first-mvp-in-days-not-months-38aj" class="crayons-story__hidden-navigation-link"&gt;5 AI Tools to Build Your First MVP in Days, Not Months🚀🚀🚀&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/taradepan" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F900098%2F17804f3d-a926-480f-838f-e37d4bd0964d.jpg" alt="taradepan profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/taradepan" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Taradepan R
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Taradepan R
                
              
              &lt;div id="story-author-preview-content-2398002" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/taradepan" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F900098%2F17804f3d-a926-480f-838f-e37d4bd0964d.jpg" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Taradepan R&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/taradepan/5-ai-tools-to-build-your-first-mvp-in-days-not-months-38aj" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Apr 11 '25&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/taradepan/5-ai-tools-to-build-your-first-mvp-in-days-not-months-38aj" id="article-link-2398002"&gt;
          5 AI Tools to Build Your First MVP in Days, Not Months🚀🚀🚀
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/ai"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;ai&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/startup"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;startup&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/programming"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;programming&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/productivity"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;productivity&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/taradepan/5-ai-tools-to-build-your-first-mvp-in-days-not-months-38aj" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/multi-unicorn-b44d6f8c23cdd00964192bedc38af3e82463978aa611b4365bd33a0f1f4f3e97.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/fire-f60e7a582391810302117f987b22a8ef04a2fe0df7e3258a5f49332df1cec71e.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;216&lt;span class="hidden s:inline"&gt; reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/taradepan/5-ai-tools-to-build-your-first-mvp-in-days-not-months-38aj#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              Comments


              18&lt;span class="hidden s:inline"&gt; comments&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            7 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;


</description>
      <category>ai</category>
      <category>startup</category>
      <category>programming</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Day3: 10 signups.</title>
      <dc:creator>Harshil Siyani</dc:creator>
      <pubDate>Mon, 14 Apr 2025 10:32:07 +0000</pubDate>
      <link>https://dev.to/harshil_siyani/day3-10-signups-4944</link>
      <guid>https://dev.to/harshil_siyani/day3-10-signups-4944</guid>
      <description>&lt;p&gt;I have been sharing about the problem I'm working to solve around AI Prompt optimisation online on IndieHacker, Product Hunt, X and discord/Slack groups.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe2uc5yl6se89lkuker75.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe2uc5yl6se89lkuker75.png" alt="Image description" width="800" height="550"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;First what am I building?&lt;br&gt;
building &lt;a href="https://promptperf.dev/" rel="noopener noreferrer"&gt;Promptperf.dev&lt;/a&gt; this is a platform that will automate promt testing. Why is this important. As more and more AI models are available its hard to find out which model will provide the best performance and consistent output. My tool will allow users run automated tests against multiple models and configs and most importants on multiple runs. How is this useful? Take this as an example: user wants to test 4 AI models at temperatures 0.1, 0.4, 0.7 and 1.0 and has 3 prompts they want to test. To ensure consistency each config to be ran 10 times to ensure its not hallucinated. (4 Models x 4 temp x 3 prompts x 10 runs = 480 API Calls and manually entering results into some database/excel to compare the results)&lt;/p&gt;

&lt;p&gt;Why now? AI model providers are starting to deprecate models eg. Gemini 1.0 is being deprecated which means all the apps running on Gemini 1.0 will now need to test a new model and just doing a model swap to the newer model doesnt work. So prompt testing will be required to find the next AI model that will easily be swapped in its place.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Febkb06db1wz7vds48i4l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Febkb06db1wz7vds48i4l.png" alt="Image description" width="800" height="662"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let me share which platform has helped me get the most traction:&lt;/p&gt;

&lt;p&gt;Day 1 first shared this on IndieHackers and X got 2 signups and about 30 visitors,&lt;/p&gt;

&lt;p&gt;Next day I looked for online communities on discord and Slack: Most of these communities had showcase channel where I can promote the product but others had introductions which I got warnings from moderator when I shared the product Im building. I also replied to quite alot of X tweets. I created a list of potential AI influencers and started to reply to their tweets and where I could I plugged in PromptPerf.dev but I also updated my bio to ensure it directly users to the product from the profile. This provided another 2 signups, unsure from where but website visitors were up to 70 at this point&lt;/p&gt;

&lt;p&gt;Day 3: Focused entirely on X and replied back to multiple accounts not focusing on the product but just replying anything AI related which I assume made me gain trust in the AI audience and got quite alot of follows which could have directed some visitors to the product. Here I got to 10 subscribers who had opted in.&lt;/p&gt;

&lt;p&gt;Now I had created tags on the signup form for users to select if they are interested in helping with feedback or get early access. Heres the breakdown:&lt;br&gt;
5 for early adopters, 4 for early adopter and provide feedback, 1 for just feedback.&lt;/p&gt;

&lt;p&gt;Do you think the problem I'm working to solve is real?&lt;/p&gt;

</description>
      <category>ai</category>
      <category>buildinpublic</category>
      <category>sass</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
