<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Narimasa Sakurai</title>
    <description>The latest articles on DEV Community by Narimasa Sakurai (@wonyx).</description>
    <link>https://dev.to/wonyx</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1406331%2F7531c056-bd75-4f79-95f6-9e7ae3edd444.jpeg</url>
      <title>DEV Community: Narimasa Sakurai</title>
      <link>https://dev.to/wonyx</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/wonyx"/>
    <language>en</language>
    <item>
      <title>Cloudflare AI Challenge: Building a Feed Reader with Workers AI</title>
      <dc:creator>Narimasa Sakurai</dc:creator>
      <pubDate>Sun, 14 Apr 2024 12:51:59 +0000</pubDate>
      <link>https://dev.to/wonyx/cloudflare-ai-challenge-building-a-feed-reader-with-workers-ai-5bd7</link>
      <guid>https://dev.to/wonyx/cloudflare-ai-challenge-building-a-feed-reader-with-workers-ai-5bd7</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/devteam/join-us-for-the-cloudflare-ai-challenge-3000-in-prizes-5f99"&gt;Cloudflare AI Challenge&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Hello there! I'm Narimasa Sakurai(github account: wonyx), a software engineer from Japan.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;I built a simple feed reader that suggest related feed entries.&lt;br&gt;
The app uses &lt;code&gt;@cf/baai/bge-large-en-v1.5&lt;/code&gt; model to generate text embeddings for each feed entries and then uses cosine similarity to suggest related entries.&lt;/p&gt;
&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;



&lt;p&gt;Cloudflare Pages Link&lt;br&gt;
&lt;a href="https://cloudfeed-app.pages.dev/"&gt;https://cloudfeed-app.pages.dev/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Demo video&lt;/p&gt;

&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/4ZC6EXAnXvQ"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  My Code
&lt;/h2&gt;

&lt;p&gt;This is a link to the &lt;a href="https://github.com/wonyx/cloudfeed"&gt;repo&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Journey
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Process
&lt;/h3&gt;

&lt;p&gt;I saw an article about the Cloudflare AI Challenge and decided to participate in this.&lt;br&gt;
First, I tried to get a better understanding of Cloudflare by reading the blogs posted on Developer Week. I was a bit intimidated because I only knew a little about Cloudflare.&lt;br&gt;
I had only used Cloudflare Pages to deploy a static Nextjs site.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What tasks can Workers AI handle?&lt;/li&gt;
&lt;li&gt;What kind of system configuration would be suitable?&lt;/li&gt;
&lt;li&gt;What is the difference between Pages and Workers?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I learned a lot about the differences between Pages and Workers.&lt;br&gt;
Next, what tasks are suitable for AI to solve? I asked myself.&lt;br&gt;
And I decided to create an RSS Feed Reader.&lt;br&gt;
Here's why.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;RSS provides an interface that makes it easy for machines to gather information.&lt;/li&gt;
&lt;li&gt;Since I use an RSS reader every day, I thought it would be nice if AI could suggest articles of my interest for me.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And then, I started to develop the app.&lt;/p&gt;

&lt;p&gt;First, I develop a simple pipeline to fetch feed entries and store them in the D1 using Queue and Cron Triggers.&lt;br&gt;
Second, I tried to evalutate models that are suitable for task suggesting feed entries.&lt;br&gt;
I tried to use Text Generation Models such as &lt;code&gt;llama-2-7b-chat-fp16&lt;/code&gt;, &lt;code&gt;mistral-7b-instruct-v0.2&lt;/code&gt;, &lt;code&gt;gemma-7b-it&lt;/code&gt; but it is difficult to suggest related feed entries.&lt;/p&gt;

&lt;p&gt;So I decided to use &lt;code&gt;@cf/baai/bge-large-en-v1.5&lt;/code&gt; model to generate text embeddings for each feed entries and then use cosine similarity to suggest related entries.&lt;/p&gt;

&lt;h3&gt;
  
  
  Overview
&lt;/h3&gt;

&lt;p&gt;This is an overview of the part of system that use Workers AI and Vectorize.&lt;br&gt;
I think this is not RAG but I reffered &lt;a href="https://developers.cloudflare.com/reference-architecture/diagrams/ai/ai-rag/"&gt;the RAG Architecture&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  Indexing Feed Entries
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz9c1b3z8gom5ihmq4we4.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz9c1b3z8gom5ihmq4we4.jpg" alt="Indexing Feed Entries Diagram" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Cron trigger worker to fetch feed periodically.&lt;/li&gt;
&lt;li&gt;send feed entries to the Worker through the Queue.&lt;/li&gt;
&lt;li&gt;dequeue feed entries from the Queue.&lt;/li&gt;
&lt;li&gt;calculate feed entry vectors from the title and description using the Worker AI.&lt;/li&gt;
&lt;li&gt;store feed entry vectors into the Vectorize.&lt;/li&gt;
&lt;li&gt;store feed entries and feed vectors into the D1.&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;
  
  
  Suggesting Related Entries
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff56wow75htigssv8rbph.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff56wow75htigssv8rbph.jpg" alt="Suggesting Related Entries Diagram" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;User read a feed entry, then browser requests to the Pages to get related entries.&lt;/li&gt;
&lt;li&gt;Pages requests to the Worker to get related entries.&lt;/li&gt;
&lt;li&gt;Workers get the feed entry vectors user read from the D1.&lt;/li&gt;
&lt;li&gt;Workers requests to the Vectorize to get similar feed entries with metadata.&lt;/li&gt;
&lt;li&gt;Workers filter entries by similarity score and get feed entry ids from metadata, get similar feed entries by ids from the D1, and then return them to the Pages.&lt;/li&gt;
&lt;li&gt;Pages return related entries.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  What I Learned
&lt;/h3&gt;

&lt;p&gt;What I learned from this project is following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;How to use Cloudflare Tech Stacks, such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pages&lt;/li&gt;
&lt;li&gt;Workers&lt;/li&gt;
&lt;li&gt;Workers AI&lt;/li&gt;
&lt;li&gt;Vectorize&lt;/li&gt;
&lt;li&gt;D1&lt;/li&gt;
&lt;li&gt;Queue&lt;/li&gt;
&lt;li&gt;Cron Triggers&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;

&lt;p&gt;Evaluate these tech stacks working on Cloudflare&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Nextjs on Pages with App router and Server Actions works fine!&lt;/li&gt;
&lt;li&gt;hono&lt;/li&gt;
&lt;li&gt;I chose hono as a backend framework, instead of &lt;a href="https://blog.cloudflare.com/javascript-native-rpc"&gt;RPC&lt;/a&gt; for now. because RPC doesn't work on &lt;code&gt;nextjs on pages&lt;/code&gt; on my local environment. I think RPC is better for a production to protect workers fetch from the Internet.&lt;/li&gt;
&lt;li&gt;drizzle orm&lt;/li&gt;
&lt;li&gt;almost works fine, but I faced &lt;a href="https://github.com/drizzle-team/drizzle-orm/issues/555"&gt;this issue&lt;/a&gt;. It confused me a little bit. &lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;

&lt;p&gt;vector similarity is hard.&lt;/p&gt;

&lt;p&gt;Difficult to predict behavior compared to full-text search engines. It is difficult to determine if the similarity results are correct.&lt;br&gt;
However, it can resolve similarities that full-text search engines cannot. Full-text search engines can also calculate similarity based on tokens, but with a different approach. Probably need to provide synonyms, etc.&lt;br&gt;
Also difficult to determine the similarity threshold. I set the threshold to 0.66, but I'm not sure if it's the best value.&lt;/p&gt;


&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Proud of
&lt;/h3&gt;

&lt;p&gt;Actually, English is not my first language, so I'm proud of submitting &lt;code&gt;MY FIRST POST&lt;/code&gt; in English.&lt;br&gt;
And I'm proud of joining this Cloudflare AI Challenge.&lt;/p&gt;

&lt;h3&gt;
  
  
  Next
&lt;/h3&gt;

&lt;p&gt;Following are the next steps I plan to take:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Using RPC between the Pages and Workers.&lt;/li&gt;
&lt;li&gt;Try to build RAG Architecture.&lt;/li&gt;
&lt;li&gt;Improve quality.

&lt;ul&gt;
&lt;li&gt;I did not implement these error handling, test code, basic features such as add feed URLs, pagination, and does not work with some feed xml, etc.
&amp;lt;!-- Let us know if your project utilized multiple models per task and/or if your project used three or more task types. If so, you may qualify for our additional prize categories! If not, please remove this section. --&amp;gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Thank you for
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;dev.to&lt;/code&gt; users reading this post.&lt;/li&gt;
&lt;li&gt;Cloudflare for hosting this challenge.&lt;/li&gt;
&lt;li&gt;Github Copilot helping me everything such as writing this post, coding, etc!&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>cloudflarechallenge</category>
      <category>devchallenge</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
