<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: guoliwu</title>
    <description>The latest articles on DEV Community by guoliwu (@guoliwu).</description>
    <link>https://dev.to/guoliwu</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1023584%2F81a65458-e417-4a5b-9726-19e14e05dd20.png</url>
      <title>DEV Community: guoliwu</title>
      <link>https://dev.to/guoliwu</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/guoliwu"/>
    <language>en</language>
    <item>
      <title>How to Manage Jina Resources with Namespace</title>
      <dc:creator>guoliwu</dc:creator>
      <pubDate>Fri, 24 Feb 2023 07:52:56 +0000</pubDate>
      <link>https://dev.to/guoliwu/how-to-manage-jina-resources-with-namespace-1hhb</link>
      <guid>https://dev.to/guoliwu/how-to-manage-jina-resources-with-namespace-1hhb</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh1y89ranb69zmiv7sl0y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh1y89ranb69zmiv7sl0y.png" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Over the past year, we’ve been rapidly expanding Jina AI Cloud, starting with our &lt;a href="https://cloud.jina.ai/" rel="noopener noreferrer"&gt;Executor Hub&lt;/a&gt;, and now encompassing &lt;a href="https://docarray.jina.ai/fundamentals/cloud-support/data/" rel="noopener noreferrer"&gt;DocumentArray storage&lt;/a&gt;, &lt;a href="https://docs.jina.ai/fundamentals/jcloud/" rel="noopener noreferrer"&gt;hosted Flows&lt;/a&gt; and &lt;a href="https://now.jina.ai/" rel="noopener noreferrer"&gt;cloud apps&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Jina AI Cloud：&lt;a href="https://cloud.jina.ai/" rel="noopener noreferrer"&gt;https://cloud.jina.ai/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That’s a lot of stuff to manage! We’re introducing &lt;strong&gt;user namespaces&lt;/strong&gt; to make things easier for all our users.&lt;/p&gt;

&lt;h2&gt;
  
  
  Namespace
&lt;/h2&gt;

&lt;p&gt;Previously, if Alice and Bob both wanted to push a DocumentArray called &lt;code&gt;fashion-mnist&lt;/code&gt;, whoever pushed first would get the name. That means Bob might have to go with &lt;code&gt;fashion-mnist2&lt;/code&gt; or similar. With more people using Jina AI Cloud, naming conflicts could become commonplace.&lt;/p&gt;

&lt;p&gt;With user namespaces, both Alice and Bob can have their own &lt;code&gt;fashion_mnist&lt;/code&gt; DocumentArrays (or Executors with the same name) with no fear of naming conflicts.&lt;/p&gt;

&lt;p&gt;The new namespace apply to two important resources inside Jina AI ecosystem: &lt;strong&gt;DocumentArray&lt;/strong&gt; and &lt;strong&gt;Executor&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Schema
&lt;/h3&gt;

&lt;p&gt;Moving forwards, names for DocumentArrays and Hub Executors will follow the new schema, &lt;code&gt;namespace/resource&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4uo6lrkq10r6sc77yxpa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4uo6lrkq10r6sc77yxpa.png" alt=" " width="800" height="211"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;💡Note that Executors now use &lt;code&gt;jinaai://&lt;/code&gt; as the prefix (not &lt;code&gt;jinahub://&lt;/code&gt;) and no longer need secrets (as you already logged in).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Access scope
&lt;/h3&gt;

&lt;p&gt;For both DocumentArray and Executor, the user enjoys the following accessibility:&lt;/p&gt;

&lt;h3&gt;
  
  
  Does this break anything?
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;There are no breaking changes for existing Executors on Executor Hub.&lt;/li&gt;
&lt;li&gt;You can still pull old DocumentArrays by their original name, but you can’t update them. Newly-pushed DocumentArrays must follow the &lt;code&gt;username/da-name&lt;/code&gt; scheme.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Managing your resources under the namespace
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Manage DocumentArrays
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Create a Jina AI account at &lt;a href="http://cloud.jina.ai/" rel="noopener noreferrer"&gt;cloud.jina.ai&lt;/a&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fibfmw3zlglcs9wxctvad.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fibfmw3zlglcs9wxctvad.png" width="445" height="743"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://jina.ai/news/docarray-0-19-1-update/" rel="noopener noreferrer"&gt;Upgrade &lt;code&gt;docarray&lt;/code&gt; to &lt;code&gt;&amp;gt;=0.19.1&lt;/code&gt; version&lt;/a&gt; with &lt;code&gt;pip install -U docarray&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Log in via &lt;code&gt;docarray.login()&lt;/code&gt; and start pushing and pulling DocumentArrays:&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;docarray&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DocumentArray&lt;/span&gt;

&lt;span class="n"&gt;docarray&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;login&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# log in as 'alice'
&lt;/span&gt;
&lt;span class="n"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;DocumentArray&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pull&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;alice/fashion_mnist&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;alice/fashion_mnist_updated&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Manage your pushed DocumentArrays on Jina AI Cloud in the "Storage" tab:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdox1fn1paihuzsa7ruse.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdox1fn1paihuzsa7ruse.png" alt="Untitled" width="800" height="279"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Manage Executors
&lt;/h3&gt;

&lt;p&gt;Upgrade &lt;code&gt;jina&lt;/code&gt; to the latest version with &lt;code&gt;pip install -U jina&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;1.Create a Jina AI account at &lt;a href="http://cloud.jina.ai/" rel="noopener noreferrer"&gt;cloud.jina.ai&lt;/a&gt;.&lt;br&gt;
2.Use an existing Executor in your Flow (in Python):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;jina&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Flow&lt;/span&gt;

&lt;span class="n"&gt;flow&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Flow&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uses&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;jinaai+docker://alice/MyExecutor&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# note jinaai
&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;flow&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;3.Or push your own Executor to Executor Hub (from the CLI):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;jina auth login
jina hub push MyExecutor
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;4.Manage your Executors on Executor Hub on the "Executors" tab:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe712o6zpewcd5objhinr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe712o6zpewcd5objhinr.png" width="800" height="360"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>discuss</category>
    </item>
    <item>
      <title>This Week(s) in DocArray</title>
      <dc:creator>guoliwu</dc:creator>
      <pubDate>Thu, 23 Feb 2023 11:10:52 +0000</pubDate>
      <link>https://dev.to/guoliwu/this-weeks-in-docarray-485n</link>
      <guid>https://dev.to/guoliwu/this-weeks-in-docarray-485n</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1e09izeu9zx9zq4q5cbj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1e09izeu9zx9zq4q5cbj.png" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It’s already been a month since the &lt;a href="https://github.com/docarray/docarray/releases/tag/2023.01.18.alpha" rel="noopener noreferrer"&gt;last alpha release&lt;/a&gt; of DocArray v2. Since then a lot has happened: we’ve merged features that we’re really proud of and keep crying tears of joy and misery trying to coerce Python into doing what we want. If you want to learn about interesting Python edge cases or follow the advancement of DocArray v2 development then you’re at the right place in this dev blog!&lt;/p&gt;

&lt;p&gt;For those who don’t know, DocArray is a library for &lt;strong&gt;representing, sending, and storing multi-modal data&lt;/strong&gt;, with a focus on applications in &lt;strong&gt;ML&lt;/strong&gt; and &lt;strong&gt;Neural Search.&lt;/strong&gt; The project just moved to the Linux foundation AI and Data and to celebrate its first birthday we decided to rewrite it from scratch, mainly because of a design shift and a will to solidify the codebase from the ground up.&lt;/p&gt;

&lt;h2&gt;
  
  
  MultiModalDataset
&lt;/h2&gt;

&lt;p&gt;As part of our goal to make DocArray the go-to library for representing, sending, and storing multi-modal data, we‘ve added a &lt;code&gt;MultiModalDataset&lt;/code&gt; class to easily convert DocumentArrays into PyTorch Dataset compliant datasets that can be used in the PyTorch DataLoader.&lt;/p&gt;

&lt;p&gt;All you need is a DocumentArray and a dictionary of preprocessing functions and you’re up and running!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;docarray&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseDocument&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;DocumentArray&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;docarray.data&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MultiModalDataset&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;docarray.documents&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Text&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;torch.utils.data&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DataLoader&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Thesis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseDocument&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Text&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Student&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseDocument&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;thesis&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Thesis&lt;/span&gt;

&lt;span class="n"&gt;da&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;DocumentArray&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Student&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_students&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;ds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;MultiModalDataset&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Student&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MultiModalDataset&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Student&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;
    &lt;span class="n"&gt;da&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;preprocessing&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;thesis.title&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;embed_title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;thesis&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;normalize_embedding&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;loader&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;DataLoader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DataLoader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;ds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;collate_fn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;MultiModalDataset&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Student&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;collate_fn&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Use your loader just like any other dataloader for awesome DL training
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you’re interested in using DocArray for training, check out our &lt;a href="https://github.com/docarray/docarray/blob/feat-rewrite-v2/docs/tutorials/multimodal_training_and_serving.md" rel="noopener noreferrer"&gt;example notebook&lt;/a&gt;, or take a peek at &lt;a href="https://github.com/docarray/docarray/pull/1049" rel="noopener noreferrer"&gt;implementation details of MultiModalDataset&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  TensorFlow support
&lt;/h2&gt;

&lt;p&gt;After recently adding PyTorch support, we’ve now gone on to add TensorFlow support to DocArray v2. Like with PyTorch, we planned on subclassing the &lt;code&gt;tensorflow.Tensor&lt;/code&gt; class with our &lt;code&gt;TensorFlowTensor&lt;/code&gt; class. By doing so we want to allow DocArray to run operations on it while also being able to hand over our &lt;code&gt;TensorFlowTensor&lt;/code&gt; instance to ML models or TensorFlow functions without TensorFlow being confused about this instance’s class but instead recognizing it as its own. Since we implemented this for PyTorch already, this should be easy, right?&lt;/p&gt;

&lt;p&gt;But stop, not so fast. At first glance, TensorFlow tensors seem to be of class &lt;code&gt;tf.Tensor&lt;/code&gt;, right?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tensorflow&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;tf&lt;/span&gt;

&lt;span class="n"&gt;tensor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;zeros&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,))&lt;/span&gt;
&lt;span class="n"&gt;tensor&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;tf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Tensor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,),&lt;/span&gt; &lt;span class="n"&gt;dtype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;float32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mf"&gt;0.&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;dtype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;float32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When trying to subclass &lt;code&gt;tf.Tensor&lt;/code&gt; though, we notice that this does not seem to work:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Union&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cast&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tensorflow&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;tf&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;docarray.typing.tensor.abstract_tensor&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AbstractTensor&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pydantic.tools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;parse_obj_as&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TensorFlowTensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AbstractTensor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Tensor&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nd"&gt;@classmethod&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;validate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;field&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;TensorFlowTensor&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Tensor&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__class__&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cls&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;cast&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TensorFlowTensor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Expected a tf.Tensor, got &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;our_tensor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;parse_obj_as&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TensorFlowTensor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;zeros&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,)))&lt;/span&gt;  &lt;span class="c1"&gt;# will fail
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Parsing a &lt;code&gt;tf.Tensor&lt;/code&gt; as &lt;code&gt;TensorFlowTensor&lt;/code&gt; will fail:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;pydantic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;error_wrappers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ValidationError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="n"&gt;validation&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;ParsingModel&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;TensorFlowTensor&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;__root__&lt;/span&gt;
  &lt;span class="n"&gt;__class__&lt;/span&gt; &lt;span class="n"&gt;assignment&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;TensorFlowTensor&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="nb"&gt;object&lt;/span&gt; &lt;span class="n"&gt;layout&lt;/span&gt; &lt;span class="n"&gt;differs&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;tensorflow.python.framework.ops.EagerTensor&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;type_error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But wait, here they talk about an &lt;code&gt;EagerTensor&lt;/code&gt;, not &lt;code&gt;tf.Tensor&lt;/code&gt;. This is because TensorFlow actually supports eager execution and as well as graph execution. It defaults to eager execution, where operations are evaluated immediately. In graph execution, a computational graph is constructed for later evaluation.&lt;/p&gt;

&lt;p&gt;So maybe we just need to extend TensorFlow’s &lt;code&gt;EagerTensor&lt;/code&gt; then!&lt;/p&gt;

&lt;p&gt;This, however, doesn’t work either, because the class &lt;code&gt;EagerTensor&lt;/code&gt; is created on the fly, which is why trying to extend this class will fail with:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;TypeError: type 'tensorflow.python.framework.ops.EagerTensor' is not an acceptable base type&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;With all that being said, we’ve decided to go with the following solution for now:&lt;/p&gt;

&lt;p&gt;Instead of extending TensorFlow’s tensor, we store a &lt;code&gt;tf.Tensor&lt;/code&gt; instance as an attribute of our &lt;code&gt;TensorFlowTensor&lt;/code&gt; class. Therefore if you want to perform operations on the tensor data or hand it over to your ML model, you have to explicitly access the &lt;code&gt;.tensor&lt;/code&gt; attribute:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tensorflow&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;tf&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;docarray.typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TensorFlowTensor&lt;/span&gt;

&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TensorFlowTensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tensor&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;zeros&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mi"&gt;224&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;224&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;

&lt;span class="c1"&gt;# tensorflow functions
&lt;/span&gt;&lt;span class="n"&gt;broadcasted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;broadcast_to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;224&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;224&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;broadcasted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;broadcast_to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;unwrap&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;224&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;224&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;broadcasted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;broadcast_to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;224&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;224&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;  &lt;span class="c1"&gt;# this will fail
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In future we plan to take a closer look and find a solution that enables handling &lt;code&gt;TensorFlowTensor&lt;/code&gt;s just like our &lt;code&gt;TorchTensor&lt;/code&gt;s. In particular, we plan to investigate if there’s an equivalent in TensorFlow to Torch’s &lt;code&gt;__torch_function__()&lt;/code&gt;, which we told you about in the &lt;a href="https://jina.ai/news/this-week-in-docarray-1" rel="noopener noreferrer"&gt;previous blog post&lt;/a&gt;. With such an equivalent and some tricks here and there we hope to enable smooth usage or our &lt;code&gt;TensorFlowTensor&lt;/code&gt; class and make it feel like it’s a subclass of TensorFlow’s tensor, without it actually being one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Nested class and multiprocessing
&lt;/h2&gt;

&lt;p&gt;As part of our goal to make DocArray the go-to library for representing, sending, and storing multi-modal data, it’s important that DocumentArrays support multiprocessing, namely processing on multi-CPU cores.&lt;/p&gt;

&lt;p&gt;In particular, we recently implemented a &lt;code&gt;MultiModalDataset&lt;/code&gt; class to easily convert a DocumentArray into a dataset that can be used in the PyTorch DataLoader. The PyTorch DataLoader wraps the Python multiprocessing module to implement preprocessing with multiple CPUs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The problem&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;One of the well-known issues with multiprocessing is that it doesn’t support classes that are declared inside a function:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_class&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;B&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="bp"&gt;...&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;

&lt;span class="n"&gt;MyClass&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_class&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;MyClass&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;multiprocessing&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;mp&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;mp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;fork&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nc"&gt;Pool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Traceback &lt;span class="o"&gt;(&lt;/span&gt;most recent call last&lt;span class="o"&gt;)&lt;/span&gt;:
  File &lt;span class="s2"&gt;"/Users/jackmin/Jina/docarray/meow.py"&lt;/span&gt;, line 13, &lt;span class="k"&gt;in&lt;/span&gt; &amp;lt;module&amp;gt;
    print&lt;span class="o"&gt;(&lt;/span&gt;p.map&lt;span class="o"&gt;(&lt;/span&gt;foo, range&lt;span class="o"&gt;(&lt;/span&gt;2&lt;span class="o"&gt;)))&lt;/span&gt;
  File &lt;span class="s2"&gt;"/Users/jackmin/miniconda3/envs/docarray/lib/python3.10/multiprocessing/pool.py"&lt;/span&gt;, line 367, &lt;span class="k"&gt;in &lt;/span&gt;map
    &lt;span class="k"&gt;return &lt;/span&gt;self._map_async&lt;span class="o"&gt;(&lt;/span&gt;func, iterable, mapstar, chunksize&lt;span class="o"&gt;)&lt;/span&gt;.get&lt;span class="o"&gt;()&lt;/span&gt;
  File &lt;span class="s2"&gt;"/Users/jackmin/miniconda3/envs/docarray/lib/python3.10/multiprocessing/pool.py"&lt;/span&gt;, line 774, &lt;span class="k"&gt;in &lt;/span&gt;get
    raise self._value
multiprocessing.pool.MaybeEncodingError: Error sending result: &lt;span class="s1"&gt;'[&amp;lt;__main__.get_class.&amp;lt;locals&amp;gt;.B object at 0x10152e950&amp;gt;]'&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt; Reason: &lt;span class="s1"&gt;'AttributeError("Can'&lt;/span&gt;t pickle &lt;span class="nb"&gt;local &lt;/span&gt;object &lt;span class="s1"&gt;'get_class.&amp;lt;locals&amp;gt;.B'&lt;/span&gt;&lt;span class="s2"&gt;")'
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Pickling&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is because multiprocessing uses pickle to share objects with workers. Pickling only saves the qualified class name of an object and unpickling requires re-importing the class by its qualified class name. For that to work, the class needs a global qualified name. Classes defined by functions are local and thus cannot be pickled:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_class&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;B&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="bp"&gt;...&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;

&lt;span class="n"&gt;MyClass&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_class&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pickle&lt;/span&gt;

&lt;span class="n"&gt;pickle&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dump&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MyClass&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;meow.pkl&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;wb&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Traceback &lt;span class="o"&gt;(&lt;/span&gt;most recent call last&lt;span class="o"&gt;)&lt;/span&gt;:
  File &lt;span class="s2"&gt;"/Users/jackmin/Jina/docarray/meow.py"&lt;/span&gt;, line 10, &lt;span class="k"&gt;in&lt;/span&gt; &amp;lt;module&amp;gt;
    pickle.dump&lt;span class="o"&gt;(&lt;/span&gt;MyClass&lt;span class="o"&gt;()&lt;/span&gt;, open&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"meow.pkl"&lt;/span&gt;, &lt;span class="s2"&gt;"wb"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
AttributeError: Can&lt;span class="s1"&gt;'t pickle local object '&lt;/span&gt;get_class.&amp;lt;locals&amp;gt;.B&lt;span class="s1"&gt;'
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In order to get around this, we need to make the declared class global:So maybe we just need to extend TensorFlow’s &lt;code&gt;EagerTensor&lt;/code&gt; then!&lt;/p&gt;

&lt;p&gt;This, however, doesn’t work either, because the class &lt;code&gt;EagerTensor&lt;/code&gt; is created on the fly, which is why trying to extend this class will fail with:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;TypeError: type 'tensorflow.python.framework.ops.EagerTensor' is not an acceptable base type&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;With all that being said, we’ve decided to go with the following solution for now:&lt;/p&gt;

&lt;p&gt;Instead of extending TensorFlow’s tensor, we store a &lt;code&gt;tf.Tensor&lt;/code&gt; instance as an attribute of our &lt;code&gt;TensorFlowTensor&lt;/code&gt; class. Therefore if you want to perform operations on the tensor data or hand it over to your ML model, you have to explicitly access the &lt;code&gt;.tensor&lt;/code&gt; attribute:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tensorflow&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;tf&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;docarray.typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TensorFlowTensor&lt;/span&gt;

&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TensorFlowTensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tensor&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;zeros&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mi"&gt;224&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;224&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;

&lt;span class="c1"&gt;# tensorflow functions
&lt;/span&gt;&lt;span class="n"&gt;broadcasted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;broadcast_to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;224&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;224&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;broadcasted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;broadcast_to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;unwrap&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;224&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;224&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;broadcasted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;broadcast_to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;224&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;224&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;  &lt;span class="c1"&gt;# this will fail
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In future we plan to take a closer look and find a solution that enables handling &lt;code&gt;TensorFlowTensor&lt;/code&gt;s just like our &lt;code&gt;TorchTensor&lt;/code&gt;s. In particular, we plan to investigate if there’s an equivalent in TensorFlow to Torch’s &lt;code&gt;__torch_function__()&lt;/code&gt;, which we told you about in the &lt;a href="https://jina.ai/news/this-week-in-docarray-1" rel="noopener noreferrer"&gt;previous blog post&lt;/a&gt;. With such an equivalent and some tricks here and there we hope to enable smooth usage or our &lt;code&gt;TensorFlowTensor&lt;/code&gt; class and make it feel like it’s a subclass of TensorFlow’s tensor, without it actually being one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Nested class and multiprocessing
&lt;/h2&gt;

&lt;p&gt;As part of our goal to make DocArray the go-to library for representing, sending, and storing multi-modal data, it’s important that DocumentArrays support multiprocessing, namely processing on multi-CPU cores.&lt;/p&gt;

&lt;p&gt;In particular, we recently implemented a &lt;code&gt;MultiModalDataset&lt;/code&gt; class to easily convert a DocumentArray into a dataset that can be used in the PyTorch DataLoader. The PyTorch DataLoader wraps the Python multiprocessing module to implement preprocessing with multiple CPUs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The problem&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;One of the well-known issues with multiprocessing is that it doesn’t support classes that are declared inside a function:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_class&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;B&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="bp"&gt;...&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;

&lt;span class="n"&gt;MyClass&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_class&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;MyClass&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;multiprocessing&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;mp&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;mp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;fork&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nc"&gt;Pool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Traceback &lt;span class="o"&gt;(&lt;/span&gt;most recent call last&lt;span class="o"&gt;)&lt;/span&gt;:
  File &lt;span class="s2"&gt;"/Users/jackmin/Jina/docarray/meow.py"&lt;/span&gt;, line 13, &lt;span class="k"&gt;in&lt;/span&gt; &amp;lt;module&amp;gt;
    print&lt;span class="o"&gt;(&lt;/span&gt;p.map&lt;span class="o"&gt;(&lt;/span&gt;foo, range&lt;span class="o"&gt;(&lt;/span&gt;2&lt;span class="o"&gt;)))&lt;/span&gt;
  File &lt;span class="s2"&gt;"/Users/jackmin/miniconda3/envs/docarray/lib/python3.10/multiprocessing/pool.py"&lt;/span&gt;, line 367, &lt;span class="k"&gt;in &lt;/span&gt;map
    &lt;span class="k"&gt;return &lt;/span&gt;self._map_async&lt;span class="o"&gt;(&lt;/span&gt;func, iterable, mapstar, chunksize&lt;span class="o"&gt;)&lt;/span&gt;.get&lt;span class="o"&gt;()&lt;/span&gt;
  File &lt;span class="s2"&gt;"/Users/jackmin/miniconda3/envs/docarray/lib/python3.10/multiprocessing/pool.py"&lt;/span&gt;, line 774, &lt;span class="k"&gt;in &lt;/span&gt;get
    raise self._value
multiprocessing.pool.MaybeEncodingError: Error sending result: &lt;span class="s1"&gt;'[&amp;lt;__main__.get_class.&amp;lt;locals&amp;gt;.B object at 0x10152e950&amp;gt;]'&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt; Reason: &lt;span class="s1"&gt;'AttributeError("Can'&lt;/span&gt;t pickle &lt;span class="nb"&gt;local &lt;/span&gt;object &lt;span class="s1"&gt;'get_class.&amp;lt;locals&amp;gt;.B'&lt;/span&gt;&lt;span class="s2"&gt;")'
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Pickling&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is because multiprocessing uses pickle to share objects with workers. Pickling only saves the qualified class name of an object and unpickling requires re-importing the class by its qualified class name. For that to work, the class needs a global qualified name. Classes defined by functions are local and thus cannot be pickled:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_class&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;B&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="bp"&gt;...&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;

&lt;span class="n"&gt;MyClass&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_class&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pickle&lt;/span&gt;

&lt;span class="n"&gt;pickle&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dump&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MyClass&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;meow.pkl&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;wb&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Traceback &lt;span class="o"&gt;(&lt;/span&gt;most recent call last&lt;span class="o"&gt;)&lt;/span&gt;:
  File &lt;span class="s2"&gt;"/Users/jackmin/Jina/docarray/meow.py"&lt;/span&gt;, line 10, &lt;span class="k"&gt;in&lt;/span&gt; &amp;lt;module&amp;gt;
    pickle.dump&lt;span class="o"&gt;(&lt;/span&gt;MyClass&lt;span class="o"&gt;()&lt;/span&gt;, open&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"meow.pkl"&lt;/span&gt;, &lt;span class="s2"&gt;"wb"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
AttributeError: Can&lt;span class="s1"&gt;'t pickle local object '&lt;/span&gt;get_class.&amp;lt;locals&amp;gt;.B&lt;span class="s1"&gt;'
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In order to get around this, we need to make the declared class global:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_class&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;global&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;

    &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;B&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="bp"&gt;...&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;

&lt;span class="n"&gt;MyClass&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_class&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pickle&lt;/span&gt;

&lt;span class="n"&gt;pickle&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dump&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MyClass&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;meow.pkl&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;wb&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can now load the pickles in a separate process as long as the process has a declaration of our class:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_class&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;global&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;

    &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;B&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="bp"&gt;...&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;

&lt;span class="n"&gt;MyClass&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_class&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pickle&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pickle&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;meow.pkl&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;rb&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It doesn’t really matter how it ends up in the global scope. We can even do this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;B&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="bp"&gt;...&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pickle&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pickle&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;meow.pkl&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;rb&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The fix?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Ok. It just wants it to be global. Simple enough right? Let’s just plop &lt;code&gt;global&lt;/code&gt; in front of our declaration and be done with it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_class&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;global&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;

    &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;B&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="bp"&gt;...&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;

&lt;span class="n"&gt;MyClass&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_class&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;MyClass&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;multiprocessing&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;mp&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;mp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;fork&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nc"&gt;Pool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Yay this runs fine. But, what if our function returns a different class depending on the input arguments? I mean, why else would I want to return a class from a function?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_class&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;global&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;

    &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;B&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;VERSION&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;

&lt;span class="n"&gt;C1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_class&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;C2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_class&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_version&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;VERSION&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;multiprocessing&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;mp&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;mp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;fork&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nc"&gt;Pool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;get_version&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;C1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;C2&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&amp;lt;class &lt;span class="s1"&gt;'__main__.B'&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
Traceback &lt;span class="o"&gt;(&lt;/span&gt;most recent call last&lt;span class="o"&gt;)&lt;/span&gt;:
  File &lt;span class="s2"&gt;"/Users/jackmin/Jina/docarray/meow.py"&lt;/span&gt;, line 19, &lt;span class="k"&gt;in&lt;/span&gt; &amp;lt;module&amp;gt;
    print&lt;span class="o"&gt;(&lt;/span&gt;p.map&lt;span class="o"&gt;(&lt;/span&gt;get_version, &lt;span class="o"&gt;[&lt;/span&gt;C1, C2]&lt;span class="o"&gt;))&lt;/span&gt;
  File &lt;span class="s2"&gt;"/Users/jackmin/miniconda3/envs/docarray/lib/python3.10/multiprocessing/pool.py"&lt;/span&gt;, line 367, &lt;span class="k"&gt;in &lt;/span&gt;map
    &lt;span class="k"&gt;return &lt;/span&gt;self._map_async&lt;span class="o"&gt;(&lt;/span&gt;func, iterable, mapstar, chunksize&lt;span class="o"&gt;)&lt;/span&gt;.get&lt;span class="o"&gt;()&lt;/span&gt;
  File &lt;span class="s2"&gt;"/Users/jackmin/miniconda3/envs/docarray/lib/python3.10/multiprocessing/pool.py"&lt;/span&gt;, line 774, &lt;span class="k"&gt;in &lt;/span&gt;get
    raise self._value
  File &lt;span class="s2"&gt;"/Users/jackmin/miniconda3/envs/docarray/lib/python3.10/multiprocessing/pool.py"&lt;/span&gt;, line 540, &lt;span class="k"&gt;in &lt;/span&gt;_handle_tasks
    put&lt;span class="o"&gt;(&lt;/span&gt;task&lt;span class="o"&gt;)&lt;/span&gt;
  File &lt;span class="s2"&gt;"/Users/jackmin/miniconda3/envs/docarray/lib/python3.10/multiprocessing/connection.py"&lt;/span&gt;, line 211, &lt;span class="k"&gt;in &lt;/span&gt;send
    self._send_bytes&lt;span class="o"&gt;(&lt;/span&gt;_ForkingPickler.dumps&lt;span class="o"&gt;(&lt;/span&gt;obj&lt;span class="o"&gt;))&lt;/span&gt;
  File &lt;span class="s2"&gt;"/Users/jackmin/miniconda3/envs/docarray/lib/python3.10/multiprocessing/reduction.py"&lt;/span&gt;, line 51, &lt;span class="k"&gt;in &lt;/span&gt;dumps
    cls&lt;span class="o"&gt;(&lt;/span&gt;buf, protocol&lt;span class="o"&gt;)&lt;/span&gt;.dump&lt;span class="o"&gt;(&lt;/span&gt;obj&lt;span class="o"&gt;)&lt;/span&gt;
_pickle.PicklingError: Can&lt;span class="s1"&gt;'t pickle &amp;lt;class '&lt;/span&gt;__main__.B&lt;span class="s1"&gt;'&amp;gt;: it'&lt;/span&gt;s not the same object as __main__.B
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;Can't pickle &amp;lt;class '__main__.B'&amp;gt;: it's not the same object as __main__.B&lt;/code&gt;. What does that mean?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Double declaration&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Well, our little trick has some caveats. By performing a global declaration, we’re essentially taking the class declaration out into the top-level scope. This means we’re essentially doing this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;B&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;VERSION&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

&lt;span class="n"&gt;C1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;B&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;VERSION&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;

&lt;span class="n"&gt;C2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_version&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;VERSION&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;multiprocessing&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;mp&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;mp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;fork&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nc"&gt;Pool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;get_version&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;C1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;C2&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If we run this code, we get the exact same error we got before:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&amp;lt;class &lt;span class="s1"&gt;'__main__.B'&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
Traceback &lt;span class="o"&gt;(&lt;/span&gt;most recent call last&lt;span class="o"&gt;)&lt;/span&gt;:
  File &lt;span class="s2"&gt;"/Users/jackmin/Jina/docarray/wow.py"&lt;/span&gt;, line 15, &lt;span class="k"&gt;in&lt;/span&gt; &amp;lt;module&amp;gt;
    print&lt;span class="o"&gt;(&lt;/span&gt;p.map&lt;span class="o"&gt;(&lt;/span&gt;get_version, &lt;span class="o"&gt;[&lt;/span&gt;C1, C2]&lt;span class="o"&gt;))&lt;/span&gt;
  File &lt;span class="s2"&gt;"/Users/jackmin/miniconda3/envs/docarray/lib/python3.10/multiprocessing/pool.py"&lt;/span&gt;, line 367, &lt;span class="k"&gt;in &lt;/span&gt;map
    &lt;span class="k"&gt;return &lt;/span&gt;self._map_async&lt;span class="o"&gt;(&lt;/span&gt;func, iterable, mapstar, chunksize&lt;span class="o"&gt;)&lt;/span&gt;.get&lt;span class="o"&gt;()&lt;/span&gt;
  File &lt;span class="s2"&gt;"/Users/jackmin/miniconda3/envs/docarray/lib/python3.10/multiprocessing/pool.py"&lt;/span&gt;, line 774, &lt;span class="k"&gt;in &lt;/span&gt;get
    raise self._value
  File &lt;span class="s2"&gt;"/Users/jackmin/miniconda3/envs/docarray/lib/python3.10/multiprocessing/pool.py"&lt;/span&gt;, line 540, &lt;span class="k"&gt;in &lt;/span&gt;_handle_tasks
    put&lt;span class="o"&gt;(&lt;/span&gt;task&lt;span class="o"&gt;)&lt;/span&gt;
  File &lt;span class="s2"&gt;"/Users/jackmin/miniconda3/envs/docarray/lib/python3.10/multiprocessing/connection.py"&lt;/span&gt;, line 211, &lt;span class="k"&gt;in &lt;/span&gt;send
    self._send_bytes&lt;span class="o"&gt;(&lt;/span&gt;_ForkingPickler.dumps&lt;span class="o"&gt;(&lt;/span&gt;obj&lt;span class="o"&gt;))&lt;/span&gt;
  File &lt;span class="s2"&gt;"/Users/jackmin/miniconda3/envs/docarray/lib/python3.10/multiprocessing/reduction.py"&lt;/span&gt;, line 51, &lt;span class="k"&gt;in &lt;/span&gt;dumps
    cls&lt;span class="o"&gt;(&lt;/span&gt;buf, protocol&lt;span class="o"&gt;)&lt;/span&gt;.dump&lt;span class="o"&gt;(&lt;/span&gt;obj&lt;span class="o"&gt;)&lt;/span&gt;
_pickle.PicklingError: Can&lt;span class="s1"&gt;'t pickle &amp;lt;class '&lt;/span&gt;__main__.B&lt;span class="s1"&gt;'&amp;gt;: it'&lt;/span&gt;s not the same object as __main__.B
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What happened here? By declaring the class twice, we’ve overwritten our first &lt;code&gt;Class B&lt;/code&gt; with a second &lt;code&gt;Class B&lt;/code&gt; in the global scope. Pickle is aware of this when it tries to serialize &lt;code&gt;C1&lt;/code&gt;. It will notice that the &lt;code&gt;Class B&lt;/code&gt; &lt;code&gt;C1&lt;/code&gt; refers to is no longer the top-level one and raises an exception.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Qualified names must be unique&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The issue here is that both &lt;code&gt;Class B&lt;/code&gt;s have the same qualified name. Thus, both definitions are fighting over who gets to be the one the global dictionary knows about.&lt;/p&gt;

&lt;p&gt;We can resolve this conflict and allow our two classes to live together peacefully by moving them to different qualified names and thus, different keys in the global scope:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_class&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;global&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;

    &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;B&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;VERSION&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt;

    &lt;span class="n"&gt;B&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__qualname__&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__qualname__&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;globals&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;B&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;

&lt;span class="n"&gt;C1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_class&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;C2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_class&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_version&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Class Name:&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Class Qualified Name:&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__qualname__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Type repr&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;VERSION&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;multiprocessing&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;mp&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;mp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;fork&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nc"&gt;Pool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;get_version&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;C1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;C2&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Class Name: B
Class Qualified Name: B1
Type repr &amp;lt;class &lt;span class="s1"&gt;'__main__.B1'&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
Class Name: B
Class Qualified Name: B2
Type repr &amp;lt;class &lt;span class="s1"&gt;'__main__.B2'&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="o"&gt;[&lt;/span&gt;1, 2]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice that although the two classes have different qualified names, they can still share the same name with no issues. Printing the type does however show the qualified name.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Implementation example&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you’d like to see how we used this pattern to implement DocumentArrays that work with multiprocessing, check out &lt;a href="https://github.com/docarray/docarray/pull/1049" rel="noopener noreferrer"&gt;this PR&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Support Proto 3 and 4
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://protobuf.dev/" rel="noopener noreferrer"&gt;Protobuf&lt;/a&gt; introduced a &lt;a href="https://github.com/tensorflow/tensorflow/issues/56077" rel="noopener noreferrer"&gt;breaking change&lt;/a&gt; in their 4.21 release. This has had a big impact on the Python ecosystem, and a lot of libraries have not yet been updated to use version 4.x. Perhaps the biggest pain for the ML ecosystem is TensorFlow’s lack of support for Protobuf, as it’s a widely used library and many packages, including DocArray, depend on it.&lt;/p&gt;

&lt;p&gt;At the same time, DocArray can be used without TensorFlow — It’s just one of several available backends. To better support all users, we’ve decided to support both versions of protobuf.&lt;/p&gt;

&lt;p&gt;This is actually easier than it may sound. We simply generated two Python files with Protoc, one for each of the Protobuf versions we want to support (3.x and 4.x).&lt;/p&gt;

&lt;p&gt;So, depending on the protobuf version you have installed, we either load the first or the second version of the proto file. It’s as straightforward as that. &lt;a href="https://github.com/docarray/docarray/pull/1078" rel="noopener noreferrer"&gt;Here&lt;/a&gt; is the PR for the curious.&lt;/p&gt;

&lt;h2&gt;
  
  
  Join the conversation
&lt;/h2&gt;

&lt;p&gt;Want to keep up to date or just have a chat with us? Join our Discord and say hi!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://discord.gg/WaMp6PVPgR" rel="noopener noreferrer"&gt;Join Discord&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Author
&lt;/h2&gt;

&lt;p&gt;Sami Jaghouar,Alex C-G,Charlotte Gerhaher,Jack Min Ong&lt;/p&gt;

&lt;h2&gt;
  
  
  Qriginal Link
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://jina.ai/news/this-week-in-docarray-2/https://jina.ai/news/this-week-in-docarray-2/" rel="noopener noreferrer"&gt;https://jina.ai/news/this-week-in-docarray-2/https://jina.ai/news/this-week-in-docarray-2/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>tutorial</category>
      <category>beginners</category>
      <category>automation</category>
    </item>
    <item>
      <title>Fine-tuning with Low Budget and High Expectations</title>
      <dc:creator>guoliwu</dc:creator>
      <pubDate>Wed, 22 Feb 2023 06:42:41 +0000</pubDate>
      <link>https://dev.to/guoliwu/fine-tuning-with-low-budget-and-high-expectations-35oc</link>
      <guid>https://dev.to/guoliwu/fine-tuning-with-low-budget-and-high-expectations-35oc</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5hu6da32grkmj3uzz3ep.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5hu6da32grkmj3uzz3ep.png" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Fine-tuning is a transfer learning technique developed as part of the Deep Learning revolution in artificial intelligence. Instead of learning a new task from scratch, fine-tuning takes a pre-trained model, trained on a related task, and then further trains it for the new task. Alternately, it can mean taking a model pre-trained for an open domain task, and further training it for a domain-specific one.&lt;/p&gt;

&lt;p&gt;Compared to training from scratch, fine-tuning is a much more cost-efficient solution whenever it is feasible. It requires:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;less labeled data,&lt;/strong&gt; as there is no need to learn everything all over again. All the training is devoted to acquiring domain-specific knowledge.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;less time to train,&lt;/strong&gt; since the number of variables is much smaller and most layers in the deep neural network freeze during fine-tuning.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Leveraging and transferring pre-existing training to new problems is one of the major practical developments of the Deep Learning revolution. It is highly effective, economical, and environmentally friendly. This is especially true for small businesses and individuals that hope to take advantage of new AI technologies.&lt;/p&gt;

&lt;p&gt;Or at least that's what all the deep learning tweets will tell you.&lt;/p&gt;

&lt;p&gt;But if you think about it, or try to use fine-tuning in a real world use-case, you will quickly find out that the promise comes with a lot of caveats:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Exactly &lt;em&gt;how much data&lt;/em&gt; do you&lt;/strong&gt; &lt;strong&gt;need to get a good result?&lt;/strong&gt; One labeled data point? Ten? One thousand? Ten thousand?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Exactly &lt;em&gt;how much time&lt;/em&gt; do you need to get good results?&lt;/strong&gt; One minute of fine-tuning? An hour? A day? A week?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are not trivial questions, even for large enterprises, but they are especially critical to SME's and individuals who have limited resources to invest in AI. Domain-specific data is neither free nor error-free and requires costly human labor to generate. Top of the line GPU pipelines are frighteningly expensive to buy and maintain, with most enterprises renting time on a cloud service. An unplanned AWS bill in the thousands of euros is unwelcome at the best of times.&lt;/p&gt;

&lt;p&gt;This article will give you a &lt;strong&gt;quantitative answer&lt;/strong&gt; to these questions, using the &lt;a href="https://rebrand.ly/jina-ai-finetune" rel="noopener noreferrer"&gt;Jina AI Finetuner&lt;/a&gt;. This tool is designed to improve the performance of pre-trained models and make them production-ready without expensive hardware.&lt;/p&gt;

&lt;h2&gt;
  
  
  Experiment design
&lt;/h2&gt;

&lt;p&gt;We designed two experiments to quantitatively study how &lt;strong&gt;labeled data&lt;/strong&gt; and &lt;strong&gt;training time&lt;/strong&gt; affect fine-tuning performance. For each experiment, we construct three multimodal search tasks by fine-tuning three deep neural networks. We chose seven datasets, two of which are non-domain-specific public datasets, to ensure the generality of our experiment.&lt;/p&gt;

&lt;p&gt;We measure the performance of fine-tuned models by evaluating their ability to perform search tasks, as measured by &lt;a href="https://en.wikipedia.org/wiki/Mean_reciprocal_rank" rel="noopener noreferrer"&gt;Mean Reciprocal Rank&lt;/a&gt; (mRR), Recall, and &lt;a href="https://stats.stackexchange.com/questions/127041/mean-average-precision-vs-mean-reciprocal-rank" rel="noopener noreferrer"&gt;Mean Average Precision&lt;/a&gt; (mAP). These metrics are calculated using the top 20 results of each search in the validation subset held out from each dataset.&lt;/p&gt;

&lt;p&gt;The table below summarizes the tasks, models and datasets used in our experiments, as well their performance metrics without any fine-tuning.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmpqomv9s2t93497pulwu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmpqomv9s2t93497pulwu.png" alt=" " width="733" height="610"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We already knew, even before performing any experiments, that all else being equal, more &lt;strong&gt;labeled data&lt;/strong&gt; and more &lt;strong&gt;training time&lt;/strong&gt; positively influence performance. But it's not enough to say that. We need to know how much is enough?&lt;/p&gt;

&lt;p&gt;The overarching question of our experiment is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Can we estimate the minimum domain- and task-specific labeled data and training time to deliver an adequate performance?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  How much labeled data is needed for good fine-tuning?
&lt;/h2&gt;

&lt;p&gt;We gradually increase the amount of labeled data fed to Finetuner from 100 items to 100,000 and see how this affects performance on the metrics described in the previous section.&lt;/p&gt;

&lt;p&gt;We further calculate the &lt;em&gt;return on investment&lt;/em&gt; (ROI), by dividing the relative improvement (a proxy for net profit) by the amount of labeled data (a proxy for investment cost). &lt;strong&gt;This is useful because it indicates the point at which adding more data is producing diminishing returns.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In the figures below, the X-axis represents the amount of labeled data, and the Y-axis represents the relative improvement over the pre-trained model. The higher, the better.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx07euorg3iyx2p7zp4do.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx07euorg3iyx2p7zp4do.png" alt=" " width="800" height="363"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fanuxhspyt7bs3pddgquc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fanuxhspyt7bs3pddgquc.png" alt=" " width="800" height="255"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F376k4pc5q2p301k4o3wj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F376k4pc5q2p301k4o3wj.png" alt=" " width="800" height="358"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;These results are promising but &lt;em&gt;not&lt;/em&gt; particularly surprising. Performance improves with more labeled data on nearly all tasks and all datasets, more for some tasks and datasets than for others. However, the only conclusion we can draw from these figures is that the Finetuner works as advertised. So far so good.&lt;/p&gt;

&lt;p&gt;What is more interesting is the ROI curve. In the figures below, the X-axis represents the amount of labeled data, and the Y-axis represents the ROI per labeled data item. The higher, the better. In particular, &lt;code&gt;ROI=0&lt;/code&gt; means adding new labeled data at that point no longer contributes to any improvement.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc4j1o06dgorgwiivx0qh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc4j1o06dgorgwiivx0qh.png" alt=" " width="800" height="363"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fexnag2ywmn5f4j6rq20q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fexnag2ywmn5f4j6rq20q.png" alt=" " width="800" height="243"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk39staqwgdv1f893anhp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk39staqwgdv1f893anhp.png" alt=" " width="800" height="357"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Surprisingly, we can see that the ROI per unit of new labeled data starts to drop almost immediately. We expected that it would eventually decrease, but this is an unexpected result.&lt;/p&gt;

&lt;h2&gt;
  
  
  How much time is needed for fine-tuning?
&lt;/h2&gt;

&lt;p&gt;To measure the value of added training time, we fixed the amount of new labeled data to 1000 items, and then we gradually increased the number of training epochs from 1 to 10. At each increase, we measure improvement over the pre-trained model and calculate the ROI. For these experiments, the ROI is calculated by dividing the relative improvement by the elapsed time in seconds. This means that when &lt;code&gt;ROI=0&lt;/code&gt; , adding training time no longer improves performance.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff9kcmbbil9d2ueq5v5lj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff9kcmbbil9d2ueq5v5lj.png" alt=" " width="800" height="362"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fme89lbjkuwck70e2pvgs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fme89lbjkuwck70e2pvgs.png" alt=" " width="800" height="236"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcf0n0ihkrfpmut7bemwe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcf0n0ihkrfpmut7bemwe.png" alt=" " width="800" height="362"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We knew in advance that adding more time does not guarantee any improvement at all. It can, in fact, reduce performance due to the &lt;em&gt;overfitting problem&lt;/em&gt;. Some models (e.g. CLIP) are more prone to overfitting than others. In principle, if we keep training with the same 1000 data points over and over, we are guaranteed to overfit the data and the overall performance will drop.&lt;/p&gt;

&lt;p&gt;Let's look at the ROI curves.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq71svumqo0j4qx40ew8o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq71svumqo0j4qx40ew8o.png" alt=" " width="800" height="356"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxi9w0mchpclq0gahidqp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxi9w0mchpclq0gahidqp.png" alt=" " width="800" height="246"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd2qby7lx8632c5dr9fbq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd2qby7lx8632c5dr9fbq.png" alt=" " width="800" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The ROI drops immediately after the first epoch of fine-tuning. Unlike in the last experiment, where ROI approached zero but stayed positive when increasing the labeled data, here, the ROI on added time can go negative due to the overfitting problem!&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;What does this mean for users looking to maximize gains and minimize costs?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Many state-of-the-art deep neural networks are capable of &lt;em&gt;few-shot&lt;/em&gt; learning. They are quick learners and can make large improvements with only a few hundred items of labeled data and only a few minutes of training time. You might have thought that deep neural network training requires millions of data items and a week of runtime, but we have shown in this article how that stereotype does not hold up to reality.&lt;/li&gt;
&lt;li&gt;Because they can learn so much, so fast, from so little data, ROI drops quickly as you put more time and data into fine-tuning. In the experiments above, ROI shrinks by 70% from its highest value after 500 labeled data items or 600 added seconds of GPU training time. Further investment beyond a few hundred items of training data and very minimal training time may not pay off as well as you would like.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All in all, fine-tuning is not economically comparable to training from scratch. It is far, far cheaper, especially with the help of &lt;a href="https://rebrand.ly/jina-ai-finetune" rel="noopener noreferrer"&gt;Finetuner&lt;/a&gt;. So the next time you receive a marketing email from the sales department of a GPU vendor or a company offering crowdsourced data acquisition, you know how to bargain with them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Author
&lt;/h2&gt;

&lt;p&gt;Han Xiao,Bo Wang,Scott Martens&lt;/p&gt;

&lt;h1&gt;
  
  
  Original Link
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://jina.ai/news/fine-tuning-with-low-budget-and-high-expectations/" rel="noopener noreferrer"&gt;https://jina.ai/news/fine-tuning-with-low-budget-and-high-expectations/&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  References
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://rebrand.ly/jina-ai-finetune" rel="noopener noreferrer"&gt;https://rebrand.ly/jina-ai-finetune&lt;/a&gt;&lt;/p&gt;

</description>
      <category>firstpost</category>
      <category>posts</category>
      <category>introduction</category>
    </item>
    <item>
      <title>A Guide to Using OpenTelemetry in Jina for Monitoring and Tracing Applications</title>
      <dc:creator>guoliwu</dc:creator>
      <pubDate>Fri, 17 Feb 2023 05:45:17 +0000</pubDate>
      <link>https://dev.to/guoliwu/a-guide-to-using-opentelemetry-in-jina-for-monitoring-and-tracing-applications-15a4</link>
      <guid>https://dev.to/guoliwu/a-guide-to-using-opentelemetry-in-jina-for-monitoring-and-tracing-applications-15a4</guid>
      <description>&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--xzUy8YXU--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/11/Jina-AI-Website-Banners-Templates--44--1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--xzUy8YXU--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/11/Jina-AI-Website-Banners-Templates--44--1.png" alt="" width="880" height="440"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As software (and the cloud) eats the world, more companies are using more microservice architectures, containerization, multi-cloud deployments and continuous deployment patterns. That means more points of failure. Failures, along with tight Service Level Objectives, stress out operations/SRE/DevOps teams, in turn increasing friction with development teams that want to deploy new features and launch new A/B tests as soon as possible.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://i.giphy.com/media/1Be4g2yeiJ1QfqaKvz/giphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/1Be4g2yeiJ1QfqaKvz/giphy.gif" alt="Jina developers when we can’t deploy new features" width="387" height="292"&gt;&lt;/a&gt;Jina developers when we can’t deploy new features&lt;/p&gt;

&lt;p&gt;CI/CD patterns have evolved a lot in the cloud era, helping teams to push improvements, fixes or new features to production almost instantaneously, giving users access to the latest goodies. One thing that enables this is the wide range of tools that both generate and collect information from running applications in real-time or near real-time.&lt;/p&gt;

&lt;p&gt;This information is in the form of signals indicating an application's health, performance and conformity. You can observe and analyze signal anomalies to catch misbehaving applications or features that need to be patched or disabled until further analysis. If you do it properly, you can detect anomalies more quickly, meaning happier customers, prevention of major outages and fewer security leaks.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://i.giphy.com/media/J4JSpIwM6y3Q6xnHgg/giphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/J4JSpIwM6y3Q6xnHgg/giphy.gif" alt="A disembodied hand coming from a screen like the girl from the Ring is another way to make customers happy" width="320" height="244"&gt;&lt;/a&gt;A disembodied hand coming from a screen like the girl from the Ring is another way to make customers happy&lt;/p&gt;

&lt;p&gt;A disembodied hand coming from a screen like the girl from the Ring is another way to make customers happy&lt;/p&gt;

&lt;p&gt;Back in the old days, signalling methods started as humble error/exception logging. They've now evolved to the latest OpenTelemetry standard.&lt;/p&gt;

&lt;p&gt;In this post we’ll explore the &lt;strong&gt;new tracing and monitoring features introduced in &lt;a href="https://jina.ai/news/jina-3-12-update/"&gt;Jina 3.12&lt;/a&gt;&lt;/strong&gt;, and use Sentry to track what’s happening when indexing or searching.&lt;/p&gt;

&lt;h2&gt;
  
  
  What problems can monitoring and tracing solve?
&lt;/h2&gt;

&lt;p&gt;We're going to use a fake real-world person to help explore the monitoring and tracing landscape: Meet Pamela NoLastName: She started a website called &lt;code&gt;Pamazon&lt;/code&gt; to help people buy products online. Let's walk through the evolution of &lt;code&gt;Pamazon&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--UFlo0rEg--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/11/image-17.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--UFlo0rEg--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/11/image-17.png" alt="Pamazon front page" width="880" height="503"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Pamazon 1.0: Logs as &lt;code&gt;stdout&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;As Pamela's site grows, she needs monitoring to generate, capture and analyze signals and improve site reliability. She starts with a pretty simple system:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Application logs are stored on each local machine and rotated periodically to avoid using up all the available disk space. If Pamela wants a longer retention period, she has to export these logs.&lt;/li&gt;
&lt;li&gt;If a customer or tester notices an anomaly like the search engine constantly timing out, they create a support ticket to complain.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--OQ7cyyes--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/11/image-18.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--OQ7cyyes--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/11/image-18.png" alt="" width="880" height="503"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pamela checks the logs and matches the time and possible error that the customer saw.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is a cumbersome process, and root cause analysis doesn't target the customer’s actual experience. Pamela needs a better way to do things.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pamazon 2.0: Structured and persistent logs
&lt;/h3&gt;

&lt;p&gt;Luckily for Pamela, application logging has evolved a lot with logging formats, structured logging (JSON) and better error messages. There are tools integrating code performance style measurements into running applications to measure near real-time performance of pieces of code at various layers (networking, disk, CPU) of the application stack. Pamela can capture, visualize and store these signals for a longer time. Big data processing frameworks let her aggregate and crunch data from different application landscapes (languages, architectures, device platforms) and deployment environments.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pamazon 3.0: Tracing errors with logs
&lt;/h3&gt;

&lt;p&gt;Pamazon is becoming a big success. It has a cloud/hybrid deployment environment, dealing with millions of global users over multiple device types. This means Pamela needs a more nuanced approach for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Generating valuable signals.&lt;/li&gt;
&lt;li&gt;Targeting likely root causes of issues.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Take the users Alice and Bob for example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Alice is in the USA using up her data connection to shop for fancy earrings on the Pamazon Android app.&lt;/li&gt;
&lt;li&gt;Bob is shopping for a black jacket from a Chilean research base on an Ubuntu PC.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--7QQr5Y__--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/11/image-22.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--7QQr5Y__--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/11/image-22.png" alt="" width="880" height="391"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Will Alice and Bob have similar experiences? All answers point to no. Delivering localized search results to many different devices (not to mention different languages) worldwide is a complex and daunting endeavor. &lt;strong&gt;It requires precise ways to generate signals and target likely root causes of issues.&lt;/strong&gt; More users and device types mean more and more ways for things to go wrong, and more root causes of issues.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--RnpvhW_Q--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/11/image-32.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--RnpvhW_Q--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/11/image-32.png" alt="Bob is starting to get angry at the constant errors. You won’t like him when he’s angry. Mostly because he mopes about it on Facebook ALL the time. Get out of my feed Bob!" width="880" height="503"&gt;&lt;/a&gt;Bob is starting to get angry at the constant errors. You won’t like him when he’s angry. Mostly because he mopes about it on Facebook ALL the time. Get out of my feed Bob!&lt;/p&gt;

&lt;p&gt;In short, Pamela needs to up her game again.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is OpenTelemetry?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://opentelemetry.io/"&gt;OpenTelemetry&lt;/a&gt; is a project incubated by &lt;a href="https://www.cncf.io/"&gt;CNCF&lt;/a&gt;. It brings distributed tracing and metrics. Pamela has brought us on to implement this telemetry into her Jina Flow. Before getting started we have to learn a few new terms:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://en.wikipedia.org/wiki/Telemetry"&gt;&lt;strong&gt;Telemetry&lt;/strong&gt;&lt;/a&gt; is on-site collection of measurements or other data at remote points and automatically sending them to receiving equipment for monitoring. Simply put, you can consider a log message or raw observation value of the error count during a request as a measurement.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Instrumentation&lt;/strong&gt; is the process of using instruments that record/collect raw observations that are transformed into signals and transmitted for monitoring purposes.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://en.wikipedia.org/wiki/Tracing_(software)"&gt;&lt;strong&gt;Tracing&lt;/strong&gt;&lt;/a&gt; involves a specialized use of logging to record information about a program's execution.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Bear these in mind, as Pamela has roped us into doing this.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to use OpenTelemetry integration in Jina?
&lt;/h2&gt;

&lt;p&gt;Jina &amp;gt;=3.12 comes with built-in OpenTelemetry integrations and features. Let's see how it works for Pamela. We now outline how to build a text-to-image search system using the following components:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/docarray/docarray"&gt;DocArray&lt;/a&gt; to manipulate data and interact with the storage backend using &lt;a href="https://docarray.jina.ai/advanced/document-store/"&gt;document store&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.jina.ai/fundamentals/flow/#flow"&gt;Jina Flow&lt;/a&gt; to orchestrate microservices for data loading, encoding and storage.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.jina.ai/fundamentals/executor/#executor"&gt;Jina Executor&lt;/a&gt; as the base to implement microservices.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/open-telemetry/opentelemetry-collector-contrib"&gt;OpenTelemetry Collector Contrib&lt;/a&gt; to collect traces from the microservices.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/getsentry/self-hosted"&gt;Sentry&lt;/a&gt; to visualise operations reported by the microservices.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--t2vNE5CV--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/11/image-24.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--t2vNE5CV--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/11/image-24.png" alt="" width="880" height="503"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;💡In this post we’re just building out a backend, and not touching on a frontend. To build your own low-code backend+frontend neural search solution, check out &lt;a href="https://now.jina.ai/"&gt;Jina NOW&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Preparing the data
&lt;/h3&gt;

&lt;p&gt;We derived the dataset by pre-processing the &lt;a href="https://www.google.com/url?sa=t&amp;amp;rct=j&amp;amp;q=&amp;amp;esrc=s&amp;amp;source=web&amp;amp;cd=&amp;amp;cad=rja&amp;amp;uact=8&amp;amp;ved=2ahUKEwjD6J_lkbP7AhWigP0HHWLAAYgQFnoECBIQAQ&amp;amp;url=https%3A%2F%2Fpaperswithcode.com%2Fdataset%2Fdeepfashion&amp;amp;usg=AOvVaw18-D2R5xEfrptAFWZmhT0H"&gt;deepfashion&lt;/a&gt; dataset using &lt;a href="https://github.com/jina-ai/finetuner"&gt;Finetuner&lt;/a&gt;. The image label generated by Finetuner is extracted and formatted to produce the &lt;code&gt;text&lt;/code&gt; attribute of each product.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--1UJGQM1n--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/11/image-25.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--1UJGQM1n--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/11/image-25.png" alt="" width="880" height="370"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Building a Flow with tracing enabled
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;💡You'll need Jina version ≥3.12.0 to use OpenTelemetry features.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;We use a Jina Flow as a pipeline to connect our microservices (a.k.a Executors) together. Since we don't care too much about the nitty-gritty details, we won't dive into &lt;em&gt;all&lt;/em&gt; the code, but just give a high-level overview. After all, the telemetry is the thing. Let's define the following in Python:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;jina&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Flow&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;DocumentArray&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Document&lt;/span&gt;

&lt;span class="n"&gt;flow&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;Flow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'localhost'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8080&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;tracing&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;traces_exporter_host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'localhost'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;traces_exporter_port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4317&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'clip_encoder'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# encode images/text into vectors
&lt;/span&gt;        &lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'localhost'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;51000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;timeout_ready&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3000000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;uses_with&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;'name'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'ViT-B-32::openai'&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;external&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;tls&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'qdrant_indexer'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;uses&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'jinahub+docker://QdrantIndexer'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# store vectors and metadata on disk
&lt;/span&gt;        &lt;span class="n"&gt;uses_with&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s"&gt;'collection_name'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'collection_name'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;'distance'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'cosine'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;'n_dim'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;flow&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;flow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;💡This Flow uses pre-built Executors from Jina's &lt;a href="https://cloud.jina.ai/"&gt;Executor Hub&lt;/a&gt;, saving you time in writing code.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--1PU_By_v--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/11/flow-2.svg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--1PU_By_v--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/11/flow-2.svg" alt="flow.plot('foo.svg') will give you this nice SVG image" width="724" height="120"&gt;&lt;/a&gt;flow.plot('foo.svg') will give you this nice SVG image&lt;/p&gt;

&lt;p&gt;The Flow is composed of:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The &lt;a href="https://docs.jina.ai/fundamentals/gateway/#gateway"&gt;Gateway&lt;/a&gt; that manages request flows to the underlying microservices. The tracing arguments are provided to the Flow, enabling OpenTelemetry tracing features for each deployed microservice.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://rebrand.ly/clip-as-service"&gt;CLIP-as-service&lt;/a&gt; Executor that uses the default cpu torch runtime with the &lt;code&gt;ViT-L-14-336::openai&lt;/code&gt; model for encoding text and/or image data. The &lt;code&gt;clip_encoder&lt;/code&gt; service is run using an independent Flow with the required tracing arguments.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://cloud.jina.ai/executor/j3u4uwje?random=4939"&gt;QdrantIndexer&lt;/a&gt; is the backend for storing and searching the dataset using Docarray. You'll need to provide the appropriate Qdrant &lt;code&gt;host&lt;/code&gt; and &lt;code&gt;port&lt;/code&gt; parameters if the database isn't running on localhost and the default port. The vector dimension is configured using the &lt;code&gt;uses_with&lt;/code&gt; argument.&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;💡For more information on building a Flow read our &lt;a href="https://docs.jina.ai/fundamentals/flow/"&gt;docs&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Indexing and searching
&lt;/h3&gt;

&lt;p&gt;Now that the components are in place, let's start adding data to our database, and then we can search for our favorite products using text. We can easily index our product images using the following code snippet:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;jina&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DocumentArray&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Client&lt;/span&gt;
&lt;span class="n"&gt;da&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;DocumentArray&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pull&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'deepfashion-text-preprocessed'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;show_progress&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# connect to the Flow
&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"grpc://0.0.0.0:8080"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# use only 100 Documents for demo
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;docs&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;da&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;batch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;on&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"/index"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A sample search request looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="n"&gt;from&lt;/span&gt; &lt;span class="n"&gt;jina&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;Client&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;DocumentArray&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Client&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="nl"&gt;grpc:&lt;/span&gt;&lt;span class="c1"&gt;//0.0.0.0:8080')&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;post&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;search&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;DocumentArray&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="n"&gt;jacket&lt;/span&gt; &lt;span class="n"&gt;mens&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="o"&gt;)))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;💡For more information on Jina Client, check our &lt;a href="https://docs.jina.ai/fundamentals/client/client/"&gt;docs&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Let’s break down the index and search processes described above:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;clip_encoder&lt;/code&gt; generates an embedding for the &lt;code&gt;text&lt;/code&gt; attribute of each product. The &lt;code&gt;Flow(...).add(...).add(...)&lt;/code&gt; definition creates a sequential topography by default. Requests to the Flow first pass through the &lt;code&gt;clip_encoder&lt;/code&gt; service and then results are passed to the &lt;code&gt;qdrant_indexer&lt;/code&gt; service.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;qdrant_indexer&lt;/code&gt;  implements the &lt;code&gt;/index&lt;/code&gt; endpoint for index operations and the &lt;code&gt;/search&lt;/code&gt; endpoint for search operations. The operations work on embeddings without any focus on other attributes of a product.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Enabling tracing on Executors
&lt;/h3&gt;

&lt;p&gt;Now is a good time to learn a few OpenTelemetry terms:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://en.wikipedia.org/wiki/Tracing_(software)"&gt;Tracing&lt;/a&gt; involves a specialized use of logging to record information about a program's execution.&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;trace&lt;/strong&gt; represents a series or parallel or combinations of operations that were involved in producing the end result.&lt;/li&gt;
&lt;li&gt;Every trace is made up of one or more spans. Each span represents one operation in a trace, like &lt;code&gt;process_docs&lt;/code&gt;, &lt;code&gt;sanitize_text&lt;/code&gt;, or &lt;code&gt;embed_text&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Hub Executors in the above example have instrumentation integrated by default. Let’s look at a simple example of providing the span with useful tags for the &lt;code&gt;/index&lt;/code&gt; operation. The below code snippet from the &lt;code&gt;QdrantIndexer&lt;/code&gt; creates two spans:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Jina automatically creates the &lt;code&gt;/index&lt;/code&gt; span as part of the &lt;code&gt;@requests&lt;/code&gt; decorator. This is one of Jina's automatic instrumentation features, providing value out of the box.&lt;/li&gt;
&lt;li&gt;You can track more fine-grained operations such as the &lt;code&gt;qdrant_index&lt;/code&gt; span, which records the number of documents received in the request that must be indexed in Qdrant. Suspiciously quick indexing operations could be due to any empty request. On the flip side, very slow requests could be caused by too many large documents in the request. You can add more information to the span tags, such as the target &lt;a href="https://qdrant.tech/documentation/collections/"&gt;Qdrant Collection&lt;/a&gt; and other deployment related information.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;on&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'/index'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;DocumentArray&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tracing_context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="s"&gt;"""Index new documents
    :param docs: the Documents to index
    """&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tracer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_as_current_span&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="s"&gt;'qdrant_index'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tracing_context&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;span&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;span&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_attribute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'len_docs'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_index&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice that the &lt;code&gt;service.name&lt;/code&gt; attribute can be helpful if a single Qdrant cluster is used by different Flows to store different Documents.&lt;/p&gt;

&lt;h3&gt;
  
  
  Collecting and analyzing traces
&lt;/h3&gt;

&lt;p&gt;Right now, Jina only supports the export mechanism to push telemetry signals to external systems. It uses &lt;a href="https://github.com/open-telemetry/opentelemetry-collector-contrib"&gt;OpenTelemetry Collector Contrib&lt;/a&gt; as the unified component to collect telemetry signals before exporting to enhanced components that transform the data for visualization and analysis. The collector setup is very basic and functions only as the uniform intermediary for collecting and exporting data.&lt;/p&gt;

&lt;p&gt;We use the self-hosted &lt;a href="https://github.com/getsentry/self-hosted"&gt;Sentry&lt;/a&gt; application landscape to set up the actual APM or SPM. We'll only explore a small set of features supported by Sentry to preserve the focus of this post. Refer to the &lt;a href="https://sentry.io/features/distributed-tracing/"&gt;documentation&lt;/a&gt; for more details.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;💡Application Performance Monitoring (APM) or the System Performance Monitoring (SPM) is the monitoring of performance and availability of applications. Service Level Indicators (SLI’s) detect and diagnose complex application performance problems to maintain an expected level of Service Level Objectives (SLO’s).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  How to use Sentry to visualize and analyze the collected data?
&lt;/h2&gt;

&lt;p&gt;Sentry has many features and custom definitions to translate telemetry signals into business terms. We'll just focus on the &lt;a href="https://docs.sentry.io/product/performance/"&gt;Performance&lt;/a&gt;, &lt;a href="https://docs.sentry.io/product/performance/transaction-summary/"&gt;Transaction Summary&lt;/a&gt; and &lt;a href="https://docs.sentry.io/product/sentry-basics/tracing/trace-view/"&gt;Trace&lt;/a&gt;, and Dashboard views, and using them to monitor the Flow and all Executors.&lt;/p&gt;

&lt;h3&gt;
  
  
  Performance view
&lt;/h3&gt;

&lt;p&gt;The &lt;strong&gt;performance view&lt;/strong&gt; gives the overall view of metric signals received by Sentry:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--T9DlVVu8--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/11/image-27.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--T9DlVVu8--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/11/image-27.png" alt="Click on the image to read the explanatory labels." width="880" height="480"&gt;&lt;/a&gt;Click on the image to read the explanatory labels.&lt;/p&gt;

&lt;h3&gt;
  
  
  Summary view
&lt;/h3&gt;

&lt;p&gt;You can click a span to view the transaction summary page. In our case, we clicked the &lt;code&gt;/index&lt;/code&gt; span to bring up the &lt;code&gt;index&lt;/code&gt; &lt;strong&gt;summary:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--E1GcNTdc--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/11/image-28.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--E1GcNTdc--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/11/image-28.png" alt="Click on the image to read the explanatory labels." width="880" height="480"&gt;&lt;/a&gt;Click on the image to read the explanatory labels.&lt;/p&gt;

&lt;p&gt;In the above screenshot:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The TRACE ID column shows the span’s parent trace.&lt;/li&gt;
&lt;li&gt;The &lt;em&gt;green section&lt;/em&gt; lists the spans that may be slow. By clicking the trace ID we can see the full trace of the operation.&lt;/li&gt;
&lt;li&gt;The &lt;em&gt;red section&lt;/em&gt; helps diagnose and drill down to the root cause of abnormalities, errors or issues produced during this operation. The duration and status of the span is used to detect suspects and display tags that are part of the suspect spans. We can add useful tags to span attributes to detect suspect spans and tags more easily.&lt;/li&gt;
&lt;li&gt;The &lt;em&gt;blue section&lt;/em&gt; shows the top tags in the selected time frame. This is a more general overview to gain quick insights into different attributes that you could use to drill down to the root cause of abnormalities.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Trace view
&lt;/h3&gt;

&lt;p&gt;You can click a span’s event ID in the summary view to see its *********&lt;strong&gt;&lt;em&gt;trace view&lt;/em&gt;&lt;/strong&gt;*********. Let’s look at the trace view of a single &lt;code&gt;/search&lt;/code&gt; operation:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--VDMx3qC5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/11/image-29.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--VDMx3qC5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/11/image-29.png" alt="" width="880" height="233"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the figure above, the trace &lt;code&gt;f1adb01bcb9fd18f59a5b38745f07e39&lt;/code&gt; shows the end-to-end request flow with spans that were involved in generating the response. Span names are prefixed with &lt;code&gt;rpc&lt;/code&gt; or &lt;code&gt;default&lt;/code&gt; tags. These prefixes are determined by Sentry from the span tag attributes.&lt;/p&gt;

&lt;p&gt;In this screenshot, Jina automatically creates the &lt;code&gt;rpc&lt;/code&gt; spans which represent the internal requests flow. The &lt;code&gt;rpc&lt;/code&gt; spans help to fill in the gaps and provide a complete view on top of the user added spans &lt;code&gt;encode&lt;/code&gt;, &lt;code&gt;inference&lt;/code&gt;, &lt;code&gt;txt_minibatch_encoding&lt;/code&gt;, &lt;code&gt;/search&lt;/code&gt; and &lt;code&gt;qdrant_search&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Before drilling down into user-created spans, let’s look at the trace view in more depth. This is basically a directed acyclic graph connecting the dots from request beginning to end (left to right tracking of the bars). All these operations are sequential as seen by the top down rendering of the colored bars. Each operation's duration is displayed and is more useful if there are parallel operations or slow spans. To the left is the tree style (top down) representation of high level spans and spans nested under each operation. There are three top level spans because of the topology generated by the Flow.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The client request reaches the Gateway, and the Gateway triggers a call to the &lt;code&gt;clip_encoder&lt;/code&gt; Executor&lt;/li&gt;
&lt;li&gt;The Executor creates an embedding of the search text and forwards it to the search endpoint of the &lt;code&gt;qdrant_indexer&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;The search endpoint retrieves products for the search query.&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;💡You can find more details and features such as displaying spans with errors in Sentry’s &lt;a href="https://docs.sentry.io/product/sentry-basics/tracing/trace-view/"&gt;documentation&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A user can click any span for more information. We can click the &lt;code&gt;clipencoder&lt;/code&gt;'s' &lt;code&gt;inference&lt;/code&gt; span to see more information. The &lt;code&gt;inference&lt;/code&gt; span generates the embedding for the &lt;code&gt;text&lt;/code&gt; attribute of Documents in the request:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--L15xw1IV--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/11/image-30.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--L15xw1IV--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/11/image-30.png" alt="" width="880" height="394"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;has_img_da&lt;/code&gt; and &lt;code&gt;has_txt_da&lt;/code&gt; span attributes show if the document contained image and/or textual data.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;minibatch_size&lt;/code&gt; shows the batch size used to generate embeddings using the configured thread pool.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Since the demo dataset of 100 Documents provides only text data, we see &lt;code&gt;has_txt_da&lt;/code&gt; is set to &lt;code&gt;True&lt;/code&gt; and the next span contains only the &lt;code&gt;txt_minibatch_encoding&lt;/code&gt; span wherein text embeddings were actually generated using the thread pool.&lt;/p&gt;

&lt;h3&gt;
  
  
  Customize your Sentry dashboard
&lt;/h3&gt;

&lt;p&gt;Lastly, let's see a sample performance monitoring dashboard for our example Flow. The various telemetry signals are combined to give a single outlook of the various indicators using which we can discern discrepancies or abnormal behaviors.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--DkrCEpkC--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/11/image-31.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--DkrCEpkC--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/11/image-31.png" alt="Click on the image to read the explanatory labels." width="880" height="480"&gt;&lt;/a&gt;Click on the image to read the explanatory labels.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;💡Refer to the &lt;a href="https://docs.sentry.io/product/dashboards/"&gt;documentation&lt;/a&gt; for more details on how to customize the dashboard and the error analysis graphs provided by Sentry.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;By using OpenTelemetry, we've helped Pamela build a reliable product search system for her shopping site. This means that Pamela can now get reports and/or alerts for errors as they happen. Tracing issues becomes much easier. leading to a quicker response time and faster turnaround time for fixing issues.&lt;/p&gt;

&lt;p&gt;She now has more time to focus on improving the website, increasing the assortment or adding new search features for her customers. This means happier customers all round, more business, and improved profits for Pamela.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://i.giphy.com/media/wlCWTqb3bvZkkt3Ins/giphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/wlCWTqb3bvZkkt3Ins/giphy.gif" alt="Yes, this really is Pamela flashing her OpenTelemetry-funded riches around. She’s tacky like that." width="480" height="270"&gt;&lt;/a&gt;Yes, this really is Pamela flashing her OpenTelemetry-funded riches around. She’s tacky like that.&lt;/p&gt;

&lt;p&gt;If you’re interested in making your Jina applications more observable and reliable with OpenTelemetry (and replicating Pamela's success), read our &lt;a href="https://docs.jina.ai/cloud-nativeness/opentelemetry/"&gt;docs&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>microservices</category>
      <category>cloudskills</category>
      <category>opentelemetry</category>
      <category>monitoring</category>
    </item>
    <item>
      <title>This week(s) in DocArray</title>
      <dc:creator>guoliwu</dc:creator>
      <pubDate>Thu, 16 Feb 2023 08:55:25 +0000</pubDate>
      <link>https://dev.to/guoliwu/this-weeks-in-docarray-3dj9</link>
      <guid>https://dev.to/guoliwu/this-weeks-in-docarray-3dj9</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk86febag7yrsrm6b178v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk86febag7yrsrm6b178v.png" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;💡Thanks to the &lt;a href="https://www.docarray.org/" rel="noopener noreferrer"&gt;DocArray team&lt;/a&gt; for this guest blog post!&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It's already been two weeks since the &lt;a href="https://github.com/docarray/docarray/releases/tag/2023.01.18.alpha" rel="noopener noreferrer"&gt;last alpha release&lt;/a&gt; of DocArray v2. And since then a lot has happened — we've merged features we're really proud of, and we've cried tears of joy and misery trying to coerce Python into doing what we want. If you want to learn about interesting Python edge cases or follow the advancement of DocArray v2 development then you’ve come to the right place in this blog post!&lt;/p&gt;

&lt;p&gt;For those who don’t know, DocArray is a library for &lt;strong&gt;representing, sending, and storing multi-modal data&lt;/strong&gt;, with a focus on applications in &lt;strong&gt;ML&lt;/strong&gt; and &lt;strong&gt;Neural Search.&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;👉 DocArray Link：&lt;a href="https://rebrand.ly/devTo-docarray" rel="noopener noreferrer"&gt;https://rebrand.ly/devTo-docarray&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The project just moved to the &lt;a href="https://lfaidata.foundation/" rel="noopener noreferrer"&gt;Linux foundation AI and Data&lt;/a&gt;, and to celebrate its first birthday we decided to rewrite it from scratch, mainly because of a design shift and a will to solidify the codebase from the ground up. Also because it can’t eat cake and we had to give it &lt;em&gt;something&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;So, what's been happening in the past two weeks?&lt;/p&gt;

&lt;h2&gt;
  
  
  Less verbose API
&lt;/h2&gt;

&lt;p&gt;One of DocArray's goals is to give our users powerful abstractions to represent nested data. To do this in v2 we allow nesting of &lt;code&gt;BaseDocument&lt;/code&gt;. (Well, this is actually just a feature of &lt;a href="https://docs.pydantic.dev/" rel="noopener noreferrer"&gt;pydantic&lt;/a&gt; and one of the reasons its design seduces us to use it as a backend).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;docarray&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseDocument&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;docarray.documents&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Text&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyBanner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseDocument&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Text&lt;/span&gt;
    &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyPoster&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseDocument&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;left&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;MyBanner&lt;/span&gt;
    &lt;span class="n"&gt;right&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;MyBanner&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is a powerful design pattern, but the API is a bit too verbose when using our predefined Document class:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;banner_1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MyBanner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;hello&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;myimage.png&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;banner_2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MyBanner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;bye bye&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;myimage2.png&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;poster&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MyPoster&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;left&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;banner_1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;right&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;banner_2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The new API looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;banner_1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MyBanner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;hello&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;myimage.png&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;banner_2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MyBanner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;bye bye&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;myimage2.png&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;poster&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MyPoster&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;left&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;banner_1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;right&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;banner_2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It's waaay less verbose. We basically override pydantic's predefined document validator to let us do this smart casting. But we didn't make this automatic, in the sense that if you create a Document you still need to use the verbose API. This is because this casting isn't always obvious. For instance, look at this Document:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyDoc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseDocument&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
   &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
   &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;

&lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MyDoc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;hello&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# won't work
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;n this case, where should &lt;code&gt;'hello'&lt;/code&gt; be assigned? Title or description? There's no obvious way to do it so we'd rather let the user define it, at least until we find a better way.&lt;/p&gt;

&lt;p&gt;We're thinking about either:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Referring to the order and make the first string in the list the “main” one. But this is against one of the core values of this rewrite: “we don’t do things implicitly”.&lt;/li&gt;
&lt;li&gt;Allowing the user to mark a "main" field somehow, either with a Field object or a function.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;From the outside, it looks like a minor problem. But we believe the real devil is in the details, so we spent countless hours arguing over such a simple API. Man, that's time we won't get back. 💁‍♂️&lt;/p&gt;

&lt;p&gt;Curious? Check out this PR:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;👉 DocArray PR：&lt;a href="https://rebrand.ly/docarray-PR" rel="noopener noreferrer"&gt;https://rebrand.ly/docarray-PR&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  &lt;code&gt;__torch_function__&lt;/code&gt; , or: How to give PyTorch a little bit more confidence
&lt;/h2&gt;

&lt;p&gt;We had a lot of fun wrapping our heads around the &lt;code&gt;__torch_function__&lt;/code&gt; concept.&lt;/p&gt;

&lt;p&gt;Our &lt;code&gt;TorchTensor&lt;/code&gt; class is a subclass of &lt;code&gt;torch.Tensor&lt;/code&gt; that injects some useful functionality (mainly the ability to express its shape at the type level: &lt;code&gt;TorchTensor[3, 224, 224]&lt;/code&gt;, and protobuf serialization), and PyTorch comes with a whole machinery around subclassing, dynamic dispatch and all that jazz.&lt;/p&gt;

&lt;p&gt;One part of this machinery is &lt;code&gt;__torch_function__&lt;/code&gt; , a magic method that allows all kinds of objects to be treated like Torch Tensors. You want instances of your class to be able to be processed by functions like &lt;code&gt;torch.stack([your_instance, another_instance])&lt;/code&gt;, or be directly added to a &lt;code&gt;torch.Tensor&lt;/code&gt;? No problem, just implement &lt;code&gt;__torch_function__&lt;/code&gt; in your class, handle it there, and off you go! No need to even subclass &lt;code&gt;torch.Tensor&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyClass&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;others&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_others&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;others&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="nd"&gt;@classmethod&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__torch_function__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stack&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Tensor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# we know how to handle these!
&lt;/span&gt;            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;combine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# ... but are clueless about the rest
&lt;/span&gt;            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;NotImplemented&lt;/span&gt;

    &lt;span class="nd"&gt;@classmethod&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;combine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;others&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;others&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;others&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stack&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nc"&gt;MyClass&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="nc"&gt;MyClass&lt;/span&gt;&lt;span class="p"&gt;()]))&lt;/span&gt;
&lt;span class="c1"&gt;# outputs:
# &amp;lt;__main__.MyClass object at 0x7fd290c55190&amp;gt;
&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rand&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nc"&gt;MyClass&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="c1"&gt;# outputs:
# &amp;lt;__main__.MyClass object at 0x7f363e2ed0d0&amp;gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, the example above isn’t a very useful one, but you get the idea: &lt;code&gt;__torch_function__&lt;/code&gt; lets you create objects that &lt;em&gt;behave&lt;/em&gt; like Torch Tensors without &lt;em&gt;being&lt;/em&gt; Torch Tensors.&lt;/p&gt;

&lt;p&gt;But hold on. Instances of &lt;code&gt;TorchTensor&lt;/code&gt; &lt;em&gt;are&lt;/em&gt; Torch Tensors, since they directly inherit from &lt;code&gt;torch.Tensor&lt;/code&gt;! So all the functionality is already there, we inherit &lt;code&gt;__torch_function__&lt;/code&gt; from &lt;code&gt;torch.Tensor&lt;/code&gt;, and we don’t need to care about any of this, right?&lt;/p&gt;

&lt;p&gt;Well, not quite.&lt;/p&gt;

&lt;p&gt;The thing is, we don’t just have one subclass of &lt;code&gt;torch.Tensor&lt;/code&gt;; we have many: &lt;code&gt;TorchTensor&lt;/code&gt; is the obvious one, but there's also &lt;code&gt;TorchTensor[3, 224, 224]&lt;/code&gt;, &lt;code&gt;TorchTensor[128]&lt;/code&gt; and &lt;code&gt;TorchTensor['batch', 'c', 'w', 'h']&lt;/code&gt;, etc. All of these are separate classes!&lt;/p&gt;

&lt;p&gt;To be a bit more precise, all the parameterized classes (the ones with &lt;code&gt;[...]&lt;/code&gt; at the end) are direct subclasses of &lt;code&gt;TorchTensor&lt;/code&gt; and are &lt;strong&gt;siblings of one another&lt;/strong&gt; (this becomes important later on).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;                                    torch.Tensor
                                         ^
                                         |
       ---------------------------&amp;gt; TorchTensor &amp;lt;------
      ^                   ^                            ^
      |                   |           ....             |
TorchTensor[128] TorchTensor[1, 128]  ....   TorchTensor['batch', 'c', 'w', 'h']
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So where's the problem?&lt;/p&gt;

&lt;p&gt;The problem essentially lies in the &lt;code&gt;types&lt;/code&gt; argument to &lt;code&gt;__torch_function__&lt;/code&gt;. It contains the types of all the arguments that were passed to the original PyTorch function call, &lt;code&gt;torch.stack()&lt;/code&gt; in the example above. Again, in the &lt;code&gt;stack&lt;/code&gt; example above, this would just be the tuple &lt;code&gt;(MyClass, MyClass)&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This is meant just as a convenience to the implementer of &lt;code&gt;__torch_function__&lt;/code&gt;. It lets them quickly decide, based on the type, if they can handle a given input or not.&lt;/p&gt;

&lt;p&gt;Let’s take a look at how the default PyTorch (&lt;code&gt;torch.Tensor&lt;/code&gt;) implementation of &lt;code&gt;__torch_function__&lt;/code&gt; makes that decision:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@classmethod&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__torch_function__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# ... some stuff here
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;issubclass&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;NotImplemented&lt;/span&gt;
    &lt;span class="c1"&gt;# ... more stuff here
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Can you already guess where things go wrong?&lt;/p&gt;

&lt;p&gt;Let me give you a hint by showing a failure case:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rand&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TorchTensor&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;TorchTensor&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When this call is handled in &lt;code&gt;__torch_function__&lt;/code&gt; , as inherited from &lt;code&gt;torch.Tensor&lt;/code&gt;, &lt;code&gt;cls&lt;/code&gt; will be &lt;code&gt;TorchTensor[128]&lt;/code&gt; and &lt;code&gt;types&lt;/code&gt; will contain &lt;code&gt;TorchTensor[1, 128]&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;That makes sense: those are the two classes involved in this addition.&lt;/p&gt;

&lt;p&gt;But what will PyTorch do?&lt;/p&gt;

&lt;p&gt;It will throw up its hands and give up!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;TypeError: unsupported operand &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;s&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; +: &lt;span class="s1"&gt;'TorchTensor[128]'&lt;/span&gt; and &lt;span class="s1"&gt;'TorchTensor[1, 128]'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;TorchTensor[128]&lt;/code&gt; is &lt;em&gt;not&lt;/em&gt; a subclass of &lt;code&gt;TorchTensor[1, 128]&lt;/code&gt;; they're siblings! So the subclass check above will fail and PyTorch will announce that it has &lt;em&gt;absolutely no clue&lt;/em&gt; about how to combine instances of these two classes.&lt;/p&gt;

&lt;p&gt;But c'mon PyTorch! Both these classes inherit from &lt;code&gt;torch.Tensor&lt;/code&gt;! Believe in yourself, you &lt;em&gt;do&lt;/em&gt; know how to deal with them! Just treat them like normal tensors!&lt;/p&gt;

&lt;p&gt;And that’s already the solution to the entire problem: We need to give PyTorch a little confidence boost, by telling it to treat our custom classes just like the &lt;code&gt;torch.Tensor&lt;/code&gt; class it already knows and loves.&lt;/p&gt;

&lt;p&gt;So how do we give it this metaphorical pep talk? It’s actually quite simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@classmethod&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__torch_function__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# this tells torch to treat all of our custom tensors just like
&lt;/span&gt;    &lt;span class="c1"&gt;# torch.Tensor's. Otherwise, torch will complain that it doesn't
&lt;/span&gt;    &lt;span class="c1"&gt;# know how to handle our custom tensor type.
&lt;/span&gt;    &lt;span class="n"&gt;docarray_torch_tensors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TorchTensor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;__subclasses__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;types_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Tensor&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;docarray_torch_tensors&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;super&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;__torch_function__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;types_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the implementation of of &lt;code&gt;__torch_function__&lt;/code&gt; that currently powers &lt;code&gt;TorchTensor&lt;/code&gt;. It does just one thing: For any class that's a subclass of &lt;code&gt;TorchTensor&lt;/code&gt;, it changes the &lt;code&gt;types&lt;/code&gt; argument before passing it along to the default implementation of &lt;code&gt;__torch_function__&lt;/code&gt;. It substitutes all such types for &lt;code&gt;torch.Tensor&lt;/code&gt;, telling PyTorch that it's got this!&lt;/p&gt;

&lt;p&gt;Et voilà, it works:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rand&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TorchTensor&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;TorchTensor&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="c1"&gt;# outputs:
# TorchTensor[128]([0.0454, 1.3724, ..., 1.3329, 0.9239,])
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This PR demonstrates how we coached PyTorch into having a little more self-esteem and being it's truest, best self:&lt;/p&gt;

&lt;p&gt;👉 &lt;a href="https://github.com/docarray/docarray/pull/1037/files" rel="noopener noreferrer"&gt;https://github.com/docarray/docarray/pull/1037/files&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Early support for DocArray v2 in Jina
&lt;/h2&gt;

&lt;p&gt;Well, it's not exactly a new feature, but we've been working on early support for DocArray v2 in &lt;a href="https://github.com/jina-ai/jina/" rel="noopener noreferrer"&gt;Jina&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;DocArray’s relation to Jina is similar to pydantic’s relation to &lt;a href="https://fastapi.tiangolo.com/" rel="noopener noreferrer"&gt;FastAPI&lt;/a&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;FastAPI is an HTTP framework that uses pydantic models to define the API schema.&lt;/li&gt;
&lt;li&gt;Jina is a gRPC/HTTP framework that uses DocArray Documents to define the API schema.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are other conceptual differences of course, but to fully understand the new changes in Jina it's interesting to look at it like this. DocArray is actually built on top of pydantic and adds a hint of multi-modal machine learning on top of that.&lt;/p&gt;

&lt;p&gt;Here's an example of the new interface:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;jina&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Executor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;docarray&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseDocument&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;DocumentArray&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;docarray.documents&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;docarray.typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AnyTensor&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;InputDoc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseDocument&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;OutputDoc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseDocument&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AnyTensor&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyExec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Executor&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nd"&gt;@requests&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;on&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/bar&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;bar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;DocumentArray&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;InputDoc&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;DocumentArray&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;OutputDoc&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="n"&gt;docs_return&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;DocumentArray&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;OutputDoc&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;
            &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;OutputDoc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;zeros&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;docs_return&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The main difference is that an Executor doesn't necessarily do in-place modification, but can return a different Document type. For instance, we have a toy encoder that takes an image as input and returns embeddings. Similar to FastAPI, we infer the input and output schema of the Executor by inspecting the type hint of the method. You can also use this information as an argument if you don’t want to rely on the type hint.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;💡&lt;a href="https://feat-docarray-v2--jina-docs.netlify.app/concepts/executor/docarray-v2/" rel="noopener noreferrer"&gt;Check the v2 docs for more information&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Here's the PR:&lt;/p&gt;

&lt;p&gt;👉 &lt;a href="https://rebrand.ly/docarrayV2-PR" rel="noopener noreferrer"&gt;https://rebrand.ly/docarrayV2-PR&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Pretty printing
&lt;/h2&gt;

&lt;p&gt;We ported back the pretty printing from DocArray v1 to v2 and tidied it up a bit to reflect the new v2 schema! Under the hood, we're relying on the awesome &lt;a href="https://github.com/Textualize/rich" rel="noopener noreferrer"&gt;rich&lt;/a&gt; library for everything related to UI.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcfacyuffet4bt37awj3x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcfacyuffet4bt37awj3x.png" width="800" height="508"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fac1p8ufwk54x9pymzxn7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fac1p8ufwk54x9pymzxn7.png" width="800" height="332"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Check the PR for more info! &lt;br&gt;
👉&lt;a href="https://rebrand.ly/docarrayV2-Pretty-printing" rel="noopener noreferrer"&gt;https://rebrand.ly/docarrayV2-Pretty-printing&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Document Stores
&lt;/h2&gt;

&lt;p&gt;We’re currently completely rethinking &lt;a href="https://docs.docarray.org/advanced/document-store/" rel="noopener noreferrer"&gt;Document Stores&lt;/a&gt;. The main points are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Every Document Store will have a &lt;strong&gt;schema&lt;/strong&gt; assigned, just like a DocumentArray, but with more (backend-dependent) options and configurations.&lt;/li&gt;
&lt;li&gt;First-class support for &lt;strong&gt;hybrid search and multi-vector search.&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Support &lt;strong&gt;search on nested Documents.&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you are curious about the &lt;strong&gt;full (preliminary) design&lt;/strong&gt; you can check it in detail out &lt;a href="https://lightning-scent-57a.notion.site/Document-Stores-v2-design-doc-f11d6fe6ecee43f49ef88e0f1bf80b7f" rel="noopener noreferrer"&gt;here&lt;/a&gt;. But here's a small taster:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# define schema
&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyDoc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseDocument&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ImageUrl&lt;/span&gt;
    &lt;span class="n"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;TorchTensor&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;da&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;DocumentArray&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;MyDoc&lt;/span&gt;&lt;span class="p"&gt;](...)&lt;/span&gt;  &lt;span class="c1"&gt;# data to index
&lt;/span&gt;
&lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;DocumentStore&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;MyDoc&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;storage&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;MyFavDB&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...)&lt;/span&gt;

&lt;span class="c1"&gt;# index data
&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;da&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# search through query builder
&lt;/span&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query_builder&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# build complex (composite) query
&lt;/span&gt;    &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;(...),&lt;/span&gt; &lt;span class="n"&gt;field&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;image&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;weight&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;(...),&lt;/span&gt; &lt;span class="n"&gt;field&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;price &amp;lt; 200&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;jeans&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;field&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute_query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Beyond the first designs that are just now finding their way into actual code, we're happy to share that we're &lt;strong&gt;closely collaborating with &lt;a href="https://weaviate.io/" rel="noopener noreferrer"&gt;Weaviate&lt;/a&gt;&lt;/strong&gt; to make our Document Stores as good as they can be!&lt;/p&gt;

&lt;p&gt;So far they’ve provided a lot of valuable input for our designs, and we’re looking forward to the collaboration during actual implementation.&lt;/p&gt;

&lt;p&gt;Lastly, a word about &lt;strong&gt;Document Store launch plans&lt;/strong&gt;: Our current plan is to launch this reincarnation of Document Stores with &lt;strong&gt;three supported backends: Weaviate, &lt;a href="https://www.elastic.co/" rel="noopener noreferrer"&gt;ElasticSearch&lt;/a&gt;&lt;/strong&gt;, and one &lt;strong&gt;on-device vector search&lt;/strong&gt; library (which one? That's still TBD).&lt;/p&gt;

&lt;p&gt;Unfortunately our capacity doesn't allow for more on launch day, but if you (yes, &lt;em&gt;you&lt;/em&gt;!) want to &lt;strong&gt;help us&lt;/strong&gt; accelerate development for one of the other vector databases, we would absolutely love that and accelerate our timelines accordingly. If you feel intrigued, &lt;strong&gt;&lt;a href="https://discord.gg/WaMp6PVPgR" rel="noopener noreferrer"&gt;reach out to us on Discord&lt;/a&gt;&lt;/strong&gt;!&lt;/p&gt;

&lt;h2&gt;
  
  
  Author
&lt;/h2&gt;

&lt;p&gt;Johannes Messner,Alex C-G,Sami Jaghouar&lt;/p&gt;

&lt;h2&gt;
  
  
  Original Link
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://jina.ai/news/this-week-in-docarray-1/" rel="noopener noreferrer"&gt;https://jina.ai/news/this-week-in-docarray-1/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>gratitude</category>
    </item>
    <item>
      <title>How to Personalize Stable Diffusion for ALL the Things</title>
      <dc:creator>guoliwu</dc:creator>
      <pubDate>Wed, 15 Feb 2023 06:33:41 +0000</pubDate>
      <link>https://dev.to/guoliwu/how-to-personalize-stable-diffusion-for-all-the-things-25m3</link>
      <guid>https://dev.to/guoliwu/how-to-personalize-stable-diffusion-for-all-the-things-25m3</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F08kojxzub8cvce2phzfb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F08kojxzub8cvce2phzfb.png" alt=" " width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp9o24c3hjkn917lp2mrw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp9o24c3hjkn917lp2mrw.png" alt=" " width="800" height="266"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Jina AI is really into generative AI. It started out with &lt;a href="https://github.com/jina-ai/dalle-flow" rel="noopener noreferrer"&gt;DALL·E Flow&lt;/a&gt;, swiftly followed by &lt;a href="https://colab.research.google.com/github/jina-ai/discoart/blob/main/discoart.ipynb" rel="noopener noreferrer"&gt;DiscoArt&lt;/a&gt;. And then…&lt;strong&gt;🦗🦗&lt;/strong&gt;&lt;em&gt;&lt;strong&gt;&amp;lt;&lt;/strong&gt;cricket sounds*&lt;em&gt;&amp;gt;&lt;/em&gt;*&lt;/em&gt;&lt;strong&gt;🦗🦗&lt;/strong&gt;. At least for a while…&lt;/p&gt;

&lt;p&gt;That while has ended. We’re Back In the Game, baby. Big time, with our new BIG metamodel. You might be wondering: What’s so meta about it? Before you needed multiple Stable Diffusion models, but now with BIG you have one model for everything.&lt;/p&gt;

&lt;p&gt;BIG metamodel:&lt;a href="https://rebrand.ly/Big-MetaModel" rel="noopener noreferrer"&gt;https://rebrand.ly/Big-MetaModel&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;BIG stands for Back In (the) Game. If we ever get out of the game and get Back At the Game Again, we’ll have to go with BAGA. I hope we can afford the baseball caps.&lt;/p&gt;

&lt;p&gt;In short, BIG lets you fine-tune Stable Diffusion to the next level, letting you create images of multiple subjects and in any style you want. That means you can take a picture of you and a picture of your pooch and combine them into a composite image in the style of Picasso, Pixar or pop art.&lt;/p&gt;

&lt;p&gt;We created BIG by taking the &lt;a href="https://doi.org/10.48550/arXiv.2208.12242" rel="noopener noreferrer"&gt;DreamBooth paper&lt;/a&gt;, which allows fine-tuning with one subject, and leveling it up it into a metamodel to learn &lt;em&gt;multiple&lt;/em&gt; new objects without using up all your compute. In this blog post we’ll go over how we did that, and how well it works.&lt;/p&gt;

&lt;p&gt;But first, let’s take a quick look at how we got here, by starting off with Stable Diffusion itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stable Diffusion: Fine-tune to your favorite artist (but forget everyone else)
&lt;/h2&gt;

&lt;p&gt;In the beginning there was Stable Diffusion and it was good. “Create a Banksy picture” you would say, and verily a Banksy would be created. “Create an artwork in the style of Picasso” you would exclaim. And verily an image of a woman with too many angles would be created.&lt;/p&gt;

&lt;p&gt;“Generate an image in the style of &lt;a href="https://www.leonloewentraut.de/" rel="noopener noreferrer"&gt;Leon Löwentraut&lt;/a&gt;” you proclaim. And Stable Diffusion did say “uh, what? lol I’ll give it my best. And verily it was rubbish.&lt;/p&gt;

&lt;p&gt;Luckily, this can be fixed by fine-tuning (yeah, we’re dropping the Biblical speak). If you feed Stable Diffusion a Leon Löwentraut image it can learn his style (using, for example, &lt;a href="https://huggingface.co/docs/diffusers/training/text2image" rel="noopener noreferrer"&gt;text-to-image fine-tuning for Stable Diffusion&lt;/a&gt;.)&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzevp6npx56oto2um5j4c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzevp6npx56oto2um5j4c.png" alt="Left: Generated image by Stable Diffusionwith prompt  raw `a banksy painting` endraw , before fine-tuning; Right: Generated images for prompt  raw `a banksy painting` endraw , after fine-tuning to Löwentraut." width="800" height="388"&gt;&lt;/a&gt;Left: Generated image by Stable Diffusionwith prompt &lt;code&gt;a banksy painting&lt;/code&gt;, before fine-tuning; Right: Generated images for prompt &lt;code&gt;a banksy painting&lt;/code&gt;, after fine-tuning to Löwentraut.&lt;/p&gt;

&lt;p&gt;The only problem is it gets amnesia for everything else it’s learned before. So if you then try to create the style of Banksy or Picasso on your newly fine-tuned model they all turn out pretty Löwentrautian:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxdvxsjfwyzfueokcc03p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxdvxsjfwyzfueokcc03p.png" alt="Left: An actual painting from Leon Löwentraut, Right: Generated image for Leon Löwentraut." width="800" height="393"&gt;&lt;/a&gt;Left: An actual painting from Leon Löwentraut, Right: Generated image for Leon Löwentraut.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsctavyrx8zxkhuhdeg8i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsctavyrx8zxkhuhdeg8i.png" alt="Left: Generated image by Stable Diffusion with prompt  raw `a picasso painting` endraw , before fine-tuning; Right: Generated image for prompt  raw `a picasso painting` endraw , after fine-tuning to Löwentraut" width="800" height="396"&gt;&lt;/a&gt;Left: Generated image by Stable Diffusion with prompt &lt;code&gt;a picasso painting&lt;/code&gt;, before fine-tuning; Right: Generated image for prompt &lt;code&gt;a picasso painting&lt;/code&gt;, after fine-tuning to Löwentraut.&lt;/p&gt;

&lt;h2&gt;
  
  
  DreamBooth: Fine-tune to your favorite artist (and remember!)
&lt;/h2&gt;

&lt;p&gt;DreamBooth fixes that. At least to a point. You want to train it for your dog? Piece of cake. Fine-tune it for your favorite artist? A walk in the park. And Mona Lisa would still look like it came from Leonardo and Starry Night from Van Gogh.&lt;/p&gt;

&lt;p&gt;It does this by extending Stable Diffusion’s fine-tuning loss with a prior preservation loss to train the model to still generate diverse images for the category of that style (e.g. Löwentraut) or object (e.g. a dog). The prior preservation loss is the mean squared error of the now generated images and the pre-training generated images for the category in the latent space.&lt;/p&gt;

&lt;p&gt;This fine-tuning involves two prompts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a &lt;code&gt;[CATEGORY]&lt;/code&gt;: The prompt for the prior preservation loss is the category of the style or object in question, like &lt;code&gt;dog&lt;/code&gt; or &lt;code&gt;painting&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;a &lt;code&gt;[RARE_IDENTIFIER] [CATEGORY]&lt;/code&gt;: The prompt for fine-tuning to a new object or style, generally a string that corresponds to a token the model is unfamiliar with. This is a unique reference to the object you want Stable Diffusion to learn. Example strings would be &lt;code&gt;sks&lt;/code&gt; or &lt;code&gt;btb&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So, to fine-tune Stable Diffusion to create images of your dog, you would:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Take 5-8 high quality images of your dog.&lt;/li&gt;
&lt;li&gt;Fine-tune the model to recreate these images for the prompt &lt;code&gt;a sks dog&lt;/code&gt; and the same time still create diverse images for the prompt &lt;code&gt;a dog&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Creating diverse images is helped along the way by generating images for the prompt &lt;code&gt;a dog&lt;/code&gt; and using them as training images.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb7o2gtxbb68qe8vwein3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb7o2gtxbb68qe8vwein3.png" alt=" " width="800" height="627"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Unstable Confusion: The amnesia creeps back in
&lt;/h2&gt;

&lt;p&gt;So far, so good! But what if you first use DreamBooth to fine-tune Stable Diffusion on your dog, then train on Leon Löwentraut, &lt;em&gt;then&lt;/em&gt; ask it to create a picture of your dog in his style? Or train for &lt;code&gt;artist_1&lt;/code&gt;, then train for &lt;code&gt;artist_2&lt;/code&gt;, then try to create a new image by &lt;code&gt;artist_1&lt;/code&gt;?&lt;/p&gt;

&lt;p&gt;That shouldn’t be too hard, right?&lt;/p&gt;

&lt;p&gt;Too hard for DreamBooth unfortunately.&lt;/p&gt;

&lt;p&gt;DreamBooth falls over on this because it has a selective short-term memory. Using it to teach Stable Diffusion something new (let’s say the style of Löwentraut) works great. And then you can create images of all kinds of places and objects (already known to Stable Diffusion) in his style. But then if you decide to train it in the style of another artist it’ll forget everything it learned about Löwentraut’s style.&lt;/p&gt;

&lt;p&gt;That’s why we created BIG: To let you train on multiple objects and styles &lt;em&gt;without&lt;/em&gt; the amnesia. But more on that in a later section.&lt;/p&gt;

&lt;p&gt;To see DreamBooth’s amnesia in action, let’s use it to fine-tune a model for two different artists:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Leon Löwentraut (using the &lt;code&gt;RARE_IDENTIFIER&lt;/code&gt; of &lt;code&gt;lnl&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.instagram.com/vexx/" rel="noopener noreferrer"&gt;Vexx&lt;/a&gt; (using the &lt;code&gt;RARE_IDENTIFIER&lt;/code&gt; of &lt;code&gt;qzq&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsrdv06liselez8umumns.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsrdv06liselez8umumns.png" alt="Left: Another example of an actual Leon Löwentraut painting; Right: An image from Vexx.." width="800" height="408"&gt;&lt;/a&gt;Left: Another example of an actual Leon Löwentraut painting; Right: An image from Vexx.&lt;/p&gt;

&lt;p&gt;To generate a painting in one of the above styles, we’d use a prompt like &lt;code&gt;a lnl painting&lt;/code&gt; or &lt;code&gt;a qzq painting&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Using DreamBooth and the prompt &lt;code&gt;a lnl painting&lt;/code&gt; to fine-tune a model to fit the art style of Leon Löwentraut works great.  For this we used four training images and trained for 400 steps with a learning rate of &lt;code&gt;1e-6&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The left and center images show the model before fine-tuning. Note how it doesn’t know either &lt;code&gt;lnl&lt;/code&gt; or &lt;code&gt;loewentr&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhrttiqgp9fy1q6fojowa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhrttiqgp9fy1q6fojowa.png" alt="Left: Generated image by Stable Diffusion with prompt  raw `a qzq painting` endraw ; Center: Generated image by Stable Diffusion with prompt  raw `a vexx painting` endraw ; Right: Generated image for  raw `a qzq painting` endraw  after fine-tuning for Vexx." width="800" height="265"&gt;&lt;/a&gt;Left: Generated image by Stable Diffusion with prompt &lt;code&gt;a qzq painting&lt;/code&gt;; Center: Generated image by Stable Diffusion with prompt &lt;code&gt;a vexx painting&lt;/code&gt;; Right: Generated image for &lt;code&gt;a qzq painting&lt;/code&gt; after fine-tuning for Vexx.&lt;/p&gt;

&lt;p&gt;But can the model still produce images in the styles of Picasso, Matisse and Banksy? Yes!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F13jjrztm2bim7g0s9zy4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F13jjrztm2bim7g0s9zy4.png" alt="Left: Generated image for prompt  raw `a banksy painting` endraw  after fine-tuning for Vexx; Center: Generated image for prompt  raw `a mattisse painting` endraw  after fine-tuning for Vexx; Left: Generated image for prompt  raw `a picasso painting` endraw  after fine-tuning for Vexx." width="800" height="800"&gt;&lt;/a&gt;Left: Generated image for prompt &lt;code&gt;a banksy painting&lt;/code&gt; after fine-tuning for Vexx; Center: Generated image for prompt &lt;code&gt;a mattisse painting&lt;/code&gt; after fine-tuning for Vexx; Left: Generated image for prompt &lt;code&gt;a picasso painting&lt;/code&gt; after fine-tuning for Vexx.&lt;/p&gt;

&lt;p&gt;ow, after learning Vexx, does our model still remember Leon Löwentraut?&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc0vtgrgulikhf78e8xs8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc0vtgrgulikhf78e8xs8.png" alt="Generated image for  raw `a lnl painting` endraw  after fine-tuning for Vexx." width="800" height="257"&gt;&lt;/a&gt;Generated image for &lt;code&gt;a lnl painting&lt;/code&gt; after fine-tuning for Vexx.&lt;/p&gt;

&lt;h2&gt;
  
  
  How can we cure DreamBooth’s amnesia?
&lt;/h2&gt;

&lt;p&gt;To solve the problem of forgetting Leon Löwentraut while learning Vexx, we included the images of Leon Löwentraut in the prior preservation loss during the fine-tuning. This is equivalent to further fine-tuning on Löwentraut while fine-tuning on Vexx, but less so than the original fine-tuning of Löwentraut. It works best to reuse the actual images for the style than the images the model generated.&lt;/p&gt;

&lt;p&gt;So, now we can generate all the artists we encounter in our travels. After teaching  Stable Diffusion model to create images in the style of Leon Löwentraut from the prompt &lt;code&gt;a lnl painting&lt;/code&gt;, we wanted to create images of our favourite mate tea bottle. So, again we used the Leon Löwentraut fine-tuned model as initialization to train my Stable Diffusion model to create images of &lt;a href="https://de.wikipedia.org/wiki/Mio_Mio_Mate" rel="noopener noreferrer"&gt;Mio Mio Mate&lt;/a&gt; for &lt;code&gt;a sks bottle&lt;/code&gt; (giving it a unique &lt;code&gt;RARE_IDENTIFIER&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsmn4suef7au4s0ib0xk5.png%250A" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsmn4suef7au4s0ib0xk5.png%250A" alt="Left: Picture taken of a mio mio mate bottle, used for fine-tuning; Center: Generated image by Stable Diffusion with prompt:  raw `a mio mio mate bottle` endraw ; Right: Generated image by Stable Diffusion with prompt:  raw `a sks bottle` endraw ." width="800" height="400"&gt;&lt;/a&gt;Left: Picture taken of a mio mio mate bottle, used for fine-tuning; Center: Generated image by Stable Diffusion with prompt: &lt;code&gt;a mio mio mate bottle&lt;/code&gt;; Right: Generated image by Stable Diffusion with prompt: &lt;code&gt;a sks bottle&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Again, this works for the new object, yet the model doesn’t quite remember how to produce images for Leon Löwentraut under &lt;code&gt;a lnl painting&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fms84pxnlx0jxa646eheb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fms84pxnlx0jxa646eheb.png" alt="Left: Generated image for  raw `a sks bottle` endraw  after fine-tuning for Mio Mio bottle; Right: Generated image for  raw `a lnl painting` endraw  after fine-tuning for Mio Mio bottle." width="800" height="400"&gt;&lt;/a&gt;Left: Generated image for &lt;code&gt;a sks bottle&lt;/code&gt; after fine-tuning for Mio Mio bottle; Right: Generated image for &lt;code&gt;a lnl painting&lt;/code&gt; after fine-tuning for Mio Mio bottle.&lt;/p&gt;

&lt;p&gt;To solve this issue, we can also think about using the previous images for Leon Löwentraut in the prior preservation loss. This helps remember the style of Leon Löwentraut. Yet, art styles which have similar geometric features (like Picasso) are not as accurately reproduced. This makes intuitive sense and is also the reason DreamBooth was introduced in the first place. Building on it, we need to not only incorporate the images of Leon Löwentraut in the prior preservation loss but also images of paintings, i.e., incorporate additionally the previous objects/styles and their categories into the prior preservation loss.&lt;/p&gt;

&lt;h2&gt;
  
  
  BIG Metamodel: Fine-tune Stable Diffusion to your favorite artist and dog
&lt;/h2&gt;

&lt;p&gt;Now piecing together the above ideas raises the question: how do we split the images for the prior preservation loss as a batch that always consists of &lt;code&gt;N&lt;/code&gt; instance images and &lt;code&gt;N&lt;/code&gt; prior preservation loss images? Well, the following intuitive split works great:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use half of the prior preservation loss image for the current category and its previously learned instances

&lt;ul&gt;
&lt;li&gt;50% of those are generated images for the category&lt;/li&gt;
&lt;li&gt;Remaining 50% equally divided among previously used instance images&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Use the other half equally among the previously trained categories; so for every category, split available images into:

&lt;ul&gt;
&lt;li&gt;50% generated images for category (prompt is e.g. &lt;code&gt;a painting&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Remaining 50% equally divided among previously used instance images&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;To illustrate this, let’s assume we have a metamodel which has learnt:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Two objects for category &lt;code&gt;bottle&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Two for &lt;code&gt;dog&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;One for &lt;code&gt;painting&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To learn another &lt;code&gt;dog&lt;/code&gt; the top-level split between the categories and the for the individual categories are as follows:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsmll96533m736ihq1m43.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsmll96533m736ihq1m43.png" alt=" " width="800" height="176"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Visualizing it as a pie chart, this is the split of all images for the prior preservation loss:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc0jkithbp8xw0loosny5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc0jkithbp8xw0loosny5.png" alt=" " width="800" height="473"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To abstract that logic away, we created an &lt;a href="https://docs.jina.ai/concepts/executor/" rel="noopener noreferrer"&gt;Executor&lt;/a&gt; to quickly fine-tune private (i.e. owned by a specific user) models for specific objects/styles, as well as create public and private metamodels. To do that it exposes the following &lt;a href="https://docs.jina.ai/concepts/executor/add-endpoints/" rel="noopener noreferrer"&gt;endpoints&lt;/a&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;/finetune&lt;/code&gt; endpoint:

&lt;ul&gt;
&lt;li&gt;Fine-tunes a model for a particular style or object which is private (&lt;code&gt;private model&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Incrementally fine-tunes a model for various styles and objects which is only accessible by particular user (&lt;code&gt;private metamodel&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Incrementally fine-tunes a model for various styles and objects which is accessible for everyone and to which everyone can contribute (&lt;code&gt;metamodel&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;
&lt;code&gt;/generate&lt;/code&gt; endpoint:

&lt;ul&gt;
&lt;li&gt;Generates images for any of above models as well as for a &lt;code&gt;pretrained&lt;/code&gt; model&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Hitchhiker's Guide to Building Your Own Metamodel
&lt;/h2&gt;

&lt;p&gt;So how do you train your metamodel?&lt;/p&gt;

&lt;p&gt;First, fit a private model to find the right learning rate and training steps. A low learning rate of &lt;code&gt;1e-6&lt;/code&gt; is best across different styles and objects. We found that starting from 200 training steps and slowly increasing to 600 is best to find the sweet spot of fitting and not overfitting for objects and styles. To recreate faces, we recommend starting from 600 training steps and increasing to 1,200.&lt;/p&gt;

&lt;p&gt;The second and final step is to reuse the same parameters for the request but change the &lt;code&gt;target_model&lt;/code&gt; to &lt;code&gt;meta&lt;/code&gt; or &lt;code&gt;private_meta&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Now you have your (private) metamodel. In a script, fine-tuning is made very simple as shown below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from jina import Client, DocumentArray 
import hubble 
# specify the path to the images 
path_to_instance_images = '/path/to/instance/images' 
# specify the category of the images, this could be e.g. 'painting', 'dog', 'bottle', etc. 
category = 'category of the objects' 
# 'private' for training private model from pretrained model, 'meta' for training metamodel 
target_model = 'private' 


# some custom parameters for the training 
max_train_steps = 300 
learning_rate = 1e-6 
docs = DocumentArray.from_files(f'{path_to_instance_images}/**') 

for doc in docs: 
    doc.load_uri_to_blob() 
    doc.uri = None client = 

Client(host='grpc://host_big_executor:port_big_executor') 

identifier\_doc = client.post( on='/finetune', inputs=docs, parameters={ 'jwt': { 'token': hubble.get\_token(), }, 'category': category, 'target\_model': target\_model, 'learning\_rate': learning\_rate, 'max\_train\_steps': max\_train\_steps, }, ) print(f'Finetuning was successful. The identifier for the object is "{identifier\_doc\[0\].text}"')
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;p&gt;With our new metamodel we taught Stable Diffusion to create images of the Mio Mio Mate tea bottle, a sparking water bottle from a local manufacturer, a &lt;a href="https://nuphy.com/products/air75" rel="noopener noreferrer"&gt;NuPhy Air75&lt;/a&gt; keyboard, and an office desk chair and artwork in the styles of Leon Löwentraut and Vexx:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnib4chn0kbt1q4eqlc8y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnib4chn0kbt1q4eqlc8y.png" alt=" " width="800" height="525"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Note how for both bottles a hand appears holding it. For both generated images there were six images used for fine-tuning and for both there was only one image holding the bottle by hand. Yet, the model has somewhat collapsed to always showing this hand. This shows the importance of not only high-quality but also diverse representative images. Here are the images of a hand holding the bottle that we used for fine-tuning:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F05kfl2cw3yy4b88uz28o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F05kfl2cw3yy4b88uz28o.png" alt="Left: Picture with hand used to fine-tune Stable Diffusion to sparking water bottle; Right: Picture with hand used to fine-tune Stable Diffusion to mate bottle." width="800" height="397"&gt;&lt;/a&gt;Left: Picture with hand used to fine-tune Stable Diffusion to sparking water bottle; Right: Picture with hand used to fine-tune Stable Diffusion to mate bottle.&lt;/p&gt;

&lt;p&gt;The model isn’t just able to memorize the objects, but also learns how newly-learned objects and styles interact:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxk4p477ayqsutbpk4l2h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxk4p477ayqsutbpk4l2h.png" alt="Left: Sparkling water bottle in the style of Vexx, generated with prompt:  raw `a qzq painting of a rvt bottle` endraw ; Right: Mate bottle in Leon Löwentraut's style, generated with prompt:  raw `a lnl painting of a pyr bottle` endraw ." width="800" height="391"&gt;&lt;/a&gt;Left: Sparkling water bottle in the style of Vexx, generated with prompt: &lt;code&gt;a qzq painting of a rvt bottle&lt;/code&gt;; Right: Mate bottle in Leon Löwentraut's style, generated with prompt: &lt;code&gt;a lnl painting of a pyr bottle&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftvxdloe96076z9rc1a7p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftvxdloe96076z9rc1a7p.png" alt="Left: NuPhy Air75 keyboard in style of Vexx, generated with prompt:  raw `a qzq painting of a sph keyboard` endraw ; Right: NuPhy Air75 keyboard in style of Leon Löwentraut, generated with prompt:  raw `a lnl painting of a sph keyboard` endraw ." width="800" height="392"&gt;&lt;/a&gt;Left: NuPhy Air75 keyboard in style of Vexx, generated with prompt: &lt;code&gt;a qzq painting of a sph keyboard&lt;/code&gt;; Right: NuPhy Air75 keyboard in style of Leon Löwentraut, generated with prompt: &lt;code&gt;a lnl painting of a sph keyboard&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Last, but not least we trained it to create images of Joschka's company dog, Briscoe:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3p87nlsupp5nic5mlx1h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3p87nlsupp5nic5mlx1h.png" alt="Left: A picture of Briscoe used for fine-tuning; Center: A generated image of Briscoe with prompt:  raw `a brc dog` endraw ; Right: A painting of Briscoe in the style of Leon Löwentraut generated with prompt:  raw `a lnl` endraw   raw `painting of a brc dog` endraw ." width="800" height="260"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What’s next?
&lt;/h2&gt;

&lt;p&gt;In future, it would be interesting to enhance the performance if we additionally apply &lt;a href="https://textual-inversion.github.io/" rel="noopener noreferrer"&gt;Textual Inversion&lt;/a&gt; to get better prompts for generating new images. This might also change how previously-learned objects are forgotten.&lt;/p&gt;

&lt;p&gt;We could also explore other angles, like why previously learned objects and styles get overwritten, by understanding if the similarity in prompts is an issue or if semantic similarity of the new objects is a strong predictor of forgetting. The former can be solved by better sampling of the rare identifiers, using &lt;a href="https://arxiv.org/abs/2201.12086" rel="noopener noreferrer"&gt;BLIP&lt;/a&gt; to automatically generate captions, or adapting textual inversion to incorporate the forgetting effect.&lt;/p&gt;

&lt;p&gt;Another question is when the current form of further learning in the metamodel leads to overfitting to previously-learned objects and styles as the model is continuously trained to minimize the loss of the generated images for them. For that it’s relevant to optimize the allocation of images for the prior preservation loss in order to push the amount of new learnt objects even further.&lt;/p&gt;

&lt;p&gt;You can also start playing around with it yourself in &lt;a href="https://colab.research.google.com/github/jina-ai/big_creative_ai/blob/main/big_metamodel.ipynb" rel="noopener noreferrer"&gt;Google Colab&lt;/a&gt; or check out the &lt;a href="https://github.com/jina-ai/big_creative_ai" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Author
&lt;/h2&gt;

&lt;p&gt;Joschka Braun,Alex C-G&lt;/p&gt;

&lt;h2&gt;
  
  
  Orignal Link
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://jina.ai/news/how-to-personalize-stable-diffusion-for-all-the-things/" rel="noopener noreferrer"&gt;https://jina.ai/news/how-to-personalize-stable-diffusion-for-all-the-things/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>startup</category>
      <category>career</category>
      <category>discuss</category>
    </item>
    <item>
      <title>Improving Search Quality for Non-English Queries with Fine-tuned Multilingual CLIP Models</title>
      <dc:creator>guoliwu</dc:creator>
      <pubDate>Fri, 10 Feb 2023 13:56:26 +0000</pubDate>
      <link>https://dev.to/guoliwu/improving-search-quality-for-non-english-queries-with-fine-tuned-multilingual-clip-models-2n9d</link>
      <guid>https://dev.to/guoliwu/improving-search-quality-for-non-english-queries-with-fine-tuned-multilingual-clip-models-2n9d</guid>
      <description>&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--ojZaCqIY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/12/Jina-AI-Website-Banners-Templates--63-.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ojZaCqIY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/12/Jina-AI-Website-Banners-Templates--63-.png" alt="" width="880" height="440"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Since early 2021, &lt;a href="https://openai.com/blog/clip/"&gt;CLIP-style models&lt;/a&gt; have been the backbone of &lt;a href="https://jina.ai/news/what-is-multimodal-deep-learning-and-what-are-the-applications/"&gt;multimodal AI&lt;/a&gt;. They work by embedding inputs from more than one kind of media into a common high-dimensional vector space, using different models for different modalities. These different models are &lt;em&gt;co-trained&lt;/em&gt; with multimodal data. For CLIP-models, this means images with captions.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--98en9v0M--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/12/fashion-axes.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--98en9v0M--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/12/fashion-axes.png" alt="" width="880" height="648"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A highly schematic representation of how CLIP embeddings make it possible to associate texts with images.&lt;/p&gt;

&lt;p&gt;The result? A pair of models that embed images and texts close to each other if the text is descriptive of the image, or the image contains things that match the text. So if we have a picture of a skirt and the word “Rock” (German for “skirt”), they would be close together, while the word “Hemd” (German for “shirt”) would be closer to a picture of a shirt.&lt;/p&gt;

&lt;h2&gt;
  
  
  Towards multilingual CLIP
&lt;/h2&gt;

&lt;p&gt;However, CLIP text models have mostly been trained on English data, and that’s a big problem: The world is full of people who don’t speak English.&lt;/p&gt;

&lt;p&gt;Very recently, a few non-English and multilingual CLIP models have appeared, using various sources of training data. In this article, we’ll evaluate a multilingual CLIP model’s performance in a language other than English, and show how you can improve it even further using &lt;a href="https://github.com/jina-ai/finetuner"&gt;Jina AI’s Finetuner&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;To make this happen, we’re collaborating with Toloka, a leading provider of data procurement services for machine learning, to create a dataset of images with high-quality German-language descriptions written by humans.&lt;/p&gt;

&lt;h2&gt;
  
  
  How does multilingual CLIP work?
&lt;/h2&gt;

&lt;p&gt;Multilingual CLIP is any CLIP model trained with more than one language. So that could be English+French, German+English, or even Klingon+Elvish.&lt;/p&gt;

&lt;p&gt;We’re going to look at a model that &lt;a href="https://openai.com/"&gt;Open AI&lt;/a&gt; has trained with a broad multilingual dataset: The &lt;code&gt;[xlm-roberta-base-ViT-B-32](https://huggingface.co/laion/CLIP-ViT-B-32-xlm-roberta-base-laion5B-s13B-b90k)&lt;/code&gt; CLIP model, which uses the &lt;code&gt;[ViT-B/32](https://github.com/google-research/vision_transformer)&lt;/code&gt;image encoder, and the &lt;code&gt;[XLM-RoBERTa](https://huggingface.co/xlm-roberta-large)&lt;/code&gt; multilingual language model. Both of these are pre-trained:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;ViT-B/32&lt;/code&gt;, using the &lt;a href="https://github.com/Alibaba-MIIL/ImageNet21K"&gt;ImageNet-21k&lt;/a&gt; dataset&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;XLM-RoBERTa&lt;/code&gt;, using a multi-terabyte dataset of text from the Common Crawl, containing over 100 languages.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So, from the outset, multilingual CLIP is different because it uses a multilingual text encoder, but can (and generally does) use the same image encoders as monolingual models.&lt;/p&gt;

&lt;p&gt;Open AI then co-trained the two encoders with the multilingual &lt;a href="https://laion.ai/blog/laion-5b"&gt;&lt;code&gt;laion5b&lt;/code&gt;&lt;/a&gt; dataset, which contains 5.85 billion image-text pairs: 2.2 billion of these pairs are labelled in 100+ non-English languages, with the rest in English or containing text that can’t be nailed down to any one language (like place names or other proper nouns). These are taken from a sampling of images and their &lt;a href="https://www.w3schools.com/tags/att_img_alt.asp"&gt;HTML alt-text&lt;/a&gt; in the &lt;a href="https://commoncrawl.org/"&gt;Common Crawl web archive&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--HctfOH8c--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/12/alt-txt-image.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--HctfOH8c--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/12/alt-txt-image.png" alt="" width="880" height="876"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Some browsers will let you see the alt-text if you move your mouse over an image.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Q5m-d6r8--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/12/Screenshot-2022-12-13-at-13.12.55-1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Q5m-d6r8--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/12/Screenshot-2022-12-13-at-13.12.55-1.png" alt="" width="858" height="144"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;How an alt-text is encoded in HTML.&lt;/p&gt;

&lt;p&gt;This dataset isn’t balanced in the sense that no-one has tried to ensure that data for one language is comparable in size or scope to the data for any other. English still dominates.&lt;/p&gt;

&lt;h3&gt;
  
  
  Deep dive of the tokenizer inside multilingual models
&lt;/h3&gt;

&lt;p&gt;So, how is a multilingual text encoder different from a bog-standard monolingual one? One big difference is how it handles &lt;strong&gt;tokenization&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Text transformer models like &lt;code&gt;XLM-RoBERTa&lt;/code&gt; all start by tokenizing input texts — breaking them up into smaller parts — and replacing each part with an input vector constructed as part of the initial training. These input vectors are strung together and passed to the model to create an embedding vector.&lt;/p&gt;

&lt;p&gt;You might expect these smaller parts to match &lt;em&gt;words&lt;/em&gt;, and sometimes they do. But looking for words by just checking for spaces and punctuation doesn’t capture the fact that &lt;em&gt;call&lt;/em&gt;, &lt;em&gt;calls&lt;/em&gt;, &lt;em&gt;calling&lt;/em&gt;, and &lt;em&gt;called&lt;/em&gt; are not four totally different words, just like &lt;em&gt;small&lt;/em&gt;, &lt;em&gt;smaller&lt;/em&gt;, and &lt;em&gt;smallest&lt;/em&gt;, or &lt;em&gt;annoy&lt;/em&gt;, &lt;em&gt;annoyed&lt;/em&gt;, &lt;em&gt;annoyingly&lt;/em&gt;. In practice, this entire class of model uses, at least partly, a technique called &lt;em&gt;subword tokenization&lt;/em&gt;, which uses the statistical properties of sequences of characters to decide what units are the “right-size” for learning.&lt;/p&gt;

&lt;p&gt;It’s not really based in any linguistic theory, but doing it this way has many advantages for machine learning. Think of the suffix &lt;em&gt;-ed&lt;/em&gt; in English. You might expect that a “right-sized” statistical tokenizer would notice that many English words end in -ed, and break those words into two parts:&lt;/p&gt;

&lt;p&gt; &lt;code&gt;called → call -ed&lt;/code&gt;&lt;br&gt;&lt;br&gt;
 &lt;code&gt;asked  → ask  -ed&lt;/code&gt;&lt;br&gt;&lt;br&gt;
 &lt;code&gt;worked → work -ed&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;And this makes sense, &lt;em&gt;most&lt;/em&gt; of the time. But not always:&lt;/p&gt;

&lt;p&gt; &lt;code&gt;weed → we -ed&lt;/code&gt;&lt;br&gt;&lt;br&gt;
 &lt;code&gt;bed  → b  -ed&lt;/code&gt;&lt;br&gt;&lt;br&gt;
 &lt;code&gt;seed → se -ed&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Large language models are very robust, and they can learn that “weed” has a meaning different from “we” + “-ed”. Using this kind of tokenization, even new words that were never part of the pre-training data get a distinct representation for the model to learn.&lt;/p&gt;

&lt;p&gt;Nonetheless, the more that the tokenization matches meaningful units of language, the faster and better the model learns.&lt;/p&gt;

&lt;p&gt;Let’s take a concrete example. The image below is from the data provided by Toloka with the German caption “&lt;em&gt;Leichte Damenjacke Frühling Herbst braun”&lt;/em&gt; (”Light women's jacket spring autumn brown”):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--0rWtWitc--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/12/image-14.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--0rWtWitc--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/12/image-14.png" alt="“_Leichte Damenjacke Frühling Herbst braun”_" width="880" height="171"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If we pass this German phrase to &lt;code&gt;XLM-RoBERTa&lt;/code&gt;’s tokenizer, we get a very different result from when we pass it to a comparable tokenizer used for an English-only model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;multilingual-CLIP: leicht|e|Damen|jack|e|Frühling|Herbst|bra|un

english-only-CLIP: le|ich|te|dam|en|jac|ke|fr|ü|h|ling|her|bst|braun
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The tokens found by the multilingual tokenizer much more closely match our intuitions about meaningful units in German, while the English-only-trained tokenizer produces almost random chunks. Yes, it is still possible for a large language model to learn from badly tokenized data, if it’s consistent, but it will be slower and/or less accurate.&lt;/p&gt;

&lt;p&gt;In contrast, the English equivalent — a word-for-word translation — is clearly better tokenized by the English-only tokenizer, but is not so badly tokenized by the multilingual one:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;multilingual-CLIP: light|women|'|s|ja|cket|spring|a|utum|n|brown

english-only-CLIP: light|women|'s|jacket|spring|autumn|brown
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Even from the first step in the process of producing text embeddings, we can see that multilingual language models make a large difference in producing multilingual CLIP models.&lt;/p&gt;

&lt;h2&gt;
  
  
  Multilingual vs. monolingual CLIP on the search quality
&lt;/h2&gt;

&lt;p&gt;Large language models are famously good at transfer learning. For example, if a monolingual English-only CLIP model has learned what “jacket” means, you can further train it, with very few additional examples, to know that the German word “Jacke” means the same thing. Then, it can carry all its knowledge about the English word “jacket” over to German.&lt;/p&gt;

&lt;p&gt;It is possible that a model already trained on English could be retrained for German with less data than training a new German model from scratch.&lt;/p&gt;

&lt;p&gt;Therefore, it’s worth asking: &lt;strong&gt;How much do we really gain using a model trained to be multilingual from the outset?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In this article, we will use the German fashion dataset provided by Toloka to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Compare the zero-shot performance (i.e. out-of-the-box, without fine-tuning) of the multilingual CLIP model &lt;code&gt;xlm-roberta-base-ViT-B-32&lt;/code&gt; and the English-only equivalent &lt;code&gt;clip-vit-base-patch32&lt;/code&gt;. These two use the same image embedding model, but different text embedding models.&lt;/li&gt;
&lt;li&gt;Attempt to improve both models by using a part of the German dataset to fine-tune them.&lt;/li&gt;
&lt;li&gt;Compare the fine-tuned models using the same metrics, so we can both contrast non-fine-tuned and fine-tuned models, and contrast the English-only and multilingual models after adaptation to the German data.&lt;/li&gt;
&lt;li&gt;Show how much advantage, if any, is gained from a multilingual CLIP model.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Experiment Setup
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The German Fashion12k dataset
&lt;/h3&gt;

&lt;p&gt;We have collaborated with Toloka to curate a 12,000 item dataset of fashion images drawn from e-commerce websites, to which human annotators have added descriptive captions in German. Toloka has made the data &lt;a href="https://github.com/Toloka/Fashion12K_german_queries"&gt;available to the public on GitHub&lt;/a&gt;, but you can also download it from Jina directly in DocArray format by following the instructions in the &lt;a href="https://jina.ai/news/improving-search-quality-non-english-queries-fine-tuned-multilingual-clip-models/#download-the-dataset-via-docarray"&gt;next section&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The images are a subset of the &lt;a href="https://github.com/xthan/fashion-200k"&gt;xthan/fashion-200k dataset&lt;/a&gt;, and we have commissioned their human annotations via Toloka’s crowdsourcing platform. Annotations were made in two steps.  First, Toloka passed the 12,000 images to annotators in their large international user community, who added descriptive captions.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--6clMH_2X--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/aem5r510qb03godhcwwu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--6clMH_2X--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/aem5r510qb03godhcwwu.png" alt="The Toloka app showing an item of clothing to a user and asking for a description." width="880" height="502"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The app prompted users to write descriptions that follow a common pattern, partially enforced by a simple pattern matcher. Specifically:&lt;/p&gt;

&lt;p&gt;Write a search query that would find this product: type, your guess about the material, where it might be worn, color, texture, details. […]&lt;/p&gt;

&lt;p&gt;Requirements for the query:&lt;br&gt;&lt;br&gt;
&lt;strong&gt;·&lt;/strong&gt; At least SIX words&lt;br&gt;&lt;br&gt;
&lt;strong&gt;·&lt;/strong&gt; Words that are separated ONLY by spaces (or ", ")&lt;br&gt;&lt;br&gt;
&lt;strong&gt;·&lt;/strong&gt; Do NOT use "this is/these are"&lt;/p&gt;

&lt;p&gt;Then, in the second stage, other, randomly chosen users validated each description.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--2HAaot35--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/o58qlkkh8cd6rqcp7hqo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--2HAaot35--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/o58qlkkh8cd6rqcp7hqo.png" alt="Validation screen in the Toloka app. The app presents the user with a text description created by someone else and asks if it’s an appropriate description, inappropriate description, or if the image failed to load." width="880" height="499"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Some examples from the resulting dataset:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--aYyV7Ct---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ekkuugzaouvh2nit4ua6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--aYyV7Ct---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ekkuugzaouvh2nit4ua6.png" alt="Image description" width="671" height="496"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Of the 12,000 image-text pairs in the data from Toloka, we randomly selected 10,000 for training and held the remaining 2,000 out for evaluation. By coincidence, and because some clothes are similar enough in nature, there are a few duplicate descriptions. However, since there are 11,582 unique descriptions, we didn’t consider this an important factor in using this data.&lt;/p&gt;
&lt;h3&gt;
  
  
  Download the dataset via DocArray
&lt;/h3&gt;

&lt;p&gt;The German Fashion12k dataset is available for free use by the Jina AI community. After logging into Jina AI Cloud, you can download it directly in &lt;a href="https://docarray.jina.ai/"&gt;DocArray&lt;/a&gt; format:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;train_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;DocumentArray&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pull&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'DE-Fashion-Image-Text-Multimodal-train'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;show_progress&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;eval_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;DocumentArray&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pull&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'DE-Fashion-Image-Text-Multimodal-test'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;show_progress&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Load the multilingual CLIP model
&lt;/h3&gt;

&lt;p&gt;Because CLIP models are actually two different models that have been trained together, we have to load them as two models.&lt;/p&gt;

&lt;p&gt;In this article, we will use the &lt;a href="https://finetuner.jina.ai/"&gt;Finetuner interface&lt;/a&gt;. To use the &lt;code&gt;xlm-roberta-base-ViT-B-32&lt;/code&gt; CLIP model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;finetuner&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;build_model&lt;/span&gt;

&lt;span class="n"&gt;mCLIP_text_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;build_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'xlm-roberta-base-ViT-B-32::laion5b_s13b_b90k'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;select_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'clip-text'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;mCLIP_vision_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;build_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'xlm-roberta-base-ViT-B-32::laion5b_s13b_b90k'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;select_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'clip-vision'&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For models supported directly by Jina AI, you can load them &lt;em&gt;by name&lt;/em&gt;, without having to directly deal with downloading or deserialization.&lt;/p&gt;

&lt;h3&gt;
  
  
  Load the English CLIP model
&lt;/h3&gt;

&lt;p&gt;For comparison, you can access the English-only &lt;code&gt;ViT-B-32::openai&lt;/code&gt; CLIP model in the same way:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;finetuner&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;build_model&lt;/span&gt;

&lt;span class="n"&gt;enCLIP_text_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;build_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'ViT-B-32::openai'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;select_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'clip-text'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;enCLIP_vision_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;build_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'ViT-B-32::openai'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;select_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'clip-vision'&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Evaluate the zero-shot performance
&lt;/h2&gt;

&lt;p&gt;We measured the zero-shot performance of both the Multilingual CLIP model and the English-only one on German Fashion dataset, that is to say, how well they perform as downloaded, without additional training, on the 2,000 items we held out for evaluation.&lt;/p&gt;

&lt;p&gt;We embed the text descriptions in the evaluation data, and used them to search for matches among the embedded images in the evaluation data, taking the 20 top matches for each text description. We took the results and performed a number of standard statistical tests on them, including &lt;a href="https://en.wikipedia.org/wiki/Mean_reciprocal_rank"&gt;Mean Reciprocal Rank&lt;/a&gt; (mRR), &lt;a href="https://stats.stackexchange.com/questions/127041/mean-average-precision-vs-mean-reciprocal-rank"&gt;Mean Average Precision&lt;/a&gt; (mAP), &lt;a href="https://en.wikipedia.org/wiki/Discounted_cumulative_gain"&gt;Discounted Cumulative Gain&lt;/a&gt; (DCG), and the share of queries that return the exact image whose description matches the query (labeled “&lt;strong&gt;&lt;em&gt;Hits&lt;/em&gt;”&lt;/strong&gt;).&lt;/p&gt;

&lt;p&gt;The performance results are:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--ISRorKdq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/vemnr38e6tlrqvpoi2p0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ISRorKdq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/vemnr38e6tlrqvpoi2p0.png" alt="Image description" width="631" height="295"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Not very surprisingly, the English CLIP model performed extremely poorly on German data.  Below are three examples from the evaluation set of queries in German, and the images it found to match:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--3dblh-L3--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/zmq2jho4d29y82zm37qg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--3dblh-L3--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/zmq2jho4d29y82zm37qg.png" alt="Image description" width="880" height="375"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Obviously, even though German is a relatively small part of the training set of the multilingual model, that is more than enough to make a ten-fold difference in performance with German queries, improving the value of a CLIP model from basically none to mediocre.&lt;/p&gt;

&lt;h2&gt;
  
  
  Improve the search quality via fine-tuning
&lt;/h2&gt;

&lt;p&gt;One of the main insights of large-model neural-network engineering is that it’s easier to start with models that are trained on general-purpose data and then further train them on domain-specific data, than it is to train models on domain-specific data from scratch.  This process is called “fine-tuning” and it can provide very significant performance improvements over using models like CLIP &lt;em&gt;*&lt;strong&gt;&lt;em&gt;as is&lt;/em&gt;&lt;/strong&gt;*&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Fine-tuning can be a tricky  process, and gains are highly dependent on the domain and the dataset used for further training.&lt;/p&gt;

&lt;h3&gt;
  
  
  Specify hyperparameters
&lt;/h3&gt;

&lt;p&gt;Fine-tuning requires a selection of hyperparameters that require some understanding of deep learning processes, and a full discussion of hyperparameter selection is beyond the scope of this article. We used the following values, based on empirical practice working with CLIP models:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--OqjO3viE--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/btwpaw4ac3gun895kttx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--OqjO3viE--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/btwpaw4ac3gun895kttx.png" alt="Image description" width="299" height="194"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;These hyperparameters are part of the command below.&lt;/p&gt;

&lt;h3&gt;
  
  
  Specify the evaluation data
&lt;/h3&gt;

&lt;p&gt;We fine-tuned using the data split described previously:  10,000 items were used as training data, and 2,000 as evaluation data. In order to evaluate models at the end of each training epoch, we turned the evaluation data into a “query” and “index” dataset. The “query” data consists of the German text descriptions in the evaluation data, and the “index” data contains the images.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;eval_data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;tag&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;
    &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'finetuner_label'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tag&lt;/span&gt;
    &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'finetuner_label'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tag&lt;/span&gt;

&lt;span class="n"&gt;query_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;DocumentArray&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;eval_data&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;index_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;DocumentArray&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;eval_data&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These are also passed to the fine-tuning command.&lt;/p&gt;

&lt;h3&gt;
  
  
  Put everything together in one call
&lt;/h3&gt;

&lt;p&gt;Running the command below uploads the training and evaluation data and fine-tunes the &lt;code&gt;xlm-roberta-base-ViT-B-32&lt;/code&gt; model on &lt;a href="https://cloud.jina.ai/"&gt;Jina AI Cloud&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;finetuner&lt;/span&gt;

&lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;finetuner&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'xlm-roberta-base-ViT-B-32::laion5b_s13b_b90k'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;train_data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;toloka_train_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;eval_data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;toloka_eval_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;epochs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;learning_rate&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1e-6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'CLIPLoss'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'cuda'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;callbacks&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="n"&gt;EvaluationCallback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;query_data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;toloka_query_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;index_data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;toloka_index_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'clip-text'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;index_model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'clip-vision'&lt;/span&gt;
        &lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;WandBLogger&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The fine-tuning process may take a considerable length of time, depending on the model and the amount of data. For this dataset and models, it took roughly half an hour. But once fine-tuning is complete, we can compare the different models' performance at querying.&lt;/p&gt;

&lt;h2&gt;
  
  
  Qualitative study on fine-tuned models
&lt;/h2&gt;

&lt;p&gt;For example, here are the top four results for the query “&lt;em&gt;Spitzen-Midirock Teilfutter Schwarz&lt;/em&gt;” (”&lt;em&gt;Lace midi skirt partial lining black&lt;/em&gt;”):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--faphWqqJ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/0s8k8auul2par1rbd334.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--faphWqqJ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/0s8k8auul2par1rbd334.png" alt="Image description" width="880" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This kind of qualitative analysis gives us a sense for how fine-tuning improves the model’s performance. Before tuning, the model was able to return images of skirts that matched the description, but it also returned images of different items of clothing made of the same materials. It was insufficiently attentive to the most important part of the query.&lt;/p&gt;

&lt;p&gt;After fine-tuning, this query consistently returns skirts, and all four results match the description. That is not to say that every query returns only correct matches, but that on direct inspection we can see that it has a far better understanding of what the query is asking for.&lt;/p&gt;

&lt;p&gt;Quantitative study on fine-tuned models&lt;br&gt;
To make more concrete comparisons, we need to evaluate our models in a more formal way over a collection of test items. We did this by a passing it test queries drawn from the evaluation data. The model then returned a set of results on which we did the same standard statistical tests we did for zero-shot evaluation.&lt;/p&gt;

&lt;p&gt;Here are the results for the Multilingual CLIP model, using the same measure of the top 20 results of each query:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--US8PRR0---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/lg9qzgv8rzgp1fzcqdkf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--US8PRR0---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/lg9qzgv8rzgp1fzcqdkf.png" alt="Image description" width="532" height="271"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The results show that fine-tuning has a significant effect in improving results for Multilingual CLIP, although not a spectacular one.&lt;/p&gt;

&lt;p&gt;Can English CLIP benefit from German data?&lt;br&gt;
We also decided to check if the English-only CLIP model would get better if we fine-tuned it with German data. It might catch up in performance with a pre-trained multilingual model, if given a chance. The results were interesting. We include the Multilingual CLIP results in this table for comparison:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--KwOQUhWM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/m5scy9wiibz1lr3j38et.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--KwOQUhWM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/m5scy9wiibz1lr3j38et.png" alt="Image description" width="730" height="258"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Using German training data, we were able to bring a vast improvement to the English-only CLIP model, although not enough to bring it up to even with the zero-shot level of the Multilingual CLIP model.  Mean average precision for the English-only model jumped 420%, compared to 31% for Multilingual CLIP, although the overall performance of the monolingual model was still much worse.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does more labeled data improve the search quality?
&lt;/h3&gt;

&lt;p&gt;We also ran multiple fine-tuning experiments with differing amounts of training data, on both the Multilingual and English-only CLIP models, to see how effective using more data was.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--FPfthdcq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/12/Average-precision-for-Multilingual-and-English-CLIP--1-.svg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--FPfthdcq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/12/Average-precision-for-Multilingual-and-English-CLIP--1-.svg" alt="" width="568" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In both, we see that most of the gain comes from the first few thousand items of training data, with gain coming more slowly after initially fast learning. This confirms a conclusion Jina AI has already published.&lt;/p&gt;

&lt;p&gt;Read *&lt;strong&gt;&lt;em&gt;&lt;a href="https://jina.ai/news/fine-tuning-with-low-budget-and-high-expectations/"&gt;Fine-tuning with Low Budget and High Expectations&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;* for more discussion of the impressive results you can get with Finetuner and relatively little new training data.&lt;/p&gt;

&lt;p&gt;Adding additional data may still improve results, but much more slowly. And in the case of fine-tuning the English-only CLIP model to handle German queries, we see performance improvement maximizes at less than 10,000 new items of data. It seems unlikely that we could train the English-only CLIP model to ever equal the Multilingual CLIP on German data, at least not using these kinds of methods.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;What lessons can we take from all this?&lt;/p&gt;

&lt;h3&gt;
  
  
  Multilingual CLIP is the first choice for non-English queries
&lt;/h3&gt;

&lt;p&gt;The Multilingual CLIP model, trained from scratch with multilingual data, outperforms comparable English-only CLIP models by a very large margin on the German data we used. The same conclusion will likely apply for other non-English languages.&lt;/p&gt;

&lt;p&gt;Even in an unfair competition, where we fine-tuned the English model and vastly improved its performance on German data, the Multilingual CLIP model without further training outperformed it by a large margin.&lt;/p&gt;

&lt;h3&gt;
  
  
  Fine-tuning improves search quality with little data
&lt;/h3&gt;

&lt;p&gt;We were shocked to see the English-only model improve its handling of German so much, and we see that we could have gotten nearly the same result using half as much data. The basic assumptions that go into fine-tuning are clearly very robust if they can teach German to an English model with only a few thousand examples.&lt;/p&gt;

&lt;p&gt;On the other hand, we struggled to improve the performance of Multilingual CLIP, even with a fairly large quantity of high quality human-annotated training data. Although Finetuner makes a clear difference, you very rapidly reach upper bounds of how much you can improve a model that’s already pretty good.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--6kYjgSF5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/12/chart--1-.svg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--6kYjgSF5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://jina-ai-gmbh.ghost.io/content/images/2022/12/chart--1-.svg" alt="" width="600" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Trouble-free fine-tuning using Finetuner
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://finetuner.jina.ai/"&gt;Finetuner&lt;/a&gt; is easy enough to use that we could construct and perform all the experiments in this article in a few days. Although it does take some understanding of deep learning to make the best configuration choices, Finetuner greatly reduces the boring labor of running and paying attention to large-scale neural network models to mere parameter setting.&lt;/p&gt;

</description>
      <category>search</category>
      <category>finetune</category>
      <category>clip</category>
    </item>
    <item>
      <title>Want to Search Inside Videos Like a Pro? CLIP-as-service Can Help</title>
      <dc:creator>guoliwu</dc:creator>
      <pubDate>Thu, 09 Feb 2023 15:05:20 +0000</pubDate>
      <link>https://dev.to/guoliwu/want-to-search-inside-videos-like-a-pro-clip-as-service-can-help-lio</link>
      <guid>https://dev.to/guoliwu/want-to-search-inside-videos-like-a-pro-clip-as-service-can-help-lio</guid>
      <description>&lt;p&gt;Wouldn’t it be great if you could search through a video the way you search through a text?&lt;/p&gt;

&lt;p&gt;Imagine opening a digitized film, just hitting &lt;em&gt;ctrl-f&lt;/em&gt; and typing “Santa”, then getting all the parts of the video with Santa Claus in it. Or just going to the command line and using the &lt;code&gt;grep&lt;/code&gt; command:&lt;/p&gt;

&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;grep &lt;span class="hljs-string"&gt;"Santa Claus"&lt;/span&gt; Santa_Claus_conquers_The_Martians.mp4&lt;br&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Normally, this would be impossible, or only possible if you had gone through the film and carefully labeled all the parts with a Santa in them already. But with Jina AI and CLIP-as-service, you can create a &lt;strong&gt;video grep&lt;/strong&gt; command for MP4 film with just a few Python functions and a standard computer setup. There is no need for a GPU and no complex AI tech stack to install, just off-the-shelf and open-source Python libraries, with Jina AI Cloud doing all the heavy lifting.&lt;/p&gt;

&lt;p&gt;This has immediate applications for anyone who has video data: film archivists, stock image vendors, news photographers, or even regular people who just keep videos from their cellphones around and post them to social media.&lt;/p&gt;

&lt;h2&gt;
&lt;span&gt;&lt;/span&gt;&lt;span&gt;Preliminaries&lt;/span&gt;&lt;span&gt;&lt;/span&gt;
&lt;/h2&gt;

&lt;p&gt;You need Python 3, and you might want to create a new virtual environment before starting. Then, install a few components at the command line with &lt;code&gt;pip&lt;/code&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;pip install clip_client &lt;span class="hljs-string"&gt;"docarray[full]&amp;gt;=0.20.0"&lt;/span&gt;&lt;br&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This installs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Jina AI’s &lt;a href="https://docarray.jina.ai/" rel="noopener noreferrer"&gt;DocArray library&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Jina AI’s &lt;a href="https://clip-as-service.jina.ai/" rel="noopener noreferrer"&gt;CLIP-as-service&lt;/a&gt; client&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You'll also need an account for CLIP-as-service. If you don't already have one, there are &lt;a href="https://docs.jina.ai/jina-ai-cloud/login/" rel="noopener noreferrer"&gt;instructions in the Jina AI documentation&lt;/a&gt;. Once you have an account, you will need a token. You can get one from &lt;a href="https://cloud.jina.ai/settings/tokens" rel="noopener noreferrer"&gt;your token settings page at Jina AI Cloud&lt;/a&gt;, or at the command line:&lt;/p&gt;

&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&amp;gt; jina auth login ⤶&lt;br&gt;&lt;br&gt;Your browser is going to open the login page.&lt;br&gt;If this fails please open the following link: https://jina-ai.us.auth0.com/au....&lt;br&gt;🔐 Successfully logged &lt;span class="hljs-keyword"&gt;in&lt;/span&gt; to Jina AI as....&lt;br&gt;&lt;br&gt;&amp;gt; jina auth token create video_search ⤶&lt;br&gt;╭───────────── 🎉 New token created ─────────────╮&lt;br&gt;│ 54f0f0ef5d514ca1908698fc6d9555a5               │&lt;br&gt;│                                                │&lt;br&gt;│ You can &lt;span class="hljs-built_in"&gt;set&lt;/span&gt; it as an env var JINA_AUTH_TOKEN   │&lt;br&gt;╰────── ☝️  This token is only shown once! ───────╯&lt;br&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Your token should looks something like: &lt;code&gt;54f0f0ef5d514ca1908698fc6d9555a5&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Keep your token in an environment variable, or a Python variable if you're using a notebook. You will need it later.&lt;/p&gt;

&lt;h2&gt;
&lt;span&gt;&lt;/span&gt;&lt;span&gt;Getting the Data&lt;/span&gt;&lt;span&gt;&lt;/span&gt;
&lt;/h2&gt;

&lt;p&gt;Loading MP4 video takes just one line of code with DocArray:&lt;/p&gt;

&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="hljs-keyword"&gt;from&lt;/span&gt; docarray &lt;span class="hljs-keyword"&gt;import&lt;/span&gt; Document&lt;br&gt;&lt;br&gt;video_uri = &lt;span class="hljs-string"&gt;"https://archive.org/download/santa-clause-conquers-the-martians/Santa%20Clause%20Conquers%20The%20Martians.ia.mp4"&lt;/span&gt;&lt;br&gt;&lt;br&gt;video_data = Document(uri=video_uri).load_uri_to_video_tensor()&lt;br&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Loading trailer the to the 1964 film, &lt;em&gt;Santa Claus Conquers the Martians&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This downloads the trailer to the public domain film &lt;em&gt;&lt;a href="https://www.imdb.com/title/tt0058548/" rel="noopener noreferrer"&gt;Santa Claus Conquers the Martians&lt;/a&gt;&lt;/em&gt; from the &lt;a href="https://archive.org/details/santa-clause-conquers-the-martians" rel="noopener noreferrer"&gt;Internet Archive&lt;/a&gt;. You can substitute another URL or a file path to a local file to use your own video instead.&lt;/p&gt;

&lt;p&gt;The video itself is stored as a &lt;code&gt;numpy&lt;/code&gt; array in the &lt;code&gt;Document.tensor&lt;/code&gt; attribute of the object:&lt;/p&gt;

&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;print(video_data.tensor.shape)&lt;br&gt;&lt;br&gt;(&lt;span class="hljs-number"&gt;4264&lt;/span&gt;, &lt;span class="hljs-number"&gt;640&lt;/span&gt;, &lt;span class="hljs-number"&gt;464&lt;/span&gt;, &lt;span class="hljs-number"&gt;3&lt;/span&gt;)&lt;br&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The video itself is has 4,264 frames (the first dimension of the tensor), each measuring 640 pixels by 464 pixels (the second and third dimensions), and each pixel has three color dimensions (conventional RGB in the fourth dimension).&lt;/p&gt;

&lt;p&gt;You can play the video in a notebook with the &lt;code&gt;Document.display()&lt;/code&gt; method:&lt;/p&gt;

&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;video_data.display()&lt;br&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;a href="https://archive.org/download/santa-clause-conquers-the-martians/Santa%20Clause%20Conquers%20The%20Martians.ia.mp4" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Ffiles.mdnice.com%2Fuser%2F39412%2Fb88fca35-e102-4cd3-9291-858b9445062d.png" alt="click on the image to Watch the video" width="741" height="510"&gt;&lt;/a&gt;click on the image to Watch the video&lt;/p&gt;

&lt;p&gt;You can also use DocArray to view individual frames by their frame number. For example, frame #1400:&lt;/p&gt;

&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="hljs-keyword"&gt;import&lt;/span&gt; numpy&lt;br&gt;&lt;br&gt;Document(tensor=numpy.rot90(video_data.tensor[&lt;span class="hljs-number"&gt;1400&lt;/span&gt;], &lt;span class="hljs-number"&gt;-1&lt;/span&gt;)).display()&lt;br&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fjina-ai-gmbh.ghost.io%2Fcontent%2Fimages%2F2022%2F12%2Fframe1400.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fjina-ai-gmbh.ghost.io%2Fcontent%2Fimages%2F2022%2F12%2Fframe1400.png" alt="Frame 1400 of the trailer to _Santa Claus conquers the Martians_" width="640" height="464"&gt;&lt;/a&gt;Frame 1400 of the trailer to &lt;em&gt;Santa Claus conquers the Martians&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;And we can extract clips from the video by giving specific start and end frame numbers:&lt;/p&gt;

&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;clip_data = video_data.tensor[&lt;span class="hljs-number"&gt;3000&lt;/span&gt;:&lt;span class="hljs-number"&gt;3300&lt;/span&gt;]&lt;br&gt;Document(tensor=clip_data).save_video_tensor_to_file(&lt;span class="hljs-string"&gt;"clip.mp4"&lt;/span&gt;)&lt;br&gt;Document(uri=&lt;span class="hljs-string"&gt;"clip.mp4"&lt;/span&gt;).display()&lt;br&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;a href="https://jina-ai-gmbh.ghost.io/content/media/2022/12/clip-1.mp4" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fjina-ai-gmbh.ghost.io%2Fcontent%2Fimages%2F2022%2F12%2Fmedia-thumbnail-ember331.jpg" alt="click the image, Watch the video" width="640" height="464"&gt;&lt;/a&gt;click the image, Watch the video&lt;/p&gt;

&lt;h2&gt;
&lt;span&gt;&lt;/span&gt;&lt;span&gt;Extracting Keyframes&lt;/span&gt;&lt;span&gt;&lt;/span&gt;
&lt;/h2&gt;

&lt;p&gt;The procedure we’re using to search in videos is to extract &lt;em&gt;keyframes&lt;/em&gt; and then perform our search on just those still images.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is a keyframe?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A keyframe is a frame in a video that is the first after a break from smooth frame-to-frame transition. This is the same as a &lt;em&gt;cut&lt;/em&gt; in film editing. We can identify them by going through the video frame by frame and comparing each frame to the next one. If the difference between the two frames is more than some value, we take that to mean that there was a cut, and the later frame is a keyframe.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fjina-ai-gmbh.ghost.io%2Fcontent%2Fimages%2F2022%2F11%2FKeyframe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fjina-ai-gmbh.ghost.io%2Fcontent%2Fimages%2F2022%2F11%2FKeyframe.png" alt="" width="800" height="174"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A visual illustration of keyframes.&lt;/p&gt;

&lt;p&gt;DocArray will automatically collect keyframes as it loads the video with the method &lt;code&gt;Document.load_uri_to_video_tensor()&lt;/code&gt; and store them in the &lt;code&gt;Document.tags&lt;/code&gt; dictionary under the key &lt;code&gt;keyframe_indices&lt;/code&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;print(video_data.tags[&lt;span class="hljs-string"&gt;'keyframe_indices'&lt;/span&gt;])&lt;br&gt;&lt;br&gt;[&lt;span class="hljs-number"&gt;0&lt;/span&gt;, &lt;span class="hljs-number"&gt;25&lt;/span&gt;, &lt;span class="hljs-number"&gt;196&lt;/span&gt;, &lt;span class="hljs-number"&gt;261&lt;/span&gt;, &lt;span class="hljs-number"&gt;325&lt;/span&gt;, &lt;span class="hljs-number"&gt;395&lt;/span&gt;, &lt;span class="hljs-number"&gt;478&lt;/span&gt;, &lt;span class="hljs-number"&gt;534&lt;/span&gt;, &lt;span class="hljs-number"&gt;695&lt;/span&gt;, &lt;span class="hljs-number"&gt;728&lt;/span&gt;, &lt;span class="hljs-number"&gt;840&lt;/span&gt;, &lt;span class="hljs-number"&gt;1019&lt;/span&gt;, &lt;span class="hljs-number"&gt;1059&lt;/span&gt;, &lt;span class="hljs-number"&gt;1131&lt;/span&gt;, &lt;span class="hljs-number"&gt;1191&lt;/span&gt;, &lt;span class="hljs-number"&gt;1245&lt;/span&gt;, &lt;span class="hljs-number"&gt;1340&lt;/span&gt;, &lt;span class="hljs-number"&gt;1389&lt;/span&gt;, &lt;span class="hljs-number"&gt;1505&lt;/span&gt;, &lt;span class="hljs-number"&gt;1573&lt;/span&gt;, &lt;span class="hljs-number"&gt;1631&lt;/span&gt;, &lt;span class="hljs-number"&gt;1674&lt;/span&gt;, &lt;span class="hljs-number"&gt;1750&lt;/span&gt;, &lt;span class="hljs-number"&gt;1869&lt;/span&gt;, &lt;span class="hljs-number"&gt;1910&lt;/span&gt;, &lt;span class="hljs-number"&gt;2010&lt;/span&gt;, &lt;span class="hljs-number"&gt;2105&lt;/span&gt;, &lt;span class="hljs-number"&gt;2184&lt;/span&gt;, &lt;span class="hljs-number"&gt;2248&lt;/span&gt;, &lt;span class="hljs-number"&gt;2335&lt;/span&gt;, &lt;span class="hljs-number"&gt;2585&lt;/span&gt;, &lt;span class="hljs-number"&gt;2618&lt;/span&gt;, &lt;span class="hljs-number"&gt;2648&lt;/span&gt;, &lt;span class="hljs-number"&gt;2706&lt;/span&gt;, &lt;span class="hljs-number"&gt;2756&lt;/span&gt;, &lt;span class="hljs-number"&gt;2788&lt;/span&gt;, &lt;span class="hljs-number"&gt;2906&lt;/span&gt;, &lt;span class="hljs-number"&gt;2950&lt;/span&gt;, &lt;span class="hljs-number"&gt;3050&lt;/span&gt;, &lt;span class="hljs-number"&gt;3100&lt;/span&gt;, &lt;span class="hljs-number"&gt;3128&lt;/span&gt;, &lt;span class="hljs-number"&gt;3190&lt;/span&gt;, &lt;span class="hljs-number"&gt;3216&lt;/span&gt;, &lt;span class="hljs-number"&gt;3314&lt;/span&gt;, &lt;span class="hljs-number"&gt;3356&lt;/span&gt;, &lt;span class="hljs-number"&gt;3421&lt;/span&gt;, &lt;span class="hljs-number"&gt;3514&lt;/span&gt;, &lt;span class="hljs-number"&gt;3586&lt;/span&gt;, &lt;span class="hljs-number"&gt;3760&lt;/span&gt;, &lt;span class="hljs-number"&gt;3828&lt;/span&gt;, &lt;span class="hljs-number"&gt;4078&lt;/span&gt;]&lt;br&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The frame numbers of the keyframes are stored in the &lt;code&gt;Document.tags&lt;/code&gt; dictionary under the key &lt;code&gt;keyframe_indices&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
&lt;span&gt;&lt;/span&gt;&lt;span&gt;Performing Search&lt;/span&gt;&lt;span&gt;&lt;/span&gt;
&lt;/h2&gt;

&lt;p&gt;First, extract all the keyframes as images, and put each one into its own &lt;code&gt;Document&lt;/code&gt;. Then compile all the frames into a &lt;code&gt;DocumentArray&lt;/code&gt; object:&lt;/p&gt;

&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="hljs-keyword"&gt;from&lt;/span&gt; docarray &lt;span class="hljs-keyword"&gt;import&lt;/span&gt; Document, DocumentArray&lt;br&gt;&lt;span class="hljs-keyword"&gt;from&lt;/span&gt; numpy &lt;span class="hljs-keyword"&gt;import&lt;/span&gt; rot90&lt;br&gt;&lt;br&gt;keyframe_indices = video_data.tags[&lt;span class="hljs-string"&gt;'keyframe_indices'&lt;/span&gt;]&lt;br&gt;keyframes = DocumentArray()&lt;br&gt;&lt;span class="hljs-keyword"&gt;for&lt;/span&gt; idx &lt;span class="hljs-keyword"&gt;in&lt;/span&gt; range(&lt;span class="hljs-number"&gt;0&lt;/span&gt;, len(keyframe_indices) - &lt;span class="hljs-number"&gt;1&lt;/span&gt;):&lt;br&gt; keyframe_number = keyframe_indices[idx]&lt;br&gt;    keyframe_tensor = rot90(video_data.tensor[keyframe_number], &lt;span class="hljs-number"&gt;-1&lt;/span&gt;)&lt;br&gt;    clip_indices = {&lt;br&gt;        &lt;span class="hljs-string"&gt;'start'&lt;/span&gt;: str(keyframe_number),&lt;br&gt;        &lt;span class="hljs-string"&gt;'end'&lt;/span&gt;: str(keyframe_indices[idx + &lt;span class="hljs-number"&gt;1&lt;/span&gt;]),&lt;br&gt;    }&lt;br&gt;    keyframe = Document(tags=clip_indices, tensor=keyframe_tensor)&lt;br&gt;    keyframes.append(keyframe)&lt;br&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The code above uses the &lt;code&gt;Document.tags&lt;/code&gt; dictionary to store the frame number (as &lt;code&gt;start&lt;/code&gt;) and the frame number of the next keyframe (as &lt;code&gt;end&lt;/code&gt;) so that we can extract a video clip corresponding to that keyframe.&lt;/p&gt;

&lt;p&gt;Then access CLIP-as-service, passing it the query text – in the example &lt;em&gt;"Santa Claus"&lt;/em&gt; – and the collection of keyframe images, and it will return the images ranked by how well they match the query text:&lt;/p&gt;

&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="hljs-keyword"&gt;from&lt;/span&gt; docarray &lt;span class="hljs-keyword"&gt;import&lt;/span&gt; Document, DocumentArray&lt;br&gt;&lt;span class="hljs-keyword"&gt;from&lt;/span&gt; clip_client &lt;span class="hljs-keyword"&gt;import&lt;/span&gt; Client&lt;br&gt;&lt;br&gt;server_url = &lt;span class="hljs-string"&gt;"grpcs://api.clip.jina.ai:2096"&lt;/span&gt;&lt;br&gt;&lt;br&gt;&lt;span class="hljs-comment"&gt;# substitute your own token in the line below!&lt;/span&gt;&lt;br&gt;jina_auth_token = &lt;span class="hljs-string"&gt;"54f0f0ef5d514ca1908698fc6d9555a5"&lt;/span&gt;&lt;br&gt;&lt;br&gt;client = Client(server_url,&lt;br&gt;                credential={&lt;span class="hljs-string"&gt;"Authorization"&lt;/span&gt;: jina_auth_token})&lt;br&gt;query = Document(text=&lt;span class="hljs-string"&gt;"Santa Claus"&lt;/span&gt;, matches=keyframes)&lt;br&gt;ranked_result = client.rank([query])[&lt;span class="hljs-number"&gt;0&lt;/span&gt;]&lt;br&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;We transmit the query and keyframe images to Jina AI Cloud, and CLIP-as-service calculates an embedding vector for the text query and for each keyframe. Then, we measure the distance between the keyframe vectors and text query vectors. We return a list of keyframes ordered by their proximity to the text query in the embedding space.&lt;/p&gt;

&lt;p&gt;You can see this represented in the figure below.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fjina-ai-gmbh.ghost.io%2Fcontent%2Fimages%2F2022%2F11%2Fsanta-and-martians-1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fjina-ai-gmbh.ghost.io%2Fcontent%2Fimages%2F2022%2F11%2Fsanta-and-martians-1.png" alt="" width="674" height="693"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;CLIP translates texts and images into vectors in a common embedding space where the distance between them reflects their semantic similarity. The embedding vectors of images with Santa Claus are much closer to the vector for the text "Santa Claus" than other images.&lt;/p&gt;

&lt;p&gt;The query reorders the keyframe images in &lt;code&gt;Document.matches&lt;/code&gt; in order from the best match to the worst.&lt;/p&gt;

&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;print(ranked_result.matches[&lt;span class="hljs-number"&gt;0&lt;/span&gt;].tags)&lt;br&gt;&lt;br&gt;{&lt;span class="hljs-string"&gt;'start'&lt;/span&gt;: &lt;span class="hljs-string"&gt;'2105'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'end'&lt;/span&gt;: &lt;span class="hljs-string"&gt;'2184'&lt;/span&gt;}&lt;br&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;You can see in the &lt;code&gt;Document.tags&lt;/code&gt; section that we've retained the information about this keyframe: It is frame #2105 and the next keyframe is at #2184. With this information, we can get the short video clip that this matches:&lt;/p&gt;

&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;match = ranked_result.matches[&lt;span class="hljs-number"&gt;0&lt;/span&gt;]&lt;br&gt;start_frame = int(match.tags[&lt;span class="hljs-string"&gt;'start'&lt;/span&gt;])&lt;br&gt;end_frame = int(match.tags[&lt;span class="hljs-string"&gt;'end'&lt;/span&gt;])&lt;br&gt;clip_data = video_data.tensor[start_frame:end_frame] &lt;br&gt;Document(tensor=clip_data).save_video_tensor_to_file(&lt;span class="hljs-string"&gt;"match.mp4"&lt;/span&gt;)&lt;br&gt;Document(uri=&lt;span class="hljs-string"&gt;"match.mp4"&lt;/span&gt;).display()&lt;br&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;a href="https://jina-ai-gmbh.ghost.io/content/media/2022/12/match.mp4" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fjina-ai-gmbh.ghost.io%2Fcontent%2Fimages%2F2022%2F12%2Fmedia-thumbnail-ember857.jpg" alt="click the image, Watch the video" width="640" height="464"&gt;&lt;/a&gt;click the image, Watch the video&lt;/p&gt;

&lt;p&gt;The top five clips all contain Santa Claus:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Ffiles.mdnice.com%2Fuser%2F39412%2F98c20fcb-c940-4dd7-9ba5-9a9eed591579.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Ffiles.mdnice.com%2Fuser%2F39412%2F98c20fcb-c940-4dd7-9ba5-9a9eed591579.png" alt="" width="771" height="164"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This quick and simple technique brings very sophisticated multimodal AI to users with only a standard computer setup. CLIP-as-service is a powerful tool for anyone who needs to search through large volumes of digital media to find what they’re looking for. It saves time and helps you get the most out of your digital media collection.&lt;/p&gt;

&lt;p&gt;So the next time you need to find Santa Claus, CLIP-as-service is here to help you look!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fjina-ai-gmbh.ghost.io%2Fcontent%2Fimages%2F2022%2F11%2FDALL-E-2022-11-28-18.01.17---Santa-Claus-with-Martians--painted-in-the-style-of-Norman-Rockwell.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fjina-ai-gmbh.ghost.io%2Fcontent%2Fimages%2F2022%2F11%2FDALL-E-2022-11-28-18.01.17---Santa-Claus-with-Martians--painted-in-the-style-of-Norman-Rockwell.png" alt="“Santa Claus with Martians, painted in the style of Norman Rockwell” according to DALL-E 2." width="800" height="873"&gt;&lt;/a&gt;“Santa Claus with Martians, painted in the style of Norman Rockwell” according to &lt;a href="https://openai.com/dall-e-2/" rel="noopener noreferrer"&gt;DALL-E 2&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
&lt;span&gt;&lt;/span&gt;&lt;span&gt;Autor:&lt;/span&gt;&lt;span&gt;&lt;/span&gt;
&lt;/h3&gt;

&lt;p&gt;Jie Fu,Scott Martens&lt;/p&gt;

&lt;h3&gt;
&lt;span&gt;&lt;/span&gt;&lt;span&gt;Original link:&lt;/span&gt;&lt;span&gt;&lt;/span&gt;
&lt;/h3&gt;

&lt;p&gt;https://jina.ai/news/guide-using-opentelemetry-jina-monitoring-tracing-applications/&lt;/p&gt;

</description>
      <category>marketing</category>
    </item>
  </channel>
</rss>
