<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: wree</title>
    <description>The latest articles on DEV Community by wree (@_1f0995eba7c81ed78c499).</description>
    <link>https://dev.to/_1f0995eba7c81ed78c499</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2569341%2F2c9a7b72-ffb0-4701-ac86-8dbe33b49ae0.jpg</url>
      <title>DEV Community: wree</title>
      <link>https://dev.to/_1f0995eba7c81ed78c499</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/_1f0995eba7c81ed78c499"/>
    <language>en</language>
    <item>
      <title>Issue with mismatched tensor sizes during training with DeepSpeed</title>
      <dc:creator>wree</dc:creator>
      <pubDate>Sat, 14 Dec 2024 07:56:50 +0000</pubDate>
      <link>https://dev.to/_1f0995eba7c81ed78c499/issue-with-mismatched-tensor-sizes-during-training-with-deepspeed-506k</link>
      <guid>https://dev.to/_1f0995eba7c81ed78c499/issue-with-mismatched-tensor-sizes-during-training-with-deepspeed-506k</guid>
      <description>&lt;p&gt;I'm currently training a model using** Hugging Face**'s Trainer with DeepSpeed integration, and I'm encountering an error related to mismatched tensor sizes. Specifically, I am getting the following error:&lt;/p&gt;

&lt;h2&gt;
  
  
  The size of tensor a (50) must match the size of tensor b (3) at non-singleton dimension2
&lt;/h2&gt;

&lt;p&gt;I hope someone can fix it and share your version please!😊😊😊&lt;/p&gt;

&lt;p&gt;&lt;a href="https://drive.google.com/file/d/1hjrvCaQFVE-EUk_-i_VEo950XY9F70tW/view?usp=drive_link" rel="noopener noreferrer"&gt;my data: &lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://colab.research.google.com/drive/1KUO_g239Ii7kdG_4j8L_jE0XbgrEm9uS?usp=sharing" rel="noopener noreferrer"&gt;Here is my setup:&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I’ve checked that the input_ids and labels have the same shape. I've verified the batch size in both the Trainer configuration and the DeepSpeed config. I've also ensured that the model is correctly placed on the device (cuda or cpu).&lt;/p&gt;

&lt;p&gt;maybe, i'm not sure.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>discuss</category>
      <category>development</category>
    </item>
  </channel>
</rss>
