<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Joy</title>
    <description>The latest articles on DEV Community by Joy (@joooyz).</description>
    <link>https://dev.to/joooyz</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F647073%2Fb08f045a-82e9-4256-8d7f-9c54a7df0805.png</url>
      <title>DEV Community: Joy</title>
      <link>https://dev.to/joooyz</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/joooyz"/>
    <language>en</language>
    <item>
      <title>I spent $15 in DALL·E 2 credits creating this AI image, and here’s what I learned</title>
      <dc:creator>Joy</dc:creator>
      <pubDate>Fri, 19 Aug 2022 05:19:48 +0000</pubDate>
      <link>https://dev.to/joooyz/i-spent-15-in-dalle-2-credits-creating-this-ai-image-and-heres-what-i-learned-4hl1</link>
      <guid>https://dev.to/joooyz/i-spent-15-in-dalle-2-credits-creating-this-ai-image-and-heres-what-i-learned-4hl1</guid>
      <description>&lt;p&gt;&lt;em&gt;Yes, that’s a llama dunking a basketball. A summary of the process, limitations, and lessons learned while experimenting with the closed Beta version of DALL·E 2.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--C9pB5XXZ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/2bp6jaqm1t1xol31qi40.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--C9pB5XXZ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/2bp6jaqm1t1xol31qi40.png" width="880" height="880"&gt;&lt;/a&gt;&lt;br&gt;Llama playing basketball, generated using DALL·E 2 by author.
  &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;This article was originally published by me on &lt;a href="https://pub.towardsai.net/i-spent-15-in-dall-e-2-credits-creating-this-ai-image-and-heres-what-i-learned-52f352912025"&gt;Medium&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;I’ve been dying to try DALL·E 2 ever since I first saw this &lt;a href="https://twitter.com/hardmaru/status/1522166259890151424"&gt;artificially generated image of a “Shiba Inu Bento Box”&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Wow — now that’s disruptive technology.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For those of you unfamiliar, &lt;strong&gt;DALL·E 2 is a system created by OpenAI that can generate original images from text.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It’s currently in closed Beta — I signed up for the waitlist in early May and got access at the end of July. During the Beta, users receive credits (50 free in the first month, 15 credits every month after that) where every use costs 1 credit, and each use results in 3–4 images. You can also purchase 115 credits for US$15.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;P.S. If you can’t wait to try it, give &lt;a href="https://huggingface.co/spaces/dalle-mini/dalle-mini"&gt;DALL·E mini&lt;/a&gt; a go for free. However, the quality of its images are generally poorer (giving rise to a &lt;a href="https://www.wired.com/story/dalle-ai-meme-machine/"&gt;host of DALL·E memes&lt;/a&gt;) and takes about ~60 seconds per prompt (DALL·E 2 in comparison only takes 5 seconds or so).&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You’ve probably seen various cherry-picked images online showing what DALL·E 2 is capable of (provided the right creative prompt). In this article, I share a candid walkthrough of what it takes to create a usable image from scratch for the subject matter: “a llama playing basketball”. &lt;strong&gt;You might find it useful if you’re thinking of trying out DALL·E 2 yourself, or you’re just interested in understanding what it’s capable of.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The starting point
&lt;/h2&gt;

&lt;p&gt;There’s both an art and science to knowing what prompt to feed DALL·E 2. To illustrate, here are the results for “&lt;em&gt;llama playing basketball&lt;/em&gt;”:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--yEOOF-AA--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/dwbx1aamivz5t14zrrm5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--yEOOF-AA--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/dwbx1aamivz5t14zrrm5.png" width="880" height="230"&gt;&lt;/a&gt;&lt;br&gt;Images generated by the author using DALL·E 2 with prompt “llama playing basketball.”
  &lt;/p&gt;

&lt;p&gt;Why is DALL·E 2 inclined to generate cartoon images for this prompt? I assume it has something to do with the lack of actual images of a llama playing basketball seen during training.&lt;/p&gt;

&lt;p&gt;I attempted to go a step further by adding the key term ‘&lt;em&gt;realistic photo of&lt;/em&gt;’:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--YWBqn1s4--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/98p4ojbutj84i0l009zy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--YWBqn1s4--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/98p4ojbutj84i0l009zy.png" width="880" height="216"&gt;&lt;/a&gt;&lt;br&gt;Images generated by the author using DALL·E 2 with prompt “realistic photo of llama playing basketball”
  &lt;/p&gt;

&lt;p&gt;That llama’s looking more photorealistic, but the whole image is starting to look like a botched Photoshop job. In this case, DALL·E 2 clearly needed some hand-holding to create a cohesive scene.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prompt engineering, aka the art of specifying exactly what you want
&lt;/h2&gt;

&lt;p&gt;In the context of DALL·E, &lt;strong&gt;prompt engineering refers to the process of designing prompts to give you the desired results.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://dallery.gallery/the-dalle-2-prompt-book/"&gt;DALL·E 2 Prompt Book&lt;/a&gt; is a fantastic resource for this. It contains a detailed list of inspirations for prompts using keywords from photography and art.&lt;/p&gt;

&lt;p&gt;Why is something like this necessary? &lt;strong&gt;Because getting a usable output from DALL·E 2 is finicky&lt;/strong&gt; (especially when you’re not sure what DALL·E 2 is capable of). So much so that a &lt;a href="https://techcrunch.com/2022/07/29/a-startup-is-charging-1-99-for-strings-of-text-to-feed-to-dall-e-2/"&gt;new startup is creating a marketplace charging $1.99 for prompts&lt;/a&gt; to save you the time and money from coming up with your own.&lt;/p&gt;

&lt;p&gt;My personal favorite find is “&lt;em&gt;dramatic backlighting&lt;/em&gt;”:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--jBPQp1dG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/6tccem0nj2l8y8mm8tzw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--jBPQp1dG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/6tccem0nj2l8y8mm8tzw.png" width="834" height="836"&gt;&lt;/a&gt;&lt;br&gt;Now we’re talking! Images generated by the author using DALL·E 2 with prompt: “Film still of a llama dunking a basketball, low angle, extreme long shot, indoors, dramatic backlighting.”
  &lt;/p&gt;

&lt;p&gt;It’s important to tell DALL·E 2 &lt;strong&gt;exactly&lt;/strong&gt; what you want. Apparently, it’s not obvious from the context that this llama should be dressed for the occasion. DALL·E 2 does a great job realizing this fantasy scene however, when ‘&lt;em&gt;llama wearing a jersey&lt;/em&gt;’ is specified:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--xx1WYuG2--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/j5lmj4z6x9oe9ajcbu5g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--xx1WYuG2--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/j5lmj4z6x9oe9ajcbu5g.png" width="875" height="212"&gt;&lt;/a&gt;&lt;br&gt;Basketball dunking llama, now comes with jerseys. Images generated by author with DALL·E 2 using prompt: “film still of an alpaca wearing a jersey, dunking a basketball, low angle, long shot, indoors, dramatic backlighting, high detail.”
  &lt;/p&gt;

&lt;p&gt;It doesn’t stop there. To add some drama to the image and really get this llama flying, I needed to specify phrases such as ‘&lt;em&gt;dunking a basketball&lt;/em&gt;', ‘&lt;em&gt;action shot of…&lt;/em&gt;’, or my personal favorite: “&lt;em&gt;…llama in a jersey dunking a basketball like Michael Jordan&lt;/em&gt;”:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--b8pv3SY---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/nbbpkh0ee02w21rr3o02.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--b8pv3SY---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/nbbpkh0ee02w21rr3o02.png" width="880" height="214"&gt;&lt;/a&gt;&lt;br&gt;Michael Jordan — if he was a llama, according to DALL·E 2. Images generated by author with DALL·E 2 using prompt “film still of a llama in a jersey dunking a basketball like Michael Jordan, low angle, show from below, tilted frame, 35°, Dutch angle, extreme long shot, high detail, indoors, dramatic backlighting.”.
  &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Tip: DALL·E 2 only stores the previous 50 generations in your history tab. Make sure to save your favourite images as you go.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  You might have noticed: DALL·E 2 isn’t great at composition.
&lt;/h2&gt;

&lt;p&gt;You’d think that from the context of ‘dunking a basketball,’ it’d be obvious where the relative positions of the llama, ball, and hoop should be. More often than not, the llama dunks the wrong way, or the ball is positioned in such a way that the llama has no real hope of making the shot. &lt;strong&gt;Though all the elements of the prompt are there, DALL·E 2 doesn’t truly ‘understand’ the relationship between them.&lt;/strong&gt; &lt;a href="https://www.unite.ai/is-dall-e-2-just-gluing-things-together-without-understanding-their-relationships/"&gt;This article covers the topic in more depth&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--uRXi6cbE--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/s7g7idcyo6gmjr5vvexu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--uRXi6cbE--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/s7g7idcyo6gmjr5vvexu.png" width="875" height="213"&gt;&lt;/a&gt;&lt;br&gt;Image generated by author using DALL·E 2 with prompt: “Film still of a llama in a jersey dunking a basketball like Michael Jordan, low angle, shot from below, tilted frame, 35°, Dutch angle, extreme long shot, high detail, indoors, dramatic backlighting.”
  &lt;/p&gt;

&lt;p&gt;Another artifact of DALL·E 2 not really ‘understanding’ the scene is the occasional mix-up in textures. In the image below, the net is made out of fur (a morbid scene once you think about it):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--RtsuwLOT--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/bryuh9hlcpisljaev4fm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--RtsuwLOT--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/bryuh9hlcpisljaev4fm.png" width="880" height="880"&gt;&lt;/a&gt;&lt;br&gt;Image generated by author using DALL·E 2 with prompt: “Expressive photo of a llama wearing a jersey dunking a basketball like Michael Jordan, low angle, extreme wide shot, indoors, dramatic backlighting, high detail.”
  &lt;/p&gt;

&lt;h2&gt;
  
  
  DALL·E 2 struggles to generate realistic faces
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://spectrum.ieee.org/openai-dall-e-2"&gt;According to some sources&lt;/a&gt;, this may have been a deliberate attempt to avoid generating deepfakes. I thought that would only apply to human subjects, but apparently, it applies to llamas too.&lt;/p&gt;

&lt;p&gt;Some of the results were downright creepy.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--8m0zD-dP--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/kneax86idpz1ty7z60ks.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--8m0zD-dP--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/kneax86idpz1ty7z60ks.png" width="875" height="875"&gt;&lt;/a&gt;&lt;br&gt;Image generated by author using DALL·E 2 with prompt: “Dramatic photo of an llama wearing a jersey dunking a basketball like Michael Jordan, low angle, wide shot, indoors, dramatic backlighting, high detail.”
  &lt;/p&gt;

&lt;h2&gt;
  
  
  Some other limitations of DALL·E 2
&lt;/h2&gt;

&lt;p&gt;Here are some other minor issues I experienced:&lt;/p&gt;

&lt;h3&gt;
  
  
  Angles and shots are interpreted loosely
&lt;/h3&gt;

&lt;p&gt;No matter how many variants of ‘&lt;em&gt;in the distance&lt;/em&gt;’ or ‘&lt;em&gt;extreme long shot&lt;/em&gt;’ I used, it was difficult to find images where the entire llama fit within the frame.&lt;/p&gt;

&lt;p&gt;In some cases, the framing was ignored entirely:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--YUVNxc6i--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/uzkseqxw7tue9wnd5fny.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--YUVNxc6i--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/uzkseqxw7tue9wnd5fny.png" width="880" height="216"&gt;&lt;/a&gt;&lt;br&gt;Image generated by the author using DALL·E 2 with prompt: “Dramatic film still of a llama wearing a jersey dunking a basketball, low angle, shot from below, tilted frame, 35°, Dutch angle, extreme long shot, indoors, dramatic backlighting, high detail.”
  &lt;/p&gt;

&lt;h3&gt;
  
  
  DALL·E 2 can’t spell
&lt;/h3&gt;

&lt;p&gt;I guess this shouldn’t be too surprising given that DALL·E 2 struggles to ‘understand’ the relationship between components. It is, however, capable of attempting some fully formed letters in the right context:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--ks0pnCCy--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/4zto8dz6h1r4ggbshg1g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ks0pnCCy--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/4zto8dz6h1r4ggbshg1g.png" width="875" height="875"&gt;&lt;/a&gt;&lt;br&gt;Image generated by author using DALL·E 2 with prompt: “Film still of a fluffy llama in a jersey dunking a basketball like Michael Jordan, low angle, shot from below, tilted frame, 35°, Dutch angle, extreme long shot, high detail, indoors, dramatic backlighting.”
  &lt;/p&gt;

&lt;h3&gt;
  
  
  DALL·E 2 can be temperamental with complex or poorly-worded prompts
&lt;/h3&gt;

&lt;p&gt;Occasionally, adding keywords or phrasing the prompt in certain ways led to results that were completely different from what was expected.&lt;/p&gt;

&lt;p&gt;In this case, the real subject of the prompt (llama wearing a jersey) was completely ignored:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--C_1Vwd-Z--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/bt4eje832evb9vyxe0oz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--C_1Vwd-Z--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/bt4eje832evb9vyxe0oz.png" width="880" height="216"&gt;&lt;/a&gt;&lt;br&gt;Now that is an impressive dunk. Images generated by author using DALL·E 2 with prompt: “A low angle, long shot, indoors, dramatic backlighting, professional photo of a llama wearing a jersey, dunking a basketball.”
  &lt;/p&gt;

&lt;p&gt;Even adding the term ‘fluffy’ led to dramatically worse performance and multiple cases where it looked like DALL·E 2 just… &lt;em&gt;broke&lt;/em&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--UhFyaZcN--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ohm658r8ksgrx5tz13fv.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--UhFyaZcN--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ohm658r8ksgrx5tz13fv.jpeg" width="875" height="212"&gt;&lt;/a&gt;&lt;br&gt;Images generated by the author using DALL·E 2 with prompt: “Film still of a fluffy llama in a jersey dunking a basketball like Michael Jordan, high detail, indoors, dramatic backlighting.” (Image intentionally modified to blur and hide faces).
  &lt;/p&gt;

&lt;p&gt;In working with DALL·E 2, it’s important to be specific about what you want &lt;strong&gt;without&lt;/strong&gt; over-stuffing or adding redundant words.&lt;/p&gt;

&lt;h2&gt;
  
  
  DALL·E 2’s ability to transfer styles is impressive
&lt;/h2&gt;

&lt;p&gt;You need to try this!&lt;/p&gt;

&lt;p&gt;Once you have your keyword subject matter, you can generate the image in an impressive number of other art styles.&lt;/p&gt;

&lt;h3&gt;
  
  
  ‘Abstract painting of….’
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--bTc8tgFy--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/e0wo7pt1cw9rrj3krx9n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--bTc8tgFy--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/e0wo7pt1cw9rrj3krx9n.png" width="880" height="214"&gt;&lt;/a&gt;&lt;br&gt;Images generated by the author using DALL·E 2 with prompt: “Abstract painting of a llama in a jersey dunking a basketball like Michael Jordan, shot from below, tilted frame, 35°, Dutch angle, extreme long shot, high detail, dramatic backlighting, indoors. In the background is a stadium full of people.”&lt;br&gt;

  &lt;/p&gt;

&lt;h3&gt;
  
  
  ‘Vaporwave’
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--kbE4UKK_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/jajouu8yjjsbwnq38gwo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--kbE4UKK_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/jajouu8yjjsbwnq38gwo.png" width="880" height="214"&gt;&lt;/a&gt;&lt;br&gt;Images generated by the author using DALL·E 2 with prompt: “Film still of a llama in a jersey dunking a basketball like Michael Jordan, dramatic backlighting, vibrant sunset, vaporwave.”&lt;br&gt;

  &lt;/p&gt;

&lt;h3&gt;
  
  
  ‘Digital art’
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--nZ07LH5B--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/opd36n5yxem0vujvssf1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--nZ07LH5B--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/opd36n5yxem0vujvssf1.png" width="875" height="215"&gt;&lt;/a&gt;&lt;br&gt;Images generated by the author using DALL·E 2 with prompt: “llama in a jersey dunking a basketball like Michael Jordan, shot from below, tilted frame, 35°, Dutch angle, extreme long shot, high detail, dramatic backlighting, epic, digital art”&lt;br&gt;

  &lt;/p&gt;

&lt;h3&gt;
  
  
  ‘Screenshots from the Miyazaki anime movie’
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--86uEK0LG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/rn2y4vc6eq2bo1qf8ys3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--86uEK0LG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/rn2y4vc6eq2bo1qf8ys3.png" width="875" height="212"&gt;&lt;/a&gt;&lt;br&gt;Images generated by the author using DALL·E 2 with prompt: “Llama in a jersey dunking a basketball like Michael Jordan, screenshots from the Miyazaki anime movie”. Thanks to the tip in &lt;a href="https://www.lesswrong.com/posts/uKp6tBFStnsvrot5t/what-dall-e-2-can-and-cannot-do#Art_style_transfer"&gt;this article&lt;/a&gt;.&lt;br&gt;

  &lt;/p&gt;

&lt;h2&gt;
  
  
  Final thoughts
&lt;/h2&gt;

&lt;p&gt;After over 100 credits (~US$13) and a lot of trial-and-error, here’s my final image:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--5XemmXSV--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/6tve9jxsyb0jcg18207g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--5XemmXSV--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/6tve9jxsyb0jcg18207g.png" width="880" height="880"&gt;&lt;/a&gt;&lt;br&gt;My winning image. &lt;a href="https://labs.openai.com/s/HYv3Kp8ElKDAWKHq2vs76VXu"&gt;https://labs.openai.com/s/HYv3Kp8ElKDAWKHq2vs76VXu&lt;/a&gt;&lt;br&gt;

  &lt;/p&gt;

&lt;p&gt;The image isn’t perfect, but DALL·E 2 managed to fulfill about 80% of the brief.&lt;/p&gt;

&lt;p&gt;Most of the credits went towards trying to get the right combination of style, faces, and composition to work together.&lt;/p&gt;

&lt;p&gt;According to &lt;a href="https://openai.com/blog/dall-e-now-available-in-beta/#fn1"&gt;OpenAI’s DALL·E announcement&lt;/a&gt;,&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“&lt;em&gt;…users get full usage rights to commercialize the images they create with DALL·E, including the right to reprint, sell, and merchandise.&lt;/em&gt;”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Expect many users to play fast and loose with these rules.&lt;/p&gt;

&lt;p&gt;As a content creator, DALL·E 2 will be most useful for creating simple illustrations, photos, and graphics for blogs and websites. I’ll be using it as an alternative to Unsplash to create blog cover images that won’t look the same as everyone else’s.&lt;/p&gt;

&lt;p&gt;If you’re about to try out DALL·E 2 yourself, &lt;strong&gt;here’s a tl;dr of tips before you start:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Check out the &lt;a href="https://dallery.gallery/the-dalle-2-prompt-book/"&gt;DALL·E 2 Prompt Book&lt;/a&gt;! (Also, the fan-made &lt;a href="https://www.reddit.com/r/dalle2/comments/v3jxud/me_and_someone_else_have_created_a_prompt/"&gt;Prompt Engineering Sheet&lt;/a&gt;).&lt;/li&gt;
&lt;li&gt;Be prepared to do some trial-and-error to get what you want. Fifteen free credits might sound like a lot, but it really isn’t. Expect to use &lt;strong&gt;at least&lt;/strong&gt; 15 credits to generate a usable image. DALL·E 2 is &lt;strong&gt;not&lt;/strong&gt; cheap.&lt;/li&gt;
&lt;li&gt;Don’t forget to save your favorite images as you go.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;Thanks for reading!&lt;/strong&gt; I’d love to hear your experience with DALL·E 2 and welcome any thoughts or feedback.&lt;/p&gt;

&lt;p&gt;If you enjoyed reading this, here are some articles by other writers you might like as well:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://jacobmartins.com/posts/how-i-used-dalle2-to-generate-the-logo-for-octosql/"&gt;How I used DALL-E 2 to Generate The Logo for OctoSQL&lt;/a&gt; by Jacob Martins&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://towardsdatascience.com/how-i-used-ai-to-reimagine-10-famous-landscape-paintings-3e2924e03f79"&gt;How I Used AI to Reimagine 10 Famous Landscape Paintings&lt;/a&gt; by Alberto Romero&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.lesswrong.com/posts/uKp6tBFStnsvrot5t/what-dall-e-2-can-and-cannot-do"&gt;What DALL-E 2 can and cannot do&lt;/a&gt; by Swimmer963&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>dalle</category>
      <category>datascience</category>
      <category>ai</category>
    </item>
    <item>
      <title>7 Best machine learning communities to advance your skills in 2022</title>
      <dc:creator>Joy</dc:creator>
      <pubDate>Wed, 04 May 2022 06:13:17 +0000</pubDate>
      <link>https://dev.to/joooyz/7-best-machine-learning-communities-to-advance-your-skills-in-2022-4075</link>
      <guid>https://dev.to/joooyz/7-best-machine-learning-communities-to-advance-your-skills-in-2022-4075</guid>
      <description>&lt;p&gt;At some point during your machine learning journey you may get stuck on a problem, start to lose motivation, or find yourself unable to keep up the rapid rate of new developments. In these situations, I find communities have a lot to offer regardless of your skill level.&lt;/p&gt;

&lt;p&gt;There are tons of communities out there — however, many are no longer active or not well-moderated. To help streamline your search, I’ve curated a list of (what I think) are some of the most active, helpful, and interesting communities to check out — not just based on overall size. I’ve also included a couple of niche communities if you’re interested in discovering new topics to explore.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Did I miss any that you’d recommend? I’m actively on the look-out for other communities to continually improve this article. Let me know in the comments!&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h1&gt;
  
  
  1. For general discussion and latest news: r/machinelearning
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Gj30Ydwn--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/tpun1icajds58qe3k1dk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Gj30Ydwn--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/tpun1icajds58qe3k1dk.png" alt="r/machinelearning subreddit home page (screenshot taken by Author)" width="880" height="548"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Reddit is home to a whole host of forums (known as subreddits) covering various aspects of machine learning. Of these, &lt;a href="https://www.reddit.com/r/MachineLearning/"&gt;r/machinelearning&lt;/a&gt; is the go-to subreddit with over 2 million members sharing machine learning projects, latest research, and discussions. It’s well-moderated and regularly contributed to by industry veterans, meaning you’ll find plenty of quality content here.&lt;/p&gt;

&lt;p&gt;If you’re looking for something a bit more beginner-friendly, I’d recommend checking out &lt;a href="https://www.reddit.com/r/learnmachinelearning/"&gt;r/learnmachinelearning&lt;/a&gt; instead. This is where you can ask beginner questions and share beginner projects for feedback (they also have a &lt;a href="https://discord.gg/G3rvFKF"&gt;Discord server&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;Some other related subreddits you might find useful include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.reddit.com/r/datascience/"&gt;r/datascience&lt;/a&gt;  (500K+ members) — discussion on data science careers&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.reddit.com/r/artificial/"&gt;r/artificial&lt;/a&gt; (145K+ members) — general AI news&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.reddit.com/r/reinforcementlearning/"&gt;r/reinforcementlearning&lt;/a&gt; (20K+ members) — focused on reinforcement learning&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  2. For competitions: Kaggle
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--y7oEB4mu--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/8yirs5qy9z98ai8xyb4u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--y7oEB4mu--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/8yirs5qy9z98ai8xyb4u.png" alt="Kaggle competitions page (screenshot taken by Author)" width="880" height="542"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://kaggle.com"&gt;Kaggle&lt;/a&gt; is the biggest data science competition platform. They partner with businesses to run challenges made up of a dataset and problem statement for anyone in the world to solve. Challenge topics vary from computer vision to stock exchange predictions. Joining a competition and contributing to the &lt;a href="https://www.kaggle.com/discussion"&gt;Kaggle Forums&lt;/a&gt; can be a useful way to collaborate with others working on the same project as you. Here you’ll be able to discuss approaches, algorithms, and advice for feature engineering. &lt;/p&gt;

&lt;p&gt;If the current competitions on Kaggle aren’t to your liking, some other data science competition platforms worth checking out include &lt;a href="https://www.aicrowd.com/"&gt;AICrowd&lt;/a&gt;, &lt;a href="https://omdena.com/"&gt;Omdena&lt;/a&gt;, &lt;a href="https://machinehack.com/"&gt;MachineHack&lt;/a&gt;, &lt;a href="https://www.drivendata.org/competitions/"&gt;DrivenData&lt;/a&gt;, and &lt;a href="https://zindi.africa/competitions"&gt;Zindi&lt;/a&gt;.&lt;/p&gt;

&lt;h1&gt;
  
  
  3. For getting started: Learn AI Together Discord
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://discord.com/invite/learnaitogether"&gt;Learn AI together&lt;/a&gt; has over 24,000 members and is one of the largest AI communities on Discord. The community is managed by Louis at &lt;a href="https://www.youtube.com/channel/UCUzGQrN-lyyc0BWTYoJM_Sg"&gt;What’s AI&lt;/a&gt; —  a YouTube channel dedicated to beginner-friendly resources on getting started in machine learning. There’s a huge list of discussion topics from AGI to Kaggle competitions to healthcare (30+ and counting!), and dedicated sections to ask questions and share resources on the latest news and events.&lt;/p&gt;

&lt;h1&gt;
  
  
  4. For NLP: Hugging Face
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--FEuuBlHI--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ju9ba909nq3f1kl1773g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--FEuuBlHI--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ju9ba909nq3f1kl1773g.png" alt="Hugging Face home page (screenshot taken by Author)" width="880" height="328"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://huggingface.co/"&gt;Hugging Face&lt;/a&gt; started originally with open-source tools for NLP projects, but has since expanded into fields such as Computer Vision and Reinforcement Learning. On the Hugging Face platform you can download and share models, and discuss projects on their &lt;a href="https://huggingface.co/join/discord"&gt;Discord&lt;/a&gt; or &lt;a href="https://discuss.huggingface.co/"&gt;Forum&lt;/a&gt;. If you’re having trouble figuring what type of project to build, heading over to Hugging Face may be a great source of inspiration.&lt;/p&gt;

&lt;h1&gt;
  
  
  5. For reinforcement learning: Reinforcement Learning Discussion Discord
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://discord.gg/xhfNqQv"&gt;Reinforcement Learning Discussion&lt;/a&gt; is an active Discord server with over 3,000 members. It’s managed by researchers in the reinforcement learning field, and they’re particularly friendly and catering for beginners. It can be a great place to ask questions on popular courses such as &lt;a href="https://www.deepmind.com/learning-resources/reinforcement-learning-lecture-series-2021"&gt;DeepMind’s Reinforcement Learning lectures&lt;/a&gt; or &lt;a href="https://spinningup.openai.com/en/latest/"&gt;Spinning Up by OpenAI&lt;/a&gt;, share progress and experiments with public reinforcement learning environments, and stay up-to-date on the latest research (many authors will share their latest papers directly in the server).&lt;/p&gt;

&lt;h1&gt;
  
  
  6. For DALL-E and similar generative projects: LAION
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://laion.ai/"&gt;LAION&lt;/a&gt; is a not-for-profit community whose main goal is to work together to replicate OpenAI’s DALL-E. They have an active &lt;a href="https://discord.gg/xBPBXfcFHd"&gt;Discord server&lt;/a&gt; with over 3,000 members at the time of writing. It’s a great place to keep up with (and contribute to) the open-source project, discuss related audio/video/3D topics, and share your own generative project for feedback.&lt;/p&gt;

&lt;h1&gt;
  
  
  7. For AI in gaming: StartCraft II AI Arena
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--ZegTU3iM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/bzagmv19321iamyj7rwm.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ZegTU3iM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/bzagmv19321iamyj7rwm.jpg" alt="AI Arena ladder homepage (screenshot taken by Author)" width="880" height="671"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If the achievements of Deepmind’s AlphaStar or OpenAI’s Dota 2 AI brought you into the space, you might be interested in checking out &lt;a href="https://aiarena.net/"&gt;AI Arena&lt;/a&gt;. They’re a community of researchers, practitioners, and hobbyists building both scripted and deep learning agents for StarCraft II. They have an open &lt;a href="https://discord.gg/Emm5Ztz"&gt;Discord&lt;/a&gt; for meeting others, run regular community streams on Twitch, and provide getting started resources for creating your own agent to enter their ranked tournament ladders.&lt;/p&gt;

&lt;h1&gt;
  
  
  Closing Remarks
&lt;/h1&gt;

&lt;p&gt;I hope that this list has helped you find a new community to meet others on a similar journey and take your skills to the next level. Join one or try them all to see what suits you best. Good luck!&lt;/p&gt;

</description>
      <category>datascience</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>7 real-world applications of reinforcement learning</title>
      <dc:creator>Joy</dc:creator>
      <pubDate>Thu, 17 Feb 2022 03:30:33 +0000</pubDate>
      <link>https://dev.to/joooyz/7-real-world-applications-of-reinforcement-learning-3l9m</link>
      <guid>https://dev.to/joooyz/7-real-world-applications-of-reinforcement-learning-3l9m</guid>
      <description>&lt;p&gt;Reinforcement learning is a subdomain of machine learning in which agents learn to make decisions by interacting with their environment. It recently gained popularity through its ability to achieve superhuman-levels of play in games like Go, Chess, Dota, and StarCraft II.&lt;/p&gt;

&lt;p&gt;In this article, I’ve put together a list of 7 examples where reinforcement learning is being applied in real-world use cases.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Autonomous driving with Wayve
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fysg6upt6zewoz084oeea.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fysg6upt6zewoz084oeea.jpg" alt="Image of car"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Approaches to self-driving cars have historically involved defining logic rules. This can be difficult to scale out to the countless number of situations that might be encountered by autonomous vehicles on public roads. This is where &lt;a href="https://arxiv.org/pdf/2002.00444.pdf" rel="noopener noreferrer"&gt;deep reinforcement learning may be promising&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://wayve.ai/" rel="noopener noreferrer"&gt;Wayve&lt;/a&gt; is a UK-based company that has been testing autonomous vehicles on public roads since 2018. In their paper, '&lt;a href="https://arxiv.org/pdf/1807.00412.pdf" rel="noopener noreferrer"&gt;Learning to Drive in a Day&lt;/a&gt;', they describe how they used deep reinforcement learning to train a model using a monocular image as input. The reward was the distance travelled by the vehicle without the safety driver taking control. The model was trained in a driving simulation and then deployed in the real world on a 250-meter section of road.&lt;/p&gt;

&lt;p&gt;While their autonomous vehicle technology continues to evolve, they claim that reinforcement learning continues to play a part in &lt;strong&gt;motion planning&lt;/strong&gt; (ensuring the existence of a feasible path between the target and destination points).&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Personalizing your Netflix recommendations
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcup7uqzgf2q36uaacmty.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcup7uqzgf2q36uaacmty.jpg" alt="Image of Netflix"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Netflix has 200 million users in over 190 countries. For each of these users, Netflix aims to present the most entertaining and relevant videos. In the presentation '&lt;strong&gt;&lt;a href="https://scale.com/blog/Netflix-Recommendation-Personalization-TransformX-Scale-AI-Insights" rel="noopener noreferrer"&gt;Netflix Explains Recommendations and Personalization&lt;/a&gt;&lt;/strong&gt;' by Justin Basilico (Director of Machine Learning and Recommender Systems at Netflix),  he describes how they achieve this by combining four key approaches: deep learning, causality, bandits &amp;amp; reinforcement learning, and objectives. &lt;/p&gt;

&lt;p&gt;The challenge is to train a model that optimizes for a user’s long-term satisfaction, over immediate gratification. Reinforcement learning can help by introducing exploration which lets the model learn about new interests over time. &lt;/p&gt;

&lt;p&gt;Justin notes that reinforcement learning is challenging to apply in this setting due to the high dimensionality and large problem space. To help with this, the team developed &lt;a href="https://dl.acm.org/doi/abs/10.1145/3460231.3474259" rel="noopener noreferrer"&gt;Accordion&lt;/a&gt; — a simulator for long-term training.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Optimizing inventory levels for Walmart
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjlgscalzrccdl5fnc493.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjlgscalzrccdl5fnc493.jpg" alt="Image of Walmart website"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Walmart is the world's largest retailer and grocer with over 4,650 stores. Walmart must constantly move unsold inventory to make space for new and better-selling items. The usual strategy to move unwanted stock is to implement price reduction. This is a time-consuming and laborious undertaking that requires re-labelling discounted merchandise multiple times on a store-by-store basis.&lt;/p&gt;

&lt;p&gt;To reduce operating costs, &lt;a href="https://www.youtube.com/watch?v=pxWkg2N0l9c" rel="noopener noreferrer"&gt;Walmart created an algorithm to optimize price reductions&lt;/a&gt;. The algorithm ingests data including sales data, operating costs, number and type of merchandise, and the dynamic time frame for when the merchandise must be sold by.&lt;/p&gt;

&lt;p&gt;The approach applies data analytics, reinforcement learning, and dynamic optimization to make automated decisions for each individual product, and is tailored to each store. The result is lowered operating costs and increased sales, with some stores experiencing up to 15% higher sales of the stock to be moved.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Improving search engine results with search.io
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0qaj6dw8zdeaktyoxurh.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0qaj6dw8zdeaktyoxurh.jpg" alt="Image of person with smartphone"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="http://Search.io" rel="noopener noreferrer"&gt;Search.io&lt;/a&gt; is an AI search engine for on-site search queries. They use &lt;a href="https://www.search.io/blog/reinforcement-learning-assisted-search-ranking" rel="noopener noreferrer"&gt;both 'learn-to-rank' and reinforcement learning techniques to improve their search ranking algorithm&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;Learn-to-rank involves using a machine learning model trained on a dataset of query-result pairs scored based on their relevance. One disadvantage of this technique is that the inputs (query-result pair scores) remain static. &lt;/p&gt;

&lt;p&gt;Reinforcement learning helps to improve the search algorithm over time using feedback in the form of clicks, sales, signups, etc. The challenge with applying reinforcement learning in this setting is that the search result quality typically starts out low, and needs time and data before it starts to meet customer expectations. &lt;/p&gt;

&lt;h2&gt;
  
  
  5. Improving language models with OpenAI's WebGPT
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe6x2gcw2aouwkzqvrtze.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe6x2gcw2aouwkzqvrtze.jpg" alt="Image of person on computer"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;GPT-3 is a language model used to generate human-like text. A downside of these language models is the tendency to 'hallucinate' information when performing tasks that require obscure real-world knowledge. To improve this, &lt;a href="https://openai.com/blog/webgpt/" rel="noopener noreferrer"&gt;OpenAI taught GPT-3 to use a text-based web browser&lt;/a&gt;. The model is able to search and collect information from web pages, and use these to compose answers to open-ended questions. &lt;/p&gt;

&lt;p&gt;The model is initially trained using human demonstrations. From there, the helpfulness and accuracy of the model are improved by training a reward model to predict human preferences. The system is then optimized against this reward model using either reinforcement learning or rejection sampling. The result was that the system was found to be more 'truthful' than GPT-3.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Trading on the financial markets with IBM's DSX platform
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftqb5sel4nkfi7u7q9ls3.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftqb5sel4nkfi7u7q9ls3.jpg" alt="Image of financial market trading platform"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;There has been reluctance in the financial industry to apply machine learning due to the high monetary risks. &lt;a href="https://medium.com/ibm-data-ai/reinforcement-learning-the-business-use-case-part-2-c175740999" rel="noopener noreferrer"&gt;In this article&lt;/a&gt;, IBM describes a trading system trained with reinforcement learning.&lt;/p&gt;

&lt;p&gt;The advantage of reinforcement learning in this setting is the ability to learn to make predictions that account for whatever effects the algorithm’s actions have had on the state of the market. This feedback loop allows the algorithm to auto-tune over time, continually making it more powerful and adaptable. The reward function is based on the profit or loss made in each trade.&lt;/p&gt;

&lt;p&gt;The model was assessed against a Buy-and-Hold strategy and ARIMA-GARCH (a forecasting model). They found that the model was able to capture head-and-shoulder patterns, which is a non-trivial feat.&lt;/p&gt;

&lt;h2&gt;
  
  
  7. Robotics with the University of California, Berkeley
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6z782z4k21y6zqug6nxt.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6z782z4k21y6zqug6nxt.jpg" alt="Image of manufacturing robot"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Developing controllers for robotics is a challenging task. Typical methods include careful modelling, but can be prone to failure when exposed to unexpected situations and environments.&lt;/p&gt;

&lt;p&gt;A team at the University of California, Berkeley tried to address this by training a &lt;a href="https://arxiv.org/pdf/2103.14295.pdf" rel="noopener noreferrer"&gt;real bipedal robot using reinforcement learning&lt;/a&gt;. The team was able to develop a model that resulted in a more diverse and robust walking control of a robot named Cassie. &lt;/p&gt;

&lt;p&gt;The deployed model was able to perform various behaviours such as changing walking heights, fast walking, walking sideways and turning in the real world. It was also robust to changes in the robot itself (e.g. partially damaged motors) and the environment (e.g. changes in ground friction and being pushed from different directions). You can watch Cassie in action in &lt;a href="https://www.youtube.com/watch?v=goxCjGPQH7U" rel="noopener noreferrer"&gt;this video&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;While reinforcement learning applications in the real world are still in their early days, I hope this list highlights the potential of the technology and the exciting progress that has already taken place so far. Who knows what else we might see in the next few years with ongoing developments in data collection, simulations, processing power, and research?&lt;/p&gt;

&lt;p&gt;If the field of reinforcement learning excites you, here are some of my other articles you might find useful:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.gocoder.one/blog/rl-tutorial-with-openai-gym" rel="noopener noreferrer"&gt;Introduction to reinforcement learning with OpenAI Gym Taxi&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.gocoder.one/blog/hands-on-introduction-to-deep-reinforcement-learning" rel="noopener noreferrer"&gt;A hands-on introduction to deep reinforcement learning&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.gocoder.one/blog/reinforcement-learning-project-ideas" rel="noopener noreferrer"&gt;8+ Reinforcement learning project ideas&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Thanks for reading!&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>datascience</category>
      <category>deeplearning</category>
    </item>
    <item>
      <title>Active and upcoming reinforcement learning competitions</title>
      <dc:creator>Joy</dc:creator>
      <pubDate>Thu, 28 Oct 2021 09:05:39 +0000</pubDate>
      <link>https://dev.to/joooyz/active-and-upcoming-reinforcement-learning-competitions-kei</link>
      <guid>https://dev.to/joooyz/active-and-upcoming-reinforcement-learning-competitions-kei</guid>
      <description>&lt;p&gt;Reinforcement learning (RL) is a subdomain of machine learning which involves agents learning to make decisions by interacting with their environment. While popular competition platforms like Kaggle are mainly suited for supervised learning problems, RL competitions are harder to come by.&lt;/p&gt;

&lt;p&gt;In this post, I've compiled a list of 7 ongoing and annual competitions which are suitable for RL. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Criteria:&lt;/strong&gt; any active (or upcoming) event or platform which involves a large number of individuals/teams competing for some form of incentive (e.g. prize money, co-authorships, leaderboard ranking etc.).&lt;/p&gt;

&lt;p&gt;&lt;em&gt;For AI competitions that are not necessarily tailored for RL, you might be interested in the list &lt;a href="https://www.gocoder.one/blog/ai-game-competitions-list"&gt;15 Active AI Game Competitions&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  1. &lt;a href="https://aws.amazon.com/deepracer/"&gt;AWS DeepRacer&lt;/a&gt; (2018 —, ongoing competition)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--0tvkuoir--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/dtj2sw3j90jvjasdi8of.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--0tvkuoir--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/dtj2sw3j90jvjasdi8of.png" alt="AWS DeepRacer" title="AWS DeepRacer 3D simulator" width="880" height="464"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/deepracer/"&gt;AWS DeepRacer&lt;/a&gt; is a beginner-friendly 3D racing simulator aimed at helping developers get started with RL. Participants can train models on Amazon SageMaker (first 10 hours are free) and enter monthly competitions in the form of an ongoing AWS DeepRacer League. &lt;/p&gt;

&lt;p&gt;The AWS DeepRacer League is run in time trial format (although other challenges such as head-to-head racing exist). Top racers win prizes including merchandise, customizations, and an expenses-paid trip to Las Vegas to attend AWS re:invent for the Championship Cup. Participants can also win or purchase a physical 1/18th scale race car for USD399 to test their models in the real-world.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. &lt;a href="https://aiarena.net/"&gt;AIArena&lt;/a&gt; (2016 —, ongoing competition)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--infYciki--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/3uygquulfg48sfm6wv83.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--infYciki--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/3uygquulfg48sfm6wv83.png" alt="AI Arena" title="AI Arena StarCraft II stream" width="880" height="493"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You might remember when &lt;a href="https://deepmind.com/blog/article/AlphaStar-Grandmaster-level-in-StarCraft-II-using-multi-agent-reinforcement-learning"&gt;AlphaStar reached Grandmaster status&lt;/a&gt; and beat two of the world's top players in StarCraft II in 2019. StarCraft II was &lt;a href="https://deepmind.com/blog/announcements/deepmind-and-blizzard-open-starcraft-ii-ai-research-environment"&gt;originally open-sourced in 2017&lt;/a&gt; by Blizzard to accelerate AI research in highly complex environments.&lt;/p&gt;

&lt;p&gt;You can still get involved with training deep RL agents in StarCraft II with the community at &lt;strong&gt;&lt;a href="https://aiarena.net/"&gt;AIArena&lt;/a&gt;&lt;/strong&gt;. They run an ongoing ranked ladder where you can compete head-to-head against other teams. Matches are 24/7 livestreamed to Twitch, with occasional community stream events.&lt;/p&gt;

&lt;p&gt;For original StarCraft, you can also check out:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://schnail.com/"&gt;SCHNAIL&lt;/a&gt;: Human vs AI competitions&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://sscaitournament.com/"&gt;SSCAIT&lt;/a&gt;: Student StarCraft AI Tournament&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  3. &lt;a href="https://www.gocoder.one/bomberland"&gt;Bomberland&lt;/a&gt; (2020—, ongoing competition)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--0avm0MBB--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/84d87fpo4fl4yeppw4oo.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--0avm0MBB--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/84d87fpo4fl4yeppw4oo.jpg" alt="Bomberland" title="Bomberland" width="880" height="462"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.gocoder.one/bomberland"&gt;Bomberland&lt;/a&gt; is our own machine learning competition based on the classic console game, Bomberman. Teams build agents which compete head-to-head in an ongoing competition against other teams.&lt;/p&gt;

&lt;p&gt;The Bomberland environment is challenging for out-of-the-box machine learning, requiring planning, real-time decision making, and navigating both adversarial and cooperative play.&lt;/p&gt;

&lt;p&gt;The competition officially starts 3rd December 2021. Top teams win prizes including merchandise, customizations, cash, and are featured on the finale Twitch livestream.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. &lt;a href="https://www.aicrowd.com/challenges/flatland-3"&gt;Flatland&lt;/a&gt; (2019—, annual competition)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--BRI84REY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ke4c9ik11yqjxlbq886a.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--BRI84REY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ke4c9ik11yqjxlbq886a.JPG" alt="Flatland" title="Flatland" width="642" height="258"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.aicrowd.com/challenges/flatland-3"&gt;Flatland&lt;/a&gt; is an annual competition featured as part of NeurIPS 2020. It is designed to tackle the problem of efficiently managing dense traffic on complex railway networks. The goal is to construct the best schedule that minimizes the delay in the requested arrival time of all trains.&lt;/p&gt;

&lt;p&gt;The 2021 competition is currently being run on the AICrowd platform. Submissions are evaluated and ranked according to the total reward accumulated in a controlled setting. RL approaches are encouraged, with a separate prize track for RL submissions. Prizes this year include drones and VR headsets.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. &lt;strong&gt;&lt;a href="https://minerl.io/"&gt;MineRL&lt;/a&gt;&lt;/strong&gt; (2019—, annual competition)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--r2QrMRPS--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/sy5s73k7p0e21vthedmi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--r2QrMRPS--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/sy5s73k7p0e21vthedmi.png" alt="MineRL" title="MineRL dataset example" width="579" height="312"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://minerl.io/"&gt;MineRL&lt;/a&gt;&lt;/strong&gt; is concerned with the development of sample-efficient deep RL algorithms which can solve hierarchical, sparse reward environments using human demonstrations in Minecraft. &lt;/p&gt;

&lt;p&gt;Participants have access to a large imitation learning dataset of over 60 million frames of recorded human player data in Minecraft. The goal is to develop systems that can complete tasks such as obtaining a diamond, building a house, searching for a cave, etc.&lt;/p&gt;

&lt;p&gt;The competition has been running as part of NeurIPS from 2019 — 2021 on AICrowd. Prizes include co-authorships and over $10,000 cash.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. &lt;a href="https://nethackchallenge.com/"&gt;NetHack&lt;/a&gt; (2020—, annual competition)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--GWywkfHd--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/nqx1eawgcda9bi1mnqil.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--GWywkfHd--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/nqx1eawgcda9bi1mnqil.JPG" alt="NetHack" title="NetHack" width="751" height="424"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://nethackchallenge.com/"&gt;NetHack&lt;/a&gt; is another annual competition at NeurIPS 2021 held on AICrowd. Teams compete to build the best agents to play NetHack, an ASCII-rendered single-player dungeon crawl game. NetHack features procedurally-generated levels, with hundreds of complex scenarios, making it an extremely challenging environment for current state-of-the-art RL.&lt;/p&gt;

&lt;p&gt;Like Flatland and MineRL, submissions are ranked on a leaderboard based on score in a controlled test setting. The competition this year features a $20,000 USD cash prize pool. RL approaches are encouraged, but non-RL approaches are also accepted. &lt;/p&gt;

&lt;h2&gt;
  
  
  7. &lt;a href="https://github.com/facebookresearch/CompilerGym"&gt;CompilerGym&lt;/a&gt; (2021—, leaderboard)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--YzfSI_zI--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/qz3zd3cc737xp90j3hxl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--YzfSI_zI--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/qz3zd3cc737xp90j3hxl.png" alt="CompilerGym" title="CompilerGym" width="880" height="343"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/facebookresearch/CompilerGym"&gt;CompilerGym&lt;/a&gt; is actually a toolkit for applying reinforcement learning to compiler optimizations, rather than a competition. However, users can submit algorithms to the &lt;a href="https://github.com/facebookresearch/CompilerGym#leaderboards"&gt;public repo leaderboard&lt;/a&gt; with their write-up and results.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bonus: competition platforms and conferences
&lt;/h2&gt;

&lt;p&gt;I prioritized competitions that are ongoing or run regularly for this list. Another good way to keep track of running competitions is to follow the competition platforms and conferences they are run as part of. Here's some worth keeping your eye on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.aicrowd.com/"&gt;AICrowd&lt;/a&gt;: Runs a combination of supervised ML competitions as well as RL competitions.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.kaggle.com/"&gt;Kaggle&lt;/a&gt;: Mainly supervised ML/data science competitions, but also feature &lt;a href="https://www.kaggle.com/simulations"&gt;simulation competitions&lt;/a&gt; which can be good problems for RL.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://nips.cc/"&gt;NeurIPS&lt;/a&gt;: Annual conference with a competition track for various machine learning competitions&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://ieee-cog.org/2022/"&gt;IEEE CoGs&lt;/a&gt;: Annual conference with a competition track, specifically for research in games.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Closing remarks
&lt;/h2&gt;

&lt;p&gt;I hope this list has helped you find an interesting competition to check out and practise reinforcement learning in. As new competitions come and go, I'll aim to keep this list up-to-date. Good luck!&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>deeplearning</category>
      <category>datascience</category>
    </item>
    <item>
      <title>Competitive self-play with Unity ML-Agents</title>
      <dc:creator>Joy</dc:creator>
      <pubDate>Fri, 22 Oct 2021 06:47:04 +0000</pubDate>
      <link>https://dev.to/joooyz/competitive-self-play-with-unity-ml-agents-1nh6</link>
      <guid>https://dev.to/joooyz/competitive-self-play-with-unity-ml-agents-1nh6</guid>
      <description>&lt;h2&gt;
  
  
  An overview of self-play
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://openai.com/blog/competitive-self-play/" rel="noopener noreferrer"&gt;Competitive self-play&lt;/a&gt; involves training an agent against itself. It was used in famous systems such as &lt;a href="https://deepmind.com/research/case-studies/alphago-the-story-so-far" rel="noopener noreferrer"&gt;AlphaGo&lt;/a&gt; and &lt;a href="https://openai.com/blog/dota-2/" rel="noopener noreferrer"&gt;OpenAI Five (Dota 2)&lt;/a&gt;. By playing increasingly stronger versions of itself, agents can discover new and better strategies.&lt;/p&gt;

&lt;p&gt;In this post, we walk through using competitive self-play in Unity ML-Agents to train agents to play volleyball. This article is also part 5 of the series '&lt;strong&gt;&lt;a href="https://dev.to/joooyz/a-hands-on-introduction-to-deep-reinforcement-learning-using-unity-ml-agents-4f8i"&gt;A hands-on introduction to deep reinforcement learning using Unity ML-Agents&lt;/a&gt;&lt;/strong&gt;'. &lt;/p&gt;

&lt;h2&gt;
  
  
  The case for self-play
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://dev.to/joooyz/how-to-train-agents-to-play-volleyball-using-deep-reinforcement-learning-417b"&gt;We previously trained agents using PPO&lt;/a&gt; with the following setup:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Symmetric environment&lt;/li&gt;
&lt;li&gt;Both agents shared the same policy&lt;/li&gt;
&lt;li&gt;Observations: velocity, rotation, and position vectors of the agent and ball&lt;/li&gt;
&lt;li&gt;Reward function: +1 for hitting the ball over the net&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This resulted in agents that were able to successfully volley the ball back-and-forth after ~20M training steps:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftmoyaw03cwnheu96dk3f.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftmoyaw03cwnheu96dk3f.gif" title="Trained agents playing volleyball" alt="PPO trained agents"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can see that the agents make 'easy' passes by aiming the ball towards the centre of the court. This is because we set the reward function to incentivize keeping the ball in play.&lt;/p&gt;

&lt;p&gt;Our aim now is to train &lt;em&gt;competitive&lt;/em&gt; agents that are rewarded for &lt;em&gt;winning&lt;/em&gt; (i.e. landing the ball in the opponent's court). We expect this will lead to agents that learn interesting strategies and make passes that are harder to return.&lt;/p&gt;

&lt;h2&gt;
  
  
  Self-play setup in ML-Agents
&lt;/h2&gt;

&lt;p&gt;To follow along this section, you will need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Unity ML-Agents Release 18+ (&lt;a href="https://dev.to/joooyz/an-introduction-to-machine-learning-with-unity-ml-agents-3an5"&gt;getting started instructions&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;The latest version of the &lt;a href="https://github.com/CoderOneHQ/ultimate-volleyball" rel="noopener noreferrer"&gt;Ultimate Volleyball repo&lt;/a&gt; (or, you can use your own volleyball environment if you've been following the tutorial series)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 1: Put the agents on opposing teams
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Open the Ultimate Volleyball environment in Unity&lt;/li&gt;
&lt;li&gt;Open &lt;strong&gt;Assets&lt;/strong&gt; &amp;gt; &lt;strong&gt;Prefabs&lt;/strong&gt; &amp;gt; &lt;code&gt;2PVolleyballArea.prefab&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Select either the &lt;code&gt;PurpleAgent&lt;/code&gt; or &lt;code&gt;BlueAgent&lt;/code&gt;  object&lt;/li&gt;
&lt;li&gt;In Inspector &amp;gt; Behavior Parameters, set &lt;code&gt;TeamId&lt;/code&gt; to 1 (the actual value doesn't matter, as long as the PurpleAgent and BlueAgent have different Team ID's):&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbofa2dc0j4uagfiybfiw.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbofa2dc0j4uagfiybfiw.jpg" title="Team ID setting in ML-Agents" alt="ML-Agents Team ID"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2:  Set up the self-play reward function
&lt;/h3&gt;

&lt;p&gt;Our previous reward function was +1 for hitting the ball over the net.&lt;/p&gt;

&lt;p&gt;For self-play, we'll switch to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;+1 to the winning team&lt;/li&gt;
&lt;li&gt;-1 to the losing team&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;Open &lt;code&gt;VolleyballEnvController.cs&lt;/code&gt; and add the rewards to the &lt;code&gt;ResolveEvent()&lt;/code&gt; method:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HitBlueGoal&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;// blue wins&lt;/span&gt;
    &lt;span class="n"&gt;blueAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddReward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1f&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;purpleAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddReward&lt;/span&gt;&lt;span class="p"&gt;(-&lt;/span&gt;&lt;span class="m"&gt;1f&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// turn floor blue&lt;/span&gt;
    &lt;span class="nf"&gt;StartCoroutine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;GoalScoredSwapGroundMaterial&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;volleyballSettings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;blueGoalMaterial&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;RenderersList&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="m"&gt;5f&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

    &lt;span class="c1"&gt;// end episode&lt;/span&gt;
    &lt;span class="n"&gt;blueAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;EndEpisode&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="n"&gt;purpleAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;EndEpisode&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nf"&gt;ResetScene&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HitPurpleGoal&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;// purple wins&lt;/span&gt;
    &lt;span class="n"&gt;purpleAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddReward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1f&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;blueAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddReward&lt;/span&gt;&lt;span class="p"&gt;(-&lt;/span&gt;&lt;span class="m"&gt;1f&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// turn floor purple&lt;/span&gt;
    &lt;span class="nf"&gt;StartCoroutine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;GoalScoredSwapGroundMaterial&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;volleyballSettings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;purpleGoalMaterial&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;RenderersList&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="m"&gt;5f&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

    &lt;span class="c1"&gt;// end episode&lt;/span&gt;
    &lt;span class="n"&gt;blueAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;EndEpisode&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="n"&gt;purpleAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;EndEpisode&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nf"&gt;ResetScene&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Remove &lt;code&gt;AddReward&lt;/code&gt; from the other cases&lt;/li&gt;
&lt;li&gt;You can also set penalties for hitting the ball out of the court (in &lt;code&gt;case Event.HitOutOfBounds&lt;/code&gt;). From my experience, this may take longer for the agents to learn to hit the ball.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Step 3: Add self-play training parameters to the trainer config
&lt;/h3&gt;

&lt;p&gt;Create a new &lt;code&gt;.yaml&lt;/code&gt; file and copy in the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;behaviors&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Volleyball&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;trainer_type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ppo&lt;/span&gt;
    &lt;span class="na"&gt;hyperparameters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;batch_size&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2048&lt;/span&gt;
      &lt;span class="na"&gt;buffer_size&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;20480&lt;/span&gt;
      &lt;span class="na"&gt;learning_rate&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.0002&lt;/span&gt;
      &lt;span class="na"&gt;beta&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.003&lt;/span&gt;
      &lt;span class="na"&gt;epsilon&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.15&lt;/span&gt;
      &lt;span class="na"&gt;lambd&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.93&lt;/span&gt;
      &lt;span class="na"&gt;num_epoch&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;4&lt;/span&gt;
      &lt;span class="na"&gt;learning_rate_schedule&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;constant&lt;/span&gt;
    &lt;span class="na"&gt;network_settings&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;normalize&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;hidden_units&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;256&lt;/span&gt;
      &lt;span class="na"&gt;num_layers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
      &lt;span class="na"&gt;vis_encode_type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;simple&lt;/span&gt;
    &lt;span class="na"&gt;reward_signals&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;extrinsic&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;gamma&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.96&lt;/span&gt;
        &lt;span class="na"&gt;strength&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1.0&lt;/span&gt;
    &lt;span class="na"&gt;keep_checkpoints&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
    &lt;span class="na"&gt;max_steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;80000000&lt;/span&gt;
    &lt;span class="na"&gt;time_horizon&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1000&lt;/span&gt;
    &lt;span class="na"&gt;summary_freq&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;20000&lt;/span&gt;
    &lt;span class="na"&gt;self_play&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;window&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;
      &lt;span class="na"&gt;play_against_latest_model_ratio&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.5&lt;/span&gt;
      &lt;span class="na"&gt;save_steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;20000&lt;/span&gt;
      &lt;span class="na"&gt;swap_steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10000&lt;/span&gt;
      &lt;span class="na"&gt;team_change&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;100000&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Explaining self-play parameters
&lt;/h3&gt;

&lt;p&gt;During self-play, one of the agents will be set as the &lt;em&gt;learning agent&lt;/em&gt; and the other as the fixed policy &lt;em&gt;opponent&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Every &lt;code&gt;save_steps=20000&lt;/code&gt; steps, a snapshot of the learning agent's existing policy will be taken. Up to &lt;code&gt;window=10&lt;/code&gt; snapshots will be stored. When a new snapshot is taken, the oldest one is discarded. These past versions of itself become the 'opponents' that the learning agent trains against. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffq14ashtaitzhhgchknw.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffq14ashtaitzhhgchknw.jpg" title="Self-play hyperparameters" alt="Self-play hyperparameters"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Every &lt;code&gt;swap_steps=10000&lt;/code&gt; steps, the opponent's policy will be swapped with a different snapshot. The snapshot is sampled with a probability of &lt;code&gt;play_against_latest_model_ratio=0.5&lt;/code&gt; that it will play against the &lt;strong&gt;latest&lt;/strong&gt; &lt;strong&gt;policy&lt;/strong&gt; (i.e. the &lt;strong&gt;strongest&lt;/strong&gt; opponent). This helps to prevent &lt;strong&gt;overfitting&lt;/strong&gt; to a single opponent playstyle.&lt;/p&gt;

&lt;p&gt;After &lt;code&gt;team_change=100000&lt;/code&gt; steps, the learning agent and opponent teams will be switched. &lt;/p&gt;

&lt;p&gt;Feel free to play around with these default hyperparameters (more information available in the official &lt;a href="https://github.com/Unity-Technologies/ml-agents/blob/main/docs/Training-Configuration-File.md#self-play" rel="noopener noreferrer"&gt;ML-Agents documentation&lt;/a&gt;). &lt;/p&gt;

&lt;h2&gt;
  
  
  Training with self-play
&lt;/h2&gt;

&lt;p&gt;Training with self-play in ML-Agents is done the same way as any other form of training:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Activate the virtual environment containing your installation of &lt;code&gt;ml-agents&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Navigate to your working directory, and run in the terminal:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;code&gt;mlagents-learn &amp;lt;path to config file&amp;gt; --run-id=VB_1 --time-scale=1&lt;/code&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;When you see the message "Start training by pressing the Play button in the Unity Editor", click ▶ within the Unity GUI.&lt;/li&gt;
&lt;li&gt;In another terminal window, run &lt;code&gt;tensorboard --logdir&lt;/code&gt; results from your working directory to observe the training process.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Self-play training results
&lt;/h2&gt;

&lt;p&gt;In a stable training run, you should see the ELO gradually increase. &lt;/p&gt;

&lt;p&gt;In the diagram below, the three inflexion points correspond to the agent:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Learning to serve &lt;/li&gt;
&lt;li&gt;Learning to return the ball&lt;/li&gt;
&lt;li&gt;Learning more competitive shots&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fszk3gdbjlwmodfdki1j5.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fszk3gdbjlwmodfdki1j5.jpg" title="ELO and Episode Length" alt="Tensorboard results"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.to/joooyz/how-to-train-agents-to-play-volleyball-using-deep-reinforcement-learning-417b"&gt;Compared to our previous training results&lt;/a&gt;, I found that even after ~80M steps, the agents trained using self-play don't serve or return the ball as reliably. However, they do learn to hit some interesting shots, like hitting the ball towards the edge of the court:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1mejlpcsil20efq9faj4.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1mejlpcsil20efq9faj4.gif" title="Volleyball agents trained using PPO self-play after 80M steps" alt="Trained agents using self-play playing volleyball"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you discover any other interesting playstyles, let me know!&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrap-up
&lt;/h2&gt;

&lt;p&gt;Thanks for reading! I hope you found this post useful.&lt;/p&gt;

&lt;p&gt;If you have any feedback or questions, feel free to post them on the &lt;a href="https://github.com/CoderOneHQ/ultimate-volleyball" rel="noopener noreferrer"&gt;Ultimate Volleyball Repo&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>datascience</category>
      <category>deeplearning</category>
      <category>unity3d</category>
    </item>
    <item>
      <title>8+ Reinforcement Learning Project Ideas</title>
      <dc:creator>Joy</dc:creator>
      <pubDate>Thu, 30 Sep 2021 07:00:29 +0000</pubDate>
      <link>https://dev.to/joooyz/7-reinforcement-learning-project-ideas-14fm</link>
      <guid>https://dev.to/joooyz/7-reinforcement-learning-project-ideas-14fm</guid>
      <description>&lt;p&gt;This blog post is a compilation of reinforcement learning (RL) project ideas to check out. I've tried to select projects covering a range of different difficulties, concepts, and algorithms in RL.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Solve toy problems with &lt;a href="https://gym.openai.com/" rel="noopener noreferrer"&gt;OpenAI Gym&lt;/a&gt; (beginner-friendly)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbh9xo8mqizlidg1l1172.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbh9xo8mqizlidg1l1172.JPG" title="Cartpole environment from OpenAI Gym" alt="Cartpole"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://gym.openai.com/" rel="noopener noreferrer"&gt;OpenAI Gym&lt;/a&gt; has become the de facto standard for reinforcement learning frameworks among researchers and practitioners. Solving toy problems from the gym library will help familiarize you with this popular framework. Good starting points include &lt;a href="https://gym.openai.com/envs/CartPole-v1/" rel="noopener noreferrer"&gt;Cartpole&lt;/a&gt;, &lt;a href="https://gym.openai.com/envs/LunarLander-v2/" rel="noopener noreferrer"&gt;Lunar Lander&lt;/a&gt; and &lt;a href="https://gym.openai.com/envs/Taxi-v3/" rel="noopener noreferrer"&gt;Taxi&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you're interested in a step-by-step walkthrough, check out our &lt;a href="https://www.gocoder.one/blog/rl-tutorial-with-openai-gym" rel="noopener noreferrer"&gt;introductory Q-learning tutorial with Taxi&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Play Atari games from pixel input with &lt;a href="https://gym.openai.com/envs/#atari" rel="noopener noreferrer"&gt;OpenAI Gym&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feoqwtm3ahn9vz1v9chop.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feoqwtm3ahn9vz1v9chop.jpg" title="Atari environments from OpenAI Gym" alt="Atari environments"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;OpenAI Gym also contains a suite of &lt;a href="https://gym.openai.com/envs/#atari" rel="noopener noreferrer"&gt;Atari game environments&lt;/a&gt; as part of its Arcade Learning Environment (ALE) framework. Examples include &lt;a href="https://gym.openai.com/envs/Breakout-v0/" rel="noopener noreferrer"&gt;Breakout&lt;/a&gt;, &lt;a href="https://gym.openai.com/envs/MontezumaRevenge-v0/" rel="noopener noreferrer"&gt;Montezuma Revenge&lt;/a&gt;, and &lt;a href="https://gym.openai.com/envs/SpaceInvaders-v0/" rel="noopener noreferrer"&gt;Space Invaders&lt;/a&gt;. Environment observations are available in the form of screen input or RAM (direct observation of the Atari 2600's 1024 bits of memory).&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Additional resources:&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/simoninithomas/Deep_reinforcement_learning_Course/blob/master/Deep%20Q%20Learning/Space%20Invaders/DQN%20Atari%20Space%20Invaders.ipynb" rel="noopener noreferrer"&gt;Jupyter notebook tutorial for Space Invaders&lt;/a&gt; by Thomas Simonini&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  3. Simulate control tasks with &lt;a href="https://github.com/bulletphysics/bullet3" rel="noopener noreferrer"&gt;PyBullet&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7c2gzfd980o43q6qsddf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7c2gzfd980o43q6qsddf.png" title="PyBullet environment examples" alt="PyBullet"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Gym provides a library of continuous physics simulations in the form of its &lt;a href="https://gym.openai.com/envs/#mujoco" rel="noopener noreferrer"&gt;MuJoCo&lt;/a&gt; environments. Since MuJoCo requires a paid license, I recommend checking out &lt;a href="https://github.com/bulletphysics/bullet3" rel="noopener noreferrer"&gt;PyBullet&lt;/a&gt; as a free open-source alternative. Using PyBullet/MuJoCo, you can teach a variety of robots to walk, run, or swim.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Create your own reinforcement learning environment with &lt;a href="https://github.com/Unity-Technologies/ml-agents" rel="noopener noreferrer"&gt;Unity ML-Agents&lt;/a&gt; (beginner-friendly)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8c0yn02hrgnf3iuxqsh1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8c0yn02hrgnf3iuxqsh1.png" title="Unity ML-Agents example environments" alt="Unity ML-Agents"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/Unity-Technologies/ml-agents" rel="noopener noreferrer"&gt;Unity ML-Agents&lt;/a&gt; is a relatively new add-on to the Unity game engine. It allows game developers to train intelligent NPCs for games and enables researchers to create graphics- and physics-rich RL environments. Project ideas to explore include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Experimenting with algorithms like PPO, SAC, GAIL, and Self-Play provided out-of-the-box&lt;/li&gt;
&lt;li&gt;Training agents in a library of 18+ environments including &lt;a href="https://github.com/Unity-Technologies/ml-agents/tree/dodgeball-env" rel="noopener noreferrer"&gt;Dodgeball&lt;/a&gt;, Soccer, and classic control problems&lt;/li&gt;
&lt;li&gt;Creating your own custom 3D RL environment&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Additional resources:&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.gocoder.one/blog/hands-on-introduction-to-deep-reinforcement-learning" rel="noopener noreferrer"&gt;Build a 3D Volleyball RL environment&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.immersivelimit.com/tutorials/reinforcement-learning-penguins-part-1-unity-ml-agents" rel="noopener noreferrer"&gt;Reinforcement Learning Penguins&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.unity.com/course/ml-agents-hummingbirds" rel="noopener noreferrer"&gt;Unity Hummingbirds Course&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  5. Race self-driving cars with &lt;a href="https://aws.amazon.com/deepracer/" rel="noopener noreferrer"&gt;AWS DeepRacer&lt;/a&gt; (beginner-friendly)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3h9x75fnhqg9m3wfbz5t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3h9x75fnhqg9m3wfbz5t.png" title="AWS DeepRacer simulation" alt="AWS DeepRacer"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/deepracer/" rel="noopener noreferrer"&gt;AWS DeepRacer&lt;/a&gt; is a 3D racing simulator designed to help developers get started with RL using Amazon SageMaker. You'll need to pay for training and evaluating your model on AWS. It features monthly competitive races as part of the &lt;a href="https://aws.amazon.com/deepracer/league/" rel="noopener noreferrer"&gt;AWS DeepRacer league&lt;/a&gt;, which awards prizes and the chance to compete at re:Invent.&lt;/p&gt;

&lt;p&gt;DeepRacer also gives you the option of purchasing a physical 1/18th scale race car for USD399 that will allow you to deploy your model in the real-world.&lt;/p&gt;

&lt;p&gt;Some other open-source projects relating to autonomous driving to check out:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/microsoft/AirSim" rel="noopener noreferrer"&gt;AirSim&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/carla-simulator/carla" rel="noopener noreferrer"&gt;CARLA&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  6. Mine diamonds in Minecraft with &lt;a href="https://minerl.io/" rel="noopener noreferrer"&gt;MineRL&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0f297v2xbl4kfj9ixmec.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0f297v2xbl4kfj9ixmec.png" title="MineRL dataset example" alt="MineRL"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://minerl.io/" rel="noopener noreferrer"&gt;MineRL&lt;/a&gt; contains an imitation learning dataset of over 60 million frames of recorded human player data in Minecraft. The goal is to train agents that can navigate an open world and overcome inherent challenges such as tasks with lots of hierarchy and sparse rewards.&lt;/p&gt;

&lt;p&gt;MineRL is currently running two competition tracks as part of NeurIPS 2021:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://minerl.io/diamond/" rel="noopener noreferrer"&gt;Diamond&lt;/a&gt;: Obtain a diamond provided a fixed limit of raw pixel sample data and time training&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://minerl.io/basalt/" rel="noopener noreferrer"&gt;BASALT&lt;/a&gt;:  Solve almost-lifelike tasks (e.g. build a house, search for a cave)&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  7. Join the community at &lt;a href="https://aiarena.net/" rel="noopener noreferrer"&gt;AIArena&lt;/a&gt; building agents for StarCraft II
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftyxd21t0ehjoj2gs619i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftyxd21t0ehjoj2gs619i.png" title="AI Arena StarCraft II stream" alt="AI Arena"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you're looking to train agents to play highly complex mainstream games, you should check out &lt;a href="https://aiarena.net/" rel="noopener noreferrer"&gt;AIArena&lt;/a&gt;. They run regular streams and ladders for a community of researchers, practitioners, and hobbyists building deep learning agents for StarCraft II.&lt;/p&gt;

&lt;p&gt;Some other games with RL frameworks you might be interested in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://rlgym.github.io/" rel="noopener noreferrer"&gt;Rocket League&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://leaguesandbox.github.io/" rel="noopener noreferrer"&gt;League of Legends&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://games.mau.se/research/the-dota2-5v5-ai-competition/" rel="noopener noreferrer"&gt;Dota 2&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  8. Build a Chess Bot with &lt;a href="https://github.com/deepmind/open_spiel" rel="noopener noreferrer"&gt;OpenSpiel&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5s6qop66ghgiwbiciqlm.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5s6qop66ghgiwbiciqlm.jpg" title="Image credit: DeepMind" alt="OpenSpiel"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/deepmind/open_spiel/" rel="noopener noreferrer"&gt;OpenSpiel&lt;/a&gt; by DeepMind is worth taking a look at if you've been inspired by programs like &lt;a href="https://stockfishchess.org/" rel="noopener noreferrer"&gt;StockFish&lt;/a&gt; or AlphaGo. It contains a collection of environments and algorithms for general RL and planning/search in a variety of games including Chess, Go, Backgammon, and more.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bonus ideas
&lt;/h2&gt;

&lt;p&gt;Here are some additional project ideas that are also worth checking out:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Predict stock prices with &lt;a href="https://github.com/tensortrade-org/tensortrade" rel="noopener noreferrer"&gt;TensorTrade&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Train cooperative agents with &lt;a href="https://www.pettingzoo.ml/" rel="noopener noreferrer"&gt;PettingZoo&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Build a Poker bot with &lt;a href="https://github.com/datamllab/rlcard" rel="noopener noreferrer"&gt;RLCard&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Join an &lt;a href="https://www.gocoder.one/blog/ai-game-competitions-list" rel="noopener noreferrer"&gt;AI Programming competition&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Closing remarks
&lt;/h2&gt;

&lt;p&gt;There's a huge range of exciting projects to explore in reinforcement learning. This list is by no means comprehensive, but I hope it's given you some inspiration for your own RL project!&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>datascience</category>
      <category>beginners</category>
      <category>deeplearning</category>
    </item>
    <item>
      <title>How to train agents to play volleyball using deep reinforcement learning</title>
      <dc:creator>Joy</dc:creator>
      <pubDate>Thu, 23 Sep 2021 01:48:10 +0000</pubDate>
      <link>https://dev.to/joooyz/how-to-train-agents-to-play-volleyball-using-deep-reinforcement-learning-417b</link>
      <guid>https://dev.to/joooyz/how-to-train-agents-to-play-volleyball-using-deep-reinforcement-learning-417b</guid>
      <description>&lt;p&gt;This article is part 4 of the series '&lt;strong&gt;&lt;a href="https://dev.to/joooyz/a-hands-on-introduction-to-deep-reinforcement-learning-using-unity-ml-agents-4f8i"&gt;A hands-on introduction to deep reinforcement learning using Unity ML-Agents&lt;/a&gt;&lt;/strong&gt;'. It's also suitable for anyone interested in using Unity ML-Agents for their own reinforcement learning project.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Recap and overview&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;In parts &lt;strong&gt;&lt;a href="https://dev.to/joooyz/build-a-reinforcement-learning-environment-using-unity-ml-agents-112e"&gt;2&lt;/a&gt;&lt;/strong&gt; and &lt;strong&gt;&lt;a href="https://dev.to/joooyz/design-reinforcement-learning-agents-using-unity-ml-agents-58f0"&gt;3&lt;/a&gt;&lt;/strong&gt;, we built a volleyball environment using Unity ML-Agents. &lt;/p&gt;

&lt;p&gt;To recap, here is the reinforcement learning setup:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Agent actions&lt;/strong&gt; (4 discrete branches):

&lt;ul&gt;
&lt;li&gt;Move forward/backward&lt;/li&gt;
&lt;li&gt;Rotate clockwise/anti-clockwise&lt;/li&gt;
&lt;li&gt;Move left/right&lt;/li&gt;
&lt;li&gt;Jump&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Agent observations&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;Agent's y-rotation [1 float]&lt;/li&gt;
&lt;li&gt;Agent's x,y,z-velocity [3 floats]&lt;/li&gt;
&lt;li&gt;Agent's x,y,z-normalized vector to the ball (i.e. direction to the ball) [3 floats]&lt;/li&gt;
&lt;li&gt;Ball's x,y,z-velocity [3 floats]&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Reward function:&lt;/strong&gt; +1 for hitting the ball over the net&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;In this tutorial, we'll use ML-Agents to train these agents to play volleyball using the PPO reinforcement learning algorithm.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4yvsewne8i8gkw3n9dl6.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4yvsewne8i8gkw3n9dl6.gif" title="Reinforcement learning Agents playing volleyball. Trained using PPO on ~20M steps." alt="Trained PPO agents"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  A note on PPO
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://openai.com/blog/openai-baselines-ppo/" rel="noopener noreferrer"&gt;Proximal Policy Optimization (PPO) by OpenAI&lt;/a&gt; is an on-policy reinforcement learning algorithm. We won't go into detail, but we choose to use it here because ML-Agents provides an implementation of it out-of-the-box. It produces stable results in this environment and is also recommended by ML-Agents for use with Self-Play (which we'll cover in the next tutorial).&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting up for training
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;If you &lt;em&gt;didn't&lt;/em&gt; follow along with the previous tutorials&lt;/strong&gt;, you can clone or download a copy of the volleyball environment here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/CoderOneHQ/ultimate-volleyball" rel="noopener noreferrer"&gt;Ultimate Volleyball Repo&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you &lt;em&gt;did&lt;/em&gt; follow along with the previous tutorials&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Load the &lt;code&gt;Volleyball.unity&lt;/code&gt; scene&lt;/li&gt;
&lt;li&gt;Select the &lt;code&gt;VolleyballArea&lt;/code&gt; object&lt;/li&gt;
&lt;li&gt;Ctrl (or CMD) + D to duplicate the object&lt;/li&gt;
&lt;li&gt;Position the &lt;code&gt;VolleyballArea&lt;/code&gt; objects so that they don't overlap&lt;/li&gt;
&lt;li&gt;Repeat 2 - 4 until you have ~16 copies of the environment&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fugx0huucq26p74pbuium.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fugx0huucq26p74pbuium.JPG" title="Volleyball scene containing 16x copies of the same reinforcement learning environment" alt="Volleyball Scene"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Each &lt;code&gt;VolleyballArea&lt;/code&gt; object is an exact copy of the reinforcement learning environment. All these agents act independently but share the same model. This speeds up training, since all agents contribute to training in parallel.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Selecting hyperparameters
&lt;/h2&gt;

&lt;p&gt;In your project working directory, create a file called &lt;code&gt;Volleyball.yaml&lt;/code&gt;. If you've downloaded the full Ultimate-Volleyball repo earlier, this is located in the &lt;code&gt;config&lt;/code&gt; folder.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Volleyball.yaml&lt;/code&gt; is a &lt;strong&gt;trainer configuration file&lt;/strong&gt; that specifies all the hyperparameters and other settings used during training. Paste the following inside &lt;code&gt;Volleyball.yaml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;behaviors&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Volleyball&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;trainer_type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ppo&lt;/span&gt;
    &lt;span class="na"&gt;hyperparameters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;batch_size&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2048&lt;/span&gt;
      &lt;span class="na"&gt;buffer_size&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;20480&lt;/span&gt;
      &lt;span class="na"&gt;learning_rate&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.0002&lt;/span&gt;
      &lt;span class="na"&gt;beta&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.003&lt;/span&gt;
      &lt;span class="na"&gt;epsilon&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.15&lt;/span&gt;
      &lt;span class="na"&gt;lambd&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.93&lt;/span&gt;
      &lt;span class="na"&gt;num_epoch&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;4&lt;/span&gt;
      &lt;span class="na"&gt;learning_rate_schedule&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;constant&lt;/span&gt;
    &lt;span class="na"&gt;network_settings&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;normalize&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;hidden_units&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;256&lt;/span&gt;
      &lt;span class="na"&gt;num_layers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
      &lt;span class="na"&gt;vis_encode_type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;simple&lt;/span&gt;
    &lt;span class="na"&gt;reward_signals&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;extrinsic&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;gamma&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.96&lt;/span&gt;
        &lt;span class="na"&gt;strength&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1.0&lt;/span&gt;
    &lt;span class="na"&gt;keep_checkpoints&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
    &lt;span class="na"&gt;max_steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;20000000&lt;/span&gt;
    &lt;span class="na"&gt;time_horizon&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1000&lt;/span&gt;
    &lt;span class="na"&gt;summary_freq&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;20000&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Descriptions of the configurations are available in the &lt;a href="https://github.com/Unity-Technologies/ml-agents/blob/release_18_docs/docs/Training-Configuration-File.md" rel="noopener noreferrer"&gt;ML-Agents official documentation&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Training
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Make sure that Behavior Types are set to &lt;code&gt;Default&lt;/code&gt;:

&lt;ol&gt;
&lt;li&gt;Open Assets &amp;gt; Prefabs &amp;gt; &lt;code&gt;VolleyballArea.prefab&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Select the &lt;code&gt;PurpleAgent&lt;/code&gt; object&lt;/li&gt;
&lt;li&gt;Go to Inspector window &amp;gt; Behavior Parameters &amp;gt; Behavior Type &amp;gt; Set to &lt;code&gt;Default&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Repeat for Blue Agent&lt;/li&gt;
&lt;/ol&gt;


&lt;/li&gt;

&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6zwy1iyg153kwra8gf0r.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6zwy1iyg153kwra8gf0r.jpg" title="Behavior Parameters panel" alt="Behavior Parameters"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; the Behavior Name (Volleyball) above &lt;strong&gt;must match&lt;/strong&gt; the behavior name in the &lt;code&gt;Volleyball.yaml&lt;/code&gt; trainer config file (line 2).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;(Optional) Set up a training camera so that you can view the whole scene while training.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;If using the pre-built repo&lt;/strong&gt;, select the Main Camera and turn it off in the Inspector. &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;If using your own project,&lt;/strong&gt; create a camera object: right click in Hierarchy &amp;gt; Camera. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbpslmdztibi6u9kukbdr.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbpslmdztibi6u9kukbdr.jpg" title="Setting up the training camera" alt="Training camera setup"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Activate the virtual environment containing your installation of &lt;code&gt;ml-agents&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Navigate to your working directory, and run in the terminal:&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mlagents-learn &amp;lt;path to config file&amp;gt; &lt;span class="nt"&gt;--run-id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;VB_1 &lt;span class="nt"&gt;--time-scale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Notes:&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;Replace &lt;code&gt;&amp;lt;path to config file&amp;gt;&lt;/code&gt; , e.g. &lt;code&gt;config/Volleyball.yaml&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;ML-Agents defaults to a time scale of 20x to speed up training. Setting the flag &lt;code&gt;--time-scale=1&lt;/code&gt; is important because the physics in this environment are time-dependant. Without it, you may notice that your agents perform differently during inference compared to training.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;When you see the message "Start training by pressing the Play button in the Unity Editor", click ▶ within the Unity GUI.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiilj7s7p1cxon5v124c2.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiilj7s7p1cxon5v124c2.JPG" title="Unity ML-Agents interface" alt="Unity ML-Agents interface"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In another terminal window, run &lt;code&gt;tensorboard --logdir results&lt;/code&gt; from your working directory to observe the training process.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffj49yte0m1almp4x8vnc.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffj49yte0m1almp4x8vnc.JPG" title="Tensorboard dashboard" alt="Tensorboard"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;You can pause training at any time by clicking the ▶ button in Unity. To see how the agents are performing:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Locate the results in &lt;code&gt;results/VB_1/Volleyball.onnx&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Copy this .onnx model into the Unity project&lt;/li&gt;
&lt;li&gt;Drag the model into the &lt;code&gt;Model&lt;/code&gt; field of the Behavior Parameters component. &lt;/li&gt;
&lt;li&gt;Click ▶ to watch the agents use this model for inference.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqske0idzc1v9njwmnju4.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqske0idzc1v9njwmnju4.jpg" title="Setting the model in Behavior Parameters" alt="Behavior Parameters"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;To resume training, add the &lt;code&gt;--resume&lt;/code&gt; flag (e.g. &lt;code&gt;mlagents-learn config/Volleyball.yaml --run-id=VB_1 --time-scale=1 --resume&lt;/code&gt;)&lt;/p&gt;&lt;/li&gt;

&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4nnrf4xx50s2yvuzdjep.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4nnrf4xx50s2yvuzdjep.gif" title="Agents training in parallel" alt="Training agents"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Leave the agents to train. At about ~5M you'll start to see the agents occasionally touching the ball. At ~10M the agents can start to volley:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F18s3jepqwe1nfm88zp8x.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F18s3jepqwe1nfm88zp8x.gif" title="Agents after training for 10M steps" alt="Training agents after 10M steps"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;At ~20M steps, the agents should be able to successfully volley the ball back-and-forth!&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmsuo2w0bb4bcxgr6m7yu.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmsuo2w0bb4bcxgr6m7yu.gif" title="Reinforcement learning Agents playing volleyball. Trained using PPO on ~20M steps." alt="Trained agents"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Next steps
&lt;/h2&gt;

&lt;p&gt;In this tutorial, you successfully trained agents to play volleyball in ~20M steps using PPO. Try playing around with the hyperparameters in &lt;code&gt;Volleyball.yaml&lt;/code&gt; or training for more steps to get a better result. &lt;/p&gt;

&lt;p&gt;These agents are trained to keep the ball in the play. You won't be able to train &lt;em&gt;competitive&lt;/em&gt; agents (with the intention of &lt;em&gt;winning&lt;/em&gt; the game) with this setup because its a zero-sum game and both purple and blue agents share the same model. This is where competitive &lt;strong&gt;Self-Play&lt;/strong&gt; comes in.&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>deeplearning</category>
      <category>unity3d</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Bomberland: a competitive sandbox for practising machine learning</title>
      <dc:creator>Joy</dc:creator>
      <pubDate>Mon, 20 Sep 2021 00:48:51 +0000</pubDate>
      <link>https://dev.to/joooyz/bomberland-a-new-artificial-intelligence-competition-2i1k</link>
      <guid>https://dev.to/joooyz/bomberland-a-new-artificial-intelligence-competition-2i1k</guid>
      <description>&lt;h2&gt;
  
  
  Welcome to Bomberland
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.gocoder.one/bomberland?s=devto-blog" rel="noopener noreferrer"&gt;Bomberland&lt;/a&gt; is a new 1v1 AI competition developed by &lt;a href="https://www.gocoder.one?s=devto-blog" rel="noopener noreferrer"&gt;Coder One&lt;/a&gt;. It features a multi-agent adversarial environment inspired by the classic console game, Bomberman. &lt;/p&gt;

&lt;p&gt;Your task is to program an intelligent agent navigating a 2D grid world. Your agent controls a team of units collecting powerups and placing explosives, with the ultimate goal of taking your opponent down.&lt;/p&gt;

&lt;p&gt;Bomberland is a challenging problem for out-of-the-box machine learning algorithms. Be prepared to manage real-time decision making, planning, game theory, and both adversarial and cooperative play.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5ybvj1iq76d3vpf87558.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5ybvj1iq76d3vpf87558.gif" title="In Bomberland, each agent controls several units with the ultimate goal of taking down the opposing team." alt="Bomberland preview"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  An open Bomberland arena
&lt;/h2&gt;

&lt;p&gt;Bomberland will feature an ongoing, always-on arena with an active leaderboard. Participants can get direct feedback on their strategies in 1v1 matches against other players.&lt;/p&gt;

&lt;p&gt;From time to time, we'll hold tournaments featuring live streams and prizes. Check out our previous &lt;a href="https://twitch.tv/CoderOneHQ" rel="noopener noreferrer"&gt;AI Sports Challenge streams&lt;/a&gt; for a sneak peek of what's ahead.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyum6w3j7mjotdaudjnux.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyum6w3j7mjotdaudjnux.jpg" title="AI Sports Challenge 2021 live stream on Twitch" alt="AI Sports Challenge Live Stream"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Bomberland?
&lt;/h2&gt;

&lt;p&gt;We're creating Bomberland as a place for the community to explore and experiment with the latest cutting edge technologies from tree search algorithms to deep reinforcement learning.&lt;/p&gt;

&lt;p&gt;You'll want to check it out if:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You're looking for a challenging hands-on ML project&lt;/li&gt;
&lt;li&gt;You're looking for a place to try out new libraries, frameworks, or research papers&lt;/li&gt;
&lt;li&gt;You've been fascinated by the work of companies like DeepMind and OpenAI&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The future
&lt;/h2&gt;

&lt;p&gt;We envision Bomberland to evolve over time with new metas and challenges.&lt;/p&gt;

&lt;p&gt;Bomberland is part of our larger goal at Coder One to make cutting-edge ML accessible. We're focused on building out the right tools and infrastructure to support the community in progressively pushing the boundaries of what's possible.&lt;/p&gt;

&lt;h2&gt;
  
  
  Join us for Bomberland!
&lt;/h2&gt;

&lt;p&gt;The Bomberland competition is now live 🎉&lt;/p&gt;

&lt;p&gt;We have starter kits in Python and TypeScript to help you get started (and encourage any community contributions to the &lt;a href="https://github.com/CoderOneHQ/bomberland" rel="noopener noreferrer"&gt;starter kit repo&lt;/a&gt;).&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.gocoder.one/bomberlands=devto-blog" rel="noopener noreferrer"&gt;Join Bomberland&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://discord.gg/DXpTKWQSpP" rel="noopener noreferrer"&gt;Join Discord&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>machinelearning</category>
      <category>programming</category>
      <category>datascience</category>
    </item>
    <item>
      <title>Design reinforcement learning agents using Unity ML-Agents</title>
      <dc:creator>Joy</dc:creator>
      <pubDate>Wed, 08 Sep 2021 01:22:22 +0000</pubDate>
      <link>https://dev.to/joooyz/design-reinforcement-learning-agents-using-unity-ml-agents-58f0</link>
      <guid>https://dev.to/joooyz/design-reinforcement-learning-agents-using-unity-ml-agents-58f0</guid>
      <description>&lt;p&gt;This article is part 3 of the series '&lt;a href="https://dev.to/joooyz/a-hands-on-introduction-to-deep-reinforcement-learning-using-unity-ml-agents-4f8i"&gt;A hands-on introduction to deep reinforcement learning using Unity ML-Agents&lt;/a&gt;'. It's also suitable for anyone new to Unity interested in using ML-Agents for their own reinforcement learning project.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Recap and overview&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;In &lt;a href="https://dev.to/joooyz/build-a-reinforcement-learning-environment-using-unity-ml-agents-112e"&gt;part 2&lt;/a&gt;, we built a 3D physics-based volleyball environment in Unity. We also added rewards to encourage agents to 'volley'.&lt;/p&gt;

&lt;p&gt;In this tutorial, we'll add agents to the environment. The goal is to let them observe and interact with the environment so that we can train them later using deep reinforcement learning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Letting our agents make decisions
&lt;/h2&gt;

&lt;p&gt;We want our agent to learn which actions to take given a certain state of the environment — e.g. if the ball is on our side of the court, our agent should get it before it hits the floor.&lt;/p&gt;

&lt;p&gt;The goal of reinforcement learning is to learn the &lt;strong&gt;&lt;em&gt;best&lt;/em&gt; policy&lt;/strong&gt; (a mapping of states to actions) &lt;strong&gt;that will maximise possible rewards.&lt;/strong&gt; The theory behind how reinforcement learning algorithms achieve this is beyond the scope of this series, but the courses I shared in the &lt;a href="https://dev.to/joooyz/a-hands-on-introduction-to-deep-reinforcement-learning-using-unity-ml-agents-4f8i"&gt;series introduction&lt;/a&gt; will cover it in great depth.&lt;/p&gt;

&lt;p&gt;While training, the agent will either take actions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;At random (to explore which actions lead to rewards and which don't)&lt;/li&gt;
&lt;li&gt;From its current policy (the optimal action given the current state)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;ML-Agents provides a convenient &lt;strong&gt;Decision Requester&lt;/strong&gt; component which will handle the alternation between these for us during training.&lt;/p&gt;

&lt;p&gt;To add a Decision Requester:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Select the &lt;strong&gt;PurpleAgent&lt;/strong&gt; game object (within the &lt;strong&gt;PurplePlayArea&lt;/strong&gt; parent).&lt;/li&gt;
&lt;li&gt;Add Component &amp;gt; Decision Requester. &lt;/li&gt;
&lt;li&gt;Leave decision period default as 5.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx9zqswh97373ihi7xhu9.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx9zqswh97373ihi7xhu9.JPG" title="Decision Requester" alt="Decision Requester"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Defining the agent behavior
&lt;/h2&gt;

&lt;p&gt;Both agents are already set up with the &lt;code&gt;VolleyballAgent.cs&lt;/code&gt; script and &lt;strong&gt;Behavior Parameters&lt;/strong&gt; component (which we'll come back to later).&lt;/p&gt;

&lt;p&gt;In this part we'll walk through &lt;code&gt;VolleyballAgent.cs&lt;/code&gt;.  This script contains all the logic that defines the agents' actions and observations. It contains some helper methods already:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Start()&lt;/code&gt; — called when the environment is first rendered. Grabs the parent Volleyball environment and saves it to a variable &lt;code&gt;envController&lt;/code&gt; for easy reference to its methods later.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Initialize()&lt;/code&gt; — called when the &lt;strong&gt;agent&lt;/strong&gt; is first initialized. Grabs some useful constants and objects. Also sets &lt;code&gt;agentRot&lt;/code&gt; to ensure symmetry  so that the same policy can be shared between both agents.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;MoveTowards()&lt;/code&gt;, &lt;code&gt;CheckIfGrounded()&lt;/code&gt; &amp;amp; &lt;code&gt;Jump()&lt;/code&gt; — from ML-Agents sample projects. Used for jumping.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;OnCollisionEnter()&lt;/code&gt; — called when the Agent collides with something. Used to update &lt;code&gt;lastHitter&lt;/code&gt; to decide which agent gets penalized if the ball is hit out of bounds or rewarded if hit over the net.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Adding an agent in Unity ML-Agents usually involves extending the base &lt;code&gt;Agent&lt;/code&gt; class, and implementing the following methods:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;OnActionReceived()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Heuristic()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;CollectObservations()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;OnEpisodeBegin()&lt;/code&gt; (&lt;strong&gt;Note:&lt;/strong&gt; usually used for resetting starting conditions. We don't implement it here, because the reset logic is already defined at the environment-level in &lt;code&gt;VolleyballEnvController&lt;/code&gt;. This makes more sense for us since we also need to reset the ball and not just the agents.)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Agent actions
&lt;/h2&gt;

&lt;p&gt;At a high level, the Decision Requester will select an action for our agent to take and trigger &lt;code&gt;OnActionReceived()&lt;/code&gt;. This in turn calls &lt;code&gt;MoveAgent()&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;MoveAgent()&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;This method resolves the selected action. &lt;/p&gt;

&lt;p&gt;Within the &lt;code&gt;MoveAgent()&lt;/code&gt; method, start by declaring vector variables for our agents direction and rotation movements:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt;&lt;span class="err"&gt; &lt;/span&gt;&lt;span class="n"&gt;dirToGo&lt;/span&gt;&lt;span class="err"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="err"&gt; &lt;/span&gt;&lt;span class="n"&gt;Vector3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt;&lt;span class="err"&gt; &lt;/span&gt;&lt;span class="n"&gt;rotateDir&lt;/span&gt;&lt;span class="err"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="err"&gt; &lt;/span&gt;&lt;span class="n"&gt;Vector3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We'll also add a 'grounded' check to see whether its possible for the agent to jump:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;grounded&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;CheckIfGrounded&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The actions passed into this method (&lt;code&gt;actionBuffers.DiscreteActions&lt;/code&gt;) will be an array of integers which we'll map to some behavior.  It's not important which order we assign them, as long as they remain consistent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;dirToGoForwardAction&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;act&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;rotateDirAction&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;act&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;dirToGoSideAction&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;act&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;jumpAction&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;act&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In Unity, every object has a &lt;code&gt;transform&lt;/code&gt; class that stores its position, rotation and scale. We'll use it to create a vector pointing to the correct direction in which we want our agent to move.&lt;/p&gt;

&lt;p&gt;Based on the previous assignment order, this is how we'll map our actions to behaviors:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;dirToGoForwardAction&lt;/code&gt;: Do nothing [0] | Move forward [1] | Move backward [2]&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;rotateDirAction&lt;/code&gt;: Do nothing [0] | Rotate clockwise [1] | Rotate anti-clockwise [2]&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;dirToGoSideAction&lt;/code&gt;: Do nothing [0] | Move left [1] | Move right [2] &lt;/li&gt;
&lt;li&gt;
&lt;code&gt;jumpAction&lt;/code&gt;: Don't jump [0] | Jump [1]&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Add to the &lt;code&gt;MoveAgent()&lt;/code&gt; method:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dirToGoForwardAction&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;dirToGo&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grounded&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="m"&gt;1f&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.5f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;forward&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="m"&gt;1f&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dirToGoForwardAction&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;dirToGo&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grounded&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="m"&gt;1f&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.5f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;forward&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="n"&gt;volleyballSettings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;speedReductionFactor&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="m"&gt;1f&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rotateDirAction&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;rotateDir&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;up&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="m"&gt;1f&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rotateDirAction&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;rotateDir&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;up&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="m"&gt;1f&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dirToGoSideAction&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;dirToGo&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grounded&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="m"&gt;1f&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.5f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;right&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="n"&gt;volleyballSettings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;speedReductionFactor&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="m"&gt;1f&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dirToGoSideAction&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;dirToGo&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grounded&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="m"&gt;1f&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.5f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;right&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="n"&gt;volleyballSettings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;speedReductionFactor&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;jumpAction&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(((&lt;/span&gt;&lt;span class="n"&gt;jumpingTime&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="m"&gt;0f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;grounded&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nf"&gt;Jump&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;volleyballSettings.speedReductionFactor&lt;/code&gt; is a constant that slows backwards and strafe movement to be more 'realistic'.&lt;/p&gt;

&lt;p&gt;Next,  apply the movement using Unity's provided &lt;code&gt;Rotate&lt;/code&gt; and &lt;code&gt;AddForce&lt;/code&gt; methods:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Rotate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rotateDir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fixedDeltaTime&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="m"&gt;200f&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;agentRb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddForce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agentRot&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="n"&gt;dirToGo&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="n"&gt;volleyballSettings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agentRunSpeed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ForceMode&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;VelocityChange&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, add in the logic for controlling jump behavior:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// makes the agent physically "jump"&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;jumpingTime&lt;/span&gt; &lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;0f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;jumpTargetPos&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt;
        &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Vector3&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agentRb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;jumpStartingPos&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="n"&gt;volleyballSettings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agentJumpHeight&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;agentRb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="n"&gt;agentRot&lt;/span&gt;&lt;span class="p"&gt;*&lt;/span&gt;&lt;span class="n"&gt;dirToGo&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="nf"&gt;MoveTowards&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;jumpTargetPos&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;agentRb&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;volleyballSettings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agentJumpVelocity&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;volleyballSettings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agentJumpVelocityMaxChange&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// provides a downward force to end the jump&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(!(&lt;/span&gt;&lt;span class="n"&gt;jumpingTime&lt;/span&gt; &lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;0f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;!&lt;/span&gt;&lt;span class="n"&gt;grounded&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;agentRb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddForce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;Vector3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;down&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="n"&gt;volleyballSettings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fallingForce&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ForceMode&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Acceleration&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// controls the jump sequence&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;jumpingTime&lt;/span&gt; &lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;0f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;jumpingTime&lt;/span&gt; &lt;span class="p"&gt;-=&lt;/span&gt; &lt;span class="n"&gt;Time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fixedDeltaTime&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;code&gt;Heuristic()&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;To test that we've resolved the actions properly, lets implement the &lt;code&gt;Heuristic()&lt;/code&gt; method. This will map actions to a keyboard input, so that we can playtest as a human controller.&lt;/p&gt;

&lt;p&gt;Add to &lt;code&gt;Heuristic()&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;discreteActionsOut&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;actionsOut&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DiscreteActions&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;KeyCode&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;D&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// rotate right&lt;/span&gt;
    &lt;span class="n"&gt;discreteActionsOut&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;KeyCode&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;W&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;||&lt;/span&gt; &lt;span class="n"&gt;Input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;KeyCode&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UpArrow&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// forward&lt;/span&gt;
    &lt;span class="n"&gt;discreteActionsOut&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;KeyCode&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// rotate left&lt;/span&gt;
    &lt;span class="n"&gt;discreteActionsOut&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;KeyCode&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;S&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;||&lt;/span&gt; &lt;span class="n"&gt;Input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;KeyCode&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DownArrow&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// backward&lt;/span&gt;
    &lt;span class="n"&gt;discreteActionsOut&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;KeyCode&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LeftArrow&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// move left&lt;/span&gt;
    &lt;span class="n"&gt;discreteActionsOut&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;KeyCode&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RightArrow&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// move right&lt;/span&gt;
    &lt;span class="n"&gt;discreteActionsOut&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="n"&gt;discreteActionsOut&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;KeyCode&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Space&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Save your script and return to the Unity editor.&lt;/p&gt;

&lt;p&gt;In the Behavior Parameters component of the PurpleAgent:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Set Behavior Type to Heuristic Only. This will call the &lt;code&gt;Heuristic()&lt;/code&gt; method.&lt;/li&gt;
&lt;li&gt;Set up the Actions:

&lt;ol&gt;
&lt;li&gt;Discrete Branches = 4

&lt;ol&gt;
&lt;li&gt;Branch 0 Size = 3 [No movement, move forward, move backward]&lt;/li&gt;
&lt;li&gt;Branch 1 Size = 3 [No movement, move left, move right]&lt;/li&gt;
&lt;li&gt;Branch 2 Size = 3 [No rotation, rotate clockwise, rotate anti-clockwise]&lt;/li&gt;
&lt;li&gt;Branch 4 Size = 2 [No Jump, jump]&lt;/li&gt;
&lt;/ol&gt;


&lt;/li&gt;

&lt;/ol&gt;

&lt;/li&gt;

&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F92njkrobu5gf1fvjbiq7.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F92njkrobu5gf1fvjbiq7.jpg" title="How to set up actions in Behavior Parameters" alt="Setup actions"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Press ▶️ in the editor and you'll be able to use the arrow keys (or WASD) and space bar to control your agent! &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; It might be easier to playtest if you comment out the &lt;code&gt;EndEpisode()&lt;/code&gt; calls in &lt;code&gt;ResolveEvent()&lt;/code&gt; of &lt;code&gt;VolleyballEnvController.cs&lt;/code&gt; to stop the episode resetting.&lt;/p&gt;

&lt;h2&gt;
  
  
  Observations
&lt;/h2&gt;

&lt;p&gt;Observations are how our agent 'sees' its environment. &lt;/p&gt;

&lt;p&gt;In ML-Agents, there are 3 types of observations we can use:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Vectors&lt;/strong&gt; — "direct" information about our environment (e.g. a list of floats containing the position, scale, velocity, etc of objects)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Raycasts&lt;/strong&gt; —  "beams" that shoot out from the agent and detect nearby objects&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Visual/camera input&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this project, we'll implement &lt;strong&gt;vector observations&lt;/strong&gt; to keep things simple. &lt;strong&gt;The goal is to include only the observations that are relevant for making an informed decision.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;With some trial and error, here's what I decided to use for observations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Agent's y-rotation [1 float]&lt;/li&gt;
&lt;li&gt;Agent's x,y,z-velocity [3 floats]&lt;/li&gt;
&lt;li&gt;Agent's x,y,z-normalized vector to the ball (i.e. direction to the ball) [3 floats]&lt;/li&gt;
&lt;li&gt;Ball's x,y,z-velocity [3 floats]&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is a total of &lt;strong&gt;11 vector observations&lt;/strong&gt;. Feel free to experiment with different observations. For example, you might've noticed that the agent knows nothing about its opponent. This ends up working fine for training a simple agent that can bounce the ball over the net, but won't be great at training a competitive agent that wants to win.&lt;/p&gt;

&lt;p&gt;Also note that selecting observations depends on your goal. If you're trying to replicate a 'real world' scenario, these observations won't make sense. It would be very unlikely for a player to 'know' these direct values about the environment .&lt;/p&gt;

&lt;p&gt;To add observations, you'll need to implement the Agent class &lt;code&gt;CollectObservations()&lt;/code&gt; method:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;CollectObservations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;VectorSensor&lt;/span&gt; &lt;span class="n"&gt;sensor&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Agent rotation (1 float)&lt;/span&gt;
    &lt;span class="n"&gt;sensor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddObservation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rotation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// Vector from agent to ball (direction to ball) (3 floats)&lt;/span&gt;
    &lt;span class="n"&gt;Vector3&lt;/span&gt; &lt;span class="n"&gt;toBall&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Vector3&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;ballRb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)*&lt;/span&gt;&lt;span class="n"&gt;agentRot&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ballRb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ballRb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;)*&lt;/span&gt;&lt;span class="n"&gt;agentRot&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="n"&gt;sensor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddObservation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;toBall&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;normalized&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// Distance from the ball (1 float)&lt;/span&gt;
    &lt;span class="n"&gt;sensor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddObservation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;toBall&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;magnitude&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// Agent velocity (3 floats)&lt;/span&gt;
    &lt;span class="n"&gt;sensor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddObservation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agentRb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;velocity&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// Ball velocity (3 floats)&lt;/span&gt;
    &lt;span class="n"&gt;sensor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddObservation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ballRb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;velocity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;sensor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddObservation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ballRb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;velocity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;*&lt;/span&gt;&lt;span class="n"&gt;agentRot&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;sensor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddObservation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ballRb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;velocity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;*&lt;/span&gt;&lt;span class="n"&gt;agentRot&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we'll finish setting up the Behavior Parameters:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Set &lt;strong&gt;Behavior Name&lt;/strong&gt; to 'Volleyball'. Later, this is how our trainer will know which agent to train.&lt;/li&gt;
&lt;li&gt;Set Vector Observation:

&lt;ol&gt;
&lt;li&gt;Space Size: 11&lt;/li&gt;
&lt;li&gt;Stacked Vectors: 1&lt;/li&gt;
&lt;/ol&gt;


&lt;/li&gt;

&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Febyz3o4t8f8kspdp1ni6.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Febyz3o4t8f8kspdp1ni6.jpg" title="How to set up observations in Behavior Parameters" alt="Setup observations"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrap-up
&lt;/h2&gt;

&lt;p&gt;You're now all set up to train your reinforcement learning agents.&lt;/p&gt;

&lt;p&gt;If you get stuck, check out the pre-configured &lt;code&gt;BlueAgent&lt;/code&gt; , or see the full source code in the &lt;a href="https://github.com/CoderOneHQ/ultimate-volleyball" rel="noopener noreferrer"&gt;Ultimate Volleyball project repo&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In the next section, we'll train our agents using &lt;a href="https://openai.com/blog/openai-baselines-ppo/" rel="noopener noreferrer"&gt;PPO&lt;/a&gt; — a state of the art RL algorithm provided out-of-the-box by Unity ML-Agents.&lt;/p&gt;

&lt;p&gt;If you have any feedback or questions, please let me know!&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>deeplearning</category>
      <category>unity3d</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Build a reinforcement learning environment using Unity ML-Agents</title>
      <dc:creator>Joy</dc:creator>
      <pubDate>Thu, 02 Sep 2021 01:16:57 +0000</pubDate>
      <link>https://dev.to/joooyz/build-a-reinforcement-learning-environment-using-unity-ml-agents-112e</link>
      <guid>https://dev.to/joooyz/build-a-reinforcement-learning-environment-using-unity-ml-agents-112e</guid>
      <description>&lt;p&gt;This article is part 2 of the series '&lt;a href="https://dev.to/joooyz/a-hands-on-introduction-to-deep-reinforcement-learning-using-unity-ml-agents-4f8i"&gt;A hands-on introduction to deep reinforcement learning using Unity ML-Agents&lt;/a&gt;'. It's also suitable for anyone new to Unity interested in using ML-Agents for their own reinforcement learning project.&lt;/p&gt;

&lt;h2&gt;
  
  
  Recap and overview
&lt;/h2&gt;

&lt;p&gt;In my &lt;a href="https://dev.to/joooyz/an-introduction-to-machine-learning-with-unity-ml-agents-3an5"&gt;previous post&lt;/a&gt;, I went over how to set up ML-Agents and train an agent.&lt;/p&gt;

&lt;p&gt;In this article, I'll walk through how to build a 3D physics-based volleyball environment in Unity. We'll use this environment later to train agents that can successfully play volleyball using deep reinforcement learning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting up the court
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Download or clone the starter project from this &lt;a href="https://github.com/CoderOneHQ/ultimate-volleyball-starter" rel="noopener noreferrer"&gt;repo&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Open Unity Hub and go to Projects &amp;gt; Add.&lt;/li&gt;
&lt;li&gt;Select the 'ultimate-volleyball-starter' project folder. You might see some warning messages in the Console but they are safe to ignore for now.&lt;/li&gt;
&lt;li&gt;From the &lt;strong&gt;Project&lt;/strong&gt; tab in Unity, navigate to &lt;strong&gt;Assets&lt;/strong&gt; &amp;gt; &lt;strong&gt;Scenes.&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Load the &lt;code&gt;Volleyball.unity&lt;/code&gt; scene.&lt;/li&gt;
&lt;li&gt;In the &lt;strong&gt;Project&lt;/strong&gt; tab go to &lt;strong&gt;Assets&lt;/strong&gt; &amp;gt; &lt;strong&gt;Prefabs&lt;/strong&gt; and drag the &lt;code&gt;VolleyballArea.prefab&lt;/code&gt; object into the scene.&lt;/li&gt;
&lt;li&gt;Save the project.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdgtv5cxa19a3h04qaj9t.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdgtv5cxa19a3h04qaj9t.gif" title="Dragging prefab into the Volleyball scene" alt="add-prefab-to-scene.gif"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you click &lt;strong&gt;Play&lt;/strong&gt; ▶️ above the Scene viewer you'll notice some weird things happening because we haven't added any physics or logic to define how the game objects should interact yet. We'll do that in the next section.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting up the environment
&lt;/h2&gt;

&lt;p&gt;⚠ Before we start, open the &lt;strong&gt;VolleyballArea prefab&lt;/strong&gt; (Project panel &amp;gt; Assets &amp;gt; Prefabs). We'll make our edits to the base prefab, so that they are reflected in all instances of this prefab. This will come in handy later when we duplicate our environment multiple times for parallel training.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F830nbzhz2ffghu4bpnk8.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F830nbzhz2ffghu4bpnk8.JPG" title="Editing a Prefab in Unity" alt="prefab-view.JPG"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Volleyball&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;em&gt;Make our volleyball subject to Unity's physics engine:&lt;/em&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;In the &lt;strong&gt;Hierarchy&lt;/strong&gt; panel, expand the &lt;strong&gt;VolleyballArea&lt;/strong&gt; object and select the &lt;strong&gt;Volleyball&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;From the &lt;strong&gt;Inspector&lt;/strong&gt; panel, set the tag to &lt;code&gt;ball&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Add Component&lt;/strong&gt; &amp;gt; &lt;strong&gt;RigidBody&lt;/strong&gt;. &lt;/li&gt;
&lt;li&gt;Set mass = 3, drag = 1 and angular drag = 1. Feel free to play around with default values. A heavier ball will make the environment 'harder'.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;em&gt;Add 'bounciness' to our ball:&lt;/em&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Add a &lt;strong&gt;Sphere Collider&lt;/strong&gt; component.&lt;/li&gt;
&lt;li&gt;Set Radius to 0.15.&lt;/li&gt;
&lt;li&gt;From the &lt;strong&gt;Project&lt;/strong&gt; panel, go to &lt;strong&gt;Assets&lt;/strong&gt; &amp;gt; &lt;strong&gt;Materials &amp;gt; Physic Materials&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Drag  &lt;code&gt;Bouncy.physicMaterial&lt;/code&gt; into the 'Material' slot.&lt;/li&gt;
&lt;li&gt;You can double-click &lt;code&gt;Bouncy.physicMaterial&lt;/code&gt; to change the 'bounciness'.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyavr4myfoqciyniltfan.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyavr4myfoqciyniltfan.jpg" title="Inspector panel for Volleyball object" alt="volleyball-components.jpg"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Both blue and purple agent cubes have already been set up for you in a similar way to the Volleyball.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Ground&lt;/strong&gt;
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Select the &lt;strong&gt;Ground&lt;/strong&gt; game object&lt;/li&gt;
&lt;li&gt;From the &lt;strong&gt;Inspector&lt;/strong&gt; panel, set the tag to &lt;code&gt;walkableSurface&lt;/code&gt;. This is used later to check whether or not the agent is 'grounded' for its jump action.&lt;/li&gt;
&lt;li&gt;Add a &lt;strong&gt;Box Collider&lt;/strong&gt; component. This is used to register collisions with other game objects containing Rigid Body components. Without it, they will just fall through the ground.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Goals&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Goals are represented by a thin layer on top of the ground. &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Expand the &lt;strong&gt;BluePlayArea&lt;/strong&gt; and &lt;strong&gt;PurplePlayArea&lt;/strong&gt; parent objects.&lt;/li&gt;
&lt;li&gt;Add a &lt;strong&gt;Box Collider&lt;/strong&gt; to both the &lt;strong&gt;BlueGoal&lt;/strong&gt; and &lt;strong&gt;PurpleGoal&lt;/strong&gt; game objects.&lt;/li&gt;
&lt;li&gt;Check the 'Is Trigger' box for both goals.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgad8t0dnfcoufaw3gm7z.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgad8t0dnfcoufaw3gm7z.JPG" title="Trigger setting checked in Inspector panel" alt="collider-is-trigger.JPG"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When a game object is set as a trigger, it no longer registers any physics-based collisions. Even though the goals are placed above the ground layer, technically the agents are moving on the Ground layer collider we created earlier.&lt;/p&gt;

&lt;p&gt;Setting triggers allows us to use the  &lt;code&gt;OnTriggerEnter&lt;/code&gt; method later which will detect when a ball has hit the collider.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Net&lt;/strong&gt;
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Select the &lt;strong&gt;Net&lt;/strong&gt; game object within &lt;strong&gt;VolleyballNet&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Add a &lt;strong&gt;Box Collider&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Click the '&lt;strong&gt;Edit Collider&lt;/strong&gt;' icon.&lt;/li&gt;
&lt;li&gt;Click and drag the bottom node of the green collider so that it covers the entire height of the net. Feel free to play around with the thickness. The intention here is to create a physical 'blocker' that will prevent the ball from going under or around the net.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhxrspqoxb01b2p3aswdf.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhxrspqoxb01b2p3aswdf.gif" title="Adjusting collider of net" alt="net-collider.gif"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 Some shortcuts: Alt+click to rotate, middle-click to pan, middle mouse wheel to zoom in/out.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Boundaries&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;There are three invisible boundaries:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;OuterBoundaries&lt;/strong&gt; (checks for ball going out of bounds)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BlueBoundary&lt;/strong&gt; (checks for ball going into the blue side of court)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PurpleBoundary&lt;/strong&gt; (checks for ball going into the purple side of court)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Colliders, tags, and triggers for these boundaries have already been set up for you.&lt;/p&gt;

&lt;h2&gt;
  
  
  Scripting the environment
&lt;/h2&gt;

&lt;p&gt;In this section, we'll add scripts that define the environment behavior (e.g. what happens when the ball hits the floor or when the episode starts).&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;&lt;code&gt;VolleyballSettings.cs&lt;/code&gt;&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Our first script will simply hold some constants that we'll reuse throughout the project.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Go back to the &lt;strong&gt;Volleyball&lt;/strong&gt; Scene and select the &lt;strong&gt;VolleyballSettings&lt;/strong&gt; game object.&lt;/li&gt;
&lt;li&gt;In the &lt;strong&gt;Inspector&lt;/strong&gt;, you'll see a Script component attached. Double click the &lt;strong&gt;VolleyballSettings&lt;/strong&gt; script to open it in your IDE of choice.&lt;/li&gt;
&lt;li&gt;You should see the following:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;float&lt;/span&gt; &lt;span class="n"&gt;agentRunSpeed&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;1.5f&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;float&lt;/span&gt; &lt;span class="n"&gt;agentJumpHeight&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;2.75f&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;float&lt;/span&gt; &lt;span class="n"&gt;agentJumpVelocity&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;777&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;float&lt;/span&gt; &lt;span class="n"&gt;agentJumpVelocityMaxChange&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// Slows down strafe &amp;amp; backward movement&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;float&lt;/span&gt; &lt;span class="n"&gt;speedReductionFactor&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0.75f&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;Material&lt;/span&gt; &lt;span class="n"&gt;blueGoalMaterial&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;Material&lt;/span&gt; &lt;span class="n"&gt;purpleGoalMaterial&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;Material&lt;/span&gt; &lt;span class="n"&gt;defaultMaterial&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// This is a downward force applied when falling to make jumps look less floaty&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;float&lt;/span&gt; &lt;span class="n"&gt;fallingForce&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;150&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; there is also a &lt;code&gt;ProjectSettingsOverride.cs&lt;/code&gt; script provided. This contains additional default settings related to time-stepping and resolving physics.&lt;/p&gt;

&lt;p&gt;Go back to the Unity editor and select the &lt;strong&gt;VolleyballSettings&lt;/strong&gt; game object. You should see that these variables are available in the &lt;strong&gt;Inspector&lt;/strong&gt; panel.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;&lt;code&gt;VolleyballController.cs&lt;/code&gt;&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;This script is attached to the &lt;strong&gt;Volleyball&lt;/strong&gt; game object and lets us detect when the ball has hit our boundary or goal trigger.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Open the &lt;code&gt;VolleyballController.cs&lt;/code&gt; script attached to the Volleyball.&lt;/li&gt;
&lt;li&gt;At the start of our &lt;code&gt;VolleyballController : MonoBehaviour&lt;/code&gt; class (above the &lt;code&gt;Start()&lt;/code&gt; method), declare the variables:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;HideInInspector&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;VolleyballEnvController&lt;/span&gt; &lt;span class="n"&gt;envController&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;GameObject&lt;/span&gt; &lt;span class="n"&gt;purpleGoal&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;GameObject&lt;/span&gt; &lt;span class="n"&gt;blueGoal&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="n"&gt;Collider&lt;/span&gt; &lt;span class="n"&gt;purpleGoalCollider&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="n"&gt;Collider&lt;/span&gt; &lt;span class="n"&gt;blueGoalCollider&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Save the script.&lt;/li&gt;
&lt;li&gt;In the Unity editor, click the &lt;strong&gt;Volleyball&lt;/strong&gt; game object.&lt;/li&gt;
&lt;li&gt;Drag the &lt;strong&gt;PurpleGoal&lt;/strong&gt; game object into the Purple Goal slot in the Inspector. &lt;/li&gt;
&lt;li&gt;Drag the &lt;strong&gt;BlueGoal&lt;/strong&gt; game object into the Blue Goal slot in the Inspector.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftccf32d1gc9q3loz2ri5.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftccf32d1gc9q3loz2ri5.JPG" title="Script component for Volleyball object" alt="volleyball-controller.JPG"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This will allow us to access their child objects later.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Start()&lt;/code&gt; &lt;/p&gt;

&lt;p&gt;This method is called when the environment is first rendered. It will:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Fetch the PurpleGoal &amp;amp; BlueGoal Colliders themselves (the components that register physics-based collisions) using the &lt;code&gt;GetComponent&amp;lt;Collider&amp;gt;&lt;/code&gt; method:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;purpleGoalCollider&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;purpleGoal&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetComponent&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Collider&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;();&lt;/span&gt;
&lt;span class="n"&gt;blueGoalCollider&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;blueGoal&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetComponent&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Collider&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Assign the parent &lt;strong&gt;VolleyballArea&lt;/strong&gt; game object to a variable 'envController' for easier reference later.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;envController&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;GetComponentInParent&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;VolleyballEnvController&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Copy these statements into the &lt;code&gt;Start()&lt;/code&gt; method:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;envController&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;GetComponentInParent&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;VolleyballEnvController&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;();&lt;/span&gt;
    &lt;span class="n"&gt;purpleGoalCollider&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;purpleGoal&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetComponent&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Collider&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;();&lt;/span&gt;
    &lt;span class="n"&gt;blueGoalCollider&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;blueGoal&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetComponent&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Collider&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;OnTriggerEnter(Collider other)&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This method is called when the ball hits a collider.&lt;/p&gt;

&lt;p&gt;Some scenarios to detect are:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Ball hits the floor/goals&lt;/li&gt;
&lt;li&gt;Ball goes out of bounds&lt;/li&gt;
&lt;li&gt;Ball is hit over the net (to encourage volleying for training later)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This method will detect each scenario and pass this information to &lt;code&gt;envController&lt;/code&gt; (which we'll add in the next section). Copy the following block into this method:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gameObject&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CompareTag&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"boundary"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// ball went out of bounds&lt;/span&gt;
    &lt;span class="n"&gt;envController&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ResolveEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HitOutOfBounds&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gameObject&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CompareTag&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"blueBoundary"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// ball hit into blue side&lt;/span&gt;
    &lt;span class="n"&gt;envController&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ResolveEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HitIntoBlueArea&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gameObject&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CompareTag&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"purpleBoundary"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// ball hit into purple side&lt;/span&gt;
    &lt;span class="n"&gt;envController&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ResolveEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HitIntoPurpleArea&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gameObject&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CompareTag&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"purpleGoal"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// ball hit purple goal (blue side court)&lt;/span&gt;
    &lt;span class="n"&gt;envController&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ResolveEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HitPurpleGoal&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gameObject&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CompareTag&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"blueGoal"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// ball hit blue goal (purple side court)&lt;/span&gt;
    &lt;span class="n"&gt;envController&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ResolveEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HitBlueGoal&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;&lt;code&gt;VolleyballEnvController.cs&lt;/code&gt;&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;This script holds all the main logic for the environment: the max steps it should run for, how the ball and agents should spawn, when the episode should end, how rewards should be assigned, etc.&lt;/p&gt;

&lt;p&gt;In the sample skeleton script, some variables and helper methods are already provided:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Start()&lt;/code&gt; — fetch the components and objects we'll need for later&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;UpdateLastHitter()&lt;/code&gt; — keeps track of which agent was last in control of the ball&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;GoalScoredSwapGroundMaterial()&lt;/code&gt; — changes the color of the ground (helps us visualise which agent scored)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;FixedUpdate()&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This is called by the Unity engine each time there is a frame update (which is set to every &lt;code&gt;FixedDeltaTime=0.02&lt;/code&gt; seconds in &lt;code&gt;ProjectSettingsOverride.cs&lt;/code&gt;).  &lt;/p&gt;

&lt;p&gt;This will control the max number of updates (i.e. 'steps') the environment takes before we interrupt the episode (e.g. if the ball gets stuck somewhere).&lt;/p&gt;

&lt;p&gt;Add the following to &lt;code&gt;void FixedUpdate()&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;/// &amp;lt;summary&amp;gt;&lt;/span&gt;
&lt;span class="c1"&gt;/// Called every step. Control max env steps.&lt;/span&gt;
&lt;span class="c1"&gt;/// &amp;lt;/summary&amp;gt;&lt;/span&gt;
&lt;span class="k"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;FixedUpdate&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;resetTimer&lt;/span&gt; &lt;span class="p"&gt;+=&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resetTimer&lt;/span&gt; &lt;span class="p"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;MaxEnvironmentSteps&lt;/span&gt; &lt;span class="p"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;MaxEnvironmentSteps&lt;/span&gt; &lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;blueAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;EpisodeInterrupted&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="n"&gt;purpleAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;EpisodeInterrupted&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="nf"&gt;ResetScene&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;ResetScene()&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This controls the starting spawn behavior.&lt;/p&gt;

&lt;p&gt;Our goal is to learn a model that allows our agent to return the ball from its side of the court no matter where the ball is sent. To help with training, we'll randomise the starting conditions of the agents and ball within some reasonable boundaries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;/// &amp;lt;summary&amp;gt;&lt;/span&gt;
&lt;span class="c1"&gt;/// Reset agent and ball spawn conditions.&lt;/span&gt;
&lt;span class="c1"&gt;/// &amp;lt;/summary&amp;gt;&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;ResetScene&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;resetTimer&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;lastHitter&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Team&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Default&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// reset last hitter&lt;/span&gt;

    &lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;AgentsList&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// randomise starting positions and rotations&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;randomPosX&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Range&lt;/span&gt;&lt;span class="p"&gt;(-&lt;/span&gt;&lt;span class="m"&gt;2f&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;2f&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;randomPosZ&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Range&lt;/span&gt;&lt;span class="p"&gt;(-&lt;/span&gt;&lt;span class="m"&gt;2f&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;2f&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;randomPosY&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;0.5f&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;3.75f&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// depends on jump height&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;randomRot&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Range&lt;/span&gt;&lt;span class="p"&gt;(-&lt;/span&gt;&lt;span class="m"&gt;45f&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;45f&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;localPosition&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Vector3&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;randomPosX&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;randomPosY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;randomPosZ&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;eulerAngles&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Vector3&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;randomRot&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetComponent&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Rigidbody&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;().&lt;/span&gt;&lt;span class="n"&gt;velocity&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Vector3&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// reset ball to starting conditions&lt;/span&gt;
    &lt;span class="nf"&gt;ResetBall&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;/// &amp;lt;summary&amp;gt;&lt;/span&gt;
&lt;span class="c1"&gt;/// Reset ball spawn conditions&lt;/span&gt;
&lt;span class="c1"&gt;/// &amp;lt;/summary&amp;gt;&lt;/span&gt;
&lt;span class="k"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;ResetBall&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;randomPosX&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Range&lt;/span&gt;&lt;span class="p"&gt;(-&lt;/span&gt;&lt;span class="m"&gt;2f&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;2f&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;randomPosZ&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;6f&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;10f&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;randomPosY&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;6f&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;8f&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// alternate ball spawn side&lt;/span&gt;
    &lt;span class="c1"&gt;// -1 = spawn blue side, 1 = spawn purple side&lt;/span&gt;
    &lt;span class="n"&gt;ballSpawnSide&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="n"&gt;ballSpawnSide&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ballSpawnSide&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;ball&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;localPosition&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Vector3&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;randomPosX&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;randomPosY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;randomPosZ&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ballSpawnSide&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;ball&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;localPosition&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Vector3&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;randomPosX&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;randomPosY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="n"&gt;randomPosZ&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;ballRb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;angularVelocity&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Vector3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;ballRb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;velocity&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Vector3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zero&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;ResolveEvent()&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This method will resolve the scenarios we defined earlier in &lt;code&gt;VolleyballController.cs&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;We can use this method to assign rewards in different ways to encourage different types of behavior. In general, it's good practise to keep rewards within [-1,1]. &lt;/p&gt;

&lt;p&gt;To keep it simple, our goal for now is to train agents that can bounce the ball back and forth and keep the ball in play. We'll assign a reward of +1 each time an agent hits the ball over the net using the &lt;code&gt;AddReward(1f)&lt;/code&gt; method in the corresponding scenario:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HitIntoBlueArea&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lastHitter&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="n"&gt;Team&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Purple&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;purpleAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddReward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HitIntoPurpleArea&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lastHitter&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="n"&gt;Team&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Blue&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;blueAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddReward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We won't assign any rewards for now if a goal is scored or the ball is hit out of bounds. If either of these scenarios happen, we'll just end the episode. Add the following code block to the sections indicated by the &lt;code&gt;// end episode&lt;/code&gt; comment.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;blueAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;EndEpisode&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="n"&gt;purpleAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;EndEpisode&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nf"&gt;ResetScene&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here's what &lt;code&gt;ResolveEvent&lt;/code&gt; should look like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;/// &amp;lt;summary&amp;gt;&lt;/span&gt;
&lt;span class="c1"&gt;/// Resolves scenarios when ball enters a trigger and assigns rewards&lt;/span&gt;
&lt;span class="c1"&gt;/// &amp;lt;/summary&amp;gt;&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;ResolveEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Event&lt;/span&gt; &lt;span class="n"&gt;triggerEvent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;switch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;triggerEvent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HitOutOfBounds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lastHitter&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="n"&gt;Team&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Blue&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="c1"&gt;// apply penalty to blue agent&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lastHitter&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="n"&gt;Team&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Purple&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="c1"&gt;// apply penalty to purple agent&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;

            &lt;span class="c1"&gt;// end episode&lt;/span&gt;
            &lt;span class="n"&gt;blueAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;EndEpisode&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
            &lt;span class="n"&gt;purpleAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;EndEpisode&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
            &lt;span class="nf"&gt;ResetScene&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HitBlueGoal&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;// blue wins&lt;/span&gt;

            &lt;span class="c1"&gt;// turn floor blue&lt;/span&gt;
            &lt;span class="nf"&gt;StartCoroutine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;GoalScoredSwapGroundMaterial&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;volleyballSettings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;blueGoalMaterial&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;RenderersList&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="m"&gt;5f&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

            &lt;span class="c1"&gt;// end episode&lt;/span&gt;
            &lt;span class="n"&gt;blueAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;EndEpisode&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
            &lt;span class="n"&gt;purpleAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;EndEpisode&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
            &lt;span class="nf"&gt;ResetScene&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HitPurpleGoal&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;// purple wins&lt;/span&gt;

            &lt;span class="c1"&gt;// turn floor purple&lt;/span&gt;
            &lt;span class="nf"&gt;StartCoroutine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;GoalScoredSwapGroundMaterial&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;volleyballSettings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;purpleGoalMaterial&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;RenderersList&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="m"&gt;5f&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

            &lt;span class="c1"&gt;// end episode&lt;/span&gt;
            &lt;span class="n"&gt;blueAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;EndEpisode&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
            &lt;span class="n"&gt;purpleAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;EndEpisode&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
            &lt;span class="nf"&gt;ResetScene&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

                &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HitIntoBlueArea&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lastHitter&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="n"&gt;Team&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Purple&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="n"&gt;purpleAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddReward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt;
                    &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

                &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HitIntoPurpleArea&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lastHitter&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="n"&gt;Team&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Blue&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="n"&gt;blueAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddReward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt;
                    &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now when you click Play ▶️ you should see the environment working correctly: the ball is affected by gravity, the agents can stand on the ground, and the episode resets when the ball hits the floor.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnrmuc0zfwavbxa7dzj6t.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnrmuc0zfwavbxa7dzj6t.gif" title="Finished environment with physics" alt="finished-environment.gif"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrap-up
&lt;/h2&gt;

&lt;p&gt;You should now have a volleyball environment ready for our agents to train in. It will assign our agents rewards to encourage a certain type of behavior (volleying the ball back and forth).&lt;/p&gt;

&lt;p&gt;In the next section, we'll design our agents and give it actions to choose from and a way to observe its environment.&lt;/p&gt;

&lt;p&gt;If you have any feedback or questions, please let me know!&lt;/p&gt;

</description>
      <category>unity3d</category>
      <category>tutorial</category>
      <category>machinelearning</category>
      <category>reinforcementlearning</category>
    </item>
    <item>
      <title>A hands-on introduction to deep reinforcement learning using Unity ML-Agents</title>
      <dc:creator>Joy</dc:creator>
      <pubDate>Thu, 26 Aug 2021 09:21:48 +0000</pubDate>
      <link>https://dev.to/joooyz/a-hands-on-introduction-to-deep-reinforcement-learning-using-unity-ml-agents-4f8i</link>
      <guid>https://dev.to/joooyz/a-hands-on-introduction-to-deep-reinforcement-learning-using-unity-ml-agents-4f8i</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;⚠ Note: this series is still a work in progress.&lt;/p&gt;

&lt;p&gt;This series is up-to-date with the latest ML-Agents Release 18&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Purpose
&lt;/h2&gt;

&lt;p&gt;There are plenty of great reinforcement learning (RL) courses out there. Just to name a few:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://deepmind.com/learning-resources/-introduction-reinforcement-learning-david-silver"&gt;Introduction to Reinforcement Learning by David Silver&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://spinningup.openai.com/en/latest/"&gt;Spinning Up by OpenAI&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.coursera.org/specializations/reinforcement-learning"&gt;Reinforcement Learning Specialization by University of Alberta&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But if you're anything like me, you might prefer a 'learning by doing' approach. With hands-on experience upfront, it may be easier for you to grasp the theory and math behind the algorithms later.&lt;/p&gt;

&lt;p&gt;In this series, I'll walk you through how to use &lt;a href="https://unity.com/products/machine-learning-agents"&gt;Unity ML-Agents&lt;/a&gt; to build a volleyball environment and train agents to play in it using deep RL.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--52lWUlSw--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/lr2zmstgz4f1ppihy8g5.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--52lWUlSw--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/lr2zmstgz4f1ppihy8g5.gif" alt="Ultimate Volleyball" width="600" height="316"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why ML-Agents?
&lt;/h2&gt;

&lt;p&gt;ML-Agents is an add-on for Unity (a game development platform). &lt;/p&gt;

&lt;p&gt;It lets us design a complex physics-rich environment without needing to build any of the physics simulation logic ourselves. It also lets us experiment with state-of-the-art RL algorithms without having to set up any boilerplate code or install additional libraries. The nice graphics and interface are a plus.&lt;/p&gt;

&lt;h2&gt;
  
  
  A (very brief) overview of reinforcement learning
&lt;/h2&gt;

&lt;p&gt;In a nutshell, think about how you might teach a dog a new trick, like telling it to sit:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If it performs the trick correctly (it sits), you’ll &lt;strong&gt;reward&lt;/strong&gt; it with a treat (&lt;em&gt;positive feedback&lt;/em&gt;) ✔️&lt;/li&gt;
&lt;li&gt;If it doesn’t sit correctly, it doesn’t get a treat (&lt;em&gt;negative feedback&lt;/em&gt;) ❌&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By continuing to do things that lead to positive outcomes, the dog will learn to sit when it hears the command in order to get its treat. Reinforcement learning is a subdomain of machine learning which involves training an ‘&lt;strong&gt;agent&lt;/strong&gt;’ (&lt;em&gt;the dog&lt;/em&gt;) to learn the correct sequences of &lt;strong&gt;actions&lt;/strong&gt; to take (&lt;em&gt;sitting&lt;/em&gt;) on its &lt;strong&gt;environment&lt;/strong&gt; (&lt;em&gt;in response to the command ‘sit’&lt;/em&gt;) in order to maximize its &lt;strong&gt;reward&lt;/strong&gt; (&lt;em&gt;getting a treat&lt;/em&gt;). &lt;/p&gt;

&lt;p&gt;This can be illustrated more formally as:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--okObq8DS--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/x1zs6o1jtkn5ccfxgnwr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--okObq8DS--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/x1zs6o1jtkn5ccfxgnwr.png" alt="Sutton and Barto" width="528" height="214"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Source: &lt;a href="http://incompleteideas.net/book/bookdraft2017nov5.pdf"&gt;Sutton &amp;amp; Barto&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For more on the theory, check out:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://lilianweng.github.io/lil-log/2018/02/19/a-long-peek-into-reinforcement-learning.html"&gt;A (Long) Peek into Reinforcement Learning by Lilian Weng&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf"&gt;Reinforcement Learning: An Introduction by Sutton &amp;amp; Barto&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>unity3d</category>
      <category>machinelearning</category>
      <category>deeplearning</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>20+ Active machine learning and data science communities</title>
      <dc:creator>Joy</dc:creator>
      <pubDate>Wed, 18 Aug 2021 07:22:00 +0000</pubDate>
      <link>https://dev.to/joooyz/20-active-machine-learning-and-data-science-communities-21gk</link>
      <guid>https://dev.to/joooyz/20-active-machine-learning-and-data-science-communities-21gk</guid>
      <description>&lt;p&gt;Whether you're a beginner or veteran in machine learning and data science, you might be interested in a place to ask questions, share projects, or join discussions on the latest developments.&lt;/p&gt;

&lt;p&gt;There are many great communities out there for this, but it can be difficult to choose which one (and some may no longer be active or well-maintained). &lt;/p&gt;

&lt;p&gt;To help you, I've compiled an up-to-date list of 20+ active machine learning and data science communities grouped by platform.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. &lt;strong&gt;Reddit&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Reddit is a powerhouse for many active forums dedicated to all areas across AI, machine learning, and data science.&lt;/p&gt;

&lt;p&gt;Here's a list:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.reddit.com/r/MachineLearning/"&gt;r/machinelearning&lt;/a&gt; (2M+ members)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.reddit.com/r/datascience/"&gt;r/datascience&lt;/a&gt;  (500K+ members)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.reddit.com/r/learnmachinelearning/"&gt;r/learnmachinelearning&lt;/a&gt; (200K+ members)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.reddit.com/r/artificial/"&gt;r/artificial&lt;/a&gt; (145K+ members)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.reddit.com/r/deeplearning/"&gt;r/deeplearning&lt;/a&gt; (60K+ members)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.reddit.com/r/ArtificialInteligence/"&gt;r/artificialinteligence&lt;/a&gt; (50K+ members)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.reddit.com/r/reinforcementlearning/"&gt;r/reinforcementlearning&lt;/a&gt; (20K+ members)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're just getting started, I recommend checking out &lt;a href="https://www.reddit.com/r/learnmachinelearning/"&gt;r/learnmachinelearning&lt;/a&gt;. It's a welcoming community for sharing beginner questions, projects, and resources (they also have a &lt;a href="https://discord.gg/G3rvFKF"&gt;Discord server&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;With over 2 million members, &lt;a href="https://www.reddit.com/r/MachineLearning/"&gt;r/machinelearning&lt;/a&gt; will likely be your go-to. It's more heavily moderated than the other subreddits, but you'll be sure to find all the latest important news, research papers, and discussions here (you might even bump into industry veterans like &lt;a href="https://twitter.com/hardmaru"&gt;@hardmaru&lt;/a&gt;).&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Discord
&lt;/h2&gt;

&lt;p&gt;Discord is an instant messaging platform with private servers that anyone can join using an invite link. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://discord.gg/R8Bcbf4"&gt;r/learnmachinelearning&lt;/a&gt; (7K+ members): a complimentary server for the &lt;a href="https://www.reddit.com/r/learnmachinelearning/"&gt;subreddit&lt;/a&gt; community, with dedicated channels for sharing projects, asking questions, and studying popular MOOC courses together.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://discord.com/invite/learnaitogether"&gt;Learn AI together&lt;/a&gt; (16K+ members): the largest Discord community dedicated to AI with a &lt;strong&gt;heap&lt;/strong&gt; of great resources to check out. You'll find discussion topics for anything from memes to AGI here.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://discord.com/invite/pQFXHK4"&gt;Fundamentals of ML&lt;/a&gt; (2K+ members): dedicated to those particularly interested in the theory and math behind ML, but also for general ML discussion, projects, and questions.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://discordapp.com/invite/UYNaemm"&gt;Data Science&lt;/a&gt; (12K+ members): a community of data science professionals and enthusiasts.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://discord.gg/7XWy7DW"&gt;The Data Share&lt;/a&gt; (6K+ members): a community-driven server moderated by part of team from &lt;a href="https://towardsdatascience.com/"&gt;Towards Data Science&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Facebook
&lt;/h2&gt;

&lt;p&gt;Facebook groups can be another way to meet others in the field. Here's some of the largest and most active groups:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.facebook.com/groups/machinelearningforum/"&gt;Data Mining / Machine Learning / Artificial Intelligence&lt;/a&gt; (130K+ members): an open group for discussing and sharing information across the general areas of data and AI.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.facebook.com/groups/1955664064497065/"&gt;Artificial Intelligence and Machine Learning&lt;/a&gt; (170K+ members): a private beginner-friendly group for people to share resources and learnings.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.facebook.com/groups/199938307171587/"&gt;Global Artificial Intelligence, Machine Learning and Deep Learning&lt;/a&gt; (20K+ members): a private group for data scientists, investors, researchers, and corporates to discuss the latest in AI.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Other platforms
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://kaggle.com/"&gt;Kaggle&lt;/a&gt; is a well-known data science competition platform. It boasts a community of over 5 million users, where you can compete and share data sets and projects (in the form of notebooks).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.linkedin.com/groups/4298680/"&gt;The Machine Learning and Data Science LinkedIn group&lt;/a&gt; is a community of professionals interested in the space. This includes engineers, data scientists, recruiters, business leaders, and more. It might be particularly worth checking out if you are looking to network or find a new role.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;There are plenty of great communities out there to check out whether you're a beginner or an industry veteran. I'll be keeping this list up to date, so if there's something you think is missing, please let me know!&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>ai</category>
      <category>datascience</category>
    </item>
  </channel>
</rss>
