<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: n3wjack 👨‍💻</title>
    <description>The latest articles on DEV Community by n3wjack 👨‍💻 (@n3wjack).</description>
    <link>https://dev.to/n3wjack</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F467816%2F9fe05cbc-6181-4aa3-97ad-804d6f8582c6.png</url>
      <title>DEV Community: n3wjack 👨‍💻</title>
      <link>https://dev.to/n3wjack</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/n3wjack"/>
    <language>en</language>
    <item>
      <title>invoke-webrequest pro tips</title>
      <dc:creator>n3wjack 👨‍💻</dc:creator>
      <pubDate>Sat, 12 Sep 2020 12:49:56 +0000</pubDate>
      <link>https://dev.to/n3wjack/invoke-webrequest-pro-tips-c31</link>
      <guid>https://dev.to/n3wjack/invoke-webrequest-pro-tips-c31</guid>
      <description>&lt;p&gt;The &lt;code&gt;Invoke-WebRequest&lt;/code&gt; PowerShell cmdlet is great if you want to fetch and work with some web page's output without installing any extra tools like wget.exe for example. If you're planning to do some text parsing on a web page anyway, PowerShell is an excellent option, so why not go full PS mode?&lt;br&gt;
Unfortunately the command has some drawbacks, causing it to be a lot slower than it should be if you just want plain text and its response parsing can even cause it to lock up and not return a result at all.&lt;/p&gt;

&lt;p&gt;So here're some pro-tips for parsing the output using PowerShell fast and effectively:&lt;/p&gt;

&lt;h1&gt;
  
  
  1. Use basic parsing
&lt;/h1&gt;

&lt;p&gt;The cmdlet does some DOM parsing by default using Internet Explorer. This takes time and sometimes fails too, so if you want to skip this bit and make things faster, simply add the command-line switch &lt;code&gt;UseBasicParsing&lt;/code&gt;:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$r = Invoke-WebRequest https://n3wjack.net -UseBasicParsing
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h1&gt;
  
  
  2. Split HTML in lines
&lt;/h1&gt;

&lt;p&gt;Parsing text in PS is easy, but it's even easier if the result is formatted like a text file with multiple lines instead of the full HTML in a single string. If you get the &lt;code&gt;Content&lt;/code&gt; property from your webpage, you can split it up into separate lines by splitting on the newline character:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;(Invoke-WebRequest https://n3wjack.net -UseBasicParsing).Content -split "`n"
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Or, if you also want the HTTP header info to be included in the result, use &lt;code&gt;RawContent&lt;/code&gt; instead:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;(Invoke-WebRequest https://n3wjack.net -UseBasicParsing).RawContent -split "`n"
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This can be really handy if you want to automatically check if the right response headers are set.&lt;br&gt;
But you can also use the &lt;code&gt;Headers&lt;/code&gt; collection on the result object, which is even easier.&lt;/p&gt;

&lt;h1&gt;
  
  
  3. Disable download progress meter shizzle to download large files (or always to speed things up)
&lt;/h1&gt;

&lt;p&gt;That download progress bar is a nice visual and all when you're using &lt;code&gt;Invoke-WebRequest&lt;/code&gt; to download some large binaries and want to see its progress, but it significantly slows things down too. Set the &lt;code&gt;$progressPreference&lt;/code&gt; variable and you'll see your scripts download those files a lot faster.&lt;br&gt;
The larger the files (like big ass log files, images, video's etc) the more this matters I've noticed.&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$progressPreference = 'silentlyContinue'
invoke-webrequest $logurl -outfile .\logfile.log -UseBasicParsing
$progressPreference = 'Continue'
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Be sure to reset this setting afterwards, because this affects any cmdlet using that progress-bar feature.&lt;/p&gt;

&lt;h1&gt;
  
  
  4. No redirects please.
&lt;/h1&gt;

&lt;p&gt;&lt;code&gt;Invoke-WebRequest&lt;/code&gt; automatically follows an HTTP redirect (301/302) so you end up with the page you were looking for in most cases.&lt;br&gt;
If you want to test if a URL is properly redirected (or not redirected) this just makes things harder. In that case you can turn off redirects by using the &lt;code&gt;MaximumRedirection&lt;/code&gt; parameter and setting it to 0&lt;/p&gt;

&lt;p&gt;When you get a URL that returns a 301 when doing this, the command will throw an exception saying the maximum redirection counts has been exceeded. This makes this case easier to test.&lt;br&gt;
The result object will also contain the redirect &lt;code&gt;StatusCode&lt;/code&gt;.&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$r = Invoke-WebRequest http://n3wjack.net -MaximumRedirection 0
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h1&gt;
  
  
  5. Use the PowerShell result object
&lt;/h1&gt;

&lt;p&gt;It's overkill in some cases, but in others this is pure win. The result object contains some really handy bits of the webpage, making a lot of tricky text and regex parsing obsolete.&lt;br&gt;
It's a piece of cake to parse all images linked from a page using the &lt;code&gt;Image&lt;/code&gt; collection. Want to parse all outgoing links on a page? Use the &lt;code&gt;Links&lt;/code&gt; collection. There's also a &lt;code&gt;StatusCode&lt;/code&gt;, a &lt;code&gt;Headers&lt;/code&gt; collection a &lt;code&gt;Forms&lt;/code&gt; and &lt;code&gt;Inputfield&lt;/code&gt; collection for form parsing and more.&lt;br&gt;
Check out what's available using Get-Members:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Invoke-WebRequest https://n3wjack.net | get-members
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h1&gt;
  
  
  4. If all else fails, use wget.exe
&lt;/h1&gt;

&lt;p&gt;Yep. Sometimes &lt;code&gt;Invoke-WebRequest&lt;/code&gt; simply doesn't cut it. I've seen it hang on some complex pages trying to parse them and fail miserably.&lt;br&gt;
In that case you can fetch the page using the GNU WGet tool, download the page as a text file and then parse that.&lt;br&gt;
You have to call wget by adding the exe extension part otherwise you'll be triggering the PowerShell alias for &lt;code&gt;Invoke-WebRequest&lt;/code&gt; again.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Install WGet with Chocolatey
choco install wget

# Get the page and save it as a text file
wget.exe https://n3wjack.net -O nj.html
# Read the file and parse it.
get-content nj.html | % { # parsing code goes here }
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;That's all the tips to have to make your web request parsing in PowerShell a breeze. If you know a good tip that isn't listed here, feel free to drop a comment below!&lt;/p&gt;

</description>
      <category>powershell</category>
      <category>scripting</category>
      <category>web</category>
    </item>
  </channel>
</rss>
