<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Olavo</title>
    <description>The latest articles on DEV Community by Olavo (@olavomello).</description>
    <link>https://dev.to/olavomello</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F370681%2F461d6120-53a6-41fe-8573-90ca49e837e2.jpg</url>
      <title>DEV Community: Olavo</title>
      <link>https://dev.to/olavomello</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/olavomello"/>
    <language>en</language>
    <item>
      <title>Old PHP 5, cURL, and TLS 1.2 = SSL connect error</title>
      <dc:creator>Olavo</dc:creator>
      <pubDate>Thu, 18 May 2023 20:01:41 +0000</pubDate>
      <link>https://dev.to/olavomello/old-php-5-curl-and-tls-12-ssl-connect-error-k04</link>
      <guid>https://dev.to/olavomello/old-php-5-curl-and-tls-12-ssl-connect-error-k04</guid>
      <description>&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqr4wwtj0ym48ovk8elfv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqr4wwtj0ym48ovk8elfv.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You have good and old PHP projects that never crack, work well and you don't have anything to worry about it. But you are using cURL to API connections, and suddenly receive the error:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SSL connect error
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Immediately you think… it'll be easy to fix. Just force cURL to ignore SSL checks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CURLOPT_SSL_VERIFYPEER    =&amp;gt; false,
CURLOPT_SSL_VERIFYHOST    =&amp;gt; false,
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, just relax and test:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SSL connect error
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At that moment you start to be afraid about how to fix it. When you got an idea. Force the TLS version and cross your fingers to work&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ch = curl_init('https://google.com/');

//Force requsts to use TLS 1.2
curl_setopt ($ch, CURLOPT_SSLVERSION, 6);

$result = curl_exec($ch);
curl_close($ch);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And again :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SSL connect error
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Ok, maybe now is a good time to start crying.&lt;/p&gt;

&lt;h2&gt;
  
  
  The solution
&lt;/h2&gt;

&lt;p&gt;Keeping the jokes aside, what happens is the cURL can't really force the TLS even if set to ignore SSL or to use TLS version X.&lt;br&gt;
On the other hand, could be impossible to update the infrastructure from some systems, furthermore if they are isolated and designed a long time ago using old versions of Apache and PHP.&lt;br&gt;
But for our luck, we have stream_context 👏👏👏&lt;br&gt;
The stream_context creates and returns a stream with any options supplied.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;stream_context_create(?array $options = null, ?array $params = null): resource
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same as cURL, using stream context you can set headers, methods, and pass parameters. Like this :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$opts = array(
  'http'=&amp;gt;array(
    'method'=&amp;gt;"GET",
    'header'=&amp;gt;"Accept-language: en\r\n" .
              "Cookie: foo=bar\r\n"
  )
);

$context = stream_context_create($opts);

/* Sends an http request to www.example.com 
   with additional headers shown above */
$fp = fopen('http://www.example.com', 'r', false, $context);
fpassthru($fp);
fclose($fp); 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One reason why stream_context_create might work when cURL does not is thatstream_context_create may be more forgiving of server or network issues that can cause cURL to fail. For example, if a server certificate is invalid or expired, cURL may fail to connect, whereas stream_context_create may still establish a connection if the verify_peer option is set to false.&lt;br&gt;
Wow, that was close!&lt;/p&gt;

&lt;p&gt;Finally, we have a solution and I wanna share it with you. Maybe it can help you to sleep better :)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// Requiring TLS 1.2:
$ctx = stream_context_create([
    'ssl' =&amp;gt; [
        'crypto_method' =&amp;gt; STREAM_CRYPTO_METHOD_TLSv1_2_CLIENT
    ]
]);
$html = file_get_contents('https://google.com/', false, $ctx);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Hope it helps you!&lt;br&gt;
Cheers, and let's stay connected on &lt;a href="https://www.linkedin.com/in/olavo-mello/" rel="noopener noreferrer"&gt;Linkedin&lt;/a&gt; !&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Nodejs Asynchronous Multithreading Web Scraping</title>
      <dc:creator>Olavo</dc:creator>
      <pubDate>Thu, 18 May 2023 19:57:01 +0000</pubDate>
      <link>https://dev.to/olavomello/nodejs-asynchronous-multithreading-web-scraping-244o</link>
      <guid>https://dev.to/olavomello/nodejs-asynchronous-multithreading-web-scraping-244o</guid>
      <description>&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjfyoiqapcfghbp449jsw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjfyoiqapcfghbp449jsw.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Nodejs Asynchronous Multithreading Web Scraping&lt;br&gt;
Reading online data multiple times faster ;)&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fomb3eqncme9s3b4t6zci.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fomb3eqncme9s3b4t6zci.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Web Scraping?
&lt;/h2&gt;

&lt;p&gt;Web scraping is the process of extracting data from websites. In today’s world, web scraping has become an essential technique for businesses and organizations to gather valuable data for their research and analysis. Node.js is a powerful platform that enables developers to perform web scraping in an efficient and scalable manner.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Multithreaded Web Scraping?
&lt;/h2&gt;

&lt;p&gt;Multithreaded web scraping is a technique that involves dividing the web scraping task into multiple threads. Each thread performs a specific part of the scraping process, such as downloading web pages, parsing HTML, or saving data to a database. By using multiple threads, the scraping process can be performed in parallel, which can significantly improve the speed and efficiency of the scraping task.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why use Multithreaded Web Scraping?
&lt;/h2&gt;

&lt;p&gt;There are several reasons why multithreaded web scraping is beneficial. Firstly, it can significantly reduce the time required to scrape large amounts of data from multiple websites. Secondly, it can improve the performance of the scraping process by utilizing the resources of the machine more efficiently. Lastly, it can help avoid potential roadblocks like getting blocked by a website due to the overloading of requests from a single IP address.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to implement Multithreaded Web Scraping in Node.js?
&lt;/h2&gt;

&lt;p&gt;To implement multithreaded web scraping in Node.js, we can use a library called “cluster”. The cluster library enables the creation of child processes that can run in parallel and communicate with each other through a shared memory space. By creating multiple child processes, we can distribute the scraping task across all available cores of the CPU.&lt;/p&gt;

&lt;p&gt;Running the code&lt;br&gt;
In this code example, we use &lt;code&gt;tabnews.com.br&lt;/code&gt; as a target. The objective is to generate the JSON files listing the article’s title and URL to each page.&lt;/p&gt;

&lt;h2&gt;
  
  
  Our code will :
&lt;/h2&gt;

&lt;p&gt;1 — Start the master process and fork each cluster process based on CPUs available;&lt;/p&gt;

&lt;p&gt;2 — Apply the Web Scraping engine to each cluster;&lt;/p&gt;

&lt;p&gt;3 — Read the page, generate de screenshot, and breakdown content in the article list;&lt;/p&gt;

&lt;p&gt;4 — Save a .json file with the article’s title and URL;&lt;/p&gt;

&lt;p&gt;5 — Finish the process and restart another;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Code !
&lt;/h2&gt;

&lt;p&gt;Get all code on &lt;a href="https://github.com/olavomello/node-multithreading-webscraping" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Let’s stay connected&lt;/p&gt;

&lt;p&gt;Hope be useful and you enjoy it!&lt;/p&gt;

&lt;p&gt;Connect me on &lt;a href="https://www.linkedin.com/in/olavo-mello/" rel="noopener noreferrer"&gt;Linkedin&lt;/a&gt; and follow me to see what comes next ;)&lt;/p&gt;

&lt;p&gt;Cya ! :)&lt;/p&gt;

</description>
      <category>node</category>
      <category>javascript</category>
      <category>scraping</category>
      <category>programming</category>
    </item>
    <item>
      <title>Creating a Google Chrome Extension using HTML + CSS + Javascript</title>
      <dc:creator>Olavo</dc:creator>
      <pubDate>Thu, 18 May 2023 19:50:23 +0000</pubDate>
      <link>https://dev.to/olavomello/creating-a-google-chrome-extension-using-html-css-javascript-44d2</link>
      <guid>https://dev.to/olavomello/creating-a-google-chrome-extension-using-html-css-javascript-44d2</guid>
      <description>&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--qQpVlSAI--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/jzfen28o134g3dq3zbqn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--qQpVlSAI--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/jzfen28o134g3dq3zbqn.png" alt="Image description" width="538" height="387"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;An extension can be useful for several projects and specific situations, so I decided to make it available to you so that you understand a little more about how an extension for Google Chrome can be useful and simple to implement. In addition, you who are probably a regular reader of TabNews, will have an easy way to follow the articles.&lt;/p&gt;

&lt;p&gt;You can check this out on &lt;a href="https://github.com/olavomello/google-chrome-extension"&gt;GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Feel free to send your PR and improve the implementation of this small project. Below I describe a little more how it is and some challenges to implementing it.&lt;/p&gt;

&lt;h2&gt;
  
  
  TabNews Reader
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--4L5uCZPY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/aubvg5kq0bz0d566xenx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--4L5uCZPY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/aubvg5kq0bz0d566xenx.png" alt="Image description" width="720" height="366"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;TabNews RSS reader with recent article listing function and Dark Mode option enabled according to user’s default selection.&lt;/p&gt;

&lt;p&gt;Challenges&lt;br&gt;
Logically, this is not an official TabNews project, so access to RSS is blocked via CORS. To get around reading the data, I used a free routing proxy that basically loads the data and returns it to the application.&lt;/p&gt;

&lt;p&gt;Usage&lt;br&gt;
Sign in on Google Chrome&lt;br&gt;
More Tools;&lt;br&gt;
Extensions;&lt;br&gt;
Activate “Developer Mode” at the top right;&lt;br&gt;
“Load without compression” top left button;&lt;br&gt;
Ready. The TabNews Reader extension will be installed in your browser. Just access it along with the other extensions.&lt;/p&gt;

&lt;p&gt;Google Chrome Store&lt;br&gt;
It is possible to package and submit the extension to the Chrome Store, but at a one-time cost for extension developers. As this is not my case, I will not go up to the store (for now) ;)&lt;/p&gt;

&lt;p&gt;For those who want to know more about uploading their extension to the Google Chrome Store: &lt;a href="https://developer.chrome.com/docs/webstore/register/"&gt;https://developer.chrome.com/docs/webstore/register/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Finishing&lt;br&gt;
Well, that’s it. I hope this material is useful to you and your projects. Follow me on &lt;a href="https://www.linkedin.com/in/olavo-mello/"&gt;Linkedin&lt;/a&gt; and stay on top of many things to come ;)&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>chrome</category>
      <category>development</category>
    </item>
  </channel>
</rss>
