This blog will guide you to the basic overview to use Mechanize using Ruby.
What is Mechanize?
- Mechanize is a ruby gem(library) which is used to makes automated web interactions easy.
- Mechanize is generally used for web scraping.
Why Mechanize?
- It automatically send and store cookies, follow redirections and submit forms by populating form fields.
- It also keeps tracks of visited sites as history.
Here I'm giving example by making a simple html page which have links and a form
First install the gem mechanize with below command.
gem install mechanize
Make a html page with some links and form by submitting that form you'll redirect to google home page.
<!DOCTYPE html> | |
<html lang="en" dir="ltr"> | |
<head> | |
<meta charset="utf-8"> | |
<title></title> | |
</head> | |
<body> | |
<a href="demo.html"> Home</a> | |
<a href="contact.html">contact</a> | |
<form action="https://www.google.com/" method="get"> | |
<input type="text" name="username" required> | |
<button type="submit" name="button">submit</button> | |
</form> | |
</body> | |
</html> |
Here there are two links where home will redirect you to the current page and contact link will redirect you to contact page.
While you submit the given form by giving username then it'll go to google home page.
-
Now you have to make a ruby file and write below code in it.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersrequire 'rubygems' require 'mechanize' # mechanize object agent = Mechanize.new # call the specific page using mechanize object page = agent.get('file:///home/er/projects/mechanize/demo.html') # print all the links present on the page puts "********* Links **********\n\n" page.links.each do |link| puts link end puts "\n**************************\n" # to find specific link contact_link = page.link_with(text: 'contact') # to redirect to the link puts "You redirected to contact link" contact_page = contact_link.click # to extract data from page puts "****** Page content ******\n\n" puts contact_page.at('span').text puts "\n**************************\n" # Form Operations # It'll give you the form form = page.form # To insert value in form field form['q'] = 'write anything' # To submit Form submit_form = form.submit # You'll successfully redirect to Google puts "*******Redirected URL*****\n\n" puts submit_form.uri.to_s puts "\n**************************\n" This will give you the reference to basic operations like finding all links from web page, how to redirect to links, how to get form submit forms and how to get data of any web page.
You can find more information about mechanize from below blogs.
- http://ruby.bastardsbook.com/chapters/mechanize/
- https://readysteadycode.com/howto-scrape-websites-with-ruby-and-mechanize
Complete Code On Github
Top comments (0)