How to Scrape/Extract Data from any Website? (free & easy)

How to Scrape and Extract Data from any Website

Main elements (click to jump)
1. How to auto scrape any site
2. “No configuration” alternative
3. Create bulk web pages using the collected data

Web Scraping simply is the process of extracting content and data from the Internet through Web Scraping Tools, for a large number of purposes like Market Research, Lead Generation, Price Tracking, News Monitoring, E-commerce, Data Analysis.. and more. This information is then extracted in a structured format that is more useful to the user.

Web Scraping can be done manually, by copy-paste elements yourself, but this will take a lot of time and energy. To speed up the process, Web Scraping tools are your best solution. Automated, cost less, and work more efficiently.

In this guide, we’ll show you how you can auto scrape data from any website for free using https://webscraper.io/.

No Python, PHP, JavaScript, or coding is needed.

How to scrape data from any site?

As we said, Web Scraping can be used to extract data of products from e-commerce websites like Amazon, eBay, AliExpress, Etsy, Walmart, Alibaba, Target, etc. Product details like price, name or title, description, pictures, reviews, rating, etc.

We will take Amazon.com as an example. But you can literally apply this to any data, any website.

Automate Amazon product scraping in 4 easy steps.

Step #1: Install webscraper.io

First, you need to install https://webscraper.io/ a free and easy-to-use browser extension for data extraction, available for both Google Chrome and Mozilla Firefox.

webscraperio website screenshot picture
Webscraper.io

Step #2: Configure webscraper.io

This needs a few minutes of setup, it might seem hard but actually, it’s pretty easy.

  1. Once installed, head to the page you want to scrape, right-click with your mouse → Inspect.
  2. Click on the last tab “Web Scraper”.
  3. Click on Create new sitemap → Create Sitemap (a sitemap is a blueprint of how you want the scraper to navigate the website and scrape the data), name it whatever, and add a start URL (Category page, or search page for example).
  4. Click on Add new selector, name it, and choose “link” in the drop-down list (because it’s a product page link).
  5. Click Select to choose the selector, then point-and-click on the product (this will tell the scraper that this is the type of page to navigate to).
  6. Check the box (because there are multiple products on that page).
  7. Click on Save selector.

Now, the scraper will start from that Start URL, and navigate to all the products found on that page. What we have to do now is, define the details we want to extract from those products (name, price, image.. etc).

Now click on that selector you just saved, and add another selector inside of it.

  1. Click on Add new selector, name it, and choose “text” in the drop-down list (now we gonna scrape the product name).
  2. Click Select, then point-and-click on the product name (after you navigate to a product page example).
  3. Click on Save selector.
  4. Repeat the same process for the image (choose “image” in the type), and for the price (choose “text” in the type).

Tip: For the pagination, just go back to the Start URL and add &page=[1-XX] to the end of the URL (i.e. https://www.amazon.com/s?k=macbook&page=[1-100]), now the scraper will be able to crawl the first 100 pages. With no pagination setup, the scraper will only extract data from products on the first page.

That’s it. If something isn’t clear you can watch their video to have a more clear idea of how this works:

Video tutorial

Step #3: Start scraping

To start scraping, click on your sitemap name → Scrape → Start scraping.

webscraper-io-scrape-button
Start scraping

Webscraper will open a new pop-up Chrome “window” to do its tasks, which will behave independently, and won’t be an issue or interrupt your browsing or activities.

Step #4: Export Data

Once finished, click on your sitemap name → Export data → and download it as XLSX or CSV format.

I don’t want to configure anything, any alternative?

If you are too lazy to do all this work or simply you don’t have the time, here is Apify a powerful web scraping and automation platform, with some flexible, and ready-to-use tools to help you scrape and extract data from any site (Instagram, Google search results, Google maps, Amazon, TikTok, YouTube, Twitter, Facebook, Reddit.. etc), fast and accurately. No need to bother with configurations, proxies, browsers, or captchas.

apify-web-scrapers-store-categories
Apify

How to create bulk web pages using the scraped data?

If you are a WordPress user, MPG WP plugin will be a great option for you, allowing you to create mass pages and customize them with just simple shortcodes. You can generate hundreds, thousands, and even hundreds of thousands of pages, with a single setup.

Check out our step-by-step guide on how to create bulk WordPress web pages.


Happy scraping!

Anselh author from Browntips

Post by: Anselh, a blogging addict, WordPress fanatic, SEO writer, and online marketing expert. linkedin twitter

Leave a Comment