Microsys
  

Crawl HTTPS Pages and Websites with A1 Website Scraper

Scan and crawl websites that use HTTPS or mix HTTP and HTTPS with website scraper
Help: overview | previous | next

 To see all the options available, you will have to switch off easy mode 

 With options that use a dropdown list, any [+] or [-] button next to adds or removes items in the list itself 

Configure Support for HTTPS

A1 Website Scraper allows the user to select a HTTP solution in Scan website | Crawler engine.

The default setting is Auto detect which translates to:
  • Windows: The setting HTTP using Windows API
  • Mac: The setting HTTP using Mac API

It is also possible to select HTTP using Indy library which is an alternative solution.

Tip: If you have problems getting crawling working, be sure to check if A1 Website Scraper is getting blocked by firewall software or similar.


Crawler Engine Configuration: Indy

Note: This section is only necessary if:
  1. Your website uses HTTPS.
  2. You use Indy in Scan website | Crawler engine.

Configuring OpenSSL / Configuring LibreSSL for use with A1 Website Scraper will help for all HTTPS / SSL based websites.

To add support for this, see General options and tools | Tool paths.

get session cookies across https

Clicking the button at the right will show a menu with information and links.

In newer versions of A1 Website Scraper the menu will also show which version you should download for your computer system.


Crawler Engine Configuration: Windows API

While this will usually work out of the box, you may sometimes need to do some configuration on older systems.

This will mainly be in Tools | Internet Options | Advanced | Security:

Windows internet options for SSL and TLS

Windows 8 without IE 11:
  1. Download and apply all Windows and IE updates, e.g. by using Windows Update.
  2. Enable TLS 1.1, TLS 1.2 and newer if available in Windows / IE internet settings at Tools | Internet Options | Advanced | Security.


Windows 7:
  1. Download and apply all Windows and IE updates, e.g. by using Windows Update. You at minimum need to use SP1 / service pack one.
  2. Enable TLS 1.1, TLS 1.2 and newer if available in Windows / IE internet settings at Tools | Internet Options | Advanced | Security.


Windows Vista:
  1. Download and apply all Windows and IE updates, e.g. by using Windows Update.
  2. Enable TLS 1.1, TLS 1.2 and newer if available in Windows / IE internet settings at Tools | Internet Options | Advanced | Security.
  3. Experts: If you do not see these options then check these guides:


Windows XP and derivatives:
  1. Download and apply all Windows and IE updates, e.g. by using Windows Update.
  2. Experts: Investigate if Windows Embedded POSReady 2009 is applicable to you and that you are eligible.
  3. Experts: A registry setting essentially turns your system into a different edition for which updates were released until 2019 April.
  4. Experts: You can get started learning more here:
  5. Download and apply all Windows and IE updates again, e.g. by using Windows Update.
  6. Enable TLS 1.1, TLS 1.2 and newer if available in Windows / IE internet settings at Tools | Internet Options | Advanced | Security.
  7. Experts: If you do not see these options then check this guide:


Note: Windows systems with IE 8 or less will still not work with websites using elliptic curve cryptography.

Alternative solution: Use Indy as the chosen crawler engine and configure as explained further above.
A1 Website ScraperA1 Website Scraper | help | previous | next

Extract data from sites into CSV files. By scraping websites, you can grab data on websites and transform it into CSV files ready to be imported anywhere, e.g. SQL databases
This help page is maintained by

As one of the lead developers, his hands have touched most of the code in the software from Microsys. If you email any questions, chances are that he will be the one answering.
Share this page with friends   LinkedIn   Twitter   Facebook   Pinterest   YouTube  
 © Copyright 1997-2020 Microsys

 Usage of this website constitutes an accept of our legal, privacy policy and cookies information.