Microsys
  

Firewall Causes Problems for Crawler in Website Download

While our website download program uses normal HTTP internet connections for crawling websites, some firewall solutions will still block our software unless you take direct action.
Help: overview | previous | next

 To see all the options available, you will have to switch off easy mode 

 With options that use a dropdown list, any [+] or [-] button next to adds or removes items in the list itself 

How Firewalls Interact with Internet Enabled Software

Most firewall software programs default to silently block all internet enabled applications unless explicitly specified otherwise in configuration. This includes most Windows programs such as our website download program.
  • If you get no URLs found in website crawl and related tools, firewalls are often the reason.
  • If you get flaky results, odd errors etc. net traffic filtering can be the reason.

One hint of firewall or internet security software being the cause is if you have response codes like these listed in the website scan results:
  • 500 : Internal Server Error
  • 503 : Service Temporarily Unavailable
  • -4 : CommError

Note: Another possible reason for the above problems can be modules installed on the webserver or website that blocks unknown crawlers.


Firewall Solutions to Get Website Download Working

NOD32 client version 3:
  • View advanced mode
  • Select Setup
  • Click - Antivirus and antispyware
  • In Web access protection click Configure
  • Expand HTTP and click Web browsers
  • NOD32 will automatically consider" A1 Website Download as web browser" (checked) - you must uncheck it for A1 Website Download to work.


Norton 360:
  • Whitelist / add the program A1 Website Download in Program Rules


ESET Smart Security:
Solution:
  • Set it to learning mode.


Kaspersky anti-virus:
Symptoms:
  • URL timeouts with Indy HTTP engine.
  • URL 404 response codes with WinInet HTTP engine
Solution:
  • Pause it


Other software and hardware firewall solutions:
Symptoms:
  • Various errors and/or no crawling
Solution: Mimic user browser behavior (like some other programs also do):
  • In Scan website | Crawler engine to HTTP using WinInet engine and settings (Internet Explorer)
  • In General Options | Internet Crawler to Mozilla/4.0 (compatible; MSIE 8.0; Win32)
  • In Scan website | Crawler engine lower amount of simultaneous conections, possibly all down to one.
  • In Scan website | Crawler engine increase the amount of time between active connections.
This help page is maintained by

As one of the lead developers, his hands have touched most of the code in the software from Microsys.

If you email any questions, chances are that he will be the one answering them.
A1 Website DownloadAbout A1 Website Download

Download and take complete websites with you to browse on offline media. Copy and store entire sites for backup, archive and documentation purposes. Never loose a web site again.
     
Share this page with friends   LinkedIn   Twitter   Facebook   Pinterest   Google+   YouTube  
 © Copyright 1997-2016 Microsys
 Usage of this website constitutes an accept of our legal, privacy and cookies information.