Understand Website Scan Information
The website crawler keeps total counts for following states:
- Listed found:
Unique URLs located.
- Listed deduced:
Appears after website scan has finished: Assume crawler during scan found links to
"example/somepage.html", but none to "example/". The latter is then "deduced" to exist.
- Analyzed content:
Unique URLs with content analyzed.
- Analyzed references:
Unique URLs with content analyzed and have all their links in content resolved (e.g. links to URLs that redirects).
Progress Changes When Website Scan Finishes
After website scan finishes, A1 Website Analyzer is per default configured to clean up after website scan.
This setting is controlled by:
Scan website | Crawler options | Apply "webmaster" and "output filters" after website scan stops.
Difference Between "Listed Found" and "Analyzed"
Progress difference is much like the difference between output filters
and analysis filters: Imagine you wanted to list .pdf files but not have them analyzed/crawled, then you would have a difference between the two numbers in progress.
Deailed Counts of URLs After Website Scan
If you want to see detailed counts, you can do so after website scan has finished.
Just open the Analyze website tab that shows website scan results,
select the root URL and select
Extended data | Directory summary.
Log and Analyze Website Crawling Issues
If you experience strange problems spidering your website,
you can try enable Scan website - Data collection - Logging of progress.
After website scan, you can find a log file in the program data directory logs/misc.
The log file can be useful in solving problems related to crawler filters, robots.txt, no-follow links etc.
You can find out through which page the crawler first found a specific website section.
2007-07-28 10:56:14
CodeArea: InitLink:Begin
ReferencedFromLink: http://www.example.com/website/
LinkToCheck: http://www.example.com/website/scan.html
|
|