|
|
Sitemaps URLs Have "&" Entity Escaped into "&"
To obey XML and HTML protocols and standards, TechSEO360 makes sure all generated sitemaps are valid and entity escaped including the ampersand character "&".
Making Sure XML and HTML Documents Are Valid and Entity Escaped
There are many rules for XML documents and HTML files. While most internet browsers are forgiving,
most sitemap validators are not.
This includes when submitting XML Sitemaps to Google. When building HTML and XML sitemap files, it is desired behavior that TechSEO360 entity escapes and convert the following characters when they occur in URLs, titles, descriptions etc:
Depending on the sitemap file type, this usually also includes:
By doing this, all HTML Validators, XML Validators, internet browsers etc. can correctly parse URLs when you submit/view sitemap files.
Notes:
Information and references:
This includes when submitting XML Sitemaps to Google. When building HTML and XML sitemap files, it is desired behavior that TechSEO360 entity escapes and convert the following characters when they occur in URLs, titles, descriptions etc:
- & into &
- < into <
- > into >
Depending on the sitemap file type, this usually also includes:
- " into "
- ' into '
By doing this, all HTML Validators, XML Validators, internet browsers etc. can correctly parse URLs when you submit/view sitemap files.
Notes:
- When viewing XML sitemap files in internet browsers, they will usually show & as the & ampersand character. To see the actual code of generated XML sitemaps, use text editor tools or view source in your internet browser.
- You should not copy and paste URLs with converted & to & into your internet browser address field. That will normally not work.
Information and references:
- Official sitemaps protocol: entity escaping.
As with all XML files, any data values (including URLs) must use entity escape codes
- Google webmaster support: general sitemap guidelines.
As with all XML files, any data values (including URLs) must use entity escape codes for the characters listed in the table below
- W3C HTML4 document: character entity references.
Authors should use "&" (ASCII decimal 38) instead of "&" to avoid confusion
"&" represents the & sign.
