Skip to content
irwink edited this page Jun 16, 2015 · 1 revision

Link Checking

The Link Checker module of the WPSS Validation Tool performs a number of link violation checks in addition to the standard broken link check. These checks include:

  • broken links
  • cross language links
  • redirected links
  • bad network scope links
  • broken anchors
  • firewall blocked links
  • IPV4 links

Link checking is applied to all URL references found in any tag. This includes, but is not limited to:

  • anchor tags <a>
  • image tags <img>
  • link tags <link>
  • script tags <script>

Link violations are described in the following sections.

Broken Links

Broken links are links or references to URLs that do not exist. This is the most basic form of link checking. When trying to retrieve a document, if it does not exist, the web server returns a “404 (Not Found)” response.

Cross Language Links

A cross language link occurs when a document in one language references a document in another. For example an English web page contains a link to a French document.

The language of a document is determined by one of the following techniques:

  • File name suffix – use the characters before the file type extension to determine language.

    English: index-eng.html, report_eng.pdf, sign_e.gif, sign-e.gif

    French: index-fra.html, rapport_fra.pdf, sign_f.gif, sign-f.gif

  • Language variable in URL string – check for presence of the lang variable in URL string. English: index.cfm?lang=eng French: index.cfm?lang=fra

  • Content – determine language of content for HTML, PDF or text files.

If content is available in only one language, such as a document in English only, then any links from French documents would appear as cross language links. You can avoid errors from the Validation Tool by including a language attribute within the HTML tag.

For example, a link from a French web page would look like:

<html xmlns="http://www.w3.org/1999/xhtml" lang="fr" xml:lang="fr">
<a href="http://www.tpsgc-pwgsc.gc.ca/comm/index-eng.html" lang="en" xml:lang="en">English</a>

The lang and xml:lang attributes indicate the link references an English document, which is a different language than the main French document. If the Validation Tool cannot determine the language of either the source document or the target document, no cross language link violations are reported.

Redirected Links

A redirected link is one in which the URL of a link is different from the final location of the document. The final URL may differ by the domain name and or the file name. Redirects may be used translate vanity domains into official domains or to direct users to new file names for documents. There is a risk that redirects may be removed on the target site resulting in broken links.

Domain redirect : http://www.pwgsc.gc.ca/ > http://www.tpsgc-pwgsc.gc.ca/

File redirect : http://www.canada.gc.ca/main_e.html > http://www.canada.gc.ca/home.html

The Validation Tool only reports redirect link violations for permanent redirects; not for temporary redirects.

Bad Network Scope Links

Network scope refers to a level of the network; PWGSC Intranet, Government of Canada Intranet or Internet. A bad network scope link occurs when a document at a higher scope references a document at a lower scope. For example, an Internet document has a link to a Government of Canada Intranet document. Some users may experience a broken link when selecting the link while others may get the referenced document. For example, public users versus users on the Government of Canada network.

Broken Anchors

A named anchor is a link marker on an HTML page. If an anchor is referenced in a URL, the browser positions the user at the anchor in the document. This allows for linking to particular section in a larger web document. If the named anchor does not exist in the target document, the browser placees the user at the top of the document.

Firewall Blocked Links

A firewall blocked link is a link to a site or document that is prohibited by the departmental firewall. This is essentially a broken link as the user cannot get to the target document.

IPV4 Links

An IPV4 link is a link that has an IP address rather than using a domain name. For example, http://10.11.12.13/index.html.

WPSS Validator Tool Test Cases

Clone this wiki locally