Google+

328. Understanding the hierarchy of HTML Source






In order to parse the HTML code using XPath we've to first view the HTML Page source of your page by following the below steps:

View the HTML Page Source of your page:

1. Open http://compendiumdev.co.uk/selenium/basic_web_page.html
2. Right Click on the Basic Web Page and select 'View Page Source' option as shown below:

3. Ensure that Page Source of the selected Page is displayed in HTML as shown below:



Understand the HTML Code displayed in the above Page Source:

1. The following HTML Code is displayed in the above Page Source:

<html>       
       <head> 
                    <title>Basic Web Page Title </title>
       </head>
       <body>
                    <p id="para1" class="main">A paragraph of text</p>
                     <p id="para2" class="sub">Another paragraph of text</p>
        </body>
</html>

2. So after going through the HTML code, its very clear that <html> is the root tag.
3. <head> and <body> are the child tags of <html> tag (i.e. They are inside the <html> </html> tags). So <html> is the parent tag of <head> and <body> tags
4. <title> tag is the child tag of <head> tag. Since <head> tag is child tag of <html> tag, <title> tag will be the grand child of <html> tag
5. In the similar manner, as <p> tags are inside the <body> tag, <p> tags are child tags of parent tag <body>. As <body> tag is child tag of <html>, <p> tags are grand children of <html> tag

Please find hierarchy of the above HTML code below:





Please comment below to feedback or ask questions.

Parse the HTML Source using its hierarchy and find absolute XPath path  will be explained in the next post.




No comments: