How to get inner text from span which include other hidden span?

nubclolug

New Member
I have some test html page\[code\]<!DOCTYPE html><html lang="en" xmlns="http://www.w3.org/1999/xhtml"><head> <meta charset="utf-8" /> <title>Page for test</title></head><body> <div class="r_tr"> <span class="r_rs">Inner text<span class="otherSpan" style="display: none">text</span></span> </div></body></html>\[/code\]I want to get "Inner text".I am using HtmlAgilityPack.I write this method \[code\]public string GetInnerTextFromSpan(HtmlDocument doc){ const string rowXPath = "//*[@class=\"r_tr\"]"; const string spanXPath = "//*[@class=\"r_rs\"]"; string text = null; HtmlNodeCollection rows = doc.DocumentNode.SelectNodes(rowXPath); foreach(HtmlNode row in rows) { text = row.SelectSingleNode(spanXPath).InnerText; Console.WriteLine("textL {0}", text); } return text;}\[/code\]but this method return "Inner texttext".I write some unit test for explain my problem\[code\][Test]public void TestGetInnerTextFromSpan(){ var client = new PromtTranslatorClient(); var doc = new HtmlDocument(); doc.Load(@"testPage.html"); var text = client.GetInnerTextFromSpan(doc); StringAssert.AreEqualIgnoringCase("Inner text", text);}\[/code\]and result\[code\]Expected string length 10 but was 14. Strings differ at index 10. Expected: "Inner text", ignoring case But was: "Inner texttext" ---------------------^\[/code\]
 
Back
Top