DEV Community

Arlene Andrews
Arlene Andrews

Posted on • Originally published at distort-roving.blogspot.com on

Compare HTML in Playwright .NET

Gray letter blocks spelling out "HTML", offset from each other

Photo by Miguel Á. Padriñán from Pexels

Automating checking text (and its HTML) can be a challenge if you aren't comfortable with the language and framework you are using. Having a limited time for learning and exploring is its own challenge. One of my tasks for automating a smoke test for our project was to verify the text in each environment, with each specific set-up having its own welcoming message. And, I admit, I wasn't quite sure where to start. I knew what I needed, and found out where to fetch it. But getting all the parts together was not as simple as it looked.

I started off with needing to get database access. And then work with our setup to make sure that I could use application settings (that tells the framework where to find information and works with the configuration settings to keep information private) to connect with the database during a run. I could then put pieces together, and get the information from the database, and compare it with the displayed information for testing.

Manually, it was simple - I should have guessed that this means it wasn't going to be quite as easy as I'd hoped. Looking at the HTML, and verifying that things were in the proper layout was simple. I understand enough of the simple HTML layout that this was a quick check to make sure that the words were correct and go on. I had made this smoke test with many pauses for manual verification for a reason. Between this and the list of things I needed to check, I was going to be encouraged to learn, explore, and ask for help to make this automation successful, and apply it in the other tests.

Once database access (and the proper connection string!) was created, I was able to set the text of the pages as a variable. This was the easy part: I knew what I was doing here and felt accomplished in getting it done. I ran the Playwright test, with an Expect that the text on the page would match word-wise, but not the HTML components. And I was correct - the proper words were there, but the test failed with the HTML added in.

Research time! The first attempt was with using Page.ContentAsync(), which is the command to get the full HTML of the page, including the headers. This should allow me to search it, and find the sub-string of the text, correct? As a first idea, it wasn’t too horrid – I had the HTML I was looking for saved, and all I needed to do was walk the whole document contents to locate it. Not efficient, and certainly not good practice! It should get me the result I needed and could then be iterated on.

It did not. Finding a sub-string in the entire page was not quickly possible, and I wanted my automation to be fast. After a few dozen attempts to get this to work as I wanted it to and keeping in mind the business’ rule (if you can’t solve it in 45 minutes of effort, it’s time to ask someone else), I made a meeting with one of the developers. I know they are busy and creating a very-needed update: the meeting went out with a “we can reschedule if needed” note attached.

While waiting for the meeting, I continued to fuss with it: one of the challenges in getting the area narrowed down was the class of the div – it was not named well, and with Bootstrap, the potential for duplicate divs with the same name had run me into problems on other pages. Talking to someone who had been here much longer than I had, I found out this was ALWAYS the third div on the page.

Now I had a new plan to find it – use the Nth() locator, and find correct div. I’d love to solve this, write the post I am now typing up, and be on to the next issue before the meeting. As many of you may know and/or suspect, this is a good trigger for something urgent to come up, and it did. The plans got copied from page to page in the organizer for several days, until it was time for us to pair develop.

Working with this developer is always a treat: we have much in common and respect for each other. As a bonus, they are good at teaching! After a quick review of what the goal was, we went over the attempts I had made to solve this. I’d left the last one, with the errors in the IDE, in hopes that would be helpful. And now to make progress!

Using the debugger, we verified that the HTML was being pulled in correctly. This was one area I had not checked completely – and it thankfully was correct. We agreed that the div name was not very useful – the work they had done recently had created other div with this same name, but on a different page. That was noted but tabled until I got to that point in the test.

The skills in NUnit they have was needed – the easier way to verify this section was to use the AreEqual command. This allowed the test to check the strings were equal. Playwright was being stubborn. It wanted locators rather than strings – or the other way around – that were simply taking too much time to craft. And I was happy to learn this technique – I can see it being useful in the future!

After a few attempts to get Nth() to work, we resorted to using that odd div class – after making sure it was only used this one time on the page. That gave us a starting point – now to figure out how to get the HTML in there (thankfully, this was the only thing in that specific div). A few more false starts, and me finally giving up on the idea that ContentAsync() for that div was not going to work, lead them to the solution that I had tried and discarded.

InnerHtmlAsync() gave us the exact contents of the div! Spaces and all. And that was the next stumbling block – and we were out of time for the meeting. They were willing to give me a few more minutes, thankfully, mostly because this was an issue that they had solved before. Just needed the syntax to remove the spaces: Replace(" ", "") if you were curious. And that let the test run, until the next PauseAsync() I’d added for manual verification stopped it for a moment.

They went off to lunch, and I spent the next block of time getting notes ready. I had other things to find – and now more of a clue how to go about it.

Top comments (0)