DEV Community

Discussion on: Create a Simple Web Scraper in C#

Collapse
 
aaroncarrick profile image
Aaron L Carrick

Follow the link Mathew F. linked to, but edit these lines and everything will work!

Reguarding:
gist.github.com/CodeCommissions/43...

Edit:

        foreach (var term in QueryTerms)
        {
            articleLink = document.All.Where(x =>
                x.ClassName == "views-field views-field-nothing" &&
                (x.ParentElement.InnerHtml.Contains(term) || x.ParentElement.InnerHtml.Contains(term.ToLower())));

            //Overwriting articleLink above means we have to print it's result for all QueryTerms
            //Appending to a pre-declared IEnumerable (like a List), could mean taking this out of the main loop.
            if (articleLink.Any())
            {
                PrintResults(articleLink);
            }
        }

TO THIS:

        foreach (var term in QueryTerms)
        {
            articleLink = document.All.Where(x =>
                x.ClassName == "views-field views-field-nothing" &&
                (x.ParentElement.InnerHtml.Contains(term) || x.ParentElement.InnerHtml.Contains(term.ToLower()))).Skip(1);

            //Overwriting articleLink above means we have to print it's result for all QueryTerms
            //Appending to a pre-declared IEnumerable (like a List), could mean taking this out of the main loop.
            if (articleLink.Any())
            {
                PrintResults(articleLink);
            }
        }

Take note of the:

.Skip(1)

The reason it was ugly is because the first element in the IEnumerable was not filtered properly so instead of spending lots of time filtering through that mess we simply skip the first element :)