DEV Community

Jordan Hansen
Jordan Hansen

Posted on • Originally published at javascriptwebscrapingguy.com on

Jordan Figures Out Why His Spies Weren’t Spying

Sample code here

Last post I went through all of the code of the link checker to try and productize it. I wanted it to be production ready and that included unit tests. Some of the tests weren’t acting how I expected they should. This post goes through some of the things I learned as I dug deeper into them and got them working.

Separate the function into its own file or module

The biggest problem I was having was with my spies. When spying, you have to spy on the module and then the function, like sinon.spy(moduleName, 'function/methodName'). I originally had a lot of my functions in the same file and this caused some problems.

For example, my checkLinks() function calls domainCheck(). Because both of these functions were in the same file and I needed a module, I simply did import * as findDeadLinksFunctions from './../findDeadLinks; to get a module and then would spy with sinon.spy(findDeadLinksFunctions, 'domainCheck');. The spy would never be called. The reason is because it was acting almost as if it was spying on something different.

The solution was to export this, and many other functions, to their own files. I put functions that didn’t call each other into a helpers.ts file and then spied like this:

import * as helpers from './../helpers';

...

    it('should call domainChecK()', async () => {
        const originalLinkObject: helpers.ILinkObject = {
            link: 'https://javascriptwebscrapingguy.com/jordan-takes-advantage-of-multithreaded-i-o-in-nodejs/',
            status: null,
            locationOfLink: 'https://javascriptwebscrapingguy.com'
        };
        const originalLinks = [];
        const domain = 'https://javascriptwebscrapingguy.com';
        const desiredIOThreads = 4;

        nock('https://javascriptwebscrapingguy.com').get('/jordan-takes-advantage-of-multithreaded-i-o-in-nodejs/').reply(200, '<button>click me</button>');

        domainCheckSpy = sinon.spy(helpers, 'domainCheck');

        await checkLinkFunction.checkLink(originalLinkObject, originalLinks, domain, desiredIOThreads);

        expect(domainCheckSpy.callCount).to.equal(1);

    });

It should be noted that I am still able to import domain check directly from helpers.ts inside the actual checkLink() function, like below. So as long as it is in its own module (or file that acts as a module in this case), it works great.

import { domainCheck, ILinkObject, getLinks } from './helpers';
...
    if (newDomain) {
        if (html && domainCheck(linkObject.link, domain, newDomain)) {
            newLinks = await getLinks(html, domain, linkObject.link, false);
        }
    }

Restoring stubs vs restoring spies

For some reason, I had to restore my stubs inside an afterEach. Originally, I would do something like this:

domainCheckSpy = sinon.spy(helpers, 'domainCheck');
getLinksStub = sinon.stub(helpers, 'getLinks');

// some test stuff

domainCheckSpy.restore();
getLinksStub.restore();

This worked great for spies. If I tried to do it with a stub, the function would never be restored and wherever getLinks was used it would return undefined like this stub was causing it to do.

If I did it within an afterEach it worked without a problem. I ended up doing this below. I have the conditional in place because not every function used the spy or stub.

describe('checkLink()', () => {
    let domainCheckStub;
    let domainCheckSpy;
    let getLinksSpy;
    let getLinksStub;
    let checkLinkSpy;

    afterEach(() => {
        if (domainCheckStub) {
            domainCheckStub.restore();
        }
        if (domainCheckSpy) {
            domainCheckSpy.restore();
        }
        if (getLinksSpy) {
            getLinksSpy.restore();
        }
        if (getLinksStub) {
            getLinksStub.restore();
        }
        if (checkLinkSpy) {
            checkLinkSpy.restore();
        }
    });
...

Testing recursive functions

checkLink() calls itself. Sometimes a lot. I wanted a way to be able to test that it was calling itself as often or as little as it should. In my test I imported it with import * as checkLinkFunction from "../checkLink"; and called it like promises.push(checkLink(linkToCheck, links, domain, desiredIOThreads));. When I expected it to call itself three times, including two of which would be recursive calls, it only called itself the original time.

This stackoverflow post was incredibly helpful. I just had to import the function from itself as its own module and call it recursively that way and then it worked just perfectly.

import * as checkLinkFunction from './checkLink';
...

            // Have to call the imported function so tests work: https://stackoverflow.com/a/51604652/2287595
            promises.push(checkLinkFunction.checkLink(linkToCheck, links, domain, desiredIOThreads));

Setting up tests found a big bug

This was pretty awesome. I had a big bug in my code that I had no idea was happening. The code appeared to be working and I may never have caught the bug. The test I was using that found the bug was this one in findDeadLinks.spec.ts.

it('should return the number of bad links (if one 404 and one 200, one bad link)', async () => {
        const returnLinks: helpers.ILinkObject[] = [
            { link: 'https://heyAnotherBuddy.com', status: null, locationOfLink: 'https://javascriptwebscrapingguy.com' },
            { link: 'https://heyBuddy.com', status: null, locationOfLink: 'https://javascriptwebscrapingguy.com' }
        ];
        getLinksStub = sinon.stub(helpers, 'getLinks').returns(Promise.resolve(returnLinks));

        nock(domainToSend).get('/').reply(200);
        nock("https://heyBuddy.com").get('/').reply(200);
        nock("https://heyAnotherBuddy.com").get('/').reply(400);

        const links = await findDeadLinks(domainToSend, desiredIOThreadsToSend);

        expect(links.length).to.equal(1);

    });

I had two links in my array and I expected it to return like I showed there. The links it returned should just be one since we are only returning bad links and there is only one with a status of 400 but instead it was returning 0 bad links.

Here’s the culprit:

let linkToReplaceIndex = links.findIndex(linkObject => linkObject.link === linkObject.link);
    links[linkToReplaceIndex] = linkObject;

See the issue? I didn’t. Not for a long time. I kept messing with this trying to figure out what was happening. If you look close, you can see the problem. linkObject => linkObject.link === linkObject.link. I’m checking it against itself so it’s going to return true at index 0 every time. It was always replacing the link at index 0.

In my case, I had heyAnotherBuddy.com at the first place and heyBuddy.com at the second spot. It would go through the first iteration and work great. Then on the second iteration, it would replace heyAnotherBuddy.com withheyBuddy.com and its status was 200.

The big thing that made it hard for me to find was how was heyBuddy.com‘s status was getting updated. It was never at index 0 but somehow its status got updated. The link I was passing to my checkLink function was still referenced to the one in the original links array. Updating its status automatically updated it in the array of links. So, I just ripped out the linkToReplaceIndex piece and it all worked perfect.

Conclusion

I learned a lot more about testing. I caught a big bug. And…I have a pretty impure function. checkLink definitely affects things outside of its function. I don’t love this. It’s something I’ll have to think about more and find a better way to do it.

Overall, good day. Got a lot of good stuff done.

The post Jordan Figures Out Why His Spies Weren’t Spying appeared first on JavaScript Web Scraping Guy.

Oldest comments (0)