Roman Dvornov

Posted on Sep 16, 2019 • Edited on Sep 25, 2019

Discovery.js tutorials: quick start

#discoveryjs #tutorial #node #javascript

This and the following tutorials will guide you through the process of building a solution based on discoveryjs projects. As a goal we will get a NPM dependencies inspector, i.e. an interface for exploring structure of node_modules.

Note: Discoveryjs is at an early stage, so some things might to be changed to be more useful and simpler. If you have an idea how to make something better, let us know on discoveryjs issues.

TL;DR

Below you will find an overview of discoveryjs key concepts. You can explore all the sources from the tutorial combined in a repo on GitHub or try how it works online.

Prerequisites

Before we start, we need a project to analyze. It may be a new project or an existing one, the only requirement is that it has a node_modules inside (a subject of our analysis).

As a first step, we need to install discoveryjs view and CLI tools:

npm install @discoveryjs/discovery @discoveryjs/cli

As our next step, we need to launch a discovery's server:

> npx discovery
No config is used
  Models are not defined (model free mode is enabled)
Init common routes ... OK
Server listen on http://localhost:8123

And open http://localhost:8123 in a browser to see something:

That's a model-free mode, where nothing is pre-configured. You can choose any JSON file via "Load data" button or drop it right on the page, and start exploring it.

However, we need something specific, in particular, we need to get a node_modules structure. Let's add some configuration.

Add a configuration

As you might have noticed, there was a message No config is used when we first launched the server. So let's create a config file named .discoveryrc.js with following content:

module.exports = {
    name: 'Node modules structure',
    data() {
        return { hello: 'world' };
    }
};

Note: If you're creating a config file in current working directory (i.e. in a root of project), then no additional action is needed. Otherwise, you need to pass a path to the config file with --config option, or specify it in package.json this way:

{
   ...
   "discovery": "path/to/discovery/config.js",
   ...
}

Ok, let's restart the server to apply a config:

> npx discovery
Load config from .discoveryrc.js
Init single model
  default
    Define default routes ... OK
    Cache: DISABLED
Init common routes ... OK
Server listen on http://localhost:8123

As you can see, a config file that we created is being used now. And there is a default model, which we defined (discovery can run in multi-model mode, we'll cover this approach in later tutorials). Let's see what we get in the browser:

What's we see here:

name is used as a header of the page;
result of data method invocation is displayed as main content of the page

Note: data method must either return data or a promise, that will be resolved to data.

Our basic setup is ready, now can move on to the next step.

Context

Before moving forward, let's look on report page (click Make report to open it):

At first glance, that's the same as index page... But we can change everything! For example, we can recreate an index page, that's easy:

Notice how a header is defined: "h1:#.name". That's a header level 1 with #.name as a content, which is a Jora query. # is referencing to a context of query. To see what it contains just enter # in a query editor and use default view:

So now you know where you can get a current page ID, its params and other stuff.

Collecting data

Back to our project, currently we used a mock data, but we need to collect real data. So we should create a module and change data value in the config (btw, you should not to restart a server after such changes):

module.exports = {
    name: 'Node modules structure',
    data: require('./collect-node-modules-data')
};

The source of collect-node-modules-data.js:

const path = require('path');
const scanFs = require('@discoveryjs/scan-fs');

module.exports = function() {
    const packages = [];

    return scanFs({
        include: ['node_modules'],
        rules: [{
            test: /\/package.json$/,
            extract: (file, content) => {
                const pkg = JSON.parse(content);

                if (pkg.name && pkg.version) {
                    packages.push({
                        name: pkg.name,
                        version: pkg.version,
                        path: path.dirname(file.filename),
                        dependencies: pkg.dependencies
                    });
                }
            }
        }]
    }).then(() => packages);
};

I used @discoveryjs/scan-fs package, which simplify file system scanning by defining rules. In package's readme an usage example can be found, so I took it as a basis and reworked as needed. And now we have some info about node_modules content:

Much better! Despite it's just a JSON, we can dig into it and gain some insights. For example using signature popup we can found out a count of packages and how many of them has more than one physical instance (due to different versions or problems with package deduplication).

Although we have got some data, we need more details. For example, it's good to know to which physical instance resolves every of dependencies. Nevertheless improvements on data fetching is out of scope of this tutorial. So we just take @discoveryjs/node-modules (which also built on @discoveryjs/scan-fs) and get most details about packages with ease. collect-node-modules-data.js simplifies dramatically:

const fetchNodeModules = require('@discoveryjs/node-modules');

module.exports = function() {
    return fetchNodeModules();
};

And data about node_modules is now looks like this:

Preparation script

As you may have noticed, some packages contains deps – a list of dependencies. Each dependency has resolved field, where value is a reference to a package physical instance. The reference is a path value of one of the packages, since each path value is unique. To resolve a reference to a package, we need to use an additional code (e.g. #.data.pick(<path=resolved>)). But for sure, it would be much suitable to have such references already resolved.

Unfortunately, we can't resolve references on data collecting stage, as this will lead to circular references and data duplication, as well as making data transfer problematic. Nevertheless, there is a solution for this – a special script called prepare. That script is defined in config, and is invoked for any new data for discovery instance. Let's start with config:

module.exports = {
    ...
    prepare: __dirname + '/prepare.js', // Note: value is a path to a module
    ...
};

And then define a prepare.js:

discovery.setPrepare(function(data) {
    // do something with data or/and with discovery instance
});

In this module we specified prepare function for a discovery instance. This function invoked every time before data is applied to discovery instance. That's a good place to resolve references:

discovery.setPrepare(function(data) {
    const packageIndex = data.reduce((map, pkg) => map.set(pkg.path, pkg), new Map());

    data.forEach(pkg =>
        pkg.deps.forEach(dep =>
            dep.resolved = packageIndex.get(dep.resolved)
        )
    );
});

Here we create a package index, where key is package's path value (which is unique). After that we go through all packages and each dependency, and replace resolved value for a reference to a package. That's a result:

It's much easier to make queries to dependency graph now. Here's how to get a dependency cluster (dependencies, dependency dependencies, etc.) for a specific package:

Unexpected success story: exploring tutorial's data I found a problem in@discoveryjs/cli (with the query .[deps.[not resolved]]), it had a typo in a peer dependency reference. This has been fixed immediately. This is a case in point.

I suppose, it's a good time to show up some numbers and packages with duplicates on index page.

Setup default page

First of all we need to create a page module, e.g. pages/default.js. default is used since index page has that slug and we can override it (most things in discoveryjs can be overridden). We might start with something simple, like this:

discovery.page.define('default', [
    'h1:#.name',
    'text:"Hello world!"'
]);

Now we need to link the module in the config:

module.exports = {
    name: 'Node modules structure',
    data: require('./collect-node-modules-data'),
    view: {
        assets: [
            'pages/default.js'  // a reference to page's module
        ]
    }
};

Checking up in a browser:

It works!

Let's show some counters, by changing pages/default.js this way:

discovery.page.define('default', [
    'h1:#.name',
    {
        view: 'inline-list',
        item: 'indicator',
        data: `[
            { label: 'Package entries', value: size() },
            { label: 'Unique packages', value: name.size() },
            { label: 'Dup packages', value: group(<name>).[value.size() > 1].size() }
        ]`
    }
]);

Here we define an inline list of indicators. A data value is a Jora query, that produces an array of entries. A package list is used as a data source (a data root), so we get a list length (size()), a number of unique names (name.size()) and number of groups by a name that has more than a single member (group(<name>).[value.size() > 1].size()).

Not bad. However, it would be better to have a link to selected entries besides numbers:

discovery.page.define('default', [
    'h1:#.name',
    {
        view: 'inline-list',
        data: [
            { label: 'Package entries', value: '' },
            { label: 'Unique packages', value: 'name' },
            { label: 'Dup packages', value: 'group(<name>).[value.size() > 1]' }
        ],
        item: `indicator:{
            label,
            value: value.query(#.data, #).size(),
            href: pageLink('report', { query: value, title: label })
        }`
    }
]);

First of all data value was changed, now it's a regular array with a few objects. In addition size() method was removed for each value queries.

Also subquery was added to indicator view. Such queries produce a new object, where value and href property values are computing. For value it performs a query using query() method and pass data to it from a context, then apply size() method to a query result. For href it uses pageLink() method to generate a link to report page with specific query and title. After those changes indicators became clickable (notice that their values became blue) and much functional.

To make index page a bit useful, let's add a table with duplicated packages.

discovery.page.define('default', [
    // ... the same as before

    'h2:"Packages with more than one physical instance"',
    {
        view: 'table',
        data: `
            group(<name>)
            .[value.size() > 1]
            .sort(<value.size()>)
            .reverse()
        `,
        cols: [
            { header: 'Name', content: 'text:key' },
            { header: 'Version & Location', content: {
                view: 'list',
                data: 'value.sort(<version>)',
                item: [
                    'badge:version',
                    'text:path'
                ]
            } }
        ]
    }
]);

The same data as for Dup packages indicator is used for the table. Additionally, a package list was sorted by a group size in reverse order. The rest setup is for columns (btw, often you don't need to setup them). For Version & Location column we defined a nested list (sorted by a version), where each item is a pair of version badge and path to instance.

A package page

Currently we have only overall view of the packages. It might be useful to have a specific package page. To archive this we need to create a new module pages/package.js and define a new page:

discovery.page.define('package', {
    view: 'context',
    data: `{
        name: #.id,
        instances: .[name = #.id]
    }`,
    content: [
        'h1:name',
        'table:instances'
    ]
});

In this module we define a page with slug package. As a root view context view is used, that's a non-visual view which helps define common data for nested views. Notice that we use #.id to get a package name, that comes from an URL, i.e. http://localhost:8123/#package:{id}.

Don't forget to include new module to the config:

module.exports = {
    ...
    view: {
        assets: [
            'pages/default.js',
            'pages/package.js'  // here you go
        ]
    }
};

And here is a result in a browser:

It's not so impressive, but OK for now. More complex views will be created in next tutorials.

Sidebar

Since now we have a package page, it's nice to have a list of all the packages. We might to define a special view sidebar for this, which renders when defined (is not defined by default). Let's create a new module views/sidebar.js:

discovery.view.define('sidebar', {
    view: 'list',
    data: 'name.sort()',
    item: 'link:{ text: $, href: pageLink("package") }'
});

Now we have a sidebar with all the packages:

Looks good. But with a filter it might be much user friendly. Extending sidebar definition:

discovery.view.define('sidebar', {
    view: 'content-filter',
    content: {
        view: 'list',
        data: 'name.[no #.filter or $~=#.filter].sort()',
        item: {
            view: 'link',
            data: '{ text: $, href: pageLink("package"), match: #.filter }',
            content: 'text-match'
        }
    }
});

Here we're wrapped the list into content-filter view, which will provide an input value converted to RegExp (or null when empty) as filter value in the context (a name may be changed via name option). Also we used #.filter to filter data for the list. Finally, a link view definition was extended to highlight matching parts using text-match view. And there is a result:

In case you dislike default style of something, you may tweak styles as you want. Suppose, you want to change a sidebar width. Then you need to create a style file (views/sidebar.css would be a good choice):

.discovery-sidebar {
    width: 300px;
}

And to include a reference to this file in the config, just like with JavaScript modules:

module.exports = {
    ...
    view: {
        assets: [
            ...
            'views/sidebar.css',  // you may specify *.css files in assets too
            'views/sidebar.js'
        ]
    }
};

Auto linking

Last chapter of this tutorial is about links. As you can see above, we made a link to a package page via pageLink() method. Beside that we need to specify link text as well. But how about make it a bit simpler?

To simplify linking, we need to define a link resolver. A good place for this is prepare script:

discovery.setPrepare(function(data) {
    ...

    const packageIndex = data.reduce(
        (map, item) => map
            .set(item, item)        // key is item itself
            .set(item.name, item),  // and `name` value
        new Map()
    );
    discovery.addEntityResolver(value => {
        value = packageIndex.get(value) || packageIndex.get(value.name);

        if (value) {
            return {
                type: 'package',
                id: value.name,
                name: value.name
            };
        }
    });
});

We added a new map (an index) for packages here, and used it for an entity resolver. The entity resolver makes an attempt to translate passed value into a package descriptor when possible. A package descriptor contains:

type – slug of instance type
id - unique reference to an instance, used as page ID in links
name – used as captions for links

Last step, we need to attach this type to certain page (a link should lead somewhere, isn't it?).

discovery.page.define('package', {
    ...
}, {
    resolveLink: 'package'  // link `package` entities to this page
});

The first effect of those changes, is that some values in struct view now marked with a badge link to the package page:

And now you may apply auto-link view for an package object or name:

As an example, sidebar can be slightly refactored:

    // before
        item: {
            view: 'link',
            data: '{ text: $, href: pageLink("package"), match: #.filter }',
            content: 'text-match'
        },

    // with `auto-link`
        item: {
            view: 'auto-link',
            content: 'text-match:{ text, match: #.filter }'
        }

Conclusion

Now you have a basic knowledge of discoveryjs key concepts. Next tutorials will continue to guide you through these topics in more depth.

You can explore all the sources from the tutorial combined in a repo on GitHub or try how it works online.

Follow @js_discovery on Twitter and stay tuned!

Top comments (1)

Eugene Karataev • Sep 16 '19

This is a great tool to analyze JSON data.
TIL that it's not only for analysis, but it's possbile to extract and transform data as well! Custom views to display your data!
I guess it may take some time to feel comfortable with discovery.js, but the potential is really impressive.