This and the following tutorials will guide you through the process of building a solution based on discoveryjs projects. As a goal we will get a NPM dependencies inspector, i.e. an interface for exploring structure of node_modules
.
Note: Discoveryjs is at an early stage, so some things might to be changed to be more useful and simpler. If you have an idea how to make something better, let us know on discoveryjs issues.
TL;DR
Below you will find an overview of discoveryjs
key concepts. You can explore all the sources from the tutorial combined in a repo on GitHub or try how it works online.
Prerequisites
Before we start, we need a project to analyze. It may be a new project or an existing one, the only requirement is that it has a node_modules
inside (a subject of our analysis).
As a first step, we need to install discoveryjs
view and CLI tools:
npm install @discoveryjs/discovery @discoveryjs/cli
As our next step, we need to launch a discovery's server:
> npx discovery
No config is used
Models are not defined (model free mode is enabled)
Init common routes ... OK
Server listen on http://localhost:8123
And open http://localhost:8123
in a browser to see something:
That's a model-free mode, where nothing is pre-configured. You can choose any JSON file via "Load data" button or drop it right on the page, and start exploring it.
However, we need something specific, in particular, we need to get a node_modules
structure. Let's add some configuration.
Add a configuration
As you might have noticed, there was a message No config is used
when we first launched the server. So let's create a config file named .discoveryrc.js
with following content:
module.exports = {
name: 'Node modules structure',
data() {
return { hello: 'world' };
}
};
Note: If you're creating a config file in current working directory (i.e. in a root of project), then no additional action is needed. Otherwise, you need to pass a path to the config file with --config
option, or specify it in package.json
this way:
{
...
"discovery": "path/to/discovery/config.js",
...
}
Ok, let's restart the server to apply a config:
> npx discovery
Load config from .discoveryrc.js
Init single model
default
Define default routes ... OK
Cache: DISABLED
Init common routes ... OK
Server listen on http://localhost:8123
As you can see, a config file that we created is being used now. And there is a default model, which we defined (discovery can run in multi-model mode, we'll cover this approach in later tutorials). Let's see what we get in the browser:
What's we see here:
-
name
is used as a header of the page; - result of
data
method invocation is displayed as main content of the page
Note:
data
method must either return data or a promise, that will be resolved to data.
Our basic setup is ready, now can move on to the next step.
Context
Before moving forward, let's look on report page (click Make report
to open it):
At first glance, that's the same as index page... But we can change everything! For example, we can recreate an index page, that's easy:
Notice how a header is defined: "h1:#.name"
. That's a header level 1 with #.name
as a content, which is a Jora query. #
is referencing to a context of query. To see what it contains just enter #
in a query editor and use default view:
So now you know where you can get a current page ID, its params and other stuff.
Collecting data
Back to our project, currently we used a mock data, but we need to collect real data. So we should create a module and change data
value in the config (btw, you should not to restart a server after such changes):
module.exports = {
name: 'Node modules structure',
data: require('./collect-node-modules-data')
};
The source of collect-node-modules-data.js
:
const path = require('path');
const scanFs = require('@discoveryjs/scan-fs');
module.exports = function() {
const packages = [];
return scanFs({
include: ['node_modules'],
rules: [{
test: /\/package.json$/,
extract: (file, content) => {
const pkg = JSON.parse(content);
if (pkg.name && pkg.version) {
packages.push({
name: pkg.name,
version: pkg.version,
path: path.dirname(file.filename),
dependencies: pkg.dependencies
});
}
}
}]
}).then(() => packages);
};
I used @discoveryjs/scan-fs
package, which simplify file system scanning by defining rules. In package's readme an usage example can be found, so I took it as a basis and reworked as needed. And now we have some info about node_modules
content:
Much better! Despite it's just a JSON, we can dig into it and gain some insights. For example using signature popup we can found out a count of packages and how many of them has more than one physical instance (due to different versions or problems with package deduplication).
Although we have got some data, we need more details. For example, it's good to know to which physical instance resolves every of dependencies. Nevertheless improvements on data fetching is out of scope of this tutorial. So we just take @discoveryjs/node-modules
(which also built on @discoveryjs/scan-fs
) and get most details about packages with ease. collect-node-modules-data.js
simplifies dramatically:
const fetchNodeModules = require('@discoveryjs/node-modules');
module.exports = function() {
return fetchNodeModules();
};
And data about node_modules
is now looks like this:
Preparation script
As you may have noticed, some packages contains deps
– a list of dependencies. Each dependency has resolved
field, where value is a reference to a package physical instance. The reference is a path
value of one of the packages, since each path
value is unique. To resolve a reference to a package, we need to use an additional code (e.g. #.data.pick(<path=resolved>)
). But for sure, it would be much suitable to have such references already resolved.
Unfortunately, we can't resolve references on data collecting stage, as this will lead to circular references and data duplication, as well as making data transfer problematic. Nevertheless, there is a solution for this – a special script called prepare
. That script is defined in config, and is invoked for any new data for discovery instance. Let's start with config:
module.exports = {
...
prepare: __dirname + '/prepare.js', // Note: value is a path to a module
...
};
And then define a prepare.js
:
discovery.setPrepare(function(data) {
// do something with data or/and with discovery instance
});
In this module we specified prepare
function for a discovery instance. This function invoked every time before data is applied to discovery instance. That's a good place to resolve references:
discovery.setPrepare(function(data) {
const packageIndex = data.reduce((map, pkg) => map.set(pkg.path, pkg), new Map());
data.forEach(pkg =>
pkg.deps.forEach(dep =>
dep.resolved = packageIndex.get(dep.resolved)
)
);
});
Here we create a package index, where key is package's path
value (which is unique). After that we go through all packages and each dependency, and replace resolved
value for a reference to a package. That's a result:
It's much easier to make queries to dependency graph now. Here's how to get a dependency cluster (dependencies, dependency dependencies, etc.) for a specific package:
Unexpected success story: exploring tutorial's data I found a problem in
@discoveryjs/cli
(with the query.[deps.[not resolved]]
), it had a typo in a peer dependency reference. This has been fixed immediately. This is a case in point.
I suppose, it's a good time to show up some numbers and packages with duplicates on index page.
Setup default page
First of all we need to create a page module, e.g. pages/default.js
. default
is used since index page has that slug and we can override it (most things in discoveryjs can be overridden). We might start with something simple, like this:
discovery.page.define('default', [
'h1:#.name',
'text:"Hello world!"'
]);
Now we need to link the module in the config:
module.exports = {
name: 'Node modules structure',
data: require('./collect-node-modules-data'),
view: {
assets: [
'pages/default.js' // a reference to page's module
]
}
};
Checking up in a browser:
It works!
Let's show some counters, by changing pages/default.js
this way:
discovery.page.define('default', [
'h1:#.name',
{
view: 'inline-list',
item: 'indicator',
data: `[
{ label: 'Package entries', value: size() },
{ label: 'Unique packages', value: name.size() },
{ label: 'Dup packages', value: group(<name>).[value.size() > 1].size() }
]`
}
]);
Here we define an inline list of indicators. A data
value is a Jora query, that produces an array of entries. A package list is used as a data source (a data root), so we get a list length (size()
), a number of unique names (name.size()
) and number of groups by a name that has more than a single member (group(<name>).[value.size() > 1].size()
).
Not bad. However, it would be better to have a link to selected entries besides numbers:
discovery.page.define('default', [
'h1:#.name',
{
view: 'inline-list',
data: [
{ label: 'Package entries', value: '' },
{ label: 'Unique packages', value: 'name' },
{ label: 'Dup packages', value: 'group(<name>).[value.size() > 1]' }
],
item: `indicator:{
label,
value: value.query(#.data, #).size(),
href: pageLink('report', { query: value, title: label })
}`
}
]);
First of all data
value was changed, now it's a regular array with a few objects. In addition size()
method was removed for each value queries.
Also subquery was added to indicator
view. Such queries produce a new object, where value
and href
property values are computing. For value
it performs a query using query()
method and pass data to it from a context, then apply size()
method to a query result. For href
it uses pageLink()
method to generate a link to report page with specific query and title. After those changes indicators became clickable (notice that their values became blue) and much functional.
To make index page a bit useful, let's add a table with duplicated packages.
discovery.page.define('default', [
// ... the same as before
'h2:"Packages with more than one physical instance"',
{
view: 'table',
data: `
group(<name>)
.[value.size() > 1]
.sort(<value.size()>)
.reverse()
`,
cols: [
{ header: 'Name', content: 'text:key' },
{ header: 'Version & Location', content: {
view: 'list',
data: 'value.sort(<version>)',
item: [
'badge:version',
'text:path'
]
} }
]
}
]);
The same data as for Dup packages
indicator is used for the table. Additionally, a package list was sorted by a group size in reverse order. The rest setup is for columns (btw, often you don't need to setup them). For Version & Location
column we defined a nested list (sorted by a version), where each item is a pair of version badge and path to instance.
A package page
Currently we have only overall view of the packages. It might be useful to have a specific package page. To archive this we need to create a new module pages/package.js
and define a new page:
discovery.page.define('package', {
view: 'context',
data: `{
name: #.id,
instances: .[name = #.id]
}`,
content: [
'h1:name',
'table:instances'
]
});
In this module we define a page with slug package
. As a root view context
view is used, that's a non-visual view which helps define common data for nested views. Notice that we use #.id
to get a package name, that comes from an URL, i.e. http://localhost:8123/#package:{id}
.
Don't forget to include new module to the config:
module.exports = {
...
view: {
assets: [
'pages/default.js',
'pages/package.js' // here you go
]
}
};
And here is a result in a browser:
It's not so impressive, but OK for now. More complex views will be created in next tutorials.
Sidebar
Since now we have a package page, it's nice to have a list of all the packages. We might to define a special view sidebar
for this, which renders when defined (is not defined by default). Let's create a new module views/sidebar.js
:
discovery.view.define('sidebar', {
view: 'list',
data: 'name.sort()',
item: 'link:{ text: $, href: pageLink("package") }'
});
Now we have a sidebar with all the packages:
Looks good. But with a filter it might be much user friendly. Extending sidebar
definition:
discovery.view.define('sidebar', {
view: 'content-filter',
content: {
view: 'list',
data: 'name.[no #.filter or $~=#.filter].sort()',
item: {
view: 'link',
data: '{ text: $, href: pageLink("package"), match: #.filter }',
content: 'text-match'
}
}
});
Here we're wrapped the list into content-filter
view, which will provide an input value converted to RegExp (or null
when empty) as filter
value in the context (a name may be changed via name
option). Also we used #.filter
to filter data for the list. Finally, a link view definition was extended to highlight matching parts using text-match
view. And there is a result:
In case you dislike default style of something, you may tweak styles as you want. Suppose, you want to change a sidebar width. Then you need to create a style file (views/sidebar.css
would be a good choice):
.discovery-sidebar {
width: 300px;
}
And to include a reference to this file in the config, just like with JavaScript modules:
module.exports = {
...
view: {
assets: [
...
'views/sidebar.css', // you may specify *.css files in assets too
'views/sidebar.js'
]
}
};
Auto linking
Last chapter of this tutorial is about links. As you can see above, we made a link to a package page via pageLink()
method. Beside that we need to specify link text as well. But how about make it a bit simpler?
To simplify linking, we need to define a link resolver. A good place for this is prepare
script:
discovery.setPrepare(function(data) {
...
const packageIndex = data.reduce(
(map, item) => map
.set(item, item) // key is item itself
.set(item.name, item), // and `name` value
new Map()
);
discovery.addEntityResolver(value => {
value = packageIndex.get(value) || packageIndex.get(value.name);
if (value) {
return {
type: 'package',
id: value.name,
name: value.name
};
}
});
});
We added a new map (an index) for packages here, and used it for an entity resolver. The entity resolver makes an attempt to translate passed value into a package descriptor when possible. A package descriptor contains:
-
type
– slug of instance type -
id
- unique reference to an instance, used as page ID in links -
name
– used as captions for links
Last step, we need to attach this type to certain page (a link should lead somewhere, isn't it?).
discovery.page.define('package', {
...
}, {
resolveLink: 'package' // link `package` entities to this page
});
The first effect of those changes, is that some values in struct
view now marked with a badge link to the package page:
And now you may apply auto-link
view for an package object or name:
As an example, sidebar can be slightly refactored:
// before
item: {
view: 'link',
data: '{ text: $, href: pageLink("package"), match: #.filter }',
content: 'text-match'
},
// with `auto-link`
item: {
view: 'auto-link',
content: 'text-match:{ text, match: #.filter }'
}
Conclusion
Now you have a basic knowledge of discoveryjs
key concepts. Next tutorials will continue to guide you through these topics in more depth.
You can explore all the sources from the tutorial combined in a repo on GitHub or try how it works online.
Follow @js_discovery on Twitter and stay tuned!
Top comments (1)
This is a great tool to analyze JSON data.
TIL that it's not only for analysis, but it's possbile to extract and transform data as well! Custom views to display your data!
I guess it may take some time to feel comfortable with discovery.js, but the potential is really impressive.