DEV Community

loading...

Parse shell commands in javascript with tagged templates

Heiker
Web developer from Venezuela. I like solving problems. Currently trying to improve my communication skills
Updated on ・4 min read

I'm here now to share something that I think you might find useful, as well as asking for help to improve my code.

I want to parse commands using javascript's tagged templates. Something like this.

$`dep deploy --branch=${branch}`
Enter fullscreen mode Exit fullscreen mode

example taken from zx.

This isn't anything new, I've seen others try to do this before, but the thing that bothers me is that they use an actual shell to execute the commands. They have their methods to sanitize inputs and whatnot but it still bothers me. For that particular case you don't need a shell. node and deno can call that command (dep) in a way that is cross-platform.

In deno we can create a subprocess using Deno.run. Node has an entire module for that (child_process), thought I would like to use execa because it looks like they have some good defaults in place.

And so what I want to do is create a tag function capable of parsing that command in a way that the result can be use with execa.sync or Deno.run.

This is what I got

I've divided this process in stages, so it's easier to code.

The tag template

The tag function itself. The thing that takes the command.

function sh(pieces, ...args) {
  let cmd = pieces[0]; 
  let i = 0;
  while (i < args.length) {
    if(Array.isArray(args[i])) {
      cmd += args[i].join(' ');
      cmd += pieces[++i];
    } else {
      cmd += args[i] + pieces[++i];
    }
  }

  return exec(parse_cmd(cmd));
}
Enter fullscreen mode Exit fullscreen mode

In here the function takes the static strings and the dynamic values and puts together the command (credits to zx for this). I added some "support" for arrays for extra convenience. The next thing will be parsing the command.

Parsing

function parse_cmd(str) {
  let result = [];
  let log_matches = false;

  let regex = /(([\w-/_~\.]+)|("(.*?)")|('(.*?)'))/g;
  let groups = [2, 4, 6];
  let match;

  while ((match = regex.exec(str)) !== null) {
    // This is necessary to avoid infinite loops 
    // with zero-width matches
    if (match.index === regex.lastIndex) {
      regex.lastIndex++;
    }

    // For this to work the regex groups need to 
    // be mutually exclusive
    groups.forEach(function(group) {
      if(match[group]) {
        result.push(match[group]);
      }
    });

    // show matches for debugging
    log_matches && match.forEach(function(m, group) {
      if(m) {
        console.log(`Match '${m}' found in group: ${group}`);
      }
    });
  }

  return result;
}
Enter fullscreen mode Exit fullscreen mode

Yes, regex. Love me some regex. The way this works is this, first try to parse the "words" of a command, which is this [\w-/_~\.]+. If it can't do that, see if the thing is inside double quotes "(.*?)" or in single quotes '(.*?)'. So if the first regex fails you can always wrap the argument inside quotes and it should just work.

Notice all those parenthesis? Each pair creates a group. And each time regex.exec finds a match it will tell me in which group the match fits. The secret sauce of this is checking the groups that are mutually exclusive, if the match is in one of them I add it to the result.

Execute

This part will depend of the javascript runtime you use. I have two use cases and parse_cmd should work with both.

  • Deno
async function exec(cmd) {
  const proc = await Deno.run({ cmd }).status();

  if (proc.success == false) {
    Deno.exit(proc.code);
  }

  return proc;
}
Enter fullscreen mode Exit fullscreen mode
  • Node
const execa = require('execa');

function exec([cmd, ...args]) {
  return execa.sync(cmd, args, { stdio: 'inherit' });
}
Enter fullscreen mode Exit fullscreen mode

Test case

How do I test it? Well... I use this for now.

let args = ['query', '~/bin/st4f_f'];

let result = sh`node ./src/1-main-test2.js -i 'thing "what"' --some "stuff 'now'" HellO ${args}`;
Enter fullscreen mode Exit fullscreen mode

result should have.

{
  "0": "node",
  "1": "./src/1-main-test2.js",
  "2": "-i",
  "3": 'thing "what"',
  "4": "--some",
  "5": "stuff 'now'",
  "6": "HellO",
  "7": "query",
  "8": "~/bin/st4f_f"
}
Enter fullscreen mode Exit fullscreen mode

I have a codepen for you to play if you want.

What am I missing?

The biggest catch is that the regex doesn't handle escaped quotes. If you have "stuff \"what\"", it won't give you what you want. There is a solution for that but its a "userland" thing. Basically you can let javascript handle the escaping things like this.

sh`node ./src/main.js --some '${"stuff \"what\""}'`
Enter fullscreen mode Exit fullscreen mode

So as the user of sh you can take advantage of ${} to let javascript handle the weird stuff. It works but it makes the API a little bit awkward (not too much I would say).

If anyone knows how I can avoid using ${} to escape the quoting let me know in the comments.


Thank you for your time. If you find this article useful and want to support my efforts, buy me a coffee ☕.

buy me a coffee

Discussion (0)