Discussion on: Built an unconventional database thing

View post

Replies for: I'd love to have more exchanges about it. :) The tagged sql template isn't any more identifiable than Repository.query in terms of a token per se...

The tagged sql template isn't any more identifiable than Repository.query

In fact it is. It's subtle, but there is one major difference: Repository.query(sqlString) might be fine, but Repository.query(createSomeQuery(someVar)) will be hard to resolve, especially if createSomeQuery is imported from another file. The template tag and the string will be tightly coupled, though. No further indirections possible, just need to check if the template tag is an export of the squid package 👌

I imagine you will really need to parse out things well and guard against various things

It's really using Babel to parse the source code to an abstract syntax tree, analyzing it, and when it finds an sql query, it will take the template string, replace the interpolated expressions with placeholders and parse the resulting SQL string using the native Postgres SQL parser implementation.

So we are actually utilizing a proper compiler toolchain and as long as we can trust Babel and the Postgres parser to do their job, there is not that much that can go wrong in that regard :)

But I also am concerned about someone forgetting to wrap it in the literal function

User might forget to use the sql template tag... That's a very interesting point!

We can prevent that source of error, though, I think. Let's come up with a check that makes sure that every query invocation involving a template string requires the template string to be tagged with either sql or sql.raw. Anything else would be considered an error.

I'm intrigued as to the functional approach

Yeah, your code sample is basically it :)

The reason why I think it's a better choice than having a class: If you decide for the class-approach you will either

a) Keep the class completely stateless and all methods can be static. As you noted, this is basically equivalent to the exported functions approach.
b) Might introduce some statefulness to the class, maybe instantiate a singleton-ish instance, because that's what distinguishes the class from a set of static exports. That's what I would try to prevent in the first place. Introducing state here would lead to creating an ORM with most of its downsides. This is what we are trying to get away from ^{^}

The difference between the concepts of classes and static functions is that the classes can be stateful. Since I would do whatever I can to prevent any statefulness here, there is no reason for me to introduce classes here.

But I suppose you are coming from the other side: You have a lot of classes and you try to figure out why not to use one here...

I think we are laying the foundation for a whole new blog post here right now... :D

Sophie The Lionhart • Jan 17 '19

Haha I think we really are laying the foundation for a blog post indeed. I'm not sure what about yet, perhaps a few different kinds really at this point. But anyways...

I hear what you mean on statefulness. Statefulness is indeed public enemy number one, I like the class with static methods approach a bit more just from the standpoint of classes are approachable to non-JS devs. What I've done recently which I really like is the following:

function invisibleHelperFunction(tooShortWord : string) : string {
    return $var + 'adding random words to strings is always helpful.';
}

const ClassThatActsOnStuff = {
    exposedMethod : function (customObject : CustomObject ) : string {
        return invisibleHelperFunction(customObject.stringField);
    }
}

Object.freeze(ClassThatActsOnStuff); //prevents adding new methods and changing state and modification in general

export default ClassThatActsOnStuff;

I like this because it lets you see that you might have added stuff to the file that doesn't belong in the idea that you can throw a function into it, export it, and go on, where as if I throw a function into the class that isn't used outside, then something has gone wrong, its also easier to keep track of what's being exported and hat isn't. And if I need one of the helper functions somewhere else, then I might have some un-needed coupling or another. Obviously this isn't real OOP by any stretch of the word and is really just a slightly different way of doing things, but I'm a fan. :P

Repository.query(createSomeQuery(someVar))

being hard to resolve is true, but is

sql`${createSomeQuery(someVar)}`

actually easier to resolve?

Now full disclosure here, I haven't actually gotten to look through the source code yet so I'm not 100% on how squid does things, but I am familiar with how tagged templates functions work. I would assume that the sql part is just a tag to identify it later, and the template function doesn't do anything fancy at all to it. And so later when we parse everything out, we still have to resolve that function.

Unless the sql safety check is at run time? I just looked at the function in question I believe(raw.githubusercontent.com/andywer/...) and it seems to do the actual sql query parameter wrapping thing, but doesn't actually check the validity of the query in the DB for you.

But for the checking that the query is valid syntactically with the schema, that is an extra build step right? And that will require the same bit regardless of the identifier unless there's something I'm really truly missing. I get that the template literal functions(this is getting annoying to repeatedly type out lol) get the params passed in the second argument, but that only helps you providing like I said that someone hasn't forgotten the sql tag, or done like

sql`SELECT * FROM Users WHERE ` + `${invalidatedInput};`;

though at this point I suppose the developer is just trying to hurt themselves and we should allow darwinism to run its course.

Let's come up with a check that makes sure that every query invocation involving a template string requires the template string to be tagged with either sql or sql.raw

This is kind of iffy to me as a solution. Checking every query invocation means that squid HAS to be tied to a specific framework or set of frameworks. And only working on template strings means that someone not using one, because either they're not a fan of string interpolation or just kind of accidentally forgets they exists, is left in the dust.

I find trying to guard against developer accidents and incompetence as much as possible is good practice, so I like a more general solution. Also since its in a build step, even if this process takes a bit longer, I think its worth it. This is where having a Repository Interface that requires them to implement those methods can help I feel.

The above points I guess bring me to two other things that I'd like to say that maybe are the source of some of our disconnect here:

Squid seems kind of tightly coupled at the moment to a specific database library and sql implementation(postgres)
Squid seems to actually be doing two things, both really with the template literal function(typed it again....)
- At run time, wraps what anything it assumes is user supplied input to prevent sql injection and the like (this is where the functionality of the template literal function(again...) matters
- At compile time, validates the query contents to be sure that no queries are incorrect or outdated. (this is where using it as a tag matters)

I think perhaps that decoupling the two sub point, and the two main points could be quite beneficial. Also I feel that whatever querying library you use should be wrapping stuff for you so doing it yourself may not be necessary, though that might stick us back into query builder kind of land, I'm not sure about it without a lot more investigation, but I feel end of the day it's DB.query(sql, params); as the call, so allowing the developer to decide how to wrap it and call it is up to them, we just do some fancy validation and parsing. :P Not care about the implementation of it at all nor the library used.

This is a fun discussion that I hope can continue! Also I've been using the inclusive 'we' here, but Its not my project and I haven't made any OSS to it, so forgive me!

Andy Wermke • Jan 18 '19

Nahhh, I appreciate the "we". Sounds like a community effort already :)

I try to keep it shorter this time... ^{^}

So the big picture overview as of right now is:

Squid:

has two functionalities
- mark sql queries as such via the sql template tag
- turn the query string with the interpolated expressions into a query object (at runtime)
that query object varies between database drivers, so as of right now it requires either
- having a tiny adapter for each db in squid (maybe three lines of code each, but yeah, not 100% elegant)
- wrapping the db driver's query function into a three-line adapter function
it's the runtime side of the whole thing

Postguard

is only a static build step, no runtime
might make it more generic in the future, supporting other databases as well
or create sibling tools for other databases, sharing the babel/typescript parts of the code

Checking every query invocation means that squid HAS to be tied to a specific framework or set of frameworks

No, squid could basically be completely db-agnostic, since all it does is super generic. All the validation happens on build time by postguard.

The reason why it's SQL-injection proofed is that squid doesn't merge the expressions into the SQL query string yet, but produces a string with parameters á la $1 and an array of the expression values, so the database driver can figure out how to escape them in a water-proof way.

I hope that shines some light on things, because it seems there was some confusion about the big picture concepts (-> will need to improve the readme files, i guess).

But sure, there is still room for various improvements and I love to hear your thoughts :)