I'm a data analyst in an AWS Redshift SQL environment.
For context, the environments that analysts out in the field tend to use, in order of popularity are:
1 TSQL/ SQL Server
4 Spreadsheet / Google Sheet
5 Scrap pieces of paper
6 Crystal Reports
7 Lotus Notes
40 Using an Abacus
41 Cave Drawings
42 Redshift AWS
So suffice to say finding Redshift SPECIFIC technology is pretty hard to find on the interwebs.
When I search for Redshift on the VS Code Extension Store there is nothing!!!!
Well that's a slight lie, there is one, but that's also a lie, its just tagged as Redshift because SEO....
Now this hasn't been an overly big problem for me.
Redshift is very similar to PostgreSQL and Oracle and so I've been getting by using Oracle based extensions.
However the Oracle Extension I've been using this whole time has been getting bigger and buggier over time.
The extension also does like 60 different things and I only use it for 2.
So seeming all I need is SQL syntax highlighting and a snippet library I thought I'd give it a shot at making my own syntax highlighter extension and I can ditch bloaty-mcbloat face.
(And I would just use the internal global snippets file)
Here are the lessons I learned along the way.
So I totally thought I could git-bash my way to success here.
I 100% thought I could take the Oracle Extension and rip out the bits I needed and would be done in time for dinner.
I was very wrong.
There are 2 ways to do syntax highlighting.
Method One requires you to declare what you are highlighting in a JSON package using REGEX expressions. This is what the VSCode documentation tells you.
Method Two simply requires you to download a TextMate Language pack and then nothing else. That’s it, you are done. No dev work what so ever.
Every single extension I looked at uses the TextMate Language pack method. Could I find a language pack for Redshift? Nope. Could I alter an existing one? Not that I could figure out.
So this was mental breakdown number one, everyone on git-hub had taken the easy way out because the languages they supported where mainstream enough to have an easy to grab language pack.
I felt like I was trying to find a case for my Motorola phone when all I can find are Samsung and Apple cases.
So I had to take option one and Dev something myself.
So syntax highlighting isn’t just a highlight everything that's not freetext. You could do that but I could also call all my variables VAR, it gets the job done but it's hardly efficient.
Good syntax highlighting makes functions different colors from variables. Comments different from strings.
Having different highlighted zones allows your eyes to snap to the bits of the code you are looking for.
I knew for SQL I wanted to highlight the following
Commands like 'select' and 'where'
Data types like 'varchar'
Single and double quoted Strings
Single and block comments
Functions like 'row_number' and 'upper'
Special words like distkey
Special characters like ‘<>’
Active words like 'and', 'or', 'else'
By the time I had written out every object I wanted to highlight out of the AWS spec it was a very very long list, about 300 things.
Now I just assumed I would group these how I want and then apply the rules.
Again I was wrong...
Because Visual Studio Code can be themed so many ways there are already pre-set groups for you to match to. This makes the themers job easy as they just say what color a certain group will be and then it doesn’t matter which one of the 87547868545874 coding languages you are using, provided it’s matched to the predefined group it should change with the theme.
This meant I had to take my planned groups and smush them into the groups available to me.
This wasn’t so much as a disaster as I thought it would be, I ended up using ten groups which ends up close to what I originally wanted to do, its just not how I expected.
I'm not a dev, I understand basic HTML / CSS and SQL and I've honed those skills into the floor. I'm like one of those wood people who can make fucking awesome letter boxes but don't ask me to make a chair or a fence.
But I sort of just kept bumbling my way forward and found that it was easily very doable at my skill level.
- For example you can use Yeoman, a NPM package to build a working shell of a VSCode extension which works and debugs right away.
- Then I matched what I wanted to highlight to the predefined groups outlined here
- Then used the provided JSON template
- Along with these Regex guidelines
- And this handy regex editor to check that I was doing it right.
Next minute I was compiling the extension and its now running on my work machine.
A Visual Studio Code Extension to support working in a Redshift (AWS) SQL Environment
Syntax Highlighter for Redshift
Simply put there is no Syntax Highlighting for Redshift AWS SQL in Visual Studio code So I made my own It aint flashy, it never will be.
Syntax Highlighting for:
- SQL Commands like Select, Abort, Begin...
- SQL Keywords like Date_trunc, Distkey, lower...
- Single line comments
- Double line comments
- Single quote strings
- Double quote strings
- SQL Data types like BIGINT, BOOL, VARCHAR
- SQL Constants such as AND, BETWEEN, ILIKE
- Support for number highlighting
- Special Characters like =, <> , =!
Syntax highlighting is done by corresponding to textmate elements.
Patterns matched to keyword.control:
ABORT, ALTER, ANALYZE, BEGIN, CALL, CANCEL, CLOSE, COMMENT, COMMIT, COPY CREATE, DEALLOCATE, DECLARE, DELETE, DROP, END, EXECUTE, EXPLAIN, FETCH, GRANT, INSERT, LOCK, PREPARE, RESET, REVOKE, ROLLBACK, SELECT, SET, SHOW, START, TRUNCATE, UNLOAD, UPDATE, VACUUM
Patterns matched to keyword.other:
ABS, ACOS, ADD_MONTHS, APPROXIMATE, PERCENTILE_DISC, ASIN, ATAN, ATAN2, AVG BIT_AND, BIT_OR, BIT-WISE,…