<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: John Baker</title>
    <description>The latest articles on DEV Community by John Baker (@bakerjd99).</description>
    <link>https://dev.to/bakerjd99</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F431733%2Fd3dcdfd9-5f66-4583-9875-99fc8cbe9ed4.jpeg</url>
      <title>DEV Community: John Baker</title>
      <link>https://dev.to/bakerjd99</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/bakerjd99"/>
    <language>en</language>
    <item>
      <title>Let ChatGPT Index Your LaTeX Documents</title>
      <dc:creator>John Baker</dc:creator>
      <pubDate>Fri, 22 Sep 2023 21:16:55 +0000</pubDate>
      <link>https://dev.to/bakerjd99/let-chatgpt-index-your-latex-documents-3jei</link>
      <guid>https://dev.to/bakerjd99/let-chatgpt-index-your-latex-documents-3jei</guid>
      <description>&lt;p&gt;&lt;em&gt;Cross posted from my blog: &lt;a href="https://analyzethedatanotthedrivel.org/2023/09/18/let-chatgpt-index-your-latex-documents/"&gt;Analyze the Data not the Drivel&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Indexing documents is an ongoing chore. You would think picking out "good" book or article index words would be a trivial pseudo-sampling-sorting task, but that is not the case. Building a useful index requires something like "comprehending" the text. You must read the document, understand what's in it, and then judiciously select words that elucidate, navigate, and summarize the contents. Until recently, coding "reading comprehension" has been beyond my meager means. Then the other day, I had a brain fart; maybe ChatGPT can index LaTeX documents.&lt;/p&gt;

&lt;p&gt;As a test, I decided to index my blog post &lt;em&gt;&lt;a href="https://analyzethedatanotthedrivel.org/2023/04/07/sympathy-for-ptolemaic-epicyclers/"&gt;Sympathy for Ptolemaic Epicyclers&lt;/a&gt;.&lt;/em&gt; For years I have been converting my WordPress blog posts to LaTeX with my J-based hack &lt;em&gt;&lt;a href="https://analyzethedatanotthedrivel.org/2012/02/11/wordpress-to-latex-with-pandoc-and-j-prerequisites-part-1/"&gt;WordPress to LaTeX with Pandoc and J&lt;/a&gt;&lt;/em&gt; and storing the generated LaTeX code on GitHub. Here is the &lt;a href="https://github.com/bakerjd99/Analyze-the-Data-not-the-Drivel/blob/master/wp2latex/sympathyforptolemaic7823.tex"&gt;LaTeX for Ptolemaic Epicyclers&lt;/a&gt;. With code in hand, I logged into ChatGPT 3.5 and had this chat.&lt;/p&gt;

&lt;p&gt;I first asked ChatGPT to summarize &lt;em&gt;Sympathy for Ptolemaic Epicyclers&lt;/em&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Me: Please summarize the following text.&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;em&gt;... upload of LaTeX code for Ptolemaic Epicyclers ...&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;ChatGPT: The provided text discusses the historical perspective of astronomy and the Ptolemaic system. It highlights that often, older theories are unfairly portrayed as ignorant and backward, whereas in reality, they were based on empirical observations and provided reasonable approximations. The text points out that the transition to new theories typically happens when the new model demonstrates its superiority in making predictions.&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;The author also mentions their recent work on a J script called "riseset," used for calculating when International Astronomical Union (IAU) stars culminate in their backyard. They explain that astronomical calculations, both in the past and today, are complex and require constant adjustments.&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;In conclusion, the text offers sympathy and respect for ancient astronomers who worked with limited tools and still made valuable contributions to understanding the cosmos. It emphasizes that these astronomers were not morons but rather practical philosophers trying to make sense of the observable universe.&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you read my blog post, you'll see that ChatGPT's summary is dead on. Also, ChatGPT's concluding remarks are better than mine. No wonder ChatGPT is scaring credentialed pseudo-intellectuals everywhere. I then asked.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Me: Please pick out good words to index from the text that you just summarized.&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;ChatGPT: Here are some good words or phrases to index from the summarized text:&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Ptolemaic system, Old theory holdouts, Copernicus, Heliocentric model, Aristarchus, Sun-centered models, International Astronomical Union (IAU), GitHub, Astronomical calculations, Empirical observations, Modern calculations, N-body celestial mechanics, Gravitational theories, Backyard amateur astronomer, Ancient epicyclers&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;You can use these words or phrases as entries in the index of your LaTeX document to help readers easily find relevant information.&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You'll notice that ChatGPT didn't just pick out words, it synthesized useful phrases that eerily sum up the contents of my post. I have no objections to the generated phrases even though I doubt I would have come up with them myself. I was impressed. This thing is a great&lt;br&gt;
sounding board for writers.&lt;/p&gt;

&lt;p&gt;Feeling lucky I decided to go for broke and ask ChatGPT to insert LaTeX indexing codes into my document.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Me: Please insert the latex index codes for the words you selected in the text I provided.&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;ChatGPT: Certainly! Here are the LaTeX index codes inserted into the provided text for the selected words:&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;em&gt;... download of modified LaTeX code with \index{} commands inserted&lt;br&gt;
...&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;ChatGPT effortlessly inserted the index codes into my document, and we concluded our chat with:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Me: Thank you&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;ChatGPT: You’re welcome! If you have any more questions or need further assistance, feel free to ask. Good luck with your LaTeX document!&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
    </item>
    <item>
      <title>Ethanol is Excrement in the Gas Tank</title>
      <dc:creator>John Baker</dc:creator>
      <pubDate>Sat, 21 Jan 2023 18:40:18 +0000</pubDate>
      <link>https://dev.to/bakerjd99/ethanol-is-excrement-in-the-gas-tank-1m7a</link>
      <guid>https://dev.to/bakerjd99/ethanol-is-excrement-in-the-gas-tank-1m7a</guid>
      <description>&lt;p&gt;&lt;em&gt;Cross posted from my blog: &lt;a href="https://analyzethedatanotthedrivel.org/2023/01/17/ethanol-is-excrement-in-the-gas-tank/" rel="noopener noreferrer"&gt;Analyze the Data not the Drivel&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Like many U.S. government programs, the blended corn ethanol mandate has done little to address its avowed goals, i.e., strengthen energy security, &lt;a href="https://www.factcheck.org/2015/11/ethanol-higher-emissions-or-lower/" rel="noopener noreferrer"&gt;reduce carbon emissions&lt;/a&gt;, and lower prices for consumers. By &lt;a href="http://large.stanford.edu/courses/2011/ph240/sojka2/" rel="noopener noreferrer"&gt;some&lt;br&gt;
accounts&lt;/a&gt; it, consumes more energy than it produces, increases consumer prices, &lt;a href="https://www.ucsusa.org/resources/land-use-changes-and-biofuels" rel="noopener noreferrer"&gt;distorts food production&lt;/a&gt;, wastes vast tracts of arable land, and &lt;a href="https://news.climate.columbia.edu/2011/03/21/ethanol%E2%80%99s-impacts-on-our-water-resources/" rel="noopener noreferrer"&gt;contributes to groundwater and aquifer depletion&lt;/a&gt;. &lt;em&gt;What's not to like?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The corn ethanol program has been &lt;a href="https://the-pipeline.org/the-great-ethanol-boondoggle/" rel="noopener noreferrer"&gt;called a boondoggle&lt;/a&gt;, by just about anyone that isn't &lt;em&gt;directly cashing ethanol checks,&lt;/em&gt; but instead of shutting this abomination down, the idiots in Washington are being pushed by their donors (yes there is a &lt;a href="https://www.bloomberg.com/news/articles/2021-11-16/ethanol-lobby-to-white-house-blend-rule-rollback-would-backfire#xj4y7vzkg" rel="noopener noreferrer"&gt;big ethanol lobby&lt;/a&gt;) to expand corn ethanol production, &lt;a href="https://www.popularmechanics.com/cars/hybrid-electric/a11687/four-things-to-know-about-e15-15096134/" rel="noopener noreferrer"&gt;increase the amount of ethanol&lt;/a&gt; blended into gasoline, and keep on &lt;a href="https://www.forbes.com/sites/dianafurchtgott-roth/2022/12/02/higher-ethanol-mandates-are-a-lose-lose-for-americans/?sh=331fc2e05c8d" rel="noopener noreferrer"&gt;screwing the public&lt;/a&gt; in perpetuity. As it's currently &lt;em&gt;constipated&lt;/em&gt;,&lt;sup id="fnref1"&gt;1&lt;/sup&gt; the ethanol mandate mainly buys votes in the Midwest and funnels billions of government dollars to ethanol refiners and their investors. &lt;em&gt;The ethanol program is pure squealing pork.&lt;/em&gt; You almost have to admire the rapacious cabal that's sold this stupid program to an even dumber public.&lt;/p&gt;

&lt;p&gt;Like many beleaguered citizens, I do my best to ignore government&lt;br&gt;
stupidity. I simply don't have the time, energy, or patience, to engage with the mental, moral, and cosmetic pygmies that infest government. Unfortunately for me, and you too, the ethanol mandate is hard to ignore because of the "mandate" bit. If you doubt this go out and try and buy some ethanol-free gasoline. You will quickly discover that ethanol-free gas is not always available, and even when it is; it's more expensive and comes in fewer octane ratings than ethanol-polluted gas. Some gas stations sell ethanol-free gas, but hardly any sell premium (octane 91 and above) ethanol-free gas.&lt;/p&gt;

&lt;p&gt;You may ask, "How about letting gas stations sell whatever gasoline blends their customers prefer? If the public prefers pure to ethanol-contaminated gas, then so be it."&lt;/p&gt;

&lt;p&gt;Oh, you poor naïve dolt. If you think we're living in a "free market" just pour gas, pure or ethanol, over yourself and strike a match! Virtually everything on sale is regulated in one way or another. The "free market" is as real as LGBTQ werewolves. In tiny regulator minds, consumer choice must be constrained because &lt;em&gt;unfettered consumers may not buy your crap!&lt;/em&gt; Big ethanol knows consumers prefer comparably priced pure gas to their poisoned alternative because:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Ethanol-free gas gets better mileage.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It burns at lower temperatures in engines reducing wear and tear.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Is less corrosive and damaging to engine parts.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If corn ethanol met its promises, you could make a case that it's worth damaging engines, draining aquifers, and restricting consumer choice but it doesn't. As I said the program is pure pork. It's not designed to benefit the public and it doesn't. And, barring an &lt;em&gt;electoral miracle&lt;/em&gt; that tosses all the corrupt congress critters that voted for this affront out of office, we're going to be saddled with ethanol in our tanks forever.&lt;/p&gt;

&lt;p&gt;So, what can you do to limit the ethanol excrement in your gas tank?&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If money isn't an issue buy an electric car or always fill up with ethanol-free gas.&lt;/em&gt; If money is an issue remember; &lt;em&gt;dilution is the solution to pollution.&lt;/em&gt; I alternate between pure and ethanol gas when I fill up. This is cheaper than pure gas and it keeps the ethanol concentration down.&lt;/p&gt;

&lt;p&gt;The Python function &lt;code&gt;ethanol_concentration&lt;/code&gt; (see below) estimates&lt;br&gt;
my average ethanol gas tank concentration assuming:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;I always fill up when the tank is half-empty (I'm a pessimist).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;I start with a full tank of ethanol E10 (10%) gas.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;I alternate between pure and ethanol gas.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If spreadsheets are more your bag &lt;a href="https://bakerjd99.files.wordpress.com/2023/01/eigt.xlsx" rel="noopener noreferrer"&gt;this sheet&lt;/a&gt; carries out the same calculations.&lt;/p&gt;

&lt;p&gt;When you run this function for the dilution schemes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;t = 1&lt;/code&gt; always use pure gas&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;t = 2&lt;/code&gt; alternate between ethanol and pure&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;t = 3&lt;/code&gt; one pure for every two ethanol&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;t = 4&lt;/code&gt; one pure for every three ethanol&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;You get:&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;(1, 10, 0.2, 10, 1.5777218104420236e-29)
(2, 10, 5.066666666666666, 10, 3.3333333333333335)
(3, 10, 6.728571428571428, 10, 4.285714285714286)
(4, 10, 7.553333333333333, 10, 4.666666666666667)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;Observe that alternating, as expected, cuts the ethanol concentration in half and that the concentration creeps up as you put less and less pure gas in your tank. There's still excrement in your tank --- just less of it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;ethanol_concentration&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;E&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Compute average ethanol concentration over:
      F = number of fill ups
      E = ethanol concentration
      T = tank filling 

    example:
      ethanol_concentration(100, 10, 2)
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="c1"&gt;# initial tank concentrations - start with ethanol gas
&lt;/span&gt;    &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;
    &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;E&lt;/span&gt;

    &lt;span class="c1"&gt;# gas ups - 1 ethanol, 0 ethanol free
&lt;/span&gt;    &lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; 
    &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

    &lt;span class="c1"&gt;# calculate concentrations
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;E&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="nf"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;E&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;ol&gt;

&lt;li id="fn1"&gt;
&lt;p&gt;And I mean &lt;em&gt;constipated&lt;/em&gt;. Government only serves the interests of&lt;br&gt;
the donor class. If you aren't punting &lt;em&gt;serious moola&lt;/em&gt; to your&lt;br&gt;
"representative," he, she, or &lt;em&gt;trans-it&lt;/em&gt;, doesn't give a ding-dong&lt;br&gt;
damn about your tiresome little problems. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;/ol&gt;

</description>
      <category>productivity</category>
      <category>career</category>
      <category>discuss</category>
    </item>
    <item>
      <title>"Managing" a SQLite Database with J (Part 2)</title>
      <dc:creator>John Baker</dc:creator>
      <pubDate>Tue, 09 Aug 2022 16:07:00 +0000</pubDate>
      <link>https://dev.to/bakerjd99/managing-a-sqlite-database-with-j-part-2-23pk</link>
      <guid>https://dev.to/bakerjd99/managing-a-sqlite-database-with-j-part-2-23pk</guid>
      <description>&lt;p&gt;&lt;em&gt;Cross posted from my blog: &lt;a href="https://analyzethedatanotthedrivel.org/2022/08/08/managing-a-sqlite-database-with-j-part-2/"&gt;Analyze the Data not the Drivel&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;When faced with unfamiliar program code &lt;a href="https://rogerhui.rip/"&gt;Roger Hui&lt;/a&gt;, the cocreator of the &lt;a href="https://code2.jsoftware.com/wiki/Main_Page"&gt;J Programming Language&lt;/a&gt; would sometimes wipe out all the code comments. He told me this forced him to concentrate on the actual code and not the &lt;em&gt;opinions&lt;/em&gt; of previous developers. Roger was an exceptional programmer; he knew that program comments are often dated, misleading and wrong. And, the only way to "correct the comments" is to read the code; so why not just read the code and skip the philosophy? Roger's comment purges underscore an important point. However you document your code, &lt;em&gt;it must be better than the null or empty document.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;With this in mind, I wrote a few versions of this post, but I hated them all. Then I realized that &lt;a href="https://github.com/bakerjd99/jacks/blob/master/jodliterate/UsingJodliterate.pdf"&gt;&lt;code&gt;jodliterate&lt;/code&gt;&lt;/a&gt; PDF documents &lt;em&gt;mostly&lt;/em&gt; do what I want. So, instead of rewriting &lt;a href="https://github.com/bakerjd99/jacks/blob/master/mirrorxref/MirrorXref.pdf"&gt;&lt;code&gt;MirrorXref.pdf&lt;/code&gt;&lt;/a&gt;, I will make a few comments about jodliterate group documents in general. If you're interested in using &lt;a href="https://sqlite.org/index.html"&gt;SQLite&lt;/a&gt; with J, download the self-contained &lt;a href="https://github.com/bakerjd99/jacks/tree/master/mirrorxref"&gt;GitHub files &lt;code&gt;MirrorXref.ijs&lt;/code&gt; and &lt;code&gt;MirrorXref.pdf&lt;/code&gt;&lt;/a&gt; and have a look.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;jodliterate Group Documents&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;jodliterate&lt;/code&gt; group documents are directly generated from &lt;a href="https://analyzethedatanotthedrivel.org/the-jod-page/"&gt;JOD&lt;br&gt;
dictionaries&lt;/a&gt; using J, &lt;a href="https://pandoc.org/"&gt;pandoc&lt;/a&gt;, and LaTeX. Group documents have three major parts.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;An overview.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Complete typeset source code.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A global word index.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The source code and index sections are self-explanatory, but a few&lt;br&gt;
remarks about overviews are warranted.&lt;/p&gt;

&lt;p&gt;The &lt;em&gt;overview&lt;/em&gt; might have several subsections. There's usually a&lt;br&gt;
hyperlinked &lt;strong&gt;Interface&lt;/strong&gt; subsection. J doesn't have formal interfaces like many programming languages. The words listed in the &lt;strong&gt;Interface&lt;/strong&gt; section are words you will use when running the script. The &lt;strong&gt;Interface&lt;/strong&gt; section highlights where to get started.&lt;/p&gt;

&lt;p&gt;For complex scripts, there is usually a &lt;strong&gt;Using&lt;/strong&gt; subsection &lt;a href="https://github.com/bakerjd99/jacks/blob/master/mirrorxref/MirrorXref.pdf"&gt;&lt;code&gt;MirrorXref.pdf&lt;/code&gt;&lt;/a&gt;'s &lt;strong&gt;Using&lt;/strong&gt; subsection is typical.&lt;/p&gt;

&lt;p&gt;Finally, there might be a &lt;strong&gt;Code Hints&lt;/strong&gt; subsection. Code Hints are lists of hyperlinked words that you should heed.&lt;/p&gt;

&lt;p&gt;For more, take Roger's advice and read the code.&lt;/p&gt;

</description>
      <category>sqlite</category>
      <category>jsoftware</category>
      <category>etl</category>
      <category>addon</category>
    </item>
    <item>
      <title>"Managing" a SQLite Database with J (Part 1)</title>
      <dc:creator>John Baker</dc:creator>
      <pubDate>Tue, 09 Aug 2022 15:55:39 +0000</pubDate>
      <link>https://dev.to/bakerjd99/managing-a-sqlite-database-with-j-part-1-3pb7</link>
      <guid>https://dev.to/bakerjd99/managing-a-sqlite-database-with-j-part-1-3pb7</guid>
      <description>&lt;p&gt;&lt;em&gt;Cross posted from my blog: &lt;a href="https://analyzethedatanotthedrivel.org/2022/08/03/managing-a-sqlite-database-with-j-part-1/"&gt;Analyze the Data not the Drivel&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://sqlite.org/index.html"&gt;SQLite&lt;/a&gt;, by many estimates, is the most widely deployed SQL database system on Earth. It's everywhere. It's in your phone, your laptop, your cameras, your car, your cloud, and your breakfast cereal. SQLite's global triumph is a gratifying testament to the virtues of technical excellence and the philosophy of "less is more."&lt;/p&gt;

&lt;p&gt;Readers of this blog will know that I admire the &lt;em&gt;set-theoretic&lt;/em&gt; basis of SQL while despising the &lt;em&gt;IT administrative state&lt;/em&gt; that has grown up, like septic tank scum, around it. You'll know what I'm talking about if you have ever:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Requested access to a particular &lt;em&gt;corporate&lt;/em&gt; database.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Waited for DBAs to sign off on a query.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Asked for a new index.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Had the gall to install "unauthorized" software tools in&lt;br&gt;
"production."&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;There is so much administrative shit covering big Oracle, SQLServer, or DB2 systems that it's almost impossible to get anything done without constantly fellating &lt;em&gt;administrative&lt;/em&gt; dick!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SQLite saves SQL&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The creators of SQLite had another idea. Let's drop the administrative crap and provide a simple stand-alone implementation of SQL. Let's also structure our system as a single highly portable C source file that can be compiled anywhere. Furthermore, let's test the shit out of our code with comprehensive test suites that hit every line. Finally, let's fix&lt;br&gt;
our damn bugs. The SQLite developers did all this so well that SQLite slowly, then all at once, conquered the whole damn world. The SQLiters earned their dominance.&lt;/p&gt;

&lt;p&gt;So, what does this have to do with J?&lt;/p&gt;

&lt;p&gt;J provides a useful &lt;a href="https://code.jsoftware.com/wiki/Addons/data/sqlite/Overview"&gt;SQLite addon&lt;/a&gt; that lets J programmers navigate the vast cosmos of SQLite database files. I mainly use J's SQLite addon to access &lt;a href="http://www.thumbsplus.com/"&gt;ThumbsPlus&lt;/a&gt; and &lt;a href="https://www.darktable.org/"&gt;Darktable&lt;/a&gt; image databases, but it's surprisingly handy in many contexts simply because SQLite database files are everywhere. SQLite database files are so pervasive that they've become a &lt;em&gt;de facto&lt;/em&gt; binary data exchange format. If you have data in dire need of distribution, stuff them in a SQLite database file.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It's still SQL&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;SQLite is a superb tool, but it's still SQL, and SQL has an ancient deficiency. Loading data into SQL systems has always been a pain. So much so that a plethora of grisly, kludge-infested hacks, charitably labeled &lt;em&gt;ETL tools&lt;/em&gt; (&lt;strong&gt;E&lt;/strong&gt;xtract &lt;strong&gt;T&lt;/strong&gt;ransform and &lt;strong&gt;L&lt;/strong&gt;oad), have taken root in the &lt;em&gt;SQLverse&lt;/em&gt;. ETL tools, no matter how polished or optimized, all seem like afterthoughts.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Hey, this set-theoretic SQL shit &lt;br&gt;
would be so much better if there was&lt;br&gt;
some data in our database!"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you "manage" SQL systems, you must slay ETL dragons. SQLite hasn't fixed this, but its almost universal adoption means you can slay ETL dragons with your favorite programming spears. One of my favorite spearsis the &lt;a href="https://code2.jsoftware.com/wiki/Main_Page"&gt;J programming language&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Toy or tool&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When I started writing this post, I considered composing a short&lt;br&gt;
tutorial on how to create, load, and query a simple SQLite database with the J addon. Then I realized the Internet drips with simple SQLite &lt;em&gt;toytutorials.&lt;/em&gt; Do we really need another one? Simple SQL tutorials help when getting started, but they invariably gloss over the ETL problem. The ETL problem never goes away and it's always more complex than expected. A realistic introduction to using SQLite, or any SQL system, would keep this nasty fact front and center. So, in a following post, I will outline a small but nontrivial SQLite-based &lt;em&gt;tool&lt;/em&gt; I use to manage messy data I care about.&lt;/p&gt;

</description>
      <category>sqlite</category>
      <category>jsoftware</category>
      <category>addon</category>
      <category>etl</category>
    </item>
    <item>
      <title>Stupid J Jupyter Tricks</title>
      <dc:creator>John Baker</dc:creator>
      <pubDate>Mon, 25 Jul 2022 17:48:00 +0000</pubDate>
      <link>https://dev.to/bakerjd99/stupid-j-jupyter-tricks-3o3f</link>
      <guid>https://dev.to/bakerjd99/stupid-j-jupyter-tricks-3o3f</guid>
      <description>&lt;p&gt;&lt;em&gt;Cross posted from my blog &lt;a href="https://analyzethedatanotthedrivel.org/"&gt;Analyze the Data not the Drivel&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Continuing with my software irritant dispatching theme: today's&lt;br&gt;
irritant - including &lt;em&gt;syntax colored code&lt;/em&gt; in blogs, LaTeX, Word, and other documents. This one is a persistent pain in the ass; especially for users of idiosyncratic programming languages. We all have our favorite hacks and workarounds; today I am sharing one of mine: using &lt;a href="https://jupyter.org/"&gt;jupyter&lt;/a&gt; to syntax color J code.&lt;/p&gt;

&lt;p&gt;A few years ago &lt;a href="https://github.com/martin-saurer"&gt;Martin Saurer&lt;/a&gt; made an outstanding contribution to the &lt;a href="https://code2.jsoftware.com/wiki/Main_Page"&gt;J programming language&lt;/a&gt; community. He created a &lt;a href="https://github.com/martin-saurer/jkernel"&gt;jupyter kernel for J.&lt;/a&gt; If you don't know what jupyter is please crawl out from under your rock and join the rest of us in the 21&lt;sup&gt;st&lt;/sup&gt; century.&lt;/p&gt;

&lt;p&gt;As I have said before jupyter notebooks are simply the &lt;em&gt;correct&lt;/em&gt; way to present complex "executable" technical documents that contain heavy doses of mathematics, program code, and graphics. I'm not the only one that holds this opinion: many concur. It would be an understatement to say that jupyter has been a success. It's now a dominant tool in many disciplines. If you're not using it you are missing out on one of the most useful open-source programming tools available.&lt;/p&gt;

&lt;p&gt;I came to jupyter from the &lt;em&gt;pythonverse&lt;/em&gt;. The JSON based &lt;code&gt;ipynb&lt;/code&gt;&lt;br&gt;
notebook format originated there and initially stood for &lt;strong&gt;I&lt;/strong&gt;nteractive &lt;strong&gt;PY&lt;/strong&gt;thon &lt;strong&gt;N&lt;/strong&gt;ote*&lt;em&gt;B&lt;/em&gt;&lt;em&gt;ook, but in one of the best software design decisions ever made, the creators of jupyter decided to make it *programming language agnostic.&lt;/em&gt; This means any programming language can plug into jupyter if it implements a well-defined kernel interface. This is what Martin implemented. Now, many programming languages support jupyter. It's become an expected facility, and if it's missing, well that's a good reason to use another programming language.&lt;/p&gt;

&lt;p&gt;Martin's J kernel works very well and I use it all the time, but there are a few ways the J kernel, and jupyter kernels in general, could be improved.&lt;/p&gt;
&lt;h3&gt;
  
  
  Debugging notebook code
&lt;/h3&gt;

&lt;p&gt;Debugging notebook code is not as polished as it could be. Kernels are typically "invisible" executables that respond to requests sent from the notebook server. Most kernels stay in the background and do not provide standard language-specific programming environments. Usually, this isn't much of a problem, but when debugging complex programs it would be handy to push a button and pop up a standard debugger. In J's case, a JQt or JHS session would appear containing all the code loaded or created by&lt;br&gt;
the notebook. You could then debug and modify your code, then pop over to the notebook session to try it out.&lt;/p&gt;

&lt;p&gt;Right now the J kernel does not "pop up" visible J sessions but you can achieve something similar by extracting J notebook code as a J script and executing it in another JQt or JHS session. jupyter makes this easy!&lt;/p&gt;
&lt;h3&gt;
  
  
  Stupid J Jupyter Trick #1 - export notebook code as a J script
&lt;/h3&gt;

&lt;p&gt;jupyter can download notebooks in a variety of formats. One of the&lt;br&gt;
formats provided by the J kernel is &lt;code&gt;ijs&lt;/code&gt; or J script. Go to the&lt;br&gt;
&lt;strong&gt;File&lt;/strong&gt; menu, select &lt;strong&gt;Download as&lt;/strong&gt;, then pick &lt;strong&gt;J (.ijs)&lt;/strong&gt;. jupyter will extract all the &lt;em&gt;J code cells&lt;/em&gt; in the notebook &lt;em&gt;in cell order&lt;/em&gt; and write a script file to wherever you want it. I usually save such scripts in the J temp directory &lt;code&gt;jpath ~temp&lt;/code&gt; where they can be conveniently loaded and run.&lt;/p&gt;

&lt;p&gt;When you run a downloaded J notebook script in another J session it &lt;em&gt;essentially&lt;/em&gt;&lt;sup id="fnref1"&gt;1&lt;/sup&gt; recreates the notebook kernel state.&lt;/p&gt;
&lt;h3&gt;
  
  
  Inserting J code into notebooks
&lt;/h3&gt;

&lt;p&gt;Extracting J code from notebooks for debugging is useful but oddly the inverse, inserting J code into notebooks, is even more useful. As previously noted, jupyter can export notebooks in many formats. In particular, it can generate HTML, LaTeX, Markdown, and PDF versions of notebooks. I've found the generated LaTeX, HTML, and Markdown formats extremely handy.&lt;/p&gt;
&lt;h3&gt;
  
  
  Stupid J Jupyter Trick #2 - J code notebook insertion
&lt;/h3&gt;

&lt;p&gt;If you're looking for &lt;em&gt;a quick and dirty way to syntax color J code&lt;/em&gt; do the following:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Create a blank jupyter notebook.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Open a J script in your favorite editor.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Cut and Paste the script into a jupyter code cell.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Download the notebook as HTML or PDF.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;jupyter will nicely syntax color your code. It even does a decent job of breaking lines. &lt;a href="https://github.com/bakerjd99/jacks/tree/master/brandxmp/jupyter"&gt;The GitHub files&lt;/a&gt; &lt;code&gt;brandxmp.ijs.pdf&lt;/code&gt; and &lt;code&gt;brandxmp.ijs.html&lt;/code&gt; are examples of what you get right out of the box.&lt;/p&gt;
&lt;h3&gt;
  
  
  Stupid J Jupyter Trick #3 - smarter J code notebook insertion
&lt;/h3&gt;

&lt;p&gt;The quick and dirty method is simple and handy but it's of limited use when preparing complex LaTeX documents. With LaTeX, you want more than just code. You want cross-references, web links, footnotes, mathematical expressions, graphics, indexes, appendices, and other &lt;a href="https://www-cs-faculty.stanford.edu/~knuth/lp.html"&gt;&lt;em&gt;literate programming&lt;/em&gt;&lt;/a&gt; goodies. It's much easier to achieve such ends if your J code is split over many cells. The following verb does just that.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="n"&gt;ipynbfrjod&lt;/span&gt;&lt;span class="p"&gt;=:&lt;/span&gt;&lt;span class="kp"&gt;3&lt;/span&gt; &lt;span class="kp"&gt;:&lt;/span&gt; &lt;span class="kp"&gt;0&lt;/span&gt;

&lt;span class="c1"&gt;NB.*ipynbfrjod v-- extract J  words from JOD  and insert in blank&lt;/span&gt;
&lt;span class="c1"&gt;NB. jupyter notebook.&lt;/span&gt;
&lt;span class="c1"&gt;NB.&lt;/span&gt;
&lt;span class="c1"&gt;NB. monad:  clIpynb =. ipynbfrjod blclNames&lt;/span&gt;
&lt;span class="c1"&gt;NB.&lt;/span&gt;
&lt;span class="c1"&gt;NB.   NB. examples use docs and utils&lt;/span&gt;
&lt;span class="c1"&gt;NB.   require 'general/jod'&lt;/span&gt;
&lt;span class="c1"&gt;NB.   od ;:'docs utils'&lt;/span&gt;
&lt;span class="c1"&gt;NB.&lt;/span&gt;
&lt;span class="c1"&gt;NB.   nbj=: ipynbfrjod ;:'sha1 sha1dir'&lt;/span&gt;
&lt;span class="c1"&gt;NB.   nbj write 'C:\Users\baker\jupyter_notebooks\test0.ipynb'&lt;/span&gt;
&lt;span class="c1"&gt;NB.&lt;/span&gt;
&lt;span class="c1"&gt;NB.   nbj=: ipynbfrjod }. grp 'ipynb'&lt;/span&gt;
&lt;span class="c1"&gt;NB.   nbj write 'C:\Users\baker\jupyter_notebooks\ipynb_onself.ipynb'&lt;/span&gt;

&lt;span class="c1"&gt;NB. require 'general/jod' !(*)=. disp&lt;/span&gt;
&lt;span class="n"&gt;jc&lt;/span&gt;&lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="n"&gt;disp&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;.&lt;/span&gt;&lt;span class="nf"&gt;&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;y&lt;/span&gt;

&lt;span class="c1"&gt;NB. markdown sections with word name&lt;/span&gt;
&lt;span class="n"&gt;sec&lt;/span&gt;&lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="n"&gt;dblquote&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;MDSECTION&lt;/span&gt;&lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="n"&gt;JWORDMARK&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;.&lt;/span&gt;&lt;span class="nf"&gt;&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;y&lt;/span&gt;
&lt;span class="n"&gt;sec&lt;/span&gt;&lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;NBJCELLBEGst&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;.&lt;/span&gt;&lt;span class="nf"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;sec&lt;/span&gt; &lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;.&lt;/span&gt;&lt;span class="nf"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;NBJCELLBEGen&lt;/span&gt;

&lt;span class="c1"&gt;NB. j code to quoted list of python strings notebook format&lt;/span&gt;
&lt;span class="n"&gt;nbj&lt;/span&gt;&lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="nf"&gt;&amp;lt;&lt;/span&gt;&lt;span class="o"&gt;;.&lt;/span&gt;&lt;span class="m"&gt;_2&lt;/span&gt;&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;REVPYESCAPECHRS&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;changestr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;tlf&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;.&lt;/span&gt;&lt;span class="nf"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;jc&lt;/span&gt;
&lt;span class="n"&gt;nbj&lt;/span&gt;&lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="nf"&gt;;&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;.&lt;/span&gt;&lt;span class="nf"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'"'&lt;/span&gt; &lt;span class="nf"&gt;,&lt;/span&gt; &lt;span class="o"&gt;L:&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;&amp;lt;&lt;/span&gt;&lt;span class="s1"&gt;'\n",'&lt;/span&gt;&lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="n"&gt;LF&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="o"&gt;~&lt;/span&gt; &lt;span class="o"&gt;L:&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="n"&gt;nbj&lt;/span&gt;
&lt;span class="n"&gt;nbj&lt;/span&gt;&lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="s1"&gt;'"'&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;.&lt;/span&gt;&lt;span class="nf"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'\n",'&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;beforelaststr&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;.&lt;/span&gt;&lt;span class="nf"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;nbj&lt;/span&gt;
&lt;span class="n"&gt;nbj&lt;/span&gt;&lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="n"&gt;sec&lt;/span&gt; &lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;.&lt;/span&gt;&lt;span class="nf"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;nbj&lt;/span&gt; &lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;.&lt;/span&gt;&lt;span class="nf"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;NBJCELLEND&lt;/span&gt;
&lt;span class="n"&gt;toJ&lt;/span&gt; &lt;span class="n"&gt;NBHEADER&lt;/span&gt; &lt;span class="nf"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;LF&lt;/span&gt; &lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="o"&gt;~&lt;/span&gt; &lt;span class="s1"&gt;','&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;beforelaststr&lt;/span&gt; &lt;span class="nf"&gt;;&lt;/span&gt;&lt;span class="n"&gt;nbj&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nf"&gt;,&lt;/span&gt; &lt;span class="n"&gt;NBTRAILER&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After loading J words into cells all the jupyter export facilities are immediately available. Examples of jupyter outputs exploiting this hack are &lt;a href="https://github.com/bakerjd99/jacks/tree/master/ipynbfrjod"&gt;available on GitHub:&lt;/a&gt; see &lt;code&gt;ipynb_onself.ipynb&lt;/code&gt; and &lt;code&gt;ipynb_onself.pdf&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;To use &lt;code&gt;ipynbfrjod&lt;/code&gt; download &lt;code&gt;ipynb.ijs&lt;/code&gt; and the corresponding document &lt;code&gt;ipynb.pdf&lt;/code&gt;. All these files &lt;a href="https://github.com/bakerjd99/jacks/tree/master/ipynbfrjod"&gt;are on GitHub&lt;/a&gt;. Help yourself!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Postscript:&lt;/strong&gt; If you already have a recent version of J installed you can directly install this script as an &lt;code&gt;addon&lt;/code&gt; by typing the following command in a JQt or JHS session.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="n"&gt;install&lt;/span&gt; &lt;span class="s1"&gt;'github:bakerjd99/jackshacks'&lt;/span&gt;

&lt;span class="c1"&gt;NB. load script&lt;/span&gt;
&lt;span class="n"&gt;load&lt;/span&gt; &lt;span class="s1"&gt;'~addons/jacks/ipynb.ijs'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;ol&gt;

&lt;li id="fn1"&gt;
&lt;p&gt;Some care must be taken when developing J jupyter notebooks to guarantee exported scripts accurately recreate kernel states. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;/ol&gt;

</description>
      <category>jsoftware</category>
      <category>jupyter</category>
      <category>j</category>
      <category>literateprogramming</category>
    </item>
    <item>
      <title>Branding XMP Sidecar Files with J</title>
      <dc:creator>John Baker</dc:creator>
      <pubDate>Tue, 19 Jul 2022 21:18:00 +0000</pubDate>
      <link>https://dev.to/bakerjd99/branding-xmp-sidecar-files-with-j-33pf</link>
      <guid>https://dev.to/bakerjd99/branding-xmp-sidecar-files-with-j-33pf</guid>
      <description>&lt;p&gt;&lt;em&gt;Cross posted from my blog &lt;a href="https://analyzethedatanotthedrivel.org/"&gt;Analyze the Data not the Drivel&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;During my long and not-so-storied IT career I covertly dispatched&lt;br&gt;
"irritants" with the &lt;a href="https://code.jsoftware.com/wiki/Main_Page"&gt;J programming language&lt;/a&gt;. An irritant is an annoying software problem that does not merit a full-blown project-style approach. The last thing you want when dealing with irritants is to drag in IT bureaucrats. You know of whom I speak: the niggling little project-managing creatures that always turn little problems into bigger problems. Do you remember "Agile" when it was actually Agile? Now that I am &lt;em&gt;tentatively retired&lt;/em&gt;&lt;sup id="fnref1"&gt;1&lt;/sup&gt; I no longer deal with IT bureaucrats but irritants still abound and I still use J to dispatch them. In my next three posts, I will show J solutions to some irritants that may be annoying you.&lt;/p&gt;

&lt;p&gt;Today's irritant comes from my photography hobby. I have been taking pictures since I was eight years old. To give you a sense of how &lt;em&gt;anal-retentive&lt;/em&gt; I am, I still have film negatives from childhood and I'm nearly seventy! Since the year 2000, I have been editing and uploading digital images. You can see many of them on &lt;a href="https://conceptcontrol.smugmug.com/"&gt;my SmugMug site&lt;/a&gt;. My editing process uses a veritable software zoo. As I work on an image, I flip from one program to another. Some programs respect and maintain image metadata and others do not. At the end of my pipeline, I rename the image file to something meaningful&lt;sup id="fnref2"&gt;2&lt;/sup&gt; even though renaming makes it difficult to go back to the original file later.&lt;/p&gt;

&lt;p&gt;If you edit images, you know the job is never done. You can always&lt;br&gt;
"improve" a picture or cast it in a new light. I resist the urge to revamp old renderings but sometimes, maybe years later, I will go back to the original to "fix" things. Unfortunately, by renaming files it's not always easy to find the original image.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This is a common DAM (Digital Asset Management) problem!&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;There are scores of image database programs, and you would think that image databases would generate stable image keys. Most of them do, but their keys are typically useful only in the context of the database. You need a key tied to the unique bit pattern of the original image file. You need a proper hash.&lt;/p&gt;

&lt;p&gt;Once you recognize the need for proper hashes the question becomes, "Where do you store them?" You might think a file's &lt;a href="https://en.wikipedia.org/wiki/Exif"&gt;EXIF&lt;/a&gt; or &lt;a href="https://www.iptc.org/standards/photo-metadata/"&gt;ITPC&lt;/a&gt; metadata would be the ideal location. I tried this approach. Using the superb &lt;a href="https://exiftool.org/"&gt;exiftool&lt;/a&gt;, I inserted hashes in images only to watch various editors strip or mangle them later on. The &lt;a href="https://www.adobe.com/products/xmp.html"&gt;XMP&lt;/a&gt; (e*&lt;em&gt;X&lt;/em&gt;&lt;em&gt;tensible **M&lt;/em&gt;&lt;em&gt;etadata **P&lt;/em&gt;*latform) sidecar&lt;sup id="fnref3"&gt;3&lt;/sup&gt; file was created to address this and other irritants. The XMP sidecar file provides a safe home for image metadata with the added bonus that you never need to touch or alter the original image file. If you don't treat your original RAWs like sacred bytes there's something wrong with you.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.darktable.org/"&gt;Darktable&lt;/a&gt;, my current favorite RAW developer, creates and maintains XMPs when you add images to its library or apply &lt;em&gt;non-destructive&lt;/em&gt; edits. Darktable's edits are stored in sidecar XMPs leaving the originals unmolested. When you export a developed image Darktable creates a copy of the original, applies all the edits stored in the sidecar file, and then inserts &lt;a href="https://www.dublincore.org/"&gt;Dublin Core&lt;/a&gt; metadata elements in the copy's IPTC metadata. Most image processors leave Dublin Core elements alone so they make a nice home for hash hacks.&lt;/p&gt;

&lt;p&gt;This following J verb modifies the &lt;code&gt;title&lt;/code&gt; element of a sidecar XMP file. It inserts the original file name and the &lt;a href="https://www.texttool.com/sha256-online"&gt;SHA256&lt;/a&gt; hash of the associated image file. The &lt;code&gt;title&lt;/code&gt; element is typically left intact by image editors. So, at the end of your edits, you retain the name of the original file and its stable unique hash.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="n"&gt;titbrandxmp&lt;/span&gt;&lt;span class="p"&gt;=:&lt;/span&gt;&lt;span class="kp"&gt;3&lt;/span&gt; &lt;span class="kp"&gt;:&lt;/span&gt; &lt;span class="kp"&gt;0&lt;/span&gt;

&lt;span class="c1"&gt;NB.*titbrandxmp v--  brand xmp  sidecar file  with file name  and&lt;/span&gt;
&lt;span class="c1"&gt;NB. hash of associated image.&lt;/span&gt;
&lt;span class="c1"&gt;NB.&lt;/span&gt;
&lt;span class="c1"&gt;NB. monad:  clXmp =. titbrandxmp blImageXmpFiles&lt;/span&gt;
&lt;span class="c1"&gt;NB.&lt;/span&gt;
&lt;span class="c1"&gt;NB.   xmp=. 'c:/pictures/2022/Idaho/07_jul/d7500/_DSC8496.NEF.xmp'&lt;/span&gt;
&lt;span class="c1"&gt;NB.   ps=. xmp ;~ (-#'.xmp') }. xmp&lt;/span&gt;
&lt;span class="c1"&gt;NB.   titbrandxmp ps&lt;/span&gt;
&lt;span class="c1"&gt;NB.&lt;/span&gt;
&lt;span class="c1"&gt;NB.   ds=. sidecars 'c:/pictures/2022/North Rim Monument Valley/06_jun/d7500'&lt;/span&gt;
&lt;span class="c1"&gt;NB.   xmps=. titbrandxmp&amp;amp;.&amp;gt; &amp;lt;"1 ds&lt;/span&gt;

&lt;span class="n"&gt;xmp&lt;/span&gt;&lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="n"&gt;read&lt;/span&gt; &lt;span class="n"&gt;xmp&lt;/span&gt; &lt;span class="nf"&gt;[&lt;/span&gt; &lt;span class="s1"&gt;'raw xmp'&lt;/span&gt;&lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="bp"&gt;y&lt;/span&gt;

&lt;span class="c1"&gt;NB. single Dubin Core publisher and creator &lt;/span&gt;
&lt;span class="c1"&gt;NB. elements must exist to safely brand&lt;/span&gt;
&lt;span class="n"&gt;dcp&lt;/span&gt;&lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="s1"&gt;'&amp;lt;/dc:publisher&amp;gt;'&lt;/span&gt;&lt;span class="nf"&gt;;&lt;/span&gt;&lt;span class="s1"&gt;'&amp;lt;/dc:creator&amp;gt;'&lt;/span&gt;
&lt;span class="k"&gt;if.&lt;/span&gt; &lt;span class="nf"&gt;-.&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="nf"&gt;-:&lt;/span&gt; &lt;span class="nf"&gt;+&lt;/span&gt;&lt;span class="o"&gt;/"&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="n"&gt;dcp&lt;/span&gt; &lt;span class="nf"&gt;E.&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="nf"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;xmp&lt;/span&gt; &lt;span class="k"&gt;do.&lt;/span&gt; &lt;span class="n"&gt;xmp&lt;/span&gt; &lt;span class="k"&gt;return.&lt;/span&gt; &lt;span class="k"&gt;end.&lt;/span&gt;

&lt;span class="c1"&gt;NB. file name and sha256 brand&lt;/span&gt;
&lt;span class="n"&gt;tit&lt;/span&gt;&lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="n"&gt;dlf&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/[~(fhash)~]/'&lt;/span&gt;&lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="n"&gt;shabrand&lt;/span&gt; &lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;changestr&lt;/span&gt; &lt;span class="n"&gt;XMPTITLEFRAG&lt;/span&gt;&lt;span class="nf"&gt;-.&lt;/span&gt;&lt;span class="n"&gt;CR&lt;/span&gt;

&lt;span class="c1"&gt;NB. replace or insert title element&lt;/span&gt;
&lt;span class="s1"&gt;'idx cxmp'&lt;/span&gt;&lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tags&lt;/span&gt; &lt;span class="s1"&gt;'dc:title'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;cutnestidx&lt;/span&gt; &lt;span class="n"&gt;xmp&lt;/span&gt;
&lt;span class="k"&gt;if.&lt;/span&gt; &lt;span class="nf"&gt;#&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt; &lt;span class="k"&gt;do.&lt;/span&gt; &lt;span class="nf"&gt;;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;tit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="n"&gt;cxmp&lt;/span&gt;
&lt;span class="k"&gt;else.&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pt&lt;/span&gt; &lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="o"&gt;~&lt;/span&gt; &lt;span class="n"&gt;pt&lt;/span&gt; &lt;span class="n"&gt;beforestr&lt;/span&gt; &lt;span class="n"&gt;xmp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="n"&gt;LF&lt;/span&gt;&lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="n"&gt;tit&lt;/span&gt;&lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="n"&gt;pt&lt;/span&gt; &lt;span class="n"&gt;afterstr&lt;/span&gt; &lt;span class="n"&gt;xmp&lt;/span&gt; &lt;span class="nf"&gt;[&lt;/span&gt; &lt;span class="n"&gt;pt&lt;/span&gt;&lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="nf"&gt;;&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="nf"&gt;{&lt;/span&gt;&lt;span class="n"&gt;dcp&lt;/span&gt;
&lt;span class="k"&gt;end.&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For details on how to use &lt;code&gt;titbrandxmp&lt;/code&gt; refer to the J script &lt;a href="https://github.com/bakerjd99/jacks/blob/master/brandxmp/brandxmp.ijs"&gt;&lt;code&gt;brandxmp.ijs&lt;/code&gt;&lt;/a&gt; and the associated document &lt;a href="https://github.com/bakerjd99/jacks/blob/master/brandxmp/brandxmp.pdf"&gt;&lt;code&gt;brandxmp.pdf&lt;/code&gt;&lt;/a&gt;. Both of these files are &lt;a href="https://github.com/bakerjd99/jacks/tree/master/brandxmp"&gt;available on GitHub&lt;/a&gt;. Enjoy!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Postscript:&lt;/strong&gt; If you already have a recent version of J installed you can directly install this script as an &lt;code&gt;addon&lt;/code&gt; by typing the following command in a JQt or JHS session.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="n"&gt;install&lt;/span&gt; &lt;span class="s1"&gt;'github:bakerjd99/jackshacks'&lt;/span&gt;

&lt;span class="c1"&gt;NB. load script&lt;/span&gt;
&lt;span class="n"&gt;load&lt;/span&gt; &lt;span class="s1"&gt;'~addons/jacks/brandxmp.ijs'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;ol&gt;

&lt;li id="fn1"&gt;
&lt;p&gt;I am hoping that rampaging inflation (2022) is checked so I don't&lt;br&gt;
have to come out of retirement again! ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn2"&gt;
&lt;p&gt;Directories filled with hundreds of names like &lt;code&gt;_DSC8007.NEF&lt;/code&gt; and&lt;br&gt;
&lt;code&gt;IMG_6666.jpg&lt;/code&gt; are useless without thumbnail software. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn3"&gt;
&lt;p&gt;Sidecar files are generated and maintained by many imaging&lt;br&gt;
processing programs. The XMP format is an XML text file. It can be&lt;br&gt;
easily parsed and changed. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;/ol&gt;

</description>
      <category>j</category>
      <category>jsoftware</category>
      <category>xmp</category>
      <category>darktable</category>
    </item>
    <item>
      <title>JETL - J Extract Transform and Load</title>
      <dc:creator>John Baker</dc:creator>
      <pubDate>Thu, 12 Aug 2021 18:28:37 +0000</pubDate>
      <link>https://dev.to/bakerjd99/jetl-j-extract-transform-and-load-9cm</link>
      <guid>https://dev.to/bakerjd99/jetl-j-extract-transform-and-load-9cm</guid>
      <description>&lt;p&gt;&lt;em&gt;Cross posted from my blog &lt;a href="https://analyzethedatanotthedrivel.org/"&gt;Analyze the Data not the Drivel&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I have been surreptitiously using the &lt;a href="https://code.jsoftware.com/wiki/NuVoc"&gt;J programming language&lt;/a&gt; on the job for many years. With rare exceptions, I haven't advertised my clandestine use of J simply because it's strange and scary and nonstandard, and IT departments are filled with cowardly little bedwetters that run away like &lt;a href="https://www.youtube.com/watch?v=BZwuTo7zKM8"&gt;Brave Sir Robin&lt;/a&gt; when confronted with something really different! At first, I tried to convince the bedwetters that J is a fabulous tool: a spear in a world of bent spoons! But, if you've ever tried convincing an &lt;em&gt;academic Marxist&lt;/em&gt; that, yes &lt;a href="https://www.quora.com/Why-do-communists-insist-that-real-communism-has-never-been-tried?share=1"&gt;"communism has been tried"&lt;/a&gt;, or a &lt;em&gt;ufologist&lt;/em&gt; that funny lights in the sky are almost certainly not aliens, or a 9-11 &lt;em&gt;troofer&lt;/em&gt; that the twin towers were &lt;em&gt;absolutely not brought down by controlled demolition&lt;/em&gt;, then you have a pretty good idea what it's like getting IT departments to buy into unsanctioned technologies.&lt;/p&gt;

&lt;p&gt;It's sad, but some of the most pig-headed and closed-minded people I have ever run into are fellow programmers and IT professionals. As a &lt;em&gt;self-anointed&lt;/em&gt; elite, ITites (IT people), can be insufferably sure of themselves. Hubris is still very much a thing! &lt;em&gt;As a general rule, always consider the possibility that you are full of it!&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;With that said, what's so great about J and why should anyone use it for database &lt;a href="https://www.ibm.com/cloud/learn/etl"&gt;ETL&lt;/a&gt; (Extract Transform and Load) jobs?&lt;/p&gt;

&lt;h4&gt;
  
  
  ETL unavoidable grunt work and an IT Achilles heel
&lt;/h4&gt;

&lt;p&gt;ETL jobs are never glamorous! You're not going to win any &lt;a href="https://amturing.acm.org/"&gt;Turing Awards&lt;/a&gt; for converting dumpster-fire data into well-structured database tables. Nor will you garner accolades and &lt;a href="https://www.youtube.com/watch?v=jjaqrPpdQYc"&gt;"We are not worthies"&lt;/a&gt; from peers. Most ITites simply assume the ETL will be done while they dream of all the shiny things the polished data will enable. &lt;em&gt;Precisely because ETL is dull, any moron can do it, grunt work, it's frequently discounted.&lt;/em&gt; I've watched many IT projects sink after being torpedoed by incomplete and crappy ETL. ETL is an IT Achilles heel; it can and has brought the mighty down.&lt;/p&gt;

&lt;p&gt;Standard ETL jobs always consist of odd combinations of:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Gather data in all its unstructured messy glory.&lt;/li&gt;
&lt;li&gt;Pick and choose the &lt;a href="https://www.youtube.com/watch?v=wPiHQ37gXnE"&gt;good bits&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Hack and standardize your selection into rows and tables.&lt;/li&gt;
&lt;li&gt;Load the tables into mostly SQL databases.&lt;/li&gt;
&lt;li&gt;Finish the job with SQL transformations.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Every single item on this list can drain your &lt;a href="https://www.youtube.com/watch?v=N1KvgtEnABY"&gt;precious bodily fluids&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This is where J can help!&lt;/p&gt;

&lt;h4&gt;
  
  
  Slay the Annoyances and Sack the Irritants with J
&lt;/h4&gt;

&lt;p&gt;It's not an exaggeration to say that ETL is a long tiresome sequence of annoyances and irritants. It's easy to see where you need to go, and it's just &lt;em&gt;N&lt;/em&gt;, where &lt;em&gt;N&lt;/em&gt; is an arbitrarily large number of picayune and irritating steps between you and the pure database of your dreams.&lt;/p&gt;

&lt;p&gt;Anything you can do to quickly slay the annoyances and sack the irritants will move you closer to your dream. The last thing you want is to write a lot of code to dispatch trivialities, but sadly, this is often the fate of ETL programmers. Like Sisyphus, they laboriously push one damn rock up the hill only to watch another roll down and take its place. When it comes to moving rocks, you want a bulldozer, not a garden spade. J is a bulldozer!&lt;/p&gt;

&lt;p&gt;You can install Windows, MacOS or Linux versions of J from &lt;a href="https://www.jsoftware.com/#/"&gt;jsoftware.com&lt;/a&gt;. Install J with all the &lt;a href="https://code.jsoftware.com/wiki/Pacman"&gt;standard addons&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Let's take the J bulldozer for a spin.&lt;/p&gt;

&lt;h4&gt;
  
  
  Pimping your J bulldozer with JOD
&lt;/h4&gt;

&lt;p&gt;J scripts are simple ASCII text files with an &lt;code&gt;.ijs&lt;/code&gt; file extension. Edit them with any standard editor. Two of J's standard development environments &lt;a href="https://code.jsoftware.com/wiki/Guides/Qt_IDE"&gt;JQT&lt;/a&gt; and &lt;a href="https://code.jsoftware.com/wiki/Guides/JHS"&gt;JHS&lt;/a&gt; have built-in editors that provide expected goodies like syntax highlighting. Many editor extensions provide J syntax highlighting. One of my favorites is the &lt;em&gt;J Language Support&lt;/em&gt; Visual Studio Code extension. It can be found in &lt;a href="https://github.com/tikkanz/j-vscode"&gt;GitHub here&lt;/a&gt; and installed from the standard Visual Studio Code Marketplace.&lt;/p&gt;

&lt;p&gt;Despite the ubiquity of J oriented text editors and script handling tools I store all my J code, test cases, and documentation in a coding tool I developed called JOD. &lt;a href="https://bakerjd99.files.wordpress.com/2020/11/jod.pdf"&gt;JOD is a refactoring tool&lt;/a&gt;. It lets you easily reuse J words in arbitrary contexts without error prone &lt;em&gt;cut-and-pasting&lt;/em&gt; or rampant &lt;em&gt;over-inclusion&lt;/em&gt;. I use JOD from three J environments, jconsole, JHS, and JQT. &lt;a href="https://code.jsoftware.com/wiki/Addons/general/jod"&gt;JOD is one of many J addons&lt;/a&gt;. It can be installed with the &lt;a href="https://code.jsoftware.com/wiki/Pacman"&gt;J pacman utility&lt;/a&gt;. JOD has been entirely programmed in J and is a good example of using J to build larger systems. &lt;a href="https://github.com/bakerjd99/jod"&gt;JOD source code is on Github&lt;/a&gt;. See &lt;a href="https://analyzethedatanotthedrivel.org/the-jod-page/"&gt;The JOD Page&lt;/a&gt; for more about JOD.&lt;/p&gt;

&lt;p&gt;In the following examples I am going to assume JOD is installed and the utilities &lt;em&gt;dictionary&lt;/em&gt; &lt;code&gt;utils&lt;/code&gt; is available. The latest version of &lt;code&gt;utils&lt;/code&gt; is on &lt;a href="https://github.com/bakerjd99/joddumps"&gt;GitHub here&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="n"&gt;load&lt;/span&gt; &lt;span class="s1"&gt;'general/jod'&lt;/span&gt;
&lt;span class="n"&gt;get&lt;/span&gt; &lt;span class="s1"&gt;'portchars'&lt;/span&gt; &lt;span class="nf"&gt;[&lt;/span&gt; &lt;span class="n"&gt;od&lt;/span&gt; &lt;span class="s1"&gt;'utils'&lt;/span&gt; &lt;span class="nf"&gt;[&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt; &lt;span class="n"&gt;od&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;
&lt;span class="c1"&gt;NB. use ASCII box characters - simplifies blog formatting&lt;/span&gt;
&lt;span class="n"&gt;portchars&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Start your JETL engine
&lt;/h4&gt;

&lt;p&gt;Let's slay some annoyances.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Annoyance #1:&lt;/strong&gt; For reasons best not discussed in civil society, many websites, point-of-sale systems, spreadsheets, and so forth encourage users to compose not-so-great novels in text entry fields. When these "novels" land in data feeds you see crap like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="n"&gt;textfield&lt;/span&gt; &lt;span class="p"&gt;=:&lt;/span&gt; &lt;span class="kp"&gt;0&lt;/span&gt; &lt;span class="kp"&gt;:&lt;/span&gt; &lt;span class="kp"&gt;0&lt;/span&gt;
&lt;span class="sh"&gt;  Leading and trailing
       white     space, tabs,
 line    feeds,
and bytes @*!!
      of crap.
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It's easy to fix crap once its loaded into a SQL table but it's crap like this that get's in the way of loading the table in the first place. The following J expression cleans &lt;code&gt;textfield&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;]&lt;/span&gt; &lt;span class="nf"&gt;#&lt;/span&gt;&lt;span class="o"&gt;~&lt;/span&gt; &lt;span class="nf"&gt;[:&lt;/span&gt; &lt;span class="nf"&gt;-.&lt;/span&gt; &lt;span class="s1"&gt;'  '&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="nf"&gt;E.&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;textfield&lt;/span&gt; &lt;span class="nf"&gt;-.&lt;/span&gt; &lt;span class="n"&gt;CRLF&lt;/span&gt;&lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="n"&gt;TAB&lt;/span&gt;&lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'@#*!'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; Leading and trailing white space, tabs, line feeds, and bytes of crap.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;utils&lt;/code&gt; dictionary contains many character list verbs.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="n"&gt;get&lt;/span&gt; &lt;span class="nf"&gt;;:&lt;/span&gt;&lt;span class="s1"&gt;'reb allwhitetrim'&lt;/span&gt;
&lt;span class="n"&gt;allwhitetrim&lt;/span&gt; &lt;span class="n"&gt;reb&lt;/span&gt; &lt;span class="n"&gt;textfield&lt;/span&gt; &lt;span class="nf"&gt;-.&lt;/span&gt; &lt;span class="n"&gt;CRLF&lt;/span&gt;&lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="n"&gt;TAB&lt;/span&gt;&lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'@#*!'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Leading and trailing white space, tabs, line feeds, and bytes of crap.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Annoyance #2:&lt;/strong&gt; Here's another way too common &lt;a href="https://www.spreadsheetsmadeeasy.com/understanding-csv-files-in-excel/"&gt;&lt;em&gt;Excel&lt;/em&gt;&lt;/a&gt; enabled irritant.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="n"&gt;quotecomma&lt;/span&gt;&lt;span class="p"&gt;=:&lt;/span&gt; &lt;span class="kp"&gt;0&lt;/span&gt; &lt;span class="kp"&gt;:&lt;/span&gt; &lt;span class="kp"&gt;0&lt;/span&gt;
&lt;span class="sh"&gt;"how often",1,2,"have quoted , commas","screwed ,,, line parsing?"
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="n"&gt;getrx&lt;/span&gt; &lt;span class="s1"&gt;'parsecsv'&lt;/span&gt;  &lt;span class="c1"&gt;NB. load parsecsv and words it calls&lt;/span&gt;
&lt;span class="s1"&gt;','&lt;/span&gt; &lt;span class="n"&gt;parsecsv&lt;/span&gt; &lt;span class="n"&gt;quotecomma&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;+-----------+-+-+----------------------+---------------------------+
|"how often"|1|2|"have quoted , commas"|"screwed ,,, line parsing?"|
+-----------+-+-+----------------------+---------------------------+
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;For example this 63MB &lt;code&gt;csv&lt;/code&gt; file contains dreaded commas in quotes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="n"&gt;smoutput&lt;/span&gt; &lt;span class="n"&gt;dir&lt;/span&gt; &lt;span class="s1"&gt;'\\jfsdev04\Shares\SwiftIQPreprocess\SwiftZipCsv\Item*.csv'&lt;/span&gt;
&lt;span class="n"&gt;smoutput&lt;/span&gt; &lt;span class="n"&gt;fi&lt;/span&gt;&lt;span class="p"&gt;=:&lt;/span&gt; &lt;span class="nf"&gt;;&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="n"&gt;dir&lt;/span&gt; &lt;span class="s1"&gt;'\\jfsdev04\Shares\SwiftIQPreprocess\SwiftZipCsv\Item*.csv'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ItemSales-0810202163523.csv    63958716 10-Aug-21 18:37:42
//jfsdev04/Shares/SwiftIQPreprocess/SwiftZipCsv/ItemSales-0810202163523.csv
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;A simple J &lt;em&gt;verb&lt;/em&gt; can flip the offenders to semi-colons.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="n"&gt;getrx&lt;/span&gt; &lt;span class="nf"&gt;;:&lt;/span&gt;&lt;span class="s1"&gt;'repdqchars read'&lt;/span&gt;
&lt;span class="nf"&gt;#&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;&lt;span class="p"&gt;=:&lt;/span&gt; &lt;span class="s1"&gt;',;'&lt;/span&gt; &lt;span class="n"&gt;repdqchars&lt;/span&gt; &lt;span class="n"&gt;read&lt;/span&gt; &lt;span class="n"&gt;fi&lt;/span&gt; &lt;span class="c1"&gt;NB. flip commas - return byte count&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;63958716
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here's the definition of &lt;code&gt;repdqchars&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="n"&gt;repdqchars&lt;/span&gt;&lt;span class="p"&gt;=:&lt;/span&gt;&lt;span class="kp"&gt;4&lt;/span&gt; &lt;span class="kp"&gt;:&lt;/span&gt; &lt;span class="kp"&gt;0&lt;/span&gt;

&lt;span class="c1"&gt;NB.*repdqchars v-- replace double quoted (0{x) characters with (1{x).&lt;/span&gt;
&lt;span class="c1"&gt;NB.&lt;/span&gt;
&lt;span class="c1"&gt;NB. dyad:  cl =. clPair repdqchars cl&lt;/span&gt;
&lt;span class="c1"&gt;NB.&lt;/span&gt;
&lt;span class="c1"&gt;NB.   s=. '"go ahead, replace","quoted commas,,,,"'&lt;/span&gt;
&lt;span class="c1"&gt;NB.   ',;' repdqchars s&lt;/span&gt;
&lt;span class="c1"&gt;NB.   s -: ',,' repdqchars s&lt;/span&gt;

&lt;span class="k"&gt;if.&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt; &lt;span class="nf"&gt;&amp;lt;:&lt;/span&gt; &lt;span class="nf"&gt;#&lt;/span&gt;&lt;span class="bp"&gt;y&lt;/span&gt; &lt;span class="k"&gt;do.&lt;/span&gt;
  &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="s1"&gt;'unbalanced quotes'&lt;/span&gt;
  &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="n"&gt;assert&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="nf"&gt;=&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt; &lt;span class="nf"&gt;|&lt;/span&gt; &lt;span class="nf"&gt;+&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="s1"&gt;'"'&lt;/span&gt;&lt;span class="nf"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;y&lt;/span&gt;

  &lt;span class="c1"&gt;NB. mask of quoted characters&lt;/span&gt;
  &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="nf"&gt;+&lt;/span&gt;&lt;span class="o"&gt;/\&lt;/span&gt; &lt;span class="m"&gt;_1&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="nf"&gt;{&lt;/span&gt; &lt;span class="nf"&gt;|:&lt;/span&gt; &lt;span class="m"&gt;_2&lt;/span&gt; &lt;span class="nf"&gt;]&lt;/span&gt;&lt;span class="o"&gt;\&lt;/span&gt; &lt;span class="nf"&gt;I.&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;
  &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="n"&gt;assert&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="nf"&gt;-:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;&amp;lt;.&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nf"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;&amp;gt;.&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;

  &lt;span class="c1"&gt;NB. replace quoted (0{x) with (1{x)&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="nf"&gt;{&lt;/span&gt;&lt;span class="bp"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;I.&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="nf"&gt;#&lt;/span&gt;&lt;span class="o"&gt;^:&lt;/span&gt;&lt;span class="m"&gt;_1&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="nf"&gt;{&lt;/span&gt;&lt;span class="bp"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nf"&gt;=&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="nf"&gt;#&lt;/span&gt;&lt;span class="bp"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="bp"&gt;y&lt;/span&gt;
&lt;span class="k"&gt;else.&lt;/span&gt;
  &lt;span class="bp"&gt;y&lt;/span&gt;
&lt;span class="k"&gt;end.&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Much of &lt;code&gt;repdqchars&lt;/code&gt; consists of comments and assertions. Not having to write a lot of code is a big plus when slaying annoyances and sacking irritants.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Annoyance #3:&lt;/strong&gt; When defining database tables, it helps to understand how data is distributed --- &lt;em&gt;duh!&lt;/em&gt; Before creating tables, it's helpful to run quick-and-dirty analyses. One useful summary is &lt;em&gt;column cardinality&lt;/em&gt;, the number of unique column items.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="c1"&gt;NB. read and parse TAB delimited text - first row column names&lt;/span&gt;
&lt;span class="n"&gt;get&lt;/span&gt; &lt;span class="s1"&gt;'readtd2'&lt;/span&gt;
&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;=:&lt;/span&gt; &lt;span class="s1"&gt;'\\jfsdev04\Shares\DailyXMLPreprocess\DailyTsv\raw_RetailItemGroup_RMI.txt'&lt;/span&gt;
&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;=:&lt;/span&gt; &lt;span class="n"&gt;readtd2&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;
&lt;span class="c1"&gt;NB. comput column cardinalties&lt;/span&gt;
&lt;span class="n"&gt;smoutput&lt;/span&gt; &lt;span class="s1"&gt;'row count: '&lt;/span&gt;&lt;span class="nf"&gt;,":#&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;&amp;gt;&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="nf"&gt;{&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nf"&gt;;&lt;/span&gt;&lt;span class="o"&gt;~&lt;/span&gt; &lt;span class="nf"&gt;,.&lt;/span&gt; &lt;span class="nf"&gt;#&lt;/span&gt;&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="nf"&gt;~.&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="nf"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;&amp;lt;&lt;/span&gt;&lt;span class="o"&gt;"&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="nf"&gt;|:&lt;/span&gt; &lt;span class="nf"&gt;}.&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;row count: 68604
+-----+------------------------+
| 2975|retailItemGrpID         |
| 2784|retailItemGrpName       |
|  316|retailItemGrpDescription|
|  283|orgUnitOwnerID          |
|22260|rmiID                   |
|21930|rmiName                 |
|  255|rmiExtID                |
|    1|ZipFileName             |
|    1|ZipRunNumber            |
|    1|XMLFileName             |
|    1|XMLFileDate             |
|    1|CreateDatetime          |
+-----+------------------------+
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;J is hardly unique in its ability to subdue ETL irritants. I've personally used Python, Powershell, and SQL scripts as well but for me J administers the biggest bang per byte! &lt;strong&gt;Adding J to your coding arsenal will make you a more versatile programmer and a better person. Do it!&lt;/strong&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  The Joys of JOD and other code databases
&lt;/h4&gt;

&lt;p&gt;One of the many advantages of storing code in tools like JOD is that it's easy to run ad hoc code analysis.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="n"&gt;load&lt;/span&gt; &lt;span class="s1"&gt;'general/jod'&lt;/span&gt;
&lt;span class="n"&gt;od&lt;/span&gt; &lt;span class="nf"&gt;;:&lt;/span&gt;&lt;span class="s1"&gt;'jacksons utils'&lt;/span&gt; &lt;span class="nf"&gt;[&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt; &lt;span class="n"&gt;od&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt; &lt;span class="c1"&gt;NB. open JOD dictionaries&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;+-+-----------------+--------+-----+
|1|opened (rw/ro) -&amp;gt;|jacksons|utils|
+-+-----------------+--------+-----+
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here's some basic information about the JOD group &lt;code&gt;swiftprep&lt;/code&gt;. J scripts are easily generated from groups with JOD's &lt;code&gt;mls&lt;/code&gt;, (make load script) verb.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="n"&gt;smoutput&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;":#}.&lt;/span&gt; &lt;span class="n"&gt;grp&lt;/span&gt; &lt;span class="s1"&gt;'swiftprep'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;': words in script'&lt;/span&gt;
&lt;span class="n"&gt;smoutput&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;":#;{:&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt; &lt;span class="n"&gt;mls&lt;/span&gt; &lt;span class="s1"&gt;'swiftprep'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;': size of commented script'&lt;/span&gt;
&lt;span class="n"&gt;smoutput&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;":#;{:&lt;/span&gt;&lt;span class="n"&gt;compj&lt;/span&gt; &lt;span class="nf"&gt;}.&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt; &lt;span class="n"&gt;grp&lt;/span&gt; &lt;span class="s1"&gt;'swiftprep'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;': size of minimized script'&lt;/span&gt;
&lt;span class="c1"&gt;NB. a single static "noun" of tables and column names&lt;/span&gt;
&lt;span class="n"&gt;tcl&lt;/span&gt;&lt;span class="p"&gt;=:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;}.&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt; &lt;span class="n"&gt;grp&lt;/span&gt; &lt;span class="s1"&gt;'swiftprep'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nf"&gt;-.&lt;/span&gt; &lt;span class="nf"&gt;&amp;lt;&lt;/span&gt;&lt;span class="s1"&gt;'SwiftLandTableHeaders'&lt;/span&gt;
&lt;span class="n"&gt;smoutput&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;":#;{:&lt;/span&gt;&lt;span class="n"&gt;compj&lt;/span&gt; &lt;span class="n"&gt;tcl&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;': minimized script without largest noun'&lt;/span&gt;
&lt;span class="n"&gt;smoutput&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;":&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;+&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nf"&gt;%&lt;/span&gt; &lt;span class="nf"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;;&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="nf"&gt;{&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="m"&gt;15&lt;/span&gt; &lt;span class="n"&gt;get&lt;/span&gt; &lt;span class="nf"&gt;}.&lt;/span&gt; &lt;span class="n"&gt;grp&lt;/span&gt; &lt;span class="s1"&gt;'swiftprep'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;': mean bytes per word'&lt;/span&gt;
&lt;span class="n"&gt;smoutput&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;":#;:&lt;/span&gt; &lt;span class="nf"&gt;;&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="nf"&gt;{&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt; &lt;span class="n"&gt;mls&lt;/span&gt; &lt;span class="s1"&gt;'swiftprep'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;': total commented J token count'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;106: words in script
42221: size of commented script
23403: size of minimized script
16492: minimized script without largest noun
529.91509: mean bytes per word
5945: total commented J token count
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;As you can see there isn't a lot of code, roughly 17K.&lt;/p&gt;

&lt;p&gt;As nice as off-the-cuff code statistics are, the biggest advantage of code databases is how they simplify code recycling and document&lt;br&gt;
generation. The ETL &lt;code&gt;swiftprep&lt;/code&gt; group currently contains 106 J words. 69 of those words are specific to &lt;code&gt;swiftprep&lt;/code&gt;. The rest are common utilities and recycled words. When &lt;code&gt;swiftprep.ijs&lt;/code&gt; is generated with JOD's &lt;code&gt;mls&lt;/code&gt; verb all of &lt;code&gt;swiftprep&lt;/code&gt;\'s words are collected into a single stand-alone script. For such stand-alone scripts, I include an &lt;em&gt;interface noun&lt;/em&gt; that lists group words you should be aware of.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="n"&gt;get&lt;/span&gt; &lt;span class="s1"&gt;'IFACEWORDSswiftprep'&lt;/span&gt;
&lt;span class="n"&gt;hlpnl&lt;/span&gt; &lt;span class="n"&gt;IFACEWORDSswiftprep&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;+-------------------+-------------------------------------------------------+
|dailyswift         |daily SwiftIQ zip loading and maintenance tasks        |
|fullheaderscan     |scan SwiftIQ zips and build complete table headers     |
|loadswiftcsvs      |copies zip to working directory and extracts csvs      |
|loadswifttabs      |convert SwiftIQ csvs to TAB delimited table files      |
|procqueuezips      |processes queued zips from file                        |
|procswiftzip       |process a SwiftIQ zip                                  |
|queueswiftSizes    |file sizes: queueswiftSizes SwiftAuxDir,'swift2.txt'   |
|queueswiftZips     |write (y) line file of queued zips                     |
|runswiftssis       |runs SwiftIQ SSIS batch scripts                        |
|showswiftUnload    |list never loaded zips in load order                   |
|showswiftlog       |show SwiftIQ log entries                               |
|swiftLoaded        |loaded swift zips with return code: swiftLoaded '&amp;lt;1000'|
|swiftMissingColumns|list missing SwiftIQ landing table columns             |
|swiftZips          |sorted SwiftIQ zip files                               |
+-------------------+-------------------------------------------------------+
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Finally, code databases typically contain far more than code! Documentation is also stored. Stored documentation can be used to generate &lt;a href="https://en.wikipedia.org/wiki/Literate_programming"&gt;&lt;em&gt;literate programming&lt;/em&gt;&lt;/a&gt; documents. In JOD's case &lt;a href="https://analyzethedatanotthedrivel.org/2020/05/25/using-jodliterate/"&gt;jodliterate&lt;/a&gt; applied to &lt;code&gt;swiftprep&lt;/code&gt; creates this &lt;a href="https://bakerjd99.files.wordpress.com/2021/08/swiftprep.pdf"&gt;indexed and annoted PDF&lt;/a&gt;. Enjoy!&lt;/p&gt;

&lt;h4&gt;
  
  
  Let J tell SSIS where to get off
&lt;/h4&gt;

&lt;p&gt;I hope I've convinced you that J can help with ETL irritants. It can help, but it cannot do the entire job. At some point you will have to bring in tools like &lt;a href="https://docs.microsoft.com/en-us/sql/tools/bcp-utility?view=sql-server-ver15"&gt;bcp&lt;/a&gt;, &lt;a href="https://docs.microsoft.com/en-us/sql/integration-services/sql-server-integration-services?view=sql-server-ver15"&gt;SSIS&lt;/a&gt; or &lt;a href="https://www.talend.com/"&gt;Talend&lt;/a&gt;. SSIS and Talend are typically the drivers of ETL processes; they call other specialized tools as needed, but I've found that it's easier to use scripting tools like J as drivers. In this blog post I outline how easy it is to &lt;a href="https://analyzethedatanotthedrivel.org/2021/06/05/running-ssis-packages-with-python/"&gt;control SSIS with Python&lt;/a&gt;. You can use the same trick with J.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;swiftprep&lt;/code&gt; ETL script does:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; Finds a large daily FTP'ed zip file containing CSVs.&lt;/li&gt;
&lt;li&gt; Unzips the CSVs to working directories.&lt;/li&gt;
&lt;li&gt; Converts unruly CSVs to standard TAB delimited text files.&lt;/li&gt;
&lt;li&gt; Runs an SSIS package to load TAB files to SQLServer staging tables.&lt;/li&gt;
&lt;li&gt; Runs another SSIS package to transform staging tables to a reporting
tables.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This reads worse than it is. The main jumping off word in &lt;code&gt;swiftprep&lt;/code&gt; is &lt;code&gt;dailyswift&lt;/code&gt;. Here's &lt;code&gt;dailyswift&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="n"&gt;dailyswift&lt;/span&gt;&lt;span class="p"&gt;=:&lt;/span&gt;&lt;span class="kp"&gt;3&lt;/span&gt; &lt;span class="kp"&gt;:&lt;/span&gt; &lt;span class="kp"&gt;0&lt;/span&gt;

&lt;span class="c1"&gt;NB.*dailyswift v-- daily SwiftIQ zip loading and maintenance tasks.&lt;/span&gt;
&lt;span class="c1"&gt;NB.&lt;/span&gt;
&lt;span class="c1"&gt;NB. monad:  (ilRc ; clGuid) =. dailyswift uuIgnore&lt;/span&gt;
&lt;span class="c1"&gt;NB. dyad:  (ilRc ; clGuid) =. iaDayCode dailyswift uuIgnore&lt;/span&gt;
&lt;span class="c1"&gt;NB.&lt;/span&gt;
&lt;span class="c1"&gt;NB.   _1 dailyswift 0  NB. update stage only&lt;/span&gt;
&lt;span class="c1"&gt;NB.   dailyswift~ _1&lt;/span&gt;
&lt;span class="c1"&gt;NB.&lt;/span&gt;
&lt;span class="c1"&gt;NB.   NB. full table reload&lt;/span&gt;
&lt;span class="c1"&gt;NB.   42 dailyswift 0&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;weekday&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt; &lt;span class="nf"&gt;{.&lt;/span&gt; &lt;span class="m"&gt;6&lt;/span&gt;&lt;span class="o"&gt;!:&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;dailyswift&lt;/span&gt; &lt;span class="bp"&gt;y&lt;/span&gt;
&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="c1"&gt;NB. stage latest zip&lt;/span&gt;
&lt;span class="s1"&gt;'rc tg'&lt;/span&gt;&lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="n"&gt;procswiftzip&lt;/span&gt; &lt;span class="nf"&gt;;&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="nf"&gt;{&lt;/span&gt;&lt;span class="n"&gt;swiftZips&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;
&lt;span class="k"&gt;if.&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="nf"&gt;~:&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="nf"&gt;{&lt;/span&gt;&lt;span class="n"&gt;rc&lt;/span&gt; &lt;span class="k"&gt;do.&lt;/span&gt; &lt;span class="n"&gt;rc&lt;/span&gt;&lt;span class="nf"&gt;;&lt;/span&gt;&lt;span class="n"&gt;tg&lt;/span&gt; &lt;span class="k"&gt;return.&lt;/span&gt; &lt;span class="k"&gt;end.&lt;/span&gt;

&lt;span class="k"&gt;if.&lt;/span&gt; &lt;span class="m"&gt;_1&lt;/span&gt;&lt;span class="nf"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;x&lt;/span&gt; &lt;span class="k"&gt;do.&lt;/span&gt;
  &lt;span class="n"&gt;rc&lt;/span&gt;&lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="n"&gt;SwiftLog&lt;/span&gt; &lt;span class="s1"&gt;'only updating staging'&lt;/span&gt;
&lt;span class="k"&gt;else.&lt;/span&gt;
  &lt;span class="n"&gt;rcnts&lt;/span&gt;&lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="n"&gt;cntswiftrows&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;  &lt;span class="c1"&gt;NB. report/stage item counts&lt;/span&gt;

  &lt;span class="c1"&gt;NB. update reporting&lt;/span&gt;
  &lt;span class="n"&gt;rc&lt;/span&gt;&lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="n"&gt;logswiftReportBegin&lt;/span&gt; &lt;span class="n"&gt;tg&lt;/span&gt;&lt;span class="nf"&gt;;&lt;/span&gt;&lt;span class="n"&gt;rcnts&lt;/span&gt;
  &lt;span class="k"&gt;if.&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="nf"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;x&lt;/span&gt; &lt;span class="k"&gt;do.&lt;/span&gt;
    &lt;span class="n"&gt;SwiftLog&lt;/span&gt; &lt;span class="s1"&gt;'rowcnts::pidx: '&lt;/span&gt;&lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="n"&gt;rcnts&lt;/span&gt;
    &lt;span class="n"&gt;SwiftLog&lt;/span&gt; &lt;span class="s1"&gt;'reindexing - partial table update'&lt;/span&gt;
    &lt;span class="n"&gt;rc&lt;/span&gt;&lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="n"&gt;runswiftssis&lt;/span&gt; &lt;span class="s1"&gt;'swiftreindex.bat'&lt;/span&gt;
    &lt;span class="n"&gt;rc&lt;/span&gt;&lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="n"&gt;runswiftssis&lt;/span&gt; &lt;span class="s1"&gt;'swiftreport.bat'&lt;/span&gt;

  &lt;span class="k"&gt;elseif.&lt;/span&gt; &lt;span class="m"&gt;42&lt;/span&gt;&lt;span class="nf"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;x&lt;/span&gt; &lt;span class="k"&gt;do.&lt;/span&gt;
    &lt;span class="c1"&gt;NB. full updates are best babysat - trigger with&lt;/span&gt;
    &lt;span class="c1"&gt;NB. code 42 - life the universe and everything&lt;/span&gt;
    &lt;span class="n"&gt;SwiftLog&lt;/span&gt; &lt;span class="s1"&gt;'rowcnts::full: '&lt;/span&gt;&lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="n"&gt;rcnts&lt;/span&gt;
    &lt;span class="n"&gt;SwiftLog&lt;/span&gt; &lt;span class="s1"&gt;'full table refresh'&lt;/span&gt;
    &lt;span class="n"&gt;rc&lt;/span&gt;&lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="n"&gt;runswiftssis&lt;/span&gt; &lt;span class="s1"&gt;'swiftreportfull.bat'&lt;/span&gt;

  &lt;span class="k"&gt;elseif.do.&lt;/span&gt;
    &lt;span class="n"&gt;SwiftLog&lt;/span&gt; &lt;span class="s1"&gt;'rowcnts:: '&lt;/span&gt;&lt;span class="nf"&gt;,&lt;/span&gt;&lt;span class="n"&gt;rcnts&lt;/span&gt;
    &lt;span class="n"&gt;SwiftLog&lt;/span&gt; &lt;span class="s1"&gt;'partial table update'&lt;/span&gt;
    &lt;span class="n"&gt;rc&lt;/span&gt;&lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="n"&gt;runswiftssis&lt;/span&gt; &lt;span class="s1"&gt;'swiftreport.bat'&lt;/span&gt;
  &lt;span class="k"&gt;end.&lt;/span&gt;
  &lt;span class="n"&gt;rc&lt;/span&gt;&lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="n"&gt;logswiftReportEnd&lt;/span&gt; &lt;span class="n"&gt;rc&lt;/span&gt;&lt;span class="nf"&gt;;&lt;/span&gt;&lt;span class="n"&gt;tg&lt;/span&gt;
&lt;span class="k"&gt;end.&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Maybe I'm an old jaded programmer, but &lt;code&gt;dailyswift&lt;/code&gt; doesn't look very scary. Obfuscated J can be phenomenally impenetrable. Look at the definition of &lt;code&gt;readtd2&lt;/code&gt;. Sadly, it's easy to create unreadable code in any programming language. If program code was edited like novel manuscripts, we'd seldom see incomprehensible crap but it isn't, so we must exercise personal discretion. &lt;strong&gt;&lt;em&gt;My goal when coding is to make things so clear that that they're dismissed as trivial!&lt;/em&gt;&lt;/strong&gt; And, let's face it, ETL, as tedious and time-consuming as it typically is, is trivial!&lt;/p&gt;

</description>
      <category>etl</category>
      <category>ssis</category>
      <category>jsofware</category>
      <category>j</category>
    </item>
    <item>
      <title>Running SSIS Packages with Python</title>
      <dc:creator>John Baker</dc:creator>
      <pubDate>Thu, 08 Jul 2021 14:54:22 +0000</pubDate>
      <link>https://dev.to/bakerjd99/running-ssis-packages-with-python-11l8</link>
      <guid>https://dev.to/bakerjd99/running-ssis-packages-with-python-11l8</guid>
      <description>&lt;p&gt;&lt;em&gt;Cross posted from my blog &lt;a href="https://analyzethedatanotthedrivel.org/"&gt;Analyze the Data not the Drivel&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Microsoft's &lt;a href="https://docs.microsoft.com/en-us/sql/integration-services/sql-server-integration-services?view=sql-server-ver15"&gt;SSIS (SQL Server Integration Services)&lt;/a&gt;&lt;br&gt;
is a ubiquitous ETL (Extract Transform and Load) tool. Despite its&lt;br&gt;
widespread use, SSIS is not loved! At best, it's tolerated for its&lt;br&gt;
undeniable utility, but SSIS's "utility" comes with a host of hideous&lt;br&gt;
warts with the ugliest being its &lt;em&gt;file handling&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;You would think that an ETL tool would excel at file manipulation; how&lt;br&gt;
else would you "extract" data? But, for reasons lost to posterity, SSIS&lt;br&gt;
totally screws the file handling pooch. Now there are "sanctioned" ways&lt;br&gt;
to unscrew the pooch. Use "script tasks" is a common retort. With script&lt;br&gt;
tasks, you code your file handling bits in something like C# and then&lt;br&gt;
use SSIS to compile and incorporate the C# parts into the larger ETL&lt;br&gt;
process. Script tasks work but they're a pain to debug and maintain.&lt;br&gt;
There are other &lt;em&gt;pooch unscrewers&lt;/em&gt;. Many code everything in SQL stored procedures, others build &lt;a href="https://stackoverflow.com/questions/1268342/what-is-the-gac-in-net"&gt;GAC dlls&lt;/a&gt;, some use &lt;a href="https://docs.microsoft.com/en-us/sql/integration-services/control-flow/execute-process-task?view=sql-server-ver15"&gt;command tasks&lt;/a&gt;&lt;br&gt;
to shell out to file utilities. Sadly, the abundance of SSIS file&lt;br&gt;
handling work-a-arounds just affirms that SSIS blows hard here!&lt;/p&gt;

&lt;p&gt;I could bitch about SSIS file handling for fortnights but, bitching does not solve problems. Luckily, there is an easy way to use SSIS's &lt;a href="https://www.youtube.com/watch?v=wPiHQ37gXnE"&gt;good bits&lt;/a&gt; without getting sucked into the script task swamp.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;SSIS typically calls helper processes like script tasks, but if you&lt;br&gt;
flip things around and call SSIS from helper processes you can easily&lt;br&gt;
leverage the strengths of both SSIS and helpers.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The following Python snippet, taken from a recent project, illustrates&lt;br&gt;
this technique.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;all_cube_zips&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;etl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;daily_zip_files&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;working_dirs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;rtp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;working_dirs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
                                    &lt;span class="n"&gt;runtime_parms&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;rtp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;runtime_parms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;cnt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;zipfile&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;all_cube_zips&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;

   &lt;span class="n"&gt;etl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;clear_working_dirs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;working_dirs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;rtp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;working_dirs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
                       &lt;span class="n"&gt;runtime_parms&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;rtp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;runtime_parms&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
                       &lt;span class="n"&gt;name_prefix&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"DailyRetails"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

   &lt;span class="n"&gt;dailyretail_xml_files&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;etl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;unzip_prefix_xml&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name_prefix&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"DailyRetails"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
                       &lt;span class="n"&gt;daily_zip_file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;zipfile&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
                       &lt;span class="n"&gt;working_dirs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;rtp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;working_dirs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                       &lt;span class="n"&gt;runtime_parms&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;rtp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;runtime_parms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

   &lt;span class="n"&gt;tsv_files&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dcx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write_dailyretail_tsvs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;xml_files&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;dailyretail_xml_files&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
                       &lt;span class="n"&gt;daily_zip_file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;basename&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;zipfile&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; 
                       &lt;span class="n"&gt;column_defaults&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;rcd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DailyRetails_column_defaults&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
                       &lt;span class="n"&gt;working_dirs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;rtp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;working_dirs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                       &lt;span class="n"&gt;runtime_parms&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;rtp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;runtime_parms&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                       &lt;span class="n"&gt;append_only&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

   &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;system&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="s"&gt;"\\Shares\DailyCode\bats\SelectReload.bat"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This Python program unzips hundreds of XML files to a working directory and then "flattens" them to a simple TAB delimited text file. This is typical ETL stuff. After writing the TAB delimited file Python runs a simple batch script to execute an SSIS package. The batch script is called with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;system&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="s"&gt;"\\Shares\DailyCode\bats\SelectReload.bat"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The following script lines from &lt;code&gt;SelectReload.bat&lt;/code&gt; show how to set up and run a package with &lt;a href="https://docs.microsoft.com/en-us/sql/integration-services/packages/dtexec-utility?view=sql-server-ver15"&gt;&lt;code&gt;dtexec.exe&lt;/code&gt;&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight batchfile"&gt;&lt;code&gt;&lt;span class="c"&gt;rem change to your dtexec directory&lt;/span&gt;
&lt;span class="nb"&gt;cd&lt;/span&gt; &lt;span class="s2"&gt;"c:\Program Files\Microsoft SQL Server\150\DTS\Binn"&lt;/span&gt;
&lt;span class="c"&gt;rem execute an SSIS package - use fully qualified paths&lt;/span&gt;
&lt;span class="kd"&gt;dtexec&lt;/span&gt; &lt;span class="na"&gt;/Project &lt;/span&gt;&lt;span class="s2"&gt;"\\Shares\DailyCode\ispacs\flat.ispac"&lt;/span&gt; &lt;span class="na"&gt;/Package &lt;/span&gt;&lt;span class="kd"&gt;Reloads&lt;/span&gt;.dtsx
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nv"&gt;%ERRORLEVEL%&lt;/span&gt; &lt;span class="kd"&gt;NEQ&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="k"&gt;goto&lt;/span&gt; &lt;span class="kd"&gt;Error18&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, all the irritating file hacking is done with Python leaving&lt;br&gt;
streamlined ETL jobs for SSIS. Both processes are simplified. Both are easy to test and debug. With software, it's always wise to try simple first.&lt;/p&gt;

&lt;p&gt;As a final warning, before you adopt this approach be aware that you need full &lt;a href="https://www.makeuseof.com/tag/windows-admin-rights/"&gt;"Administrative Rights"&lt;/a&gt; to run &lt;code&gt;dtexec.com&lt;/code&gt; like this. Many anally retentive IT outfits will simply outlaw such renegade scripting. If you're incarcerated in &lt;em&gt;Administrative Rights&lt;/em&gt; jail I'd suggest taking advantage of the &lt;a href="https://inthesetimes.com/article/labor-shortage-jobs-report-low-wages-biden-covid"&gt;current labor shortage&lt;/a&gt; and find another job. Squabbling over administrative rights is a demeaning waste of IT talent that can no longer be tolerated.&lt;/p&gt;

</description>
      <category>python</category>
      <category>ssis</category>
      <category>sql</category>
      <category>etl</category>
    </item>
    <item>
      <title>NumPy another Iverson Ghost</title>
      <dc:creator>John Baker</dc:creator>
      <pubDate>Tue, 14 Jul 2020 19:10:24 +0000</pubDate>
      <link>https://dev.to/bakerjd99/numpy-another-iverson-ghost-9mc</link>
      <guid>https://dev.to/bakerjd99/numpy-another-iverson-ghost-9mc</guid>
      <description>&lt;p&gt;&lt;em&gt;Cross posted from my blog &lt;a href="https://analyzethedatanotthedrivel.org/"&gt;Analyze the Data not the Drivel&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;During my recent &lt;a href="https://github.com/bakerjd99/smugpyter"&gt;SmugMug API and Python&lt;/a&gt; adventures I was haunted&lt;br&gt;
by an Iverson ghost: &lt;code&gt;NumPy&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;An Iverson ghost is an embedding of &lt;a href="https://en.wikipedia.org/wiki/APL_(programming_language)"&gt;APL&lt;/a&gt; like array programming&lt;br&gt;
features in nonAPL languages and tools.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You would be surprised at how often Iverson ghosts appear. Whenever&lt;br&gt;
programmers are challenged with processing large numeric arrays they rediscover bits of APL. Often they’re unaware of the rich heritage of array processing languages but in &lt;code&gt;NumPy&lt;/code&gt;'s case, they &lt;em&gt;indirectly&lt;/em&gt; acknowledged the debt. In &lt;a href="https://numpy.sourceforge.net/numdoc/numdoc.pdf"&gt;&lt;em&gt;Numerical Python&lt;/em&gt;&lt;/a&gt; the authors wrote:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;The languages which were used to guide the development of NumPy&lt;br&gt;
include the infamous APL family of languages, Basis, MATLAB, FORTRAN, S and S+, and others.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I consider “infamous” an upgrade from &lt;a href="https://www.cs.virginia.edu/~evans/cs655/readings/ewd498.html"&gt;“a mistake carried through to perfection.”&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Not only do developers frequently conjure up Iverson ghosts they also invariably turn into little apostles of array programming that won’t shut up about how cutting down on all those goddamn loops clarifies and simplifies algorithms. How learning to think about operating on entire arrays, versus one dinky number at a time, frees the mind. Why it’s almost as if array programming &lt;a href="https://www.jsoftware.com/papers/tot.htm"&gt;is a tool of thought.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Where have I heard this before?&lt;/p&gt;

&lt;p&gt;Ahh, I’ve got it, when I first encountered APL almost fifty years ago.&lt;/p&gt;

&lt;p&gt;Yes, I am an old programmer, a fossil, a living relic. My brain is a putrid pool of punky programming languages. Python is just the latest in a longish line of languages. Some people collect stamps. I collect programming languages. And, just like stamp collectors have favorite stamps, I find some programming languages more attractive than others. For example, I recognize the undeniable utility of &lt;code&gt;C/C++&lt;/code&gt;, for many tasks they are the only serious options, yet as useful and pervasive as &lt;code&gt;C/C++&lt;/code&gt; are they have never tickled my fancy. The notation is ugly! Yeah, I said it; suck on it C people. Similarly, the world’s most commonly used programming language &lt;code&gt;JavaScript&lt;/code&gt; is equally ugly. Again, &lt;code&gt;JavaScript&lt;/code&gt; is so damn useful that programmers put up with its many warts. Some have even made a few bucks writing books about its &lt;a href="https://shop.oreilly.com/product/9780596517748.do"&gt;meager good parts&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I have similar inflammatory opinions about other widely used languages. The one that is making me miserable now is &lt;code&gt;SQL&lt;/code&gt;, particularly Microsoft’s variant &lt;code&gt;T-SQL&lt;/code&gt;. On purely aesthetic grounds I find well-formed &lt;code&gt;SQL&lt;/code&gt; queries less appalling than your average &lt;code&gt;C&lt;/code&gt; pointer fest. Core &lt;code&gt;SQL&lt;/code&gt; is fairly elegant but the macro programming features that have grown up around it are depraved. I feel dirty when forced to use them which is just about every other day.&lt;/p&gt;

&lt;p&gt;At the end of my programming day, I want to look on something that is beautiful. I don’t particularly care about how useful a chunk of code is or how much money it might make, or what silly little business problem it solves. If the damn code is ugly I don’t want to see it.&lt;/p&gt;

&lt;p&gt;People keep rediscovering array programming, best described in Ken&lt;br&gt;
Iverson’s 1962 book &lt;a href="https://www.jsoftware.com/papers/APL.htm"&gt;&lt;em&gt;A Programming Language&lt;/em&gt;&lt;/a&gt;, for two basic&lt;br&gt;
reasons:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;It’s an efficient way to handle an important class of problems.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It’s a step away from the ugly and back towards the beautiful.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Both of these reasons manifest in &lt;code&gt;NumPy&lt;/code&gt;’s resounding success in the Python world.&lt;/p&gt;

&lt;p&gt;As usual, efficiency led the way. The authors of &lt;em&gt;Numerical Python&lt;/em&gt;&lt;br&gt;
note:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Why are these extensions needed? The core reason is a very prosaic&lt;br&gt;
one, and that is that manipulating a set of a million numbers in&lt;br&gt;
Python with the standard data structures such as lists, tuples or&lt;br&gt;
classes is much too slow and uses too much space.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Faced with a &lt;a href="https://www.youtube.com/watch?v=ZBAijg5Betw"&gt;“does not compute”&lt;/a&gt; situation you can either try something else or fix what you have. The Python people fixed Python with &lt;code&gt;NumPy&lt;/code&gt;.&lt;br&gt;
&lt;a href="https://www.quora.com/Whats-the-exact-difference-between-a-Pythoneer-and-a-Pythonista"&gt;Pythonistas&lt;/a&gt; reluctantly embraced &lt;code&gt;NumPy&lt;/code&gt; but quickly went &lt;em&gt;apostolic!&lt;/em&gt; Now books like &lt;a href="https://shop.oreilly.com/product/0636920038481.do"&gt;&lt;em&gt;Elegant SciPy&lt;/em&gt;&lt;/a&gt; and the entire &lt;code&gt;SciPy&lt;/code&gt; toolset that been built on &lt;code&gt;NumPy&lt;/code&gt; take it for granted.&lt;/p&gt;

&lt;p&gt;Is there anything in &lt;code&gt;NumPy&lt;/code&gt; for programmers that have been drinking the array processing cool aid for decades? The answer is yes! &lt;a href="https://code.jsoftware.com/wiki/NuVoc"&gt;J programmers&lt;/a&gt;, in particular, are in for a treat with the new Python3 addon that’s been released with the latest J 8.07 beta. This addon directly supports &lt;code&gt;NumPy&lt;/code&gt; arrays making it easy to swap data in and out of the J/Python environments. It’s one of those best of both worlds things.&lt;/p&gt;

&lt;p&gt;The following &lt;code&gt;NumPy&lt;/code&gt; examples are from the &lt;code&gt;SciPy.org&lt;/code&gt; &lt;a href="https://numpy.org/doc/1.19/user/quickstart.html"&gt;NumPy quick start tutorial&lt;/a&gt;. For each &lt;code&gt;NumPy&lt;/code&gt; statement, I have provided a J equivalent. J is a descendant of APL. It was largely designed by the same man: Ken Iverson. A scumbag lawyer or greedy patent troll might consider suing &lt;code&gt;NumPy&lt;/code&gt;’s creators after looking at these examples. APL’s influence is obvious. Fortunately, Ken Iverson was more interested in promoting good ideas than profiting from them. I suspect he would be flattered that APL has mutated and colonized strange new worlds and I think even zealous Pythonistas will agree that Python is a delightfully strange world.&lt;/p&gt;

&lt;h4&gt;
  
  
  Some Numpy and J examples
&lt;/h4&gt;

&lt;p&gt;Selected Examples from &lt;a href="https://numpy.org/doc/1.19/user/quickstart.html"&gt;https://numpy.org/doc/1.19/user/quickstart.html&lt;/a&gt; Output has been suppressed here. For a more detailed look at these examples browse the Jupyter notebook:  &lt;a href="https://github.com/bakerjd99/jacks/blob/master/numpyjlove/NumPy%20and%20J%20make%20Sweet%20Array%20Love.ipynb"&gt;NumPy and J Make Sweet Array Love&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  Creating simple arrays
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# numpy
&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;reshape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="c1"&gt;NB. J &lt;/span&gt;
&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt; &lt;span class="nf"&gt;$&lt;/span&gt; &lt;span class="nf"&gt;i.&lt;/span&gt; &lt;span class="m"&gt;15&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# numpy 
&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="c1"&gt;NB. J &lt;/span&gt;
&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt; &lt;span class="m"&gt;4&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# numpy 
&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([(&lt;/span&gt;&lt;span class="mf"&gt;1.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="c1"&gt;NB. J &lt;/span&gt;
&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="m"&gt;1.5&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt; &lt;span class="nf"&gt;,:&lt;/span&gt; &lt;span class="m"&gt;4&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt; &lt;span class="m"&gt;6&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# numpy 
&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;dtype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;complex&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="c1"&gt;NB. J &lt;/span&gt;
&lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="nf"&gt;j.&lt;/span&gt;&lt;span class="o"&gt;~&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt; &lt;span class="nf"&gt;,:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt; &lt;span class="m"&gt;4&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# numpy - make complex numbers with nonzero real and imaginary parts
&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mf"&gt;4.7j&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="c1"&gt;NB. J - also for J&lt;/span&gt;
&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="nf"&gt;+&lt;/span&gt; &lt;span class="m"&gt;0j4.7&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# numpy 
&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zeros&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="c1"&gt;NB. J &lt;/span&gt;
&lt;span class="m"&gt;3&lt;/span&gt; &lt;span class="m"&gt;4&lt;/span&gt; &lt;span class="nf"&gt;$&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# numpy - allocates array with whatever is in memory 
&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;empty&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="c1"&gt;NB. J - uses fill - safer but slower than numpy's trust memory method &lt;/span&gt;
&lt;span class="m"&gt;2&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt; &lt;span class="nf"&gt;$&lt;/span&gt; &lt;span class="m"&gt;0.0001&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Basic operations
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# numpy 
&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="c1"&gt;NB. J &lt;/span&gt;
&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="m"&gt;20&lt;/span&gt; &lt;span class="m"&gt;30&lt;/span&gt; &lt;span class="m"&gt;40&lt;/span&gt; &lt;span class="m"&gt;50&lt;/span&gt; 
&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="nf"&gt;i.&lt;/span&gt; &lt;span class="m"&gt;4&lt;/span&gt; 
&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="nf"&gt;-&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# numpy - uses previously defined (b) 
&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="c1"&gt;NB. J &lt;/span&gt;
&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="nf"&gt;^&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# numpy - uses previously defined (a) 
&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="c1"&gt;NB. J &lt;/span&gt;
&lt;span class="m"&gt;10&lt;/span&gt; &lt;span class="nf"&gt;*&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="nf"&gt;o.&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# numpy - booleans are True and False 
&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;35&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="c1"&gt;NB. J - booleans are 1 and 0 &lt;/span&gt;
&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="nf"&gt;&amp;lt;&lt;/span&gt; &lt;span class="m"&gt;35&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Array processing
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# numpy 
&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="c1"&gt;# elementwise product 
&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="c1"&gt;NB. J &lt;/span&gt;
&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="nf"&gt;,:&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt; 
&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="nf"&gt;,:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt; &lt;span class="m"&gt;4&lt;/span&gt; 
&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="nf"&gt;*&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# numpy - matrix product 
&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="c1"&gt;NB. J - matrix product &lt;/span&gt;
&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="nf"&gt;+&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="nf"&gt;*&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;   
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# numpy - uniform pseudo random 
&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="c1"&gt;NB. J - uniform pseudo random &lt;/span&gt;
&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="nf"&gt;?&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt; &lt;span class="nf"&gt;$&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# numpy - sum all array elements - implicit ravel 
&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="c1"&gt;NB. J - sum all array elements - explicit ravel &lt;/span&gt;
&lt;span class="nf"&gt;+&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nf"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# numpy 
&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;reshape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="c1"&gt;# sum of each column 
&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="c1"&gt;# min of each row 
&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="c1"&gt;# cumulative sum along each row 
&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cumsum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="c1"&gt;# transpose 
&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="c1"&gt;NB. J  &lt;/span&gt;
&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt; &lt;span class="m"&gt;4&lt;/span&gt; &lt;span class="nf"&gt;$&lt;/span&gt; &lt;span class="nf"&gt;i.&lt;/span&gt; &lt;span class="m"&gt;12&lt;/span&gt; 
&lt;span class="c1"&gt;NB. sum of each column &lt;/span&gt;
&lt;span class="nf"&gt;+&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; 
&lt;span class="c1"&gt;NB. min of each row &lt;/span&gt;
&lt;span class="nf"&gt;&amp;lt;.&lt;/span&gt;&lt;span class="o"&gt;/"&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; 
&lt;span class="c1"&gt;NB. cumulative sum along each row &lt;/span&gt;
&lt;span class="nf"&gt;+&lt;/span&gt;&lt;span class="o"&gt;/\"&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; 
&lt;span class="c1"&gt;NB. transpose &lt;/span&gt;
&lt;span class="nf"&gt;|:&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Indexing and slicing
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# numpy  
&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;  
&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; 
&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; 
&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;   &lt;span class="c1"&gt;# reversal
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight j"&gt;&lt;code&gt;&lt;span class="c1"&gt;NB. J &lt;/span&gt;
&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="p"&gt;=.&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;i.&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nf"&gt;^&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt; 
&lt;span class="m"&gt;2&lt;/span&gt; &lt;span class="nf"&gt;{&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; 
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt; &lt;span class="nf"&gt;+&lt;/span&gt; &lt;span class="nf"&gt;i.&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nf"&gt;{&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;
&lt;span class="nf"&gt;|.&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>numpy</category>
      <category>j</category>
      <category>python</category>
      <category>apl</category>
    </item>
    <item>
      <title>Extracting SQL code from SSIS dtsx packages with Python lxml</title>
      <dc:creator>John Baker</dc:creator>
      <pubDate>Tue, 14 Jul 2020 15:59:49 +0000</pubDate>
      <link>https://dev.to/bakerjd99/extracting-sql-code-from-ssis-dtsx-packages-with-python-lxml-16nj</link>
      <guid>https://dev.to/bakerjd99/extracting-sql-code-from-ssis-dtsx-packages-with-python-lxml-16nj</guid>
      <description>&lt;p&gt;&lt;em&gt;Cross posted from my blog &lt;a href="https://analyzethedatanotthedrivel.org/"&gt;Analyze the Data not the Drivel&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Lately, I’ve been refactoring a sprawling SSIS &lt;a href="https://docs.microsoft.com/en-us/sql/integration-services/sql-server-integration-services?view=sql-server-ver15"&gt;(SQL Server Integration Services)&lt;/a&gt; package that ineffectually wrestles with large XML files. In this programmer’s opinion using SSIS for heavy-duty XML parsing is geeky self-abuse so I’ve opted to replace an eye-ball straining&lt;sup id="fnref1"&gt;1&lt;/sup&gt; SSIS package with half a dozen, &lt;a href="https://quoteinvestigator.com/2011/05/13/einstein-simple/"&gt;“as simple as possible but no simpler”&lt;/a&gt;, Python scripts. If the Python is fast enough for production great! If not the scripts will serve as a clear model&lt;sup id="fnref2"&gt;2&lt;/sup&gt; for something faster.&lt;/p&gt;

&lt;p&gt;I’m only refactoring&lt;sup id="fnref3"&gt;3&lt;/sup&gt; part of a larger &lt;a href="https://www.webopedia.com/TERM/E/ETL.html"&gt;ETL&lt;/a&gt; process so whatever I do &lt;em&gt;it must mesh with the rest of the mess.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;So where is the rest of the SSIS mess?&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;SSIS’s visual editor does a wonderful job of hiding the damn code!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;This is a problem!&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If only there was a simple way to troll through large sprawling SSIS &lt;em&gt;spider-webby&lt;/em&gt; packages and extract &lt;a href="https://www.youtube.com/watch?v=wPiHQ37gXnE"&gt;the good bits&lt;/a&gt;. Fortunately, Python’s XML parsing tools can be easily applied to SSIS &lt;code&gt;dtsx&lt;/code&gt; files. &lt;em&gt;SSIS &lt;code&gt;dtsx&lt;/code&gt; files are XML files.&lt;/em&gt; The following code snippets illustrate how to hack these files.&lt;/p&gt;

&lt;p&gt;First import the required Python modules. &lt;code&gt;lxml&lt;/code&gt; is not always included in Python distributions. Use the &lt;a href="https://remotedevdaily.com/how-to-install-lxml-in-python-using-pip/"&gt;pip&lt;/a&gt; or &lt;a href="https://anaconda.org/anaconda/lxml"&gt;conda&lt;/a&gt; tools to install this module.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# imports
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;lxml&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;etree&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Set an output directory. I’m running on a Windows machine. If you’re on a Mac or Linux machine adjust the path.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# set sql output directory
&lt;/span&gt;&lt;span class="n"&gt;sql_out&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="s"&gt;"C:\temp\dtsxsql"&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;isdir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sql_out&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;makedirs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sql_out&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Point to the &lt;code&gt;dtsx&lt;/code&gt; package you want to extract code from.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# dtsx files
&lt;/span&gt;&lt;span class="n"&gt;dtsx_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="s"&gt;'C:\Users\john\AnacondaProjects\testfolder\bixml'&lt;/span&gt;
&lt;span class="n"&gt;ssis_dtsx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dtsx_path&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="s"&gt;'\ParseXML.dtsx'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Read and parse the SSIS package.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;tree&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;etree&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ssis_dtsx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;root&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;getroot&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;lxml&lt;/code&gt; renders XML namespace tags like &lt;code&gt;&amp;lt;DTS:Executable&lt;/code&gt; as&lt;br&gt;
&lt;code&gt;www.microsoft.com/SqlServer/Dts\}Executable&lt;/code&gt;. The following gathers all the transformed element tags in the &lt;code&gt;dtsx&lt;/code&gt; package.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# collect unique element tags in dtsx
&lt;/span&gt;&lt;span class="n"&gt;ele_set&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;ele&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xpath&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".//*"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;ele_set&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ele&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tag&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;    
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ele_set&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ele_set&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using transformed element tags of interest blast over the &lt;code&gt;dtsx&lt;/code&gt; and suck out the bits of interest.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# extract sql code in source statements and write to *.sql files 
&lt;/span&gt;&lt;span class="n"&gt;total_bytes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="n"&gt;package_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'{www.microsoft.com/SqlServer/Dts}ObjectName'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;" "&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;cnt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ele&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xpath&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".//*"&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ele&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tag&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;"{www.microsoft.com/SqlServer/Dts}Executable"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;attr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ele&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;attrib&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;child0&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ele&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
      &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;child0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tag&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;"{www.microsoft.com/SqlServer/Dts}ObjectData"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;child1&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;child0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
          &lt;span class="n"&gt;sql_comment&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;attr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"{www.microsoft.com/SqlServer/Dts}ObjectName"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
          &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;child1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tag&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;"{www.microsoft.com/sqlserver/dts/tasks/sqltask}SqlTaskData"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;dtsx_sql&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;child1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;attrib&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"{www.microsoft.com/sqlserver/dts/tasks/sqltask}SqlStatementSource"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="n"&gt;dtsx_sql&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"-- "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;sql_comment&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;dtsx_sql&lt;/span&gt;
            &lt;span class="n"&gt;sql_file&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sql_out&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;package_name&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cnt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;".sql"&lt;/span&gt;
            &lt;span class="n"&gt;total_bytes&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dtsx_sql&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dtsx_sql&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;sql_comment&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sql_file&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sql_file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"w"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
              &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dtsx_sql&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="s"&gt;'total sql code bytes'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;total_bytes&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The code snippets in this post are available in this Jupyter notebook: &lt;a href="https://github.com/bakerjd99/jacks/blob/master/notebooks/Extracting%20SQL%20code%20from%20SSIS%20dtsx%20packages%20with%20Python%20lxml.ipynb"&gt;Extracting SQL code from SSIS dtsx packages with Python lxml&lt;/a&gt;. &lt;em&gt;Download and tweak for your &lt;code&gt;dtsx&lt;/code&gt; nightmare!&lt;/em&gt;&lt;/p&gt;




&lt;ol&gt;

&lt;li id="fn1"&gt;
&lt;p&gt;I frequently run into SSIS packages that cannot be viewed on &lt;code&gt;4K&lt;/code&gt;&lt;br&gt;
monitors when fully zoomed out. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn2"&gt;
&lt;p&gt;Python’s readability is a major asset when disentangling&lt;br&gt;
&lt;em&gt;mess-ware&lt;/em&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn3"&gt;
&lt;p&gt;Yes, I’ve railed about the word “refactoring” in the past but I’ve moved on and so should you. &lt;a href="https://www.dictionary.com/browse/a-foolish-consistency-is-the-hobgoblin-of-little-minds"&gt;“A foolish consistency is the hobgoblin of little minds.”&lt;/a&gt; ↩&lt;/p&gt;
&lt;/li&gt;

&lt;/ol&gt;

</description>
      <category>python</category>
      <category>ssis</category>
      <category>sql</category>
      <category>lxml</category>
    </item>
  </channel>
</rss>
