<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Gardner Bickford</title>
    <description>The latest articles on DEV Community by Gardner Bickford (@gardner).</description>
    <link>https://dev.to/gardner</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F90861%2F7297dcc2-87f4-4029-84c8-bacc15fd18b1.jpeg</url>
      <title>DEV Community: Gardner Bickford</title>
      <link>https://dev.to/gardner</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/gardner"/>
    <language>en</language>
    <item>
      <title>Self-hosted LanguageTool private instance is an offline alternative to Grammarly</title>
      <dc:creator>Gardner Bickford</dc:creator>
      <pubDate>Sun, 26 Nov 2023 22:32:00 +0000</pubDate>
      <link>https://dev.to/gardner/self-hosted-languagetool-private-instance-is-an-offline-alternative-to-grammarly-56nh</link>
      <guid>https://dev.to/gardner/self-hosted-languagetool-private-instance-is-an-offline-alternative-to-grammarly-56nh</guid>
      <description>&lt;p&gt;Setting up LanguageTool is easy. If you have DNS ad blocking enabled then you will want to disable abTesting as described in this post.&lt;/p&gt;

&lt;h2&gt;
  
  
  docker-compose.yml
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;

&lt;p&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;br&gt;
  &lt;span class="na"&gt;languagetool&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;br&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;silviof/docker-languagetool:latest&lt;/span&gt;&lt;br&gt;
    &lt;span class="na"&gt;restart&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;unless-stopped&lt;/span&gt;&lt;br&gt;
    &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;java -cp languagetool-server.jar org.languagetool.server.HTTPServer --port 8010 --public --allow-origin '*' --config /etc/http_server.cfg&lt;/span&gt;&lt;br&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;br&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;127.0.0.1:8010:8010&lt;/span&gt;&lt;br&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;br&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./ngrams:/ngrams&lt;/span&gt;&lt;br&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./http_server.cfg:/etc/http_server.cfg&lt;/span&gt;&lt;/p&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h1&gt;
&lt;br&gt;
  &lt;br&gt;
  &lt;br&gt;
  Download ngrams&lt;br&gt;
&lt;/h1&gt;

&lt;p&gt;These weigh about &lt;code&gt;8.3&lt;/code&gt;GB&lt;/p&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;

&lt;p&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;ngrams&lt;br&gt;
wget &lt;a href="https://languagetool.org/download/ngram-data/ngrams-en-20150817.zip" rel="noopener noreferrer"&gt;https://languagetool.org/download/ngram-data/ngrams-en-20150817.zip&lt;/a&gt;&lt;br&gt;
&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;ngrams &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; unzip ../ngrams-en-20150817.zip&lt;span class="o"&gt;)&lt;/span&gt;&lt;br&gt;
&lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; ngrams-en-20150817.zip&lt;/p&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h1&gt;
&lt;br&gt;
  &lt;br&gt;
  &lt;br&gt;
  Disable AB Testing&lt;br&gt;
&lt;/h1&gt;

&lt;p&gt;To prevent outgoing network requests to fetch said tests:&lt;/p&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

&lt;p&gt;cat &amp;lt;&amp;lt;EOF &amp;gt; http_server.cfg&lt;br&gt;
languageModel=/ngrams&lt;br&gt;
abTest=null&lt;br&gt;
abTestClients=null&lt;br&gt;
EOF&lt;/p&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h2&gt;
&lt;br&gt;
  &lt;br&gt;
  &lt;br&gt;
  Start Server&lt;br&gt;
&lt;/h2&gt;
&lt;br&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;

&lt;p&gt;docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;&lt;/p&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h2&gt;
&lt;br&gt;
  &lt;br&gt;
  &lt;br&gt;
  Install Browser Extension&lt;br&gt;
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;From Chrome &lt;a href="https://chromewebstore.google.com/detail/grammar-checker-paraphras/oldceeleldhonbafppcapldpdifcinji" rel="noopener noreferrer"&gt;Web Store&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;From &lt;a href="https://addons.mozilla.org/en-US/firefox/addon/languagetool/" rel="noopener noreferrer"&gt;Mozilla&lt;/a&gt; &lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Configure Browser Extension
&lt;/h2&gt;

&lt;p&gt;In advanced config set API server to &lt;code&gt;http://127.0.0.1:8010/v2&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm4rouvvqex4eq0fsqfmq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm4rouvvqex4eq0fsqfmq.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>grammar</category>
      <category>privacy</category>
      <category>languagetool</category>
      <category>docker</category>
    </item>
    <item>
      <title>OpenAI Whisper Inference on Apple Silicon METAL GPU</title>
      <dc:creator>Gardner Bickford</dc:creator>
      <pubDate>Thu, 13 Jul 2023 22:59:00 +0000</pubDate>
      <link>https://dev.to/gardner/openai-whisper-inference-on-apple-silicon-metal-gpu-1g9m</link>
      <guid>https://dev.to/gardner/openai-whisper-inference-on-apple-silicon-metal-gpu-1g9m</guid>
      <description>&lt;h2&gt;
  
  
  OpenAI Whisper Inference on Apple Silicon METAL GPU
&lt;/h2&gt;

&lt;p&gt;This code example shows inference running on the Apple Silicon GPU. It did not work with the homebrew python version. Creating a miniconda environment resolved this. To monitor GPU usage open &lt;em&gt;Activity Monitor&lt;/em&gt; then click the &lt;em&gt;Window&lt;/em&gt; menu and select &lt;em&gt;GPU History&lt;/em&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prepare the environment
&lt;/h2&gt;

&lt;p&gt;The tensorflow metal plugin is &lt;a href="https://developer.apple.com/metal/tensorflow-plugin/"&gt;compatible&lt;/a&gt; with TensorFlow version 2.13 or later. This may work with later versions of python. Please let me know the results of your experiements.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;conda create &lt;span class="nt"&gt;-n&lt;/span&gt; tfmetal &lt;span class="nv"&gt;python&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;3.10
conda activate tfmetal
python3 &lt;span class="nt"&gt;-m&lt;/span&gt; pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  tensorflow-metal &lt;span class="se"&gt;\&lt;/span&gt;
  tensorflow &lt;span class="se"&gt;\&lt;/span&gt;
  transformers &lt;span class="se"&gt;\&lt;/span&gt;
  datasets &lt;span class="se"&gt;\&lt;/span&gt;
  soundfile &lt;span class="se"&gt;\&lt;/span&gt;
  librosa
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  &lt;code&gt;whisper.py&lt;/code&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;WhisperProcessor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;TFWhisperForConditionalGeneration&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;datasets&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dataset&lt;/span&gt;

&lt;span class="c1"&gt;# load model and processor
&lt;/span&gt;&lt;span class="n"&gt;processor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;WhisperProcessor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"openai/whisper-large-v2"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TFWhisperForConditionalGeneration&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"openai/whisper-large-v2"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;forced_decoder_ids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

&lt;span class="c1"&gt;# load dummy dataset and read audio files
&lt;/span&gt;&lt;span class="n"&gt;ds&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;load_dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"hf-internal-testing/librispeech_asr_dummy"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"clean"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"validation"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;sample&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ds&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s"&gt;"audio"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;input_features&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;processor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"array"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;sampling_rate&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"sampling_rate"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;return_tensors&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"tf"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;input_features&lt;/span&gt; 

&lt;span class="c1"&gt;# generate token ids
&lt;/span&gt;&lt;span class="n"&gt;predicted_ids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_features&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# decode token ids to text
&lt;/span&gt;&lt;span class="n"&gt;transcription&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;processor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;batch_decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;predicted_ids&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;skip_special_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transcription&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"---"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;transcription&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;processor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;batch_decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;predicted_ids&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;skip_special_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transcription&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Run the inference
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python3 whisper.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Output
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
2023-07-14 11:08:58.333012: I metal_plugin/src/device/metal_device.cc:1154] Metal device &lt;span class="nb"&gt;set &lt;/span&gt;to: Apple M2
2023-07-14 11:08:58.333059: I metal_plugin/src/device/metal_device.cc:296] systemMemory: 24.00 GB
2023-07-14 11:08:58.333069: I metal_plugin/src/device/metal_device.cc:313] maxCacheSize: 8.00 GB
2023-07-14 11:08:58.333338: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:303] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2023-07-14 11:08:58.333375: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:269] Created TensorFlow device &lt;span class="o"&gt;(&lt;/span&gt;/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory&lt;span class="o"&gt;)&lt;/span&gt; -&amp;gt; physical PluggableDevice &lt;span class="o"&gt;(&lt;/span&gt;device: 0, name: METAL, pci bus &lt;span class="nb"&gt;id&lt;/span&gt;: &amp;lt;undefined&amp;gt;&lt;span class="o"&gt;)&lt;/span&gt;
All model checkpoint layers were used when initializing TFWhisperForConditionalGeneration.

All the layers of TFWhisperForConditionalGeneration were initialized from the model checkpoint at openai/whisper-large-v2.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFWhisperForConditionalGeneration &lt;span class="k"&gt;for &lt;/span&gt;predictions without further training.
Found cached dataset librispeech_asr_dummy &lt;span class="o"&gt;(&lt;/span&gt;/Users/gardner/.cache/huggingface/datasets/hf-internal-testing___librispeech_asr_dummy/clean/2.1.0/d3bc4c2bc2078fcde3ad0f0f635862e4c0fef78ba94c4a34c4c250a097af240b&lt;span class="o"&gt;)&lt;/span&gt;
/Users/gardner/miniconda3/envs/tfmetal/lib/python3.10/site-packages/transformers/generation/tf_utils.py:854: UserWarning: Using &lt;span class="sb"&gt;`&lt;/span&gt;max_length&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="s1"&gt;'s default (448) to control the generation length. This behaviour is deprecated and will be removed from the config in v5 of Transformers -- we recommend using `max_new_tokens` to control the maximum length of the generation.
  warnings.warn(
['&lt;/span&gt;&amp;lt;|startoftranscript|&amp;gt;&amp;lt;|en|&amp;gt;&amp;lt;|transcribe|&amp;gt;&amp;lt;|notimestamps|&amp;gt; Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.&amp;lt;|endoftext|&amp;gt;&lt;span class="s1"&gt;']
---
['&lt;/span&gt; Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.&lt;span class="s1"&gt;']
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>tensorflow</category>
      <category>inference</category>
      <category>m1</category>
      <category>m2</category>
    </item>
  </channel>
</rss>
