DEV Community: Xavier Rey-Robert

Monitoring HP HBA H240 with telegraf and grafana

Xavier Rey-Robert — Wed, 28 Feb 2024 21:58:17 +0000

I've recently got an HP SAS HBA H240 for my home lab to manage eight SAS SSD Pm1633a drives for better IOPs - who doesn't need that to run OpenShift at home right ? Given the HBA240's tendency to heat up, especially in a workstation setup, it's important to keep an eye on temperatures (Controller and SSDs).

To tackle this, I wrote a simple Python script that parses SSA CLI output into JSON format. This makes it easy to feed the data into Telegraf, enabling straightforward monitoring with Grafana.

Just a quick post to share this and for posterity...
I don't get into telegraf and grafana now, just comment if you want telegraf conf / grafana panel.

https://gist.github.com/XReyRobert/f3d6177d2b50b4198ea9f8896437c5b8

Effortlessly Exporting and Importing Podman Volumes Across Hosts

Xavier Rey-Robert — Sun, 18 Feb 2024 12:18:35 +0000

Effortlessly Exporting and Importing Podman Volumes Across Hosts

Hey folks, let's tackle a common hiccup in managing Podman volumes remotely. If you've tried using podman volume export with a remote Podman client, you've likely noticed it's not directly supported. But, I've crafted a workaround that simplifies the process.

The Challenge

Working remotely and need to move a Podman volume from one server to another? You'll quickly find that podman volume export isn't designed for remote client operations.

The Workaround

The solution lies in two Bash scripts that utilize SSH, SCP, and Podman's capabilities to facilitate volume export and import across remote hosts.

Exporting Volumes Made Simple

The podman_remote_volume_export.sh script connects to your remote host via SSH, exports the specified Podman volume to a tarball, and then SCPs this tarball back to your local machine. It's a straightforward way to get your volume data where you need it.

Importing as easy

The podman_remote_volume_import.sh script then takes over, uploading the exported tarball to a different remote host. It checks for existing volumes (offering an option to overwrite) and imports the volume data efficiently.

A Few Considerations

Safety Checks: To prevent accidental data loss, there's a prompt for confirmation before overwriting existing volumes during the import process.
Clean as We Go: Both scripts clean up after themselves, removing temporary tarballs to keep your hosts tidy.

Usage

To run these scripts:

./podman_remote_volume_export.sh user@remotehost volume_name
./podman_remote_volume_import.sh user@remotehost new_volume_name /path/to/archive.tar

The Bottom Line

The podman volume export limitation for remote operations can be circumvented with these scripts, streamlining the process of migrating volumes. Designed for developers familiar with container management, they offer a practical solution to a common problem.

CODE

For a closer look and potential customization, the scripts are available on Gist: Podman Volume Management Scripts Gist.

Happy containerizing!

Edit 02/19/24:
Just added podman_remote_volume_migrate.sh to do one or multiple volumes in one shot...

podman_remote_volume_migrate.sh
Usage: /Volumes/UsersData_Macos/Users/xav/scripts/podman_remote_volume_migrate.sh <SOURCE_HOST> <DESTINATION_HOST> <VOLUME_NAME>...

Quick hack to use multiple instances of Newtek NDI Scan Converter on MacOS

Xavier Rey-Robert — Sat, 29 Aug 2020 18:32:54 +0000

I'm prepping for some upcoming education sessions and I ran into issue needing multiple NDI video streams out of my mac applications and so I need a way to overcome the limitation to one single feed of the NewTek NDI Scan Converter App.

If you stream live from your PC, for Videoconferencing, teaching or gaming, you might already know the free NDI tools from NewTek and the great addition they can be to OBS.

Check NewTek Ndi Tools Download here

This quick post is to showcase how to allow multiple instances of Newtek scan converter to run on Mac. You can then have multiple apps, broadcasted as multiple NDI streams. In the screenshot above you can see 3 NDI feeds, one from an iPhone Cam, one from a Terminal and one from 3D Heavens benchmark, all displayed at once in OBS.

While It's easy to spawn more than one Scan Converter App (open -n command line), the NDI stream name is hard coded to "Scan Converter" and therefore the two instances outputs are conflicting (and only one is showing)

So I came up with the following procedure to make things working:

Duplicate NewTek NDI Scan Converter app and rename it to whatever you like (for me bellow Hacked Scan Converter)

You will need a binary editor, you can get Hex Friend

Open then App package and look for the app binary: ->Contents->MacOS->NewTek NDI Scan Converter
Open it with HexFiend and search for the Hex sequence : "53 63 61 6E 20 43 6F 6E 76 65 72 74 65 72 00 61 70 70 6C 69 63 61 74 69 6F 6E 4E 61 6D 65" (which is Scan Converter/00applicationName ). This is the string that is used for the NDI stream name.
Change Scan Converter to something like Hacked Scan 01 (It has to be the exact same length)

As we've modified the app, the app signature is now invalid so we'll just get rid of it with:

codesign --remove-signature '/Applications/hacked Scan Converter.app'

Now we have to change the bundle info so that both app wont interfere in security settings

edit /Applications/Hacked Scan Converter.app/Contents/Info.plist and change * CFBundleName* and CFBundleIdentifier values to reflect new name:

<key>CFBundleName</key>
<string>Hacked NDI Scan Converter</string>
<key>CFBundleIdentifier</key>
<string>com.hacked.Application-Mac-NDI-ScanConverter</string>

Start both apps and make sure they have the right permissions in Security->Privacy->Screen Recording settings

You should now see Scan Converter and Hacked Scan 01 NDI sources available in NDI Monitor or other NDI apps.

Enjoy!

You can repeat the steps above changing the names if you need more than 2 NDI Scan converter app streams simultaneously...

Gigabyte GA-X79-UP4 rev 1.1 with Xeon E5 2697v2 - 12 cores 24 threads

Xavier Rey-Robert — Tue, 04 Aug 2020 22:02:39 +0000

This is a short post for people with Gigabyte X79-UP4 wondering if they can upgrade their CPU to a 12 cores Xeon E5 2697v2. It's probably not interesting for the rest of the world! As I found absolutely no success story online with this motherboard/cpu combination, I drop it here for the archives :)

In 2014, I made myself a decent setup with a X79-UP4 and a i7-4930K 6 cores CPU. Six years later in 2020 it is still a very nice workhorse and doesn't pale in comparison to more modern setups. As an example the 2018 Macbook pro 6 cores i7 I'm using for work is far bellow in terms of performance under load (mainly due to thermal throttling). The 4930k is still a really appreciated CPU, and overclocking it under (simple) water cooling I can get 6 cores running altogether at 4.3Ghz easily.

Just recently, after upgrading my GPU for a Radeon RX5700XT and memory +32GB, I started to wonder if I could get a better CPU for my setup. So I started looking towards the Xeon E5 line.

When I picked the i7 4930k in 2014 it was priced at $600 but at the top of the line of the Ivy bridge CPUs was standing the Xeon E5 2697-v2 - 12 Cores, 24 threads - for a bit less than $3000! It was the best CPU you could fit in the $6000+ 2013 Mac PRO I was drooling on at the time.

The e5 2697v2 is still sold by Intel new at $2000, but you can find used ones for much less. I picked mine for $170 directly from China!

When I checked Gigabyte specifications I realised that
unfortunately, the Xeon E5-2697v2 was not on the list of supported CPUs. Strangely all the CPUs of the Ivy Bridge-E family are there but not this one (and a few others). I contacted Gigabyte support and got an answer like "If it's not on the list, it's not supported. We recommend using CPU's from the list". Fine, but not supported because not-tested, or tested and not working? 3 weeks later, the request is still open and not properly answered... Congrats support.

Well, 3 weeks, was actually the time needed for the CPU to arrive from China, at this price I didn't wait and decided I'd take the risk to try by myself ! I could see no reasons why all the family would work but not this one...

and I was right ! It's perfectly working! and as I'm on a summer vacation, I take some time to tell the world about it or at least drop the information here just in case someone else is googling on the same path.

So is it really an upgrade from an overclocked 4930k ? hmmm not an easy answer.

My Geekbench score for the overclocked 4930k was 975 single core and 5884 multicore (all 6 cores running at 4.3 Ghz). The non-overclocked E5 2697v2 is a bit disappointing with scores of 678 single core and 6439 multi-core. That's respectively -30% and +9.5%.

The E5 2697v2 - like most Xeons - is locked and therefore not easily overclockable. I started playing with the bus clock which is the only way to squeeze a little more juice out of it and I got honourable 759 single core and 8480 multicore scores. Respectively -22% and +32% / 4930k oc. But to make things worse the 113 MHz bus clock boost led to some instability with my GPU...

Of course with a non-overclocked 4930k that would be a different story and to be fair I've been using my 4930k at specifications speed for 6 years totally satisfied and just started overclocking it few weeks ago only because I was going to receive the new CPU. It's been running rock steady on overclock since then though.

On the temperature side, under standard/idle use (browsing/video) I reach 40° with the xeon when it was 60° with the oc 4930k. Under heavy load (Cinebench) I would reach 80° with the 4930k and I top a 60° with the Xeon...

Using my system, I definitely cannot feel the -30% penalty on single core performance and for some workloads, it might still be a nice improvement. Compiling Tensorflow was taking about 1h to compile, I will try and see how much it takes now.

Oh but wait, I'm just reading there is the Xeon E5 1680v2 -8 cores, 16 thread - with one interesting particularity in the Xeon line... he's not unlocked and I have the feeling that with this one one could beat the single core performance of the 4930k and probably get close to the multicore performance of the 12 cores E5 2697v2 when overclocked ! See how close they are non overclocked

So I guess I could sell my 4930k and order a E5 1680v2, just to try.... or just stop here and wait...

Machine learning on macOs using Keras -> Tensorflow (1.15.0) -> nGraph -> PlaidML -> AMD GPU

Xavier Rey-Robert — Thu, 23 Jul 2020 08:46:06 +0000

Since the unavailability of Cuda on macOS, choices to use GPUs for Machine learning on Macs are sparse.

After failing to find some practical ways to do it, I resorted to use a second Linux computer with an Nvidia GPU for training my networks.

The availability of macOS Catalina with Apple support for Navi AMD GPUs incited me to give it another try. This was quite tough so I decided to write it down to share the experience.

The easy way: Keras with PlaidML - No tensorflow involved

This is quite straight forward and I'm not going to cover it again here. You can check this article here : https://medium.com/@bamouh42/gpu-acceleration-on-amd-with-plaidml-for-training-and-using-keras-models-57a9fce883b9

In my case that was not satisfying. Here Keras is using PlaidML as a backend and I want to be able to use Kapre which requires a tensorflow backend. Kapre is a neat library providing keras layers to calculate melspectrograms on the fly.

Be aware that " Keras team steping away from multi-backends " so the Keras -> PlaidML approach might be a dead end anyway.

The journey to Tensorflow execution on mac GPUs / eGPUs

The key element here is nGraph. Without entering into details, nGraph is pursuing a neutral approach in supporting multiple frameworks (Tensorflow, ONNX, etc.) and multiple hardware targets (Intel CPU, NNPs, etc) and luckily for us (not so! just wait) nGraph was also integrated with PlaidML to offer support for GPUs (Intel, Nvidia and... AMD).

So on paper all is great, we have a way to go:
Keras -> Tensorflow -> nGraph -> nGraph-bridge -> PlaidML -> Metal -> AMD GPU.

In this domain like others, things are moving fast. So fast that it's not allways easy to keep pace and for the teams of those projects it's the same. There are a lot of involved sofware and things are changing so fast that developpers don't have time - or take time - to settle things down.

nGraph-bridge team hasn't been doing proper releases since August 2019 (v0.18.1) and while they are still activily working on the project they seem to have been focusing on big refactoring.

To make things worse PlaidML support was (silently) dropped from nGraph in April without much explanations or warning so forget about using the latest github master to try to sort it out ! I spend hours wondering why it wasn't working when it was simply not there anymore.

Why was PlaidML bridge droped ?

It seems that the futur path to hapyness will be Keras -> Tensorflow -> Mlir -> PlaidMl -> ... and all are preping for the jump when Mlir as tensorflow backend will be released ... in 2021! but as of today users are just left hanging in midair.

What are your options ?

At time of writing the latest release is ngraph-bridge v0.18.1 (dated 20 Aug 2019!). It's using tensorflow v1.14.0 - Argh! Kapre requirement is tensorflow v1.15 - Dead end again.

I should mention that you should better not use prebuilt wheels. I realized not all are compiled with PlaidML backend support. So your best chance is to Build nGraph and nGraph-bridge from sources and you'd rather have all stars aligned for that to happend flawlessly. A lot of things can go wrong: Python versions, bazel versions, libraries incompatibilities, bugs to fix in the code etc... all joys of pythons

Picking a release candidate to build

v0.19.0-rc9 brings Tensorflow v1.15.0, nGraph 0.28.0-rc1 - the recommended last stable baseline - is Tensorflow v1.14.0

I need TF15 so let's try with v.0.19.0-rc10 then... of course standard build miserably crash which lead me to think that this rc was probably never compiled/tested with plaidml support on mac as clang fails because of a non complete switch statement in plaidml_translate.cpp

We will fix it by adding this line to the to the switch(dt) in the tile_converter function:

case PLAIDML_DATA_BFLOAT16: return "as_bfloat16(" + tensor_name + ", 16)";

See The complete build instructions bellow.

If everything goes right you should end up with something like this:

TensorFlow version:  1.15.0
C Compiler version used in building TensorFlow:  4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.11.45.5)
nGraph bridge version: b'0.19.0-rc10'
nGraph version used for this build: b'0.25.1-rc.10+90c70dd'
TensorFlow version used for this build: v1.15.0-rc3-22-g590d6eef7e
CXX11_ABI flag used for this build: 0
nGraph bridge built with Grappler: False
nGraph bridge built with Variables and Optimizers Enablement: False

Final thoughts - Use at your own risks

Ok, we have a working environment but they are so many imbricated (fresh) software bricks that we have no garantee that all this will run properly in all circumstances.
Using Kapre for exemple, I'm able to use the _mel_spectrogram_ layer just fine, but ngraph-bridge
will crash on a Caught exception while executing nGraph computation: syntax error when trying to use the STFT layer...

I will not abandon quite yet my linux deep learning work horse but at least I have an environment to try out that will use my Macbook pro GPU on the go and my Catalina / AMD RX 5700 XT setup at home.

The complete build instructions

I'm putting bellow what worked for me - I retested on a fresh mac after days of messing up -

Make sure you have a proper python3 installation (I wont cover it). I'm using 3.7 and using ‘‘‘brew install python@3.7 to manage it.‘‘‘

git clone https://github.com/tensorflow/ngraph-bridge.git
cd ngraph-bridge

git checkout v0.19.0-rc10

# Install bazel (bazelisk was a mess)
export BAZEL_VERSION=0.25.2 

curl -LO "https://github.com/bazelbuild/bazel/releases/download/${BAZEL_VERSION}/bazel-${BAZEL_VERSION}-installer-darwin-x86_64.sh"

chmod +x "bazel-${BAZEL_VERSION}-installer-darwin-x86_64.sh"
./bazel-${BAZEL_VERSION}-installer-darwin-x86_64.sh --user

source ~/.bazel/bin/bazel-complete.bash

# Add $HOME/bin to your PATH in .zshrc (or .bashrc) and source it

echo "\nexport PATH=$PATH:$HOME/bin" >> ~/.zshrc
source ~/.zshrc

# check bazel 
bazel version

# I like to start with a fresh venv dedicated to the build

python3 -m venv build-venv
source build-venv/bin/activate

# Recommended virtualenv v16.0.0 didn't work, I ended up using latest version

python3 -m pip3 install virtualenv

#Install tensorflow from wheel (find the right one here: https://pypi.org/project/tensorflow/1.15.0/#files)

python3 -m pip install https://files.pythonhosted.org/packages/dc/65/a94519cd8b4fd61a7b002cb752bfc0c0e5faa25d1f43ec4f0a4705020126/tensorflow-1.15.0-cp37-cp37m-macosx_10_11_x86_64.whl

#start the build

python3 build_ngtf.py --use_prebuilt_tensorflow --build_plaidml_backend

# When the build fails edit plaidml_translate.cpp from ngraph to add the missing case 

vi /build_cmake/ngraph/src/ngraph/runtime/plaidml/plaidml_translate.cpp 

#re-start the build

python3 build_ngtf.py --use_prebuilt_tensorflow --build_plaidml_backend

Some hints for the records:

When installing Kapre you might run into

AttributeError: module 'enum' has no attribute 'IntFlag

This is solved by removing enum34:

enum34 1.1.10

When importing Librosa, you might run into:

ModuleNotFoundError: No module named 'numba.decorators

This is solved by using an older version of numba:

pip install numba==0.48