DEV Community

Sharath Hebbar
Sharath Hebbar

Posted on

Joblib

Joblib

Joblib is a set of tools to provide lightweight pipelining in Python. In particular: transparent disk-caching of functions and lazy re-evaluation (memoize pattern) easy simple parallel computing.

Why it is used?

  • Better performance
  • reproducibility
  • Avoid computing the same thing twice
  • Persist to disk transparently

Features

Transparent and fast disk-caching of output value
Embarrassingly parallel helper
Fast compressed Persistence

Importing libraries

from joblib import Memory,Parallel, delayed,dump,load
import pandas as pd
import numpy as np
import math
Enter fullscreen mode Exit fullscreen mode

Data Creation

my_dir = '/content/sample_data'
a = np.vander(np.arange(3))
print(a)
output: [[0 0 1]  [1 1 1]  [4 2 1]]
Enter fullscreen mode Exit fullscreen mode

Memory

mem = Memory(my_dir)
output: [[ 0  0  1]  [ 1  1  1]  [16  4  1]]
sqr = mem.cache(np.square)
b = sqr(a)
print(b)
output: [[ 0  0  1]  [ 1  1  1]  [16  4  1]]
Enter fullscreen mode Exit fullscreen mode

Parallel

%%time
Parallel(n_jobs=1)(delayed(np.square)(i) for i in range(10))
output: CPU times: user 2.85 ms, sys: 0 ns, total: 2.85 ms
Wall time: 3 ms
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
%%time
Parallel(n_jobs=2)(delayed(np.square)(i) for i in range(10))
output: CPU times: user 42.7 ms, sys: 762 µs, total: 43.5 ms
Wall time: 75.9 ms
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
%%time
Parallel(n_jobs=3)(delayed(np.square)(i) for i in range(10))
output: CPU times: user 92.9 ms, sys: 8.93 ms, total: 102 ms
Wall time: 151 ms
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Enter fullscreen mode Exit fullscreen mode

Dump

dump(a,'/content/sample_data/a.job')
output: ['/content/sample_data/a.job']
Load
aa = load('/content/sample_data/a.job')
print(aa)
output: array([[0, 0, 1],        [1, 1, 1],        [4, 2, 1]])
Enter fullscreen mode Exit fullscreen mode

References

Documentation: https://joblib.readthedocs.io
Download: https://pypi.python.org/pypi/joblib#downloads
Source code: https://github.com/joblib/joblib
Report issues: https://github.com/joblib/joblib/issues

Source:
https://medium.com/r/?url=https%3A%2F%2Fgithub.com%2FSharathHebbar%2FData-Science-and-ML%2Ftree%2Fmain%2Fcodes%2Fjoblib

Top comments (0)