Often times you would want to save python objects for later use. For example, a dataset you constructed which could be used for several projects, or a transformer object with specific parameters you want to apply for different data, or even a machine learning learning model you trained.
This is how to do it. First, we will create a dummy data using numpy.
import numpy as np
a = np.random.chisquare(45, size=(10000, 50))Let’s say you want to transform it using the quantile transformer and save the transformed data for later use. Following is how you would transform it.
from sklearn.preprocessing import QuantileTransformer
qt = QuantileTransformer(n_quantiles=1000, random_state=10)
a_qt = qt.fit_transform(a)Save python objects
To save a_qt using joblib the dump method is used.
import joblib
joblib.dump(a_qt, 'out/a_qt.pckl')You can even save the transformer object qt.
joblib.dump(qt, 'out/qt.pckl')Load python objects
To load the objects we use the joblib method load.
# load array
b_qt = joblib.load('out/a_qt.pckl')
# load transformer
qt2 = joblib.load('out/qt.pckl')You can verify that the saved objects and loaded objects are same, by printing the arrays or printing the class of the objects.
print('Shape of a_qt: ', a_qt.shape)
print('Shape of b_qt: ', b_qt.shape)
print('Class of qt', type(qt))
print('Class of qt2', type(qt2))Shape of a_qt: (10000, 50)
Shape of b_qt: (10000, 50)
Class of qt <class 'sklearn.preprocessing._data.QuantileTransformer'>
Class of qt2 <class 'sklearn.preprocessing._data.QuantileTransformer'>