Tensorflow: solution to memory exhaustion due to repeated loading of models

I had a few deep learning models saved. From a test data, I wanted to make subsets in specific way and evaluate them. Each subset was a combination of few samples of the test data chosen randomly. I wanted to evaluate thousands of such subsets on every model.

For this, the program needed to load the models for each subset. I could have loaded all the models in one workspace but my system did not have that much amount of memory.

As the evaluation proceeded, the memory usage went up even after clearing up the memory after each evaluation as follows:

Python
import gc

for range(n):
	
	evaluation_function(*args, **kwargs)
	
	# delete model and flush memory
	del model
	tf.compat.v1.reset_default_graph()
	tf.keras.backend.clear_session()
	tf.config.experimental.reset_memory_stats(sel_gpu)
	gc.collect()
	

Following is the memory usage log:

log text
[After dataset 0] Memory usage: 605.10 MB

[After dataset 220] Memory usage: 19494.14 MB

[After dataset 230] Memory usage: 20301.51 MB

[After dataset 240] Memory usage: 21134.62 MB

[After dataset 250] Memory usage: 21952.45 MB

[After dataset 260] Memory usage: 22774.45 MB

[After dataset 270] Memory usage: 23605.70 MB

[After dataset 280] Memory usage: 24416.38 MB

[After dataset 290] Memory usage: 25248.49 MB

[After dataset 300] Memory usage: 26067.98 MB

[After dataset 310] Memory usage: 26876.53 MB

[After dataset 320] Memory usage: 27712.44 MB

The only thing that worked was to run the evaluation function in a separate python process. Following is the format how I did this.

Python
# imports

# helper functions

# helper statements

def evaluation_function(*args, **kwargs)
  # statments
  # call evaluation helper functions

# import Process
from multiprocessing import Process

desired_iters = 1000
c = 0
while c < desired_iters:
    # run prediction in separate process
    p = Process(target = evaluation_function,
                args=(*args, **kwargs))            
    p.start()
    p.join() # Wait for the subprocess to finish

    c += 1

For each iteration of the Process, I evaluated only so many sets as my system memory could have handled. For each process the result was stored in separate file and later combined and analyzed.

Following memory usage log shows how after each Process, the memory usage drops as seen below:

log text
[After dataset 430] Memory usage: 36763.20 MB

[After dataset 440] Memory usage: 37616.81 MB

[After dataset 450] Memory usage: 38419.14 MB

[After dataset 460] Memory usage: 39266.66 MB

[After dataset 470] Memory usage: 40063.92 MB

[After dataset 480] Memory usage: 40867.15 MB

[After dataset 490] Memory usage: 41704.53 MB

[After dataset 500] Memory usage: 42522.88 MB

[After dataset 0] Memory usage: 605.57 MB

[After dataset 10] Memory usage: 2210.55 MB

[After dataset 20] Memory usage: 3034.72 MB

[After dataset 30] Memory usage: 3849.43 MB

[After dataset 40] Memory usage: 4684.55 MB

[After dataset 50] Memory usage: 5503.93 MB

[After dataset 60] Memory usage: 6324.65 MB

The evaluation_function function was a complex function that called other helper functions. All those helper functions were automatically loaded into the subprocess. The global variables in the script were also utilized.