Skip to content Skip to sidebar Skip to footer

Ray: Memory Management When Calling Tune.run() Multiple Times Within Python Script

I have a python script that trains a reinforcement learning model using, among others, the libraries ray and rllib. The script uses check-pointing to update an rllib.PPO model iter

Solution 1:

Ah, I solved the issue. It was not necessary to release any resources after calling tune.run(), the memory issue was due to building a tensorflow graph within each iteration. I realised that, quite annoyingly, the only way to release resources allocated by tensoflow is to terminate the python interpreter (closing the tensorflow session does not release them). I therefore wrote a script for building and training the graph, which I call using os.system(). Quite hacky, but I am not aware of any other solutions.


Post a Comment for "Ray: Memory Management When Calling Tune.run() Multiple Times Within Python Script"