Ray: Memory Management When Calling Tune.run() Multiple Times Within Python Script
I have a python script that trains a reinforcement learning model using, among others, the libraries ray and rllib. The script uses check-pointing to update an rllib.PPO model iter
Solution 1:
Ah, I solved the issue. It was not necessary to release any resources after calling tune.run()
, the memory issue was due to building a tensorflow graph within each iteration. I realised that, quite annoyingly, the only way to release resources allocated by tensoflow is to terminate the python interpreter (closing the tensorflow session does not release them). I therefore wrote a script for building and training the graph, which I call using os.system()
. Quite hacky, but I am not aware of any other solutions.
Post a Comment for "Ray: Memory Management When Calling Tune.run() Multiple Times Within Python Script"