What is the easiest way to duplicate my model and run in 10 cloud machines?


I have built a model in a cloud machine(google cloud). It runs for a few hours to a day. I need to scan some parameters like learning rate and batch size.

How do I duplicate my compute engine, run the model with different parameters, collect the results and turn them off?

Edit: I have a neural network model, it runs for 24 hours. Usually without cloud setting I would do a grid search: learning rate in {0.001, 0.003, 0.1} and batch size in {32,64,128}. This would take 9 days.

With cloud computing I can do this grid search in 24 hours. I need to manually do the followings. Save my original compute engine into a snapshot and create 8 compute engine from the snapshot, start all engines. Run each model with a different parameter. Copy the result to the original compute engine. Close after copy.

The question is how do I automate this? Answers in gcp or aws are welcomed.


Posted 2019-12-23T14:32:37.717

Reputation: 111

Question was closed 2019-12-25T08:41:59.733

Hi, I don't know Google cloud but it would probably be useful if you would clarify what you want to do: what do you mean by "scanning some parameters", by "duplicating my compute engine"? you might want to ask Google cloud support, because the question seems to be very specific to their platform. – Erwan – 2019-12-24T11:44:45.130

No answers