I came from a software development background and we have separate servers of the same database (dev, test, prod). The reason for this is because we develop our apps against the dev DB, run tests against the Test DB, and prod is prod. This is so we create a clear separation and won't bring down prod trying to build our app.
Do you guys train your models the same way? Have 3 environments of the same database and as your model goes from dev to test to prod, it trains against the corresponding environment?
Data scientist plays around with a 3 different algos for classification. Creates 3 models (A, B, C) using dev env's database.
Data scientist evaluates 3 models and selects Model A after testing/validation.
Data scientist deploys the code to TESTING/STAGING env (same hyperparameters). This time, however, the model trains using TESTING/STAGING env's databases. A version of Model A is created using data from the TESTING/STAGING databases.
Data scientist deploys the code to PROD env (same hyperparameters). This time, however, the model trains using PROD env's databases. A version of Model A is created using data from the PROD databases.