I was looking at this repository (https://github.com/jeffheaton/jh-kaggle-util) and was wondering if there's a better approach to creating a scalable machine learning pipeline for data science projects?
I feel inspired to try and make a pipeline that incorporates the structure outlined in the repo above, but I'm having trouble following the code implemented by the author. I also found it difficult to find more examples of scalable machine learning pipelines online.
Would love to have a second opinion on this subject and recommendations for any relevant resources. Thanks!