I cannot tell from your question how adept you are at mathematics or where your learning stops. I'll assume since you are a computer software engineer that you're familiar with algebra, geometry, and perhaps some calculus.

I'd recommend you start your learning by reading up on statistics and understanding concepts like descriptives, exploratory data analysis, correlation, distributions, and so on. I see that you prefer books rather than videos, so I'll meet you half way and provide you with a few books that are online, as well as a book or two that you can buy in print.

First, I'd recommend Penn State's online graduate course curriculum in statistics. You can explore each of their courses using the menu on the left. Once you select a course, scroll down on the course's webpage and click on the link that reads "online course notes". The course notes for these courses are much more than notes and read like full books. They are very instructive. Also, check out Penn State's online undergraduate course curriculum in statistics, too, in case you find something in the graduate coursework that is too advanced and want a "simpler" explanation.

Second, review the Handbook of Biological Statistics by John H. McDonald. Don't let the title fool you; this book is an excellent primer on statistics and data analysis that is applicable to any domain.

Third, review The Little Handbook of Statistics by Gerard Dallal. Again, don't let the title fool you; this book is another gem that walks you through some important statistics fundamentals.

Fourth, check out the book Think Stats by Allen Downey. There's a free version online of an earlier edition; the most recent edition you'll have to buy. It's worth it though, especially if you work in Python. In this book, the author teaches you statistics and data analysis using Python to analyze real-world (toy) datasets. This is a really great book to work through.

Lastly, check out Data Science from Scratch by Joel Grus. This book focuses more on data analysis (instead of statistics fundamentals) and places a greater emphasis on machine learning and modeling. It uses Python (and the Python data science stack) to walk you through analyzing and conducting predictive analytics on real-world (toy) datasets. Another great book to work through.

I have a Bsc honours in statistics and currently taking an online masters programme in Data Science with Simplilearn... to be a data scientist one need have a strong background in statistics... because most of the models in machine learning builds on Maths and Stats which is taught at degree level or better... my advice would be to read Data science handbook with python... send me an email on pchiita@gmail.com ... so i can share my material on my google drive... i have a lot of good books... happy learning.... – Paul Chiita – 2018-05-26T12:37:50.640