337

391

Lots of people use a main tool like Excel or another spreadsheet, SPSS, Stata, or R for their statistics needs. They might turn to some specific package for very special needs, but a lot of things can be done with a simple spreadsheet or a general stats package or stats programming environment.

I've always liked Python as a programming language, and for simple needs, it's easy to write a short program that calculates what I need. Matplotlib allows me to plot it.

Has anyone switched completely from, say R, to Python? R (or any other statistics package) has a lot of functionality specific to statistics, and it has data structures that allow you to think about the statistics you want to perform and less about the internal representation of your data. Python (or some other dynamic language) has the benefit of allowing me to program in a familiar, high-level language, and it lets me programmatically interact with real-world systems in which the data resides or from which I can take measurements. But I haven't found any Python package that would allow me to express things with "statistical terminology" – from simple descriptive statistics to more complicated multivariate methods.

What can you recommend if I wanted to use Python as a "statistics workbench" to replace R, SPSS, etc.?

What would I gain and lose, based on your experience?

5

FYI, there is a new python stats subreddit that is going off: http://www.reddit.com/r/pystats

– naught101 – 2013-12-04T04:12:52.553