Is a switch from R to Python worth it?



I just finished a 1-year Data Science master's program where we were taught R. I found that Python is more popular and has a larger community in AI.

Is it worth for someone in my position to switch to Python and if yes, why? Does python have any game-changing features not available in R or is it just a matter of community?


Posted 2019-08-04T22:24:19.287

Reputation: 483

Question was closed 2019-11-19T04:26:27.647

12You can't switch yourself to Python. You are not talking about a project you already wrote in R and wants to port it to Python, you are simply asking about learning Python (not forgetting R). Is it worth learning Python? Nowdays it is almost impossible to not learn Python if you work with anything related to data handling with a computer... – lvella – 2019-08-05T14:07:01.553

It might also be worth mentioning that if you are writing papers or preparing presentations that you would have to compare the tools there. So e.g. in the case of R you would have Sweave or KnitR for static papers or (beamer class) presentations and in the case and Jupyter Notebook (formerly IPython Notebook) for dynamic notebooks (which you can also export to TeX). – phk – 2019-08-06T06:59:53.263



I want to reframe your question.

Don't think about switching, think about adding.

In data science you'll be able to go very far with either python or r but you'll go farthest with both.

Python and r integrate very well, thanks to the reticulate package. I often tidy data in r because it is easier for me, train a model in python to benefit from superior speed and visualize the outcomes in r in beautiful ggplot all in one notebook!

If you already know r there is no sense in abandoning it, use it where sensible and easy to you. But it is 100% a good idea to add python for many uses.

Once you feel comfortable in both you'll have a workflow that fits you best dominated by your favorite language.


Posted 2019-08-04T22:24:19.287

Reputation: 855

Comments are not for extended discussion; this conversation has been moved to chat.

– nbro – 2020-03-06T01:13:02.433


Of course, this type of questions will also lead to primarily opinion-based answers. Nonetheless, it is possible to enumerate the strengths and weakness of each language, with respect to machine learning, statistics, and data analysis tasks, which I will try to list below.



  • R was designed and developed for statisticians and data analysts, so it provides, out-of-the-box (that is, they are part of the language itself), features and facilities for statisticians, which are not available in Python, unless you install a related package. For example, the data frame, which Python does not provide, unless you install the famous Python's pandas package. There are other examples like matrices, vectors, etc. In Python, there are also similar data structures, but they are more general, so not specifically targeted for statisticians.

  • There are a lot of statistical libraries.




  • A lot of people and companies, including Google and Facebook, invest a lot in Python. For example, the main programming language of TensorFlow and PyTorch (two widely used machine learning frameworks) is Python. So, it is very unlikely that Python won't continue to be widely used in machine learning for at least 5-10 more years.

  • The Python community is likely a lot bigger than the R community. In fact, for example, if you look at Tiobe's index, Python is placed 3rd, while R is placed 20th.

  • Python is also widely used outside of the statistics or machine learning communities. For example, it is used for web development (see e.g. the Python frameworks Django or Flask).

  • There are a lot of machine learning libraries (e.g. TensorFlow and PyTorch).


  • It does not provide, out-of-the-box, the statistical and data analysis functionalities that R provides, unless you install an appropriate package. This might be a weakness or a strength, depending on your philosophical point of view.

There are other possible advantages and disadvantages of these languages. For example, both languages are dynamic. However, this feature can both be an advantage and a disadvantage (and it is not strictly related to machine learning or statistics), so I did not list it above. I avoided mentioning opinionated language features, such as code readability and learning curve, for obvious reasons (e.g. not all people have the same programming experience).


Python is definitely worth learning if you are studying machine learning or statistics. However, it does not mean that you will not use R anymore. R might still be handier for certain tasks.


Posted 2019-08-04T22:24:19.287

Reputation: 19 783

3It seems like the "out of the box" feature set is irrelevant. The relevant thing is the availability of packages that do what you want, no? – Dean MacGregor – 2019-08-05T14:34:19.617

1@DeanMacGregor If you do not have access to the internet, this feature is relevant! Furthermore, if a programming language already provides a feature out of the box, you do not have to lose time looking for it. – nbro – 2019-08-05T17:44:14.330

Considering Python is heavily infested on being 'batteries included', its weakness is not one you encounter often. Especially since there are Python installations in use which do have statistical packages included. For data science in particular, Anaconda is quite popular and solves your immediate concern.

– Mast – 2019-08-07T07:06:36.613


I didn't have this choice because I was forced to move from R to Python:

It depends on your environment: When you are embedded in an engineer department, working technical group or something similar than Python is more feasible.

When you are surrounded by scientists and especially statisticians, stay with R.

PS: R offers keras and tensorflow as well though it is implemented under the hood of python. Only very advanced stuff will make you need Python. Though I'm getting more and more used to Python, the synthax in R is easier. And though each package has its own, it is somehow consistent while Python is not.. And ggplot is so strong. Python has a clone (plotnine) but it lacks several (important) features. In principle you can do nearly as much as in R but especially visualization and data wrangling is much easier in R. Thus, the most famous Python library, pandas, is a clone of R.

PSS: Advanced statistics aims definitely at R. Python offers a lot of everyday tools and methods for a data scientist but it will never reach those >13,000 packages R provides. For example, I had to do an inverse regression and python doesn't offer this. In R you can choose between several confidence tests and whether it is linear or nonlinear. The same goes to mixed models: It is implemented in python but it is so basic there I can't realize how this can be sufficient for someone.


Posted 2019-08-04T22:24:19.287

Reputation: 173


I would say yes. Python is better than R for most tasks, but R has its niche and you would still want to use it in many circumstances.

Additionally, learning a second language will improve your programming skills.

My own perspective on the strengths of R vs Python is that I would prefer R for a small, single-purpose program involving tables or charts, or exploratory work in the same vein. I would prefer Python for everything else.

  • R is really good for table mashing. If most of what a particular program is going to do is smoosh some tables into different shapes, then R is the thing to pick. Python has tools for this, but R is designed for it and does it better.
  • It's worth switching to R whenever you need to make a chart, because ggplot2 is a masterpiece of API usability and matplotlib is a crawling horror.
  • Python is well designed for general purpose programming. It has a very well designed set of standard data structures, standard libraries, and control flow statements.
  • R is poorly suited for general purpose programming. It doesn't handle tree-structured or graph-structured data well. It has some rules (like being able to look into and modify your parent scope) which are immediately convenient, but when used lead to programs that do are hard to grow, modify, or compose.
  • R also has some straightforwardly bad things in it. These are mostly just historical leftovers like the three different object systems.

To elaborate more on the last point: computer programming done well is lego where you make your own bricks (functions and modules).

Programs are usually modified and repurposed past their original design. As you build them it is useful to think about which parts might be reused, and to build those part in a general way that will let them plug in to the other bricks.

R encourages you to melt all the bricks together.


Posted 2019-08-04T22:24:19.287

Reputation: 141


As others have said, it's not a "switch". But is it worth adding Python to your arsenal? I would say certainly. In data science, Python is popular and becoming ever more popular, while R is receding somewhat. And in the fields of machine learning and neural networks, I'd say that Python is the main language now -- I don't think R really comes close here in terms of usage. The reason for all of this is generality. Python is intended as a general programming language, and allows you to easily script all kinds of tasks. If you're staying strictly within a neatly structured statistical world, R is great, but with AI you often end up having to do novel, miscellaneous things, and I don't think R can beat Python at that. And because of this, I think Python and its packages will be receiving more support and development when it comes to the more cutting-edge tech.


Posted 2019-08-04T22:24:19.287

Reputation: 111


It sounds like you have invested 1 year for data science with R, and embedded into R environment, but want to explore python for data science.

First learn the basics of the python like how lists and tuple works and how classes and objects work.

Then get your hands dirty with some libraries like numpy matplotlib pandas. Learn tensorflow or keras and then go for data science.

Nitish Kumar

Posted 2019-08-04T22:24:19.287

Reputation: 1


This is totally my personal opinion.

I read in my office (at a construction site) that "There is a right tool for every task."

I expect me to face a variety of tasks, as a programmer. I want as many tools as I can "buy or invest in", as possible. One day one tool will help me solve it, some other day some other tool. R (for statistics) and Python (for in general) are two tools I definitely want with me and I think it is worth investment for me.

As far as switch is concerned, I will use the most efficient tool I know (where efficiency is measured by client's requirement, time and cost investment and ease of coding) . The more tools I know, the merrier! Of course there is a practical limit to it.

All this is my personal opinion and not necessarily correct.


Posted 2019-08-04T22:24:19.287

Reputation: 101


Person who chases two rabbits catches neither

And yes, Python is more popular. I work in both but, business speaking, it's easy to find a job on Python than in R.

So, you could:

  • Pick Python because it is more popular. However, you must start from scratch.


  • Stay with R, after all, you have one year worth of training with R. But it is not popular.


Posted 2019-08-04T22:24:19.287

Reputation: 117

The suggestion here that learning an additional programming language will somehow leave you worse off is nonsense. Learning additional programming languages, especially those that are unfamiliar, will always improve your skills as a programmer in any language. – Will Da Silva – 2019-08-07T15:52:16.820