loading...

Learning R: Rant 1

rohitfarmer profile image Rohit Farmer ・1 min read

I learned machine learning and data science last year with Python in my previous job, and this year I am more or less doing the same stuff in R for my current job. Learning R seems to be a pain right now, because to me coming from Python R appears to be a giant mess. There is n number of packages with multiple of them doing more or less the same thing. For example, right now I am learning about data frames/tables in R. So you have traditional R data frames that are not very easy to work with. Therefore, there is an enhanced version of it in the form of data.tables(). Former generates a data frame and the later generates a data table. In addition to the data frame and table, there is something else called tibble that is produced by the packages in the tidyverse library. Why there is not just one package like Pandas for data frames and Numpy for nd arrays?

Posted on by:

rohitfarmer profile

Rohit Farmer

@rohitfarmer

Interdisciplinary researcher, applying computational methods at the intersection of biology, chemistry and medicine.

Discussion

pic
Editor guide
 

Might be a bit late, but my advice:

Use tidyverse.

If you need to you can learn the differences later as under the hood a tibble is a data.frame with bells and whistles. Also, a data.table is just a data.frame too, but with different bells and whistles. I hope that's helpful rather than confusing :)

Another way of looking at it: data.frame is baked into the language at a fundamental level, even more so than numpy and pandas. It's native to base R. These other packages extend it in specific, opinionated, and usually pretty useful ways.

 

I love and hate for all the reasons you mentioned. Haha