All posts by fadzli.fuzi

About fadzli.fuzi

My passion is data analytics. My main research interests are Bayesian statistics, actuarial and risk modelling. Avid user of R. Love landscape photography.

Why R and why use R? 11 facts about R.

What is R? There are two categories of people: Those who heard about R and those who haven’t. For a start, R is a statistical analysis software, more or less similar to SAS, SPSS, etc (my R compatriots will cringe here). The main difference between R and other statistical software is that it is completely FREE to download! Yes, free! It is an open source software so it means that you don’t need a password or product key from your IT admin at your workplace or your university. Money saved from paying enormous amount for licenses can be saved for scholarships and research grant. Welcome to the open source world.

Below are 11 facts that you need to know about R:

1. R is FREE

Yes, R is free to download and doesn’t need any licensing key to install (you may need to ask permission from your IT admin if you are installing on a company desktop). What you simply need to do is to download R from the R website (click here). R is compatible with Windows, MacOS and other popular operating systems. If you want a better user interface with nice add-ins, download R Studio (here), also freely available.

2. Packages in R are developed by experts in their field

R has thousands of packages which were developed and maintained by academicians, researchers and statisticians who are expert in their fields. For instance, for Bayesian modelling, packages such as rjags, MCMCpack and bayesglmm were developed by researchers who are experts in Bayesian statistics. Bioconductor, a package popular among bioinformatics community was developed by a team of cancer researchers.

3. Graphics in R are not your everyday normal graph.

You are bored with Excel based graphics which only include bar plots, histograms, line plots and pie charts? R will allow you to create better impression graphics in your publications or reports. R users have full control of the type of graphics to use, font size, colours, etc. This will make your graphics more appealing and research results easier to understand.

New York Times for example, have used R in a lot of its news. Among interesting graphics is ‘Mapping Migration in the US” (here). Below are some beautiful graphics created by R. A picture can speak a thousand words.

ggplot2 horizon plot

bar-chart-of-all-questions-linear-score

aus-unemployed-flower-blue-close

j2xjm

4. R is a programming language.

Contrary to other statistical software which mainly use “point and click”, R is also a programming language. Users will have to define their own way of doing data analysis by typing into the console. This may seem a hassle, but hold on, it gives you freedom and flexibility (we all need these in life) on how and what you want to do with your data. Some of us are dealing with small sample of data and others may work with big data, so R is flexible to all.

Because R is a programming language, users who have good programming knowledge can develop their own R packages. Maybe to some R is hard to understand, but from my experience you just need a quick introduction to the language and you are ready to fly.

5.  R is recognised outside academia

Even though R is yet to be recognised widely in Malaysia, R popularity is increasing around the world. Tech companies which have data analytics team will usually recognised anyone who can use R. Companies like Google, Twitter, Facebook, Booking.com are known to use R. ANZ Bank (4th largest bank in Australia) for instance, are using R for their credit risk modelling. Lloyd’s London (UK insurance company) and Ford Motors company use R for their data analytics work. Recently, Heartland Bank, a bank in New Zealand, replace their SAS system with R. Two years ago, Microsoft bought a company called Revolution Analytics, which specialised in developing early R version for big data. This shows how valuable is R to organizations outside academia.

Table below shows the average annual salary in data analytics in US for the year 2013. We have a growing community of R users in Malaysia, and I am sure this will change in a few years given what is happening in the US & EU.

R salary

6. R users are growing by the day

Robert Muenchen in his blog (click here) shows a comparison between R and other statistical softwares/programming languages being used in academic journals. R has the second fastest growth after Python (a programming language that is popular for data science and web applications ), which is also free.

R popularity in scholarly articles

7. Build other applications using R

R is very flexible, it can be used to build applications associated with data analysis. For example, Shiny apps built by R Studio is an interactive web application which is based on R. Shiny helps people from outside your domain of knowledge to understand data more effectively.

8. “Customer Support” for R is 24 hours and it’s free!

R has “customer support”? The customer support department is actually the R community on the internet. Because R has large community users worldwide, you can just Google your problems and there is a good chance someone has written a blog on it or has published an online manual.

If your questions are very specific and you still can’t find the answers written elsewhere, you can use two most common platforms that I normally use: StackOverflow and StackExchange. My questions on R will usually be answered within 24 hours by the community.

9. R for Big Data

Everyone is talking about Big Data, but not everyone is doing it. But rest assured, R got it covered. R can be used with High Performance Computer Clusters which normally used for simulations and complex data analysis. Computational process that may take days to complete can be reduced to just a few hours. R can also be used in “multicore processors”. Computers normally have several processors but most statistical softwares only use a single processor.

In 2015, Microsoft Inc. bought Revolution Analytics, a company that developed Big Data platform for R. The technology has been adopted and developed into  what is now called as Microsoft R Server. R can also be connected with big data engine such as Apache Spark and Hadoop.

10. useR! Conference

useR! conference gathers R practitioners from academia and industry every year and this year (2017) the conference will be held in Brussels, Belgium. useR conference is the best place to know about the latest applications of R and and (of course) to build your networking with people who have the same interest on R. There are also R User Group Meetups which are small meetings among community of R users around the world. In Malaysia, we have R User Group Malaysia which meets bi-monthly. Follow the Facebook page here.

11. R-bloggers as news portal for R users.

Follow R-bloggers  website for your everyday R-juice. I started to follow this website when I began to learn R in 2010. The site gathers interesting blogs on R and if you subscribe to it (I suggest you do), it will send you emails about R everyday in your inbox. I find this helpful in my research work, as I can learn new things everyday. The blog posts can also be used to help with ideas for your next research project.

 

Those are 11 things that you may know or not know about R. Some may ask “Why do I need to learn R, the tool that I am using now is enough.” There is always an added advantage of learning new things. Maybe you can collaborate with the researchers who are established in your field who is using R. In my case, R has brought me to computer programming (I am learning Python). If you want your research to be in the frontline, you must have the right “tools” that can bring you there.

 

Interested to learn R? I offer face-to-face and group session to academia and non-academia. Contact me at fadzlifuzi@gmail.com.