Category Archive: R Programming

Classifying Bank Customer Data Using R? Use K-means Clustering

Before delving deeper into the analysis of bank data using R, let’s have a quick brush-up of R skills.

 

Classifying Bank Customer Data Using R? Use K-means Clustering

 

As you know, R is a well-structured functional suite of software for data estimation, manipulation and graphical representation.

Dexlab

How to Create Repeat Loop in R Programming

In this tutorial, we will learn to make a repeat loop with the use of R programming.
 
How to Create Repeat Loop in R Programming
 
A repeat loop is used to iterate over a block of code over several number of times.

Dexlab

Debugging Magrittr Pipelines in R with Bizarro Pipe and Eager Assignment

Debugging Magrittr Pipelines in R with Bizarro Pipe and Eager Assignment

 

Pipes in R

Pipe, written as “%>%“ is basically an efficient operator, supplied by magrittr R package. The pipe operator is notably famous due to its wide range of use in dplyr and by the proficient dplyr users. The usage of pipe operator allows one to write “sin(5)” as “5 %>% sin“,  which is inspired by F#‘s pipe-forward operator “|>” and is further characterised by:

Dexlab

How To Visualize Multivariate Relationships in Large Datasets in R Programming:

How To Visualize Multivariate Relationships in Large Datasets in R Programming:
 

In this post, we will discuss how to use the package nmle in R programming, which includes the dataset MathArchieve. To install the package and load it into your R programming environment, use the code mentioned below:

Dexlab

ANZ uses R programming for Credit Risk Analysis

At the previous month’s “R user group meeting in Melbourne”, they had a theme going; which was “Experiences with using SAS and R in insurance and banking”. In that convention, Hong Ooi from ANZ (Australia and New Zealand Banking Group) spoke on the “experiences in credit risk analysis with R”. He gave a presentation, which has a great story told through slides about implementing R programming for fiscal analyses at a few major banks.

 
ANZ uses R programming for Credit Risk Analysis
 

In the slides he made, one can see the following:

 

How R is used to fit models for mortgage loss at ANZ

A customized model is made to assess the probability of default for individual’s loans with a heavy tailed T distribution for volatility.

One slide goes on to display how the standard lm function for regression is adapted for a non-Gaussian error distribution — one of the many benefits of having the source code available in R.

Dexlab

Introducing The New R Tools For Visual Studio

Introducing The New R Tools For Visual Studio
 

It is a great new development that the new Visual Studio now speaks the R Language!

Dexlab

We are Proud to Host Corporate Training for WHO Reps!

We are happy to announce our month-long corporate training session for the representatives of WHO, who will be joining us to discuss data analytics all the way from Bhutan. The team of delegates who have come to seek training from our expert in-house trainers are for the Central of Disease Control, Ministry of Health Royal Government of Bhutan.

 
We are Proud to Host Corporate Training for WHO Reps!
 

The training is on the concepts of R Programming, Data Science using R and Statistical Modelling using R, and will go on from the 8th of February 2017 to the 8th of March 2017. We are hosting this training session at our headquarters in Gurgaon, Delhi NCR. It is a matter of great pride and honour for the team of seasoned industry expert trainers at DexLab Analytics to be hosting the representatives from WHO.

Dexlab

How is data science helping NFL players win Super bowl?!

Recently, a discussion was held, which invited data scientists and analysts all over the world, to take part in the Science of Super Bowl discussion panel, this discussion was held by Newswise.

Data Science in Super bowl

We found one notable discussion topic, which answered three very important questions related to data science that the sports industry could use:

Dexlab

The Choice Between SAS Vs. R Vs. Python: Which to Learn First?

It is a well-known fact that Python, R and SAS are the most important three languages to be learnt for data analysis.

 

The Choice Between SAS Vs. R Vs. Python: Which to Learn First?

 

If you are a fresh blood in the data science community and are not experienced in any of the above-mentioned languages, then it makes a lot of sense to be acquainted with R, SAS or Python.

 

Does that sound too difficult? Do not fret, by the time you are done reading this post you will know without a doubt which language is the right one for you to take up first.

 

Introduction to the languages:

  • R programming: R is the lingua franca for statistics. It is an open source programming language, free to access and pen to all to perform data analysis tasks.
  • Python: this is a multi-purpose, open source programming language, which has become a very popular one these days in data science due to its vibrant community and immense data mining libraries.
  • SAS: SAS is currently the undisputed market leader when it comes to enterprise analytics space. It provides a huge array of statistical functions; it has a good GUI for people to pick it up faster and also offers a well-backed tech support team.

 

You must be aspiring to start a career in data science for gaining some knowledge and to be able to transition to this field in the near future. And if so, then some research on your part is necessary to understand what you must take up as your first lesson to excel in this complex field. It will help up your chances of landing the right job. But the question here remains – whether you should take up R? Or should it be better to make SAS a priority for learning? Or should one learn Python instead?

 

Here are the factors that one should consider, before deciding what to learn:

Industries where the tool is used the most:

An international HR firm Burtch Works, asked about 1000 quantitative professionals about which language they prefer better – is it SAS, R or Python. The survey results, came out to be something like this:

 

SAS vs R vs Python

Image Source: kdnuggets.com

 

 

big companies mostly prefer SAS, because they offer better customer services, this is also the reason why SAS has an advantage within the financial services sector and the marketing companies, where the budget is not a concern for selecting the tool.

 

 

Tools used in data science industry

Image Source: kdnuggets.com

On the other hand, start-ups and mid-sized firms use R and Python. Tech as well as telecom companies also require a large amounts of unstructured data to get analyzed, and hence, many data scientists associated with these sectors use machine learning techniques for which Python and R are more suitable.

 

Data Scientist vs Predictive Analytics

Image Source: KDnuggets

You can take up an R programming course in Gurgaon with DexLab Analytics.

The ease of learning, and pocket pinch:

The language SAS is an expensive software used for commercial purposes and is mostly used by large corporations with massive budgets.

 

However, R and Python are free software, which can be used and downloaded by anyone keen on learning.

 

No prior knowledge in programming is required by people for learning SAS, as it has simple GUI, which is easy to use. There is the provision of parsing SQL codes, combining it with macros along with other native packages it makes learning SAS a cakewalk for those with basic knowledge of SQL.

 

For analyzing data in Python one will need data mining libraries like Pandas, Scipy, and Numpy. In other words, one will not code in a native Python language for data analysis. The codes one writes in these abovementioned libraries are somewhat similar to those in R. So, it is easier for people to learn Python who are already aware of R in data science. For those who already know R it is recommended that they learn the basics of Python Programming language before one starts to learn the Python data mining ecosystem.

Capabilities in data science:

SAS is very efficient at sequential data access and for database access through using SQL which is well integrated. With the drag-and-drop interface, it makes it easier for people to create better statistical models faster.

 

R is best known for in-memory analytics and is mainly used when the data analysis tasks need standalone servers. It is a great tool for exploring data.

 

Python libraries like Numpy, Pandas, Scipy and Scikit-learn allows it to be the second most popular programming language in the field of data science right behind R. One can also create a lot of beautiful graphs and charts with libraries like Seaborn and Matlplotlib.

 

Get R programming certification to pave your way to data science success with our courses from DexLab Analytics.

Community support:

R and Python have a huge community support online from things like mailing list, Stackover flow and other user-contributed documentations and codes.

 

SAS also has an online active community which is regulated by the community managers.

 

So advance your career with a course on Machine Learning using Python or R programming.

 

 

Interested in a career in Data Analyst?

To learn more about Machine Learning Using Python and Spark – click here.
To learn more about Data Analyst with Advanced excel course – click here.
To learn more about Data Analyst with SAS Course – click here.
To learn more about Data Analyst with R Course – click here.
To learn more about Big Data Course – click here.

Dexlab

Understanding the Difference Between ‘Sub-Setting IF’ and ‘IF-Then-Else-IF Statement’ in SAS Programming:

Winter is knocking at our doorstep and we are hoping to get our brains worked out with some rigorous learning.

 

Understanding the Difference Between ‘Sub-Setting IF’ and ‘IF-Then-Else-IF Statement’ in SAS Programming:
 
However the weather remains, as data analysts using SAS programming, we can definitely use the weather forecasts to provide the data for explaining the concepts of IF and IF-THEN-ELSE statements to our readers interested in learning SAS predictive modeling.

Dexlab