100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached
logo-home
p programming $7.99   Add to cart

Other

p programming

 0 view  0 purchase
  • Course
  • Institution
  • Book

introduction to r programming is complete details to the p rogramming

Preview 4 out of 47  pages

  • April 7, 2023
  • 47
  • 2022/2023
  • Other
  • Unknown
avatar-seller
Unit 1
Introduction to R
1 Introduction
Statistical computing and high-scale data analysis tasks needed a new category of computer language
besides the existing procedural and object-oriented programming languages, which would support these
tasks instead of developing new software. There is plenty of data available today which can be analysed
in different ways to provide a wide range of useful insights for multiple operations in various industries.
Problems such as the lack of support, tools and techniques for varied data analysis have been solved
with the introduction of one such language called R.

1.1 History of R
Ross Ihaka and Robert Gentleman developed R as a free software environment for their teaching classes
when they were colleagues at the University of Auckland in New Zealand. Because they were both
familiar with S, a commercial programming language for statistics, it seemed natural to use similar
syntax in their own work. After Ihaka and Gentleman announced their software on the S-news mailing
list, several people became interested and started to collaborate with them, notably Martin Mächler.
Currently, a group of 18 people has rights to modify the central archive of source code. This group is
referred to as the R Development Core Team. In addition, many other people have contributed new
code and bug fixes to the project.

Here are some milestone dates in the development of R:

Early 1990s: The development of R began.

August 1993: The software was announced on the S-news mailing list. Since then, a set of active R
mailing lists has been created. The web page at www.r-project.org/mail.html provides descriptions of
these lists and instructions for subscribing.

June 1995: After some persuasive arguments by Martin Mächler (among others) to make the code
available as “free software,” the code was made available under the Free Software Foundation’s GNU
General Public License (GPL), Version 2.

Mid-1997: The initial R Development Core Team was formed (although, at the time, it was simply known
as the core group).

February 2000: The first version of R, version 1.0.0, was released. Ross Ihaka wrote a comprehensive
overview of the development of R. The web page http://cran.r-project.org/doc/html/interface98-
paper/paper.html provides a fascinating history.

,1.2 What is R?
R is a scripting or programming language which provides an environment for statistical computing, data
science and graphics. It was inspired by, and is mostly compatible with, the statistical language S
developed at Bell laboratory (formerly AT & T, now Lucent technologies). Although there are some very
important differences between R and S, much of the code written for S runs unaltered on R. R has
become so popular that it is used as the single most important tool for computational statistics,
visualisation and data science.

1.3 Why R?
R has opened tremendous scope for statistical computing and data analysis. It provides techniques for
various statistical analyses like classical tests and classification, timeseries analysis, clustering, linear and
non-linear modelling and graphical operations. The techniques supported by R are highly extensible.

S is the pioneer of statistical computing; however, it is a proprietary solution and is not readily available
to developers. In contrast, R is available freely under the GNU license. Hence, it helps the developer
community in research and development.

Another reason behind the popularity and widespread use of R is its superior support for graphics. It can
provide well-developed and high-quality plots from data analysis. The plots can contain mathematical
formulae and symbols, if necessary, and users have full control over the selection and use of symbols in
the graphics. Hence, other than robustness, user-experience and user-friendliness are two key aspects
of R.

Why Learn R?
The following points describe why R language should be used (Figure ):

If you need to run statistical calculations in your application, learn and deploy R. It easily integrates with
programming languages such as Java, C++, Python and Ruby.

If you wish to perform a quick analysis for making sense of data.

If you are working on an optimisation problem.

If you need to use re-usable libraries to solve a complex problem, leverage the 2000+ free libraries
provided by R.

If you wish to create compelling charts.

If you aspire to be a Data Scientist.

If you want to have fun with statistics.

R is free. It is available under the terms of the Free Software Foundation’s GNU General Public License in
source code form.

,It is available for Windows, Mac and a wide variety of Unix platforms (including FreeBSD, Linux, etc.).

In addition to enabling statistical operations, it is a general programming language so that you can
automate your analyses and create new functions.

R has excellent tools for creating graphics such as bar charts, scatter plots, multipanel lattice charts, etc.

It has an object oriented and functional programming structure along with support from a robust and
vibrant community.

R has a flexible analysis tool kit, which makes it easy to access data in various formats, manipulate it
(transform, merge, aggregate, etc.), and subject it to traditional and modern statistical models (such as
regression, ANOVA, tree models, etc.)

R can be extended easily via packages. It relates easily to other programming languages. Existing
software as well as emerging software can be integrated with R packages to make them more
productive.

R can easily import data from MS Excel, MS Access, MySQL, SQLite, Oracle etc. It can easily connect to
databases using ODBC (Open Database Connectivity Protocol) and ROracle package.




Figure: Advantages of learning R language

1.4 Advantages of R over Other Programming Languages
Advanced programming languages like Python also support statistical computing and data visualisation
along with traditional computer programming. However, R wins the race over Python and similar
languages because of the following two advantages:

 Python needs third party extensions and support for data visualisation and statistical computing.
However, R does not require any such support extensively. For example, the lm function is

, present for linear regression analysis and data analysis in both Python and R. In R, data can be
easily passed through the function and the function will return an object with detailed
information about the regression. The function can also return information about the standard
errors, coefficients, residual values and so on. When lm function is called in the Python
environment, it will duplicate the functionalities using third party libraries such as SciPy, NumPy
and so on. Hence, R can do the same thing with a single line of code instead of taking support
from third party libraries.
Note: SciPy is used for performing data analysis tasks and NumPy is used for representing the
data or objects.
 R has the fundamental data type, i.e., a vector that can be organised and aggregated in different
ways even though the core is the same. Vector data type imposes some limitations on the
language as this is a rigid type. However, it gives a strong logical base to R. Based on the vector
data type, R uses the concept of data frames that are like a matrix with attributes and internal
data structure similar to spreadsheets or relational database. Hence, R follows a column-wise
data structure based on the aggregation of vectors.

Note: There are also some disadvantages of R. For example, R cannot scale efficiently for larger
data sets. Hence, the use of R is limited to prototyping and sandboxing. It is rarely used for
enterprise-level solutions. By default, R uses a single-thread execution approach while working
on data stored in the RAM which leads to scalability issues as well. Developers from open source
communities are working hard on these issues to make R capable of multi-threading execution
and parallelisation. This will help R to utilise more than one core processor. There are big data
extensions from companies like Revolution R and the issues are expected to be resolved soon.
Other languages like SPlus can help to store objects permanently on disks, hence, supporting
better memory management and analysis of high volume of massive datasets.

1.5 Benefits of Using R

Of the many attractive benefits of R, a few stand out: It’s actively maintained, it has good connectivity
to various types of data and other systems, and it’s versatile enough to solve problems in many
domains. Possibly best of all, it’s available for free, in more than one sense of the word.
1. It comes as free, open-source code: R is available under an open-source license, which means that
anyone can download and modify the code. This freedom is often referred to as “free as in
speech.” R is also available free of charge — a second kind of freedom, sometimes referred to as
“free as in beer.” In practical terms, this means that you can download and use R free of charge.
Another benefit, albeit slightly more indirect, is that anybody can access the source code, modify it,
and improve it. As a result, many excellent programmers have contributed improvements and fixes to
the R code. For this reason, R is very stable and reliable.
Any freedom also has associated obligations. In the case of R, these obligations are described in the
conditions of the license under which it is released: GNU General Public License (GPL), Version 2. The
full text of the license is available at www.r-project.org/COPYING. It’s important to stress that the GPL
does not pertain to your usage of R. There are no obligations for using the software — the obligations

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller shivampandey2. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $7.99. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

85443 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling
$7.99
  • (0)
  Add to cart