Skip to Main Content
Go to Penn Libraries homepage   Go to Guides homepage

R for Business Guide

Quick start

Accessing data files from R is called reading.

  • To read delimited files, try read.csv and related functions.
  • To read Excel files, try the package readxl.
  • To read proprietary data files from common statistical software packages, try the package haven.

R does not store data it reads automatically, so you will want to store most data as an object:

object_name <- read.table('file')

Read data by file type

Comma-separated-value (CSV) files

If your file ends in the .csv extension, commas probably separate the values in each row. Try the function read.csv:

read.csv('file_path/file_name.csv')

Tab-separated-value (TSV) files

If your file ends in .tsv (tab-separated values) or if it has another delimiter, try the function read.delim.

By default, read.delim assumes tabs separate the values:

read.delim('file_path/file_name.tsv')

Other delimited files

The sep argument can make read.csv or read.delim separate values by space, vertical bar, or any other character:

  • Space-delimited
    read.delim('file_path/file_name.file_extension', sep = ' ')
  • Vertical-bar-delimited
    read.delim('file_path/file_name.file_extension', sep = '|')

Run ?read.table to see options for changing the decimal character, skipping non-data rows at the top of the file, and more.

You can read Microsoft Excel files, which end in .xls or .xlsx, using the package readxl.

Install readxl on its own or as part of the package tidyverse:

install.packages('readxl')
# or
install.packages('tidyverse')

Load readxl separately:

library(readxl)

Now use the function read_excel:

read_excel('file_path/file_name.xls')
# or
read_excel('file_path/file_name.xlsx')

Run ?read_excel for information about reading specific worksheets and other options.

Working with Stata, SAS or SPSS data? You can read files that end in .dta, .sas7bdat, or .sav. using the package haven.

Install haven directly or as part of the package tidyverse:

install.packages('haven')
# or
install.packages('tidyverse')

Load haven:

library(haven)

Chose a function based on the type of file you have:

Stata (DTA) files

read_dta('file_path/file_name.dta')

SAS (SAS7BDAT) files

read_sas('file_path/file_name.sas7bdat')

SPSS (SAV) files

read_sav('file_path/file_name.sav')

If spacing and the number of characters determine where the columns of your data begin and end, you probably have fixed-width data.

You can use the function read.fwf. Give the function

  1. Your file's location and name.
  2. A vector that lists how many characters wide each column is.
read.fwf('file_path/file_name.file_extension', widths = c(col1_width, col2_width, ..., coln_width))

Run ?read.fwf for details about including column names as well as other options.

R can save data in its own .rdata or .rda file format. RDATA and RDA files store data as R objects.

To load all of the objects in a RDATA or RDA file at once, use the function load:

load('file_path/file_name.rda')

The load function imports data objects with their original object names. Data will stay in your environment without assigning new object names.

Find your file

Having trouble finding your file? Getting errors from your file path?

You can use RStudio's Import Dataset tool to help develop your file-reading code.RStudio's Environment pane with Import Dataset selected

  1. Look in the Environment pane.
  2. Select Import Dataset and the type of data file you have.
  3. Follow the steps R provides.
  4. Select Import.

The steps above will run code to load your file once. To make it an ongoing part of your code,

  1. Copy the read code from the Console pane.
  2. Paste the code into your R script.

Business & Data Analysis Librarian

Profile Photo
Kevin Thomas
He/Him/His
Subjects: Statistics

Chat

Penn Libraries Home Franklin Home
(215) 898-7555