DEV Community

Dipti
Dipti

Posted on

Mastering SVM in R: Classification, Kernels & Model Tuning

Importing data into R is often the very first step in any data analysis journey. But if you’ve worked with R for a while, you’ll know that it has a different function for nearly every file format. At first, this can feel confusing—even frustrating—because you might mix up the functions or their arguments.

The good news is that once you know which functions and packages to use for which file types, the process becomes smooth and straightforward. This tutorial provides a comprehensive reference guide for importing data into R, covering the most common file formats along with practical examples and code.

So the next time you find yourself Googling “how to load [file type] into R”, you’ll have everything you need right here.

We’ll explore:

Reading TXT, CSV, JSON, Excel files

Working with SAS, SPSS, Stata, Matlab, and Octave datasets

Importing data from HTML/XML files

Connecting directly to relational databases with ODBC

And even a handy hack for quick, ad-hoc data loading.

Let’s dive in!

Preparing Your R Workspace

Before importing data, it’s a good idea to set up your environment properly.

Setting the Working Directory

Most datasets are stored in a dedicated folder for each project. You can tell R to treat that folder as its working directory.

getwd() # check current working directory
setwd("") # set a new working directory

By doing this, you can use relative file paths instead of typing long absolute paths each time.

Cleaning the Environment

Your R environment often contains leftover objects from previous sessions, which can cause errors. To start fresh, run:

rm(list = ls())

This clears all objects, functions, and variables. Alternatively, you can choose not to save the workspace when closing R.

💡 Pro Tip: Always start with a clean environment for smoother imports and fewer debugging headaches.

Importing TXT, CSV, and Delimited Files
Reading Text Files

Text files usually contain data separated by tabs, commas, or semicolons. Here’s a simple example of a tab-delimited file:

Category V1 V2
A 3 2
B 5 6
B 2 3
A 4 8

Use read.table() to import such files:

df <- read.table("", header = TRUE, sep = "\t")

You can adjust the sep argument for other delimiters.

Reading CSV Files

CSV files are either comma-separated (,) or semicolon-separated (;). R provides wrapper functions around read.table() for convenience:

df <- read.csv("") # for comma-separated
df <- read.csv2("") # for semicolon-separated

Both functions work like read.table() but come with defaults tailored for each format.

Quick Copy-Paste Hack

Need a fast way to test or analyze some data? Copy the data to your clipboard and run:

df <- read.table("clipboard", header = TRUE)

It may not always format perfectly, but it’s great for quick ad-hoc analysis.

Using Packages for Data Import

For more complex file formats, you’ll need specialized packages.

Install and load packages with:

install.packages("")
library("")

Reading JSON Files

Use the rjson package:

install.packages("rjson")
library(rjson)

From file

jsonData <- fromJSON(file = "")

From URL

jsonData <- fromJSON(file = "")

JSON data is loaded as a list. To convert it into a data frame:

jsonDF <- as.data.frame(jsonData)

Importing XML and HTML Tables

For XML and HTML data, the XML package works best:

library(XML)
library(RCurl)

Parse XML file

xmlData <- xmlTreeParse("")

Convert to data frame

xmlDF <- xmlToDataFrame("")

For HTML tables:

htmlData <- readHTMLTable(getURL(""))

Reading Excel Workbooks

There are several options—XLConnect, XLSX, gdata—but readxl is the simplest and fastest.

install.packages("readxl")
library(readxl)

Read first sheet

df <- read_excel("")

Read by sheet name or index

df <- read_excel("", sheet = "Sheet3")
df <- read_excel("", sheet = 3)

Importing Data from Statistical Software

R can directly import data from SAS, SPSS, and Stata using the haven package:

install.packages("haven")
library(haven)

SAS

df_sas <- read_sas("data.sas7bdat")

SPSS

df_spss <- read_sav("data.sav")

Stata

df_stata <- read_dta("data.dta")

For MATLAB:

install.packages("R.matlab")
library(R.matlab)

matData <- readMat("")

For Octave:

library(foreign)
octData <- read.octave("")

Importing Data from Relational Databases

Use the RODBC package to connect with databases like Microsoft SQL Server or Access:

install.packages("RODBC")
library(RODBC)

Connect to database

con <- odbcConnect("dsn", uid = "username", pwd = "password")

Fetch data

df1 <- sqlFetch(con, "Table1")
df2 <- sqlQuery(con, "SELECT * FROM Table2")

Close connection

odbcClose(con)

Tips for Smooth Data Imports

Use the first row for column headers

Ensure column names are unique and case-sensitive

Stick to simple naming conventions (e.g., var_name, varName)

Replace missing values with NA

Remove comments or extra symbols from files

Keep code style consistent for readability

Conclusion

Importing data into R is just the beginning of your analysis journey. In this guide, we walked through methods to bring in CSV, TXT, JSON, Excel, XML/HTML, as well as data from SAS, SPSS, Stata, Matlab, and relational databases.

With these tools and tricks, you’ll be able to handle almost any dataset you encounter. And remember, R often has multiple ways to achieve the same goal—so explore and find the method that best suits your workflow.

This article was originally published on Perceptive Analytics.
In United States, our mission is simple — to enable businesses to unlock value in data. For over 20 years, we’ve partnered with more than 100 clients — from Fortune 500 companies to mid-sized firms — helping them solve complex data analytics challenges. As a leading excel consultants, we turn raw data into strategic insights that drive better decisions.

Top comments (0)