install.packages(c("tidyverse", "haven", "readxl", "rmarkdown"))R Course: Beginner to Expert
1 Welcome to R for SAS Users
1.1 Course Overview
This comprehensive course is designed specifically for SAS programmers who want to learn R programming. We’ll leverage your existing knowledge of data manipulation, statistical analysis, and programming concepts to help you become proficient in R.
1.2 Learning Objectives
By the end of this course, you will be able to:
- Understand R fundamentals: Master R syntax, data types, and programming concepts
- Data manipulation: Perform complex data transformations using
dplyrandbase R - Statistical analysis: Apply statistical methods and create models in R
- Data visualization: Create compelling visualizations using
ggplot2 - Reporting: Generate dynamic reports with R Markdown and Quarto
- Bridge knowledge: Map your SAS skills to equivalent R functions and workflows
- Best practices: Write clean, efficient, and reproducible R code
1.3 Course Structure
1.3.1 Module 1: R Fundamentals for SAS Users
- Getting Started
- R and RStudio installation and setup
- Understanding the R environment vs SAS environment
- Package management
- Basic R Syntax
- R syntax compared to SAS syntax
- Variables and assignment operators
- Data types and structures
- Functions and help system
1.3.2 Module 2: Data Import and Export
- CSV, Excel and text files (
readr,readxlpackages) - SAS dataset import (
havenpackage) and export xpt files - other formats (JSON, XML)
1.3.3 Module 3: Data Manipulation - dplyr
- Core dplyr Functions
select()vsKEEP/DROPstatementsfilter()vsWHEREclausemutate()vsassignmentstatementssummarise()vsPROC MEANSgroup_by()vsBYstatement
- Advanced Data Manipulation
- Joins equivalent to
PROC SQLjoins
- Reshaping data (
tidyrvsPROC TRANSPOSE) - String manipulation (
stringrvs SAS string functions)
- Joins equivalent to
1.3.4 Module 4: Creation of ADSL dataset
- ADSL and ADVS Dataset Creation
- Creating analysis variables
- Handling missing data and derivations
- creating ADSL xpt dataset
1.3.5 Module 5: Statistical Analysis
- Descriptive Statistics
- Summary statistics (equivalent to
PROC UNIVARIATE) - Frequency tables (equivalent to
PROC FREQ) - Cross-tabulations and chi-square tests
- creation of tables like DM and AE outputs
- Summary statistics (equivalent to
- Statistical Modeling
- Linear regression (equivalent to
PROC REG) - Logistic regression (equivalent to
PROC LOGISTIC) - ANOVA (equivalent to
PROC ANOVA) - Mixed models and advanced techniques
- Linear regression (equivalent to
1.3.6 Module 6: Package Develpment and testing
- Creating R Packages
- Package structure and essential files
- Documenting functions with
roxygen2 - Building and installing packages
- Testing with testthat
- Writing unit tests for R functions
- Running tests and interpreting results
1.3.7 Module 7: Data Visualization
- Base R Graphics
- Basic plots and customization
- Comparison with SAS/GRAPH
- ggplot2 - Grammar of Graphics
- Understanding the layered approach
- Creating publication-ready plots
- Advanced visualization techniques
1.3.8 Module 8: Reporting and Documentation
- R Markdown and Quarto
- Creating dynamic reports (equivalent to ODS output)
- Integrating code, results, and narrative
- Output formats: HTML, PDF, Word
- Reproducible Research
- Project organization
- Version control with Git
- Best practices for code documentation
1.4 SAS to R Translation Guide
| SAS Concept | R Equivalent | Package |
|---|---|---|
| DATA step | dplyr::mutate() |
dplyr |
| PROC SQL | dplyr verbs |
dplyr |
| PROC MEANS | dplyr::summarise() |
dplyr |
| PROC FREQ | table(), xtabs() |
base R |
| PROC REG | lm() |
base R |
| PROC LOGISTIC | glm() |
base R |
| PROC TRANSPOSE | tidyr::pivot_*() |
tidyr |
| ODS OUTPUT | R Markdown/Quarto | rmarkdown/quarto |
1.5 Prerequisites
- SAS Experience: Familiarity with SAS programming, data steps, and procedures
- Statistical Knowledge: Basic understanding of statistical concepts
- Programming Basics: Understanding of programming logic and data structures
1.6 Course Format
- Interactive Learning: Hands-on exercises with real datasets
- Comparative Examples: Side-by-side SAS and R code comparisons
- Practical Projects: Real-world scenarios mimicking typical SAS workflows
- Reference Materials: Quick reference guides and cheat sheets
1.7 Getting Started
1.7.1 Required Software
- R (version 4.3+): Download from CRAN
- RStudio: Download from Posit
- Essential Packages: We’ll install these as needed
Tip: If you don’t have sample SDTM/ADaM data yet, the chapters generate small synthetic data as a fallback so everything runs end-to-end. ## contact For questions or feedback, reach out to r2sas2025@gmail.com