STA 6207 -- Regression Analysis



            

Fall 2023 Syllabus      PDF     WORD 


Exam 1 Grade Distribution - Fall 2019       Solutions

Exam 2 Grade Distribution - Fall 2019 - 214 Total Points

Exam 1 Solutions (Fall 2018) 

Exam 2 Solutions  (Fall 2018)

Fall 2016     Exam 1    Solutions     

Fall 2016     Exam 2    Solutions    

Fall 2012       Exam 1       Exam 2          Exam 3

Fall 2013       Exam 1       Exam 2          Exam 3 (Version 1)    Exam 3 (Version 2)        

Fall 2014       
Exam 1       Exam 2         Exam 3

Fall 2015        Exam 1       Exam 2         Exam 3    

Fall 2016        Exam 1       Exam 2         Exam 3

Fall 2017        Exam 1       Exam 2         Exam 3

Exam 1 Topics - Fall 2018   (Exam @ 7AM, Friday, September 28)     Through Chapter 4 of Course Notes                         

Exam 2 Topics (Exam @ 7AM, Friday, October 27)

Exam 3 Topics - 2019 (Exam @ 7AM, Monday, December 2, 2019)

Instructor: Larry Winner
Office Hours:  TBA

E-mail: winner@stat.ufl.edu
Phone: (352) 273-2995


TA: Michael Kim    Office Hours: TBA       E-mail: michaelkkim@ufl.edu     Office: 


Course Notes (Theory Portion) - Old Version

Regression Notes - New Version (9/14/2021)  ( Do not print this out on Department Copier!)         R Programs        

RPD SAS/R  Programs/Output and Datasets

Regression Examples

Homework Assignments

Very Old Exams

Statistical Tables

Running R on Windows and Macs (Source: Stanford University Social Science Data and Software)

Downloading RStudio (Very helpful platform for Running R and managing Plots)     http://www.rstudio.com/ide/download/


Chapter 1 - Math Stat Review/Introduction

Math Stats I Materials  (Probability)

Math Stats II Materials  (Inference)

Brief Introduction to Probability Distributions     PDF       PDF (B/W)

Examples of Common Families of Distributions

Michael Jordan Career Regular Season Stats       EXCEL

Brief Introduction to Likelihood Functions and Statistical Tests     PDF      PDF (B/W)

Distributions of Functions of Normal RVs         R Program        R Output         Graphics Output

Conditional Expectations - Television Sales (PPT)



Chapter 2 - Simple Linear Regression - Scalar Form (RPD Chapter 1)

Practice Problems      WORD       PDF

Regression/Correlation for Heights/Weights of NHL Players      R Program         Data

Bollywood Movies Box Office Grosses and Budgets      R Program           Data      Description      EXCEL


Carpet Aging            R Program (.r)         R Program (.pdf)         R Output           EXCEL Spreadsheet

Electric Train Supply and Demand       Data       Description

EXCEL   Spreadsheet      Combined EXCEL, R, SAS   Programs/Results

R    Program            SAS    Program

MPA Suspension Concentration and Peak Intensity  EXCEL            Data      Description

Antioxidant Levels and Activity in 40 Varieties of Lager Beer        Data (.csv)         Description      EXCEL Spreadsheet

WNBA Heights and Weights           Data (.csv)       R Program             R Text Output           R Graphics Output                 Regression through Origin (EXCEL)

Orlando Weather Data (EXCEL)

Minneapolis Annual Temperature 1900-2014 (EXCEL)

NBA Over/Under and Total Points 2014/2015 Correlation Analysis        Data.csv       Description        R Program     EXCEL Spreadsheet 

Classical Simple Linear Regression Model - Galton Height Data

Simple Linear Regression - Stochastic Predictor - NBA Over/Under and Total Points 2014/5 Regular Season      R Program


Chapters 3 & 4 - Matrix Approach to Simple Linear Regression and Distributional Properties  (RPD Parts of Chapter 2 - 4)

Practice Problems      WORD      PDF

Introduction to Matrix Algebra and Simple Linear Regression in Matrix Form (Chapter 2 and part of 3 in RPD)      PDF

Bollywood Movie Revenues and Budgets - Matrix Computations     R Program

Florida Lotto Results - October 24,1999 - September 16, 2014        R Program

Michael Jordan Career Regular Season Stats (Linear Functions of RVs)       EXCEL

Gravity/Latitude Worksheet        EXCEL Spreadsheet             Data       Description

Regression Models with Stochastic Regressors - NBA Over/Under and Total Points 2014/5      R Program (Partial)           Data (.csv)      Description    Worksheet

Maya Moore Scoring in WNBA Playoffs - Projection

EXCEL Spreadsheet for Exercise 3.13

RPD Problem 4.6     R Program      R Output

Power Computation and Simulation      R Program

Noncentral Chi-Square Distribution    R Program


Chapter 5 - Problem Areas in Least Squares and 
Residual Diagnostics/Tests  (RPD Chapters 10-12)

Practice Problems            WORD         PDF  

Problem Areas in Least Squares (PPT)

R Program to Simulate Problem Areas in Least Squares

Maya Moore 2014 Points per Game and Minutes Played               Worksheet

Simple Linear Regression - Graphing and Testing Model Assumptions - NBA Players Weights and Heights                EXCEL          R Program

F-Test for Lack-of-Fit - Breaking Strength  of Fibers            EXCEL Spreadsheet             R Program

Residuals and Influence Measures  (WORD)

Residuals and Influence Measures (PPT)       Apparent R Guidlelines for Identifying Influential Observations

Bollywood Movies Revenues and Budgets (Diagnostics)  R Program

US State Wine Consumption and Population     Data (.csv)        R Program        PDF

Math Score/LSD Concentration      EXCEL       R Program

Residual Analysis of Regression of Argentine Wheat Yields Rainfall and Temperature  (WORD)

Residual Analysis of Regression of Argentine Wheat Yields Rainfall and Temperature  (EXCEL)

Argentine Wheat Yields     Data           Description

Argentine Wheat Yields   SAS Program      SAS Text Output      SAS Graphics Output

Argentine Wheat Yields   R Program           R Text Output          R  Graphics Output

Muscle Regression Case Study (PPT)

Muscle Regression Matrix Example (Y=Heat Production (Calories), X1=Work Effort (Calories), X2=Body Mass (Kilograms))  (EXCEL Spreadsheet)

SAS Program       SAS Text Output         SAS Graphics Output

R Program            R Text Output             R Graphics Output

NFL 2007 Spread and Actual Scores - Regression/Residual Analysis and Tests  (PPT)

Variance Stabilizing Transformations
 
Box-Cox Transformation Description

Transformations on Y and X to Approximate Normality (Box-Cox) and Linearity (Box-Tidwell)      R Program       Data

Spanish Silver in New World 1720-1800 - Box-Cox Transformation SAS Program (Proc Transreg)       SAS Program Output        SAS Program Graphics Output (Proc Transreg)

Spanish Silver in New World 1720-1800 - Box-Cox Transformation R Program (boxcox function)     R Program Graphics Output (boxcox function)

Spanish Silver in New World 1720-1800 - Box-Cox Transformation SAS Program (Matrix Form)          SAS Program Output (Matrix Form)       SAS Program Graphics Output (Matrix Form)

Spanish Silver in New World 1720-1800 - Box-Cox Transformation R Program (Matrix Form)         R Program Output (Matrix Form)         R Program Graphics Output (Matrix Form)



Chapter 6 - Multiple Linear Regression

Practice Problems      Word          PDF

Sections 6.1-6.3 - Estimation and Testing  (RPD Chapters 3 and 4)

Analysis of Variance Description     WORD         pdf

Using EXCEL for Matrix Form of Multiple Regression Model - Hotel Energy Consumption   PPT      EXCEL      R Program      Data (.csv)      Description

Multiple Linear Regression - Texas January High Temperatures  (Complete/Reduced Models)

Assessed Winning Probabilities in Texas Hold 'Em           WORD              EXCEL           Data        Description

Estimating Demand Elasticity for Sugar 1896-1914            PPT                  EXCEL           Data        Description

Texas January High Temps (n=369 Locations)      EXCEL      
R Program        SAS Program     SAS Output    EXCEL (2 Models)

Association Between Height and Foot and Hand Lengths in Females  (EXCEL)

Hand Length EXCEL


Section 6.4 - General Linear Tests (RPD Sections 4.5)


NFL Point Spreads and Actual Scores (PPT)         Data (.csv)      R Program for General Linear Test      R Program for Confidence Ellipsoid

Height, Hand Length, and Foot Length for 80 Adult Males

Texas Mean January Temperature (EXCEL)          R Program      Program        Output

General Linear Test - Cobb-Douglas Production Function             EXCEL Spreadsheet      R Program

General Linear Test, CI, PI, Lack of Fit - WNBA Over/Under             EXCEL Spreadsheet

Performance Characteristics of an Air Conditioning System           Data (.csv)         Description      R Program      EXCEL


Sections 6.5-6.6 - Models with Qualitative Variables and Interactions (RPD Section 9.6)

Left Foot Lengths of Lahoul and Kulu Kanets     EXCEL

Bullet-Proof Fabric Layers and 3 Bullet Types  
     Data         Description         R Program with Bartlett's Test

R Program        R Text Output          R Graphics Output           SAS Program        SAS Output

Multiple Linear Regression - Dummy Variables in Accounting Example               EXCEL Spreadsheet

Cloud Seeding - Analysis of Covariance (EXCEL)



Sections 6.7-6.9 - Models w/Curvature, Response Surfaces, and Trigonometric Models (RPD Chapter 8)

Heat Capacity and Temperature for Solid Hydrogen Bromide         Data          Description      EXCEL

Ice Cream Sensory Evaluations                     EXCEL             Data           Description

Yarn Count and Output for ealy 20th Century New England Textile Mills      Data            Description

2013 WNBA Player Height and Weight  EXCEL       2013 NBA Player Height and Weight  EXCEL

Container Ship Speed and Fuel Consumption for ship_leg = 1        EXCEL     Data(All ship_legs)      Description

Sine and Cosine Plots

Trigonometric Regression - Tampa Bay Monthly Hotel Revenues          R Program

Trigonometric Regression - Shipping Container Throughput by Month

Response Surface Model - Top NASCAR Qualifying Speeds by Track

Response Surface Relating Sugarcane Wine Rating to 3 Factors      Data          Description

 R Program       R Text Output    R Graphics Output
     SAS Program      SAS Output

Response Surface Relating Mango Wine Ethanol Level to 3 Factors      Data        Description

EXCEL        SAS Program         SAS Output          R Program       R Output


Section 6.10 -  Model Building (RPD Chapter 7)


Mortgage Rates for 18 Cities - Worksheet

Cruise Ship Model Building             R Program        R Text Output       R Graphics Output          Data        Description     R Program for k-fold Cross Validation

Cruise Ship Model Building (Updated)             R Program (Updated)

AIC, AICc, BIC Example/Simulation    R Program

NBA Odds 2014/2015              Data (.csv)          Description           R Program


Section 6.11 -  Multicollinearity (RPD Chapter 13)

Shaq O'Neal Ponts/Rebounds - Eigenvalues, Eigenvectors, Principal Components      R Program

Multiple Linear Regression - Standing Heights and Other Stature Attributes for Female Police Officer Candidates (Multicollinearity/Principal Component Regression)

Multiple Linear Regression - China Carbon Emmissions and Population Factors 1978-2008 (Multicollinearity/Ridge Regression)

Cruise Ship      Ridge Regression  R  Program         Principal Components R Program

Texas Weather Principal Components R Program


Sections 6.12-6.13 - Models with Heteroskedastic and Correlated Errors (RPD Section 12.5)

Estimated Weighted Least Squares   PPT    EWLS Algorithm  (WORD)              R Program  

Weighted Least Squares Case study -- Cholesterol Reduction (PPT)

 Weighted Least Squares -- Cholesterol Reduction SAS Program        SAS Output         SAS Graph Output

 Weighted Least Squares -- Cholesterol Reduction R Program         R Output             R Graph Output

Estimated Weighted Least Squares - Profits and Market Structure for High Advertising Firms               Data       Description          

Estimated Weighted Least Squares - RKO Film Revenues and Costs

Estimated Weighted Least Squares Worksheet - Shotgun Pellet Spread            EXCEL          R Program        R Text Output

Second Experiment   R Program     Data (.csv)            Description         Summary of Results


Generalized Least Squares Case Study -- US Wine Sales vs Population 1934-2003 (PPT)       EXCEL

US Wine Sales and Population      Data    Description    SAS Program           R Program          R Output



Chapter 7 - Nonlinear Regression (RPD Chapter 14)

Practice Problems        Word         PDF

Intrinsically Linear Regression - Cobb-Douglas Production Function

Orlistat Case Study       Data       Description

R Program     R Text Output    R Graphics Output    SAS Program      SAS Text Output      SAS Graphics Output

Salmonella Weighted Nonlinear Least Squares    R Program         R Text Output          R Graphics Output

Solomon  Island Bird Species    Data      Description     R Program     Worksheet

Kentucky Derby Winning Times    Data (.csv)    R Program

Estimated Generalized Least Squares Matrix Algorithm for AR(q) Errors


Chapter 8 - Random Coefficient Regression/General Mixed Linear Models  (RPD Chapter 18)

Practice Problems        Word         PDF

Airline Revenues for 10 Markets 1996-2000 Case Study - PPT         Updated 1/29/2017

Airline Revenues for 10 Markets         Data     Description

R Program (lme procedure in nlme library)           R Output

SAS Program (proc mixed)                                      SAS Output

WNBA Example (EXCEL)        R Program    R Text Output    R Graphics Output      SAS Program    SAS Output

WNBA Extended Example        R Program    R Text Output        EXCEL Spreadsheet


Chapter 9 - Models Based on Non-Normal Distributions

Logistic Regression - NFL Field Goal Attempts (2003)

Logistic Regression - Pre-Challenger Field-Joint O-Ring Failures and Temperature

Logistic Regression with Grouped Data - Lobster Survival in Tether Experiment (PPT)      R Program

Poisson Regression - NASCAR Crash Data (1975-1979)

Poisson Regression with Rates - Traffic Accidents in Finland on Friday the 13th versus Other Fridays by Gender (1971-1997)

Negative Binomial Regression - NASCAR Lead Changes (1975-1979)

Gamma Regression - Napa Valley Marathon Speeds by Age and Gender  (2015)

Beta Regression - Proportion of Prize Money by Race for Ford - NASCAR Races (1992-2000)