Data tables are exemplary and limited to 20 rows. You can download the full data files directly from github
Chapter 1 Introduction to R
Chapter Objectives
- Learn about R as a programming language
- Define Integrated Development Environment
- Define objects
- Learn the assignment operator
- Define functions
- Executing a loop
- Learn logical operators
- Learn about R data types
- Learn about object classes
- Indexing data objects
- Extending R functionality with packages
- Writing a custom function
- Create a scatter plot with sports data
- Create a heatmap with sports data
2019-2020 Boston Player Stats.csv
2019-2020 Dallas Player Stats.csv
Chapter 2 Data Visualization: Best Practices
Chapter Objectives
- Articulate best practices of convincing visualizations
- Understand the programmatic layering used in most popular plotting R library
ggplot2
- Understand the difference between client and server-side data
- Create various plots with
ggplot
including sports fields and courts
- Create interactive visualizations with
echarts4r
that are client side
Dellavedova_18_19_season.csv
Chapter 3 Geospatial Data
Chapter Objectives
- Download baseball data from various sources
- Perform a “crosswalk” inner-join with data to append additional information
- Chart a player’s performance over time
- Web scrape player data from a GET request
- Tabulate pitch types by year
- Visualize the change by pitch type over time
- Create and interpret a box plot of pitch speed by type and time
- Create a 2D density plot of pitch type
- Make JavaScript interactive and static plots for each visual in this chapter
Miguel_Castro_pitchStats_backup.csv
Chapter 5 Logistic Regression
Chapter Objectives
- Follow the SEMMA approach to modeling
- Construct various visuals within the data exploration phase of the modeling exercise
- Build a logistic regression to model winning team characteristics
- Calculate multiple model key performance indicators and compare them across training and validation partitions
- Construct a waterfall using the model coefficients to understand the proportion of the model’s output is explained by each feature
- Organize the modeling coefficients and winning team data to construct a scatter plot
- Interpret the scatter plot quadrants as a means to understand team behavior and what aspects of women’s collegiate basketball should be focused or deprioritized by players and coaches
- Identify top-performing teams according to statistic(s) that may be overlooked by other teams
imputed_DefensiveStats.csv
imputed_OffensiveStats.csv
Chapter 6 Guaging Fan Sentiment in Cricket
Chapter Objectives
- Learn what NLP is and a basic approach to analyzing text
- Learn the basic NLP terms and object classes
- Define the six-step NLP workflow
- Apply various string manipulation functions to a collection of forum posts as documents
- Identify two-word lexicons for sentiment analysis and adjust one for the forum’s context
- Programmatically change the tokenization of the text from unigram to bigrams
- Learn about full, inner, and left joins
- Visualize the overall forum community comment velocity
- Build a word cloud of frequent two-word phrases
- Classify forum comments by emotional category, then plot as a radar chart for the entire forum conversation
- Focusing on individual users, calculate and visualize the network graph of comments to identify the most central author
- Individually review the most and least negative authors, creating a bar chart for review
Chapter 7 Gambling Optimization
Chapter Objectives
- Understand the basic premise of sports line ups
- Contextualize the impact of fantasy sports and gambling
- Learn to set a football lineup
- Define a simulation of outcomes
- Solve a linear programming football lineup problem
- Identify a single lineup which maximizes a week’s lineup using player point predictions
copy_scrape.rds
- A copy of the scrape is offered here so you can see how the data is structured in case the webpages change after publication.
Chapter 8 Exploratory Data Analysis
Chapter Objectives
- Download basic sports data
- Apply various functions to understand summary, tabular and statistical information of the data
- Build bar, timeline event, and line charts to explore patterns in the data
- Construct a Markov Chain to understand the next most likely event in an effort to more fully understand characteristics of the overall scenario represented in the data
©Copyright 2022
rstatsbook.com
commentsReddit_Feb_15_2021.csv