logo Use CA10RAM to get 10%* Discount.
Order Nowlogo
(5/5)

Is the salary of a player dependent of the position he plays in the game?

INSTRUCTIONS TO CANDIDATES
ANSWER ALL QUESTIONS

1. Introduction

With millions of fans worldwide, the National Basketball Association (NBA) is now one of the most popular and successful professional sports leagues. Huge profits have led to the superstars in the NBA being paid lavishly: in the 2018-19 season, Steph Curry earned over 37.4 million and LeBron James made over 35.6 million on the court……..

In our project, a dataset containing the salaries of NBA players from the 2016-17 season is used, with other variables such as career statistics, draft picks and physical measurements. Based on this dataset, we seek to answer the following popular questions around the NBA:

1. Is the salary of a player dependent of the position he plays in the game?

2. Does the salary depend on the height of the player?

3. Does the salary depend on the draft-pick and draft-year of the player?

4. Is there a single performance measure that is more important in affecting the salary than the others?

5. ……

This report will cover the data descriptions and analysis using R language. For each of our research objectives, we performed statistical analysis and drew conclusions in the most appropriate approach, together with explanations and elaborations.

2. Data Description

The dataset, titled “NBA Salaries”, is obtained from the online data science community data.world. The original data consists of 2 csv data frames, titled “players.csv” and “salaries_1985to2018.csv”. The dataset was originally posted on basketball-reference.com, the official database partner of the National Basketball Association (NBA), and is open to the public for study and research.

Before proceeding to data analysis, we first performed a preliminary data cleaning to ensure that:

- Irrelevant columns are eliminated, e.g. “birthplace” and “highschool”;

- Players with fewer than 10 games played are treated as unrepresentative anomalies and excluded;

- Redundant information is cut out, e.g. the word “overall” under “draft_pick” column as we only need the number for analysis;

- Only 2016-17 season’s data are included in our dataset

- ……

 

After all the preparation, 513 observations (players) with 11 variables are retained for analysis:

1. Sno: serial number

2. X_id: player identity, abbreviated player name

3. AST: career average assists per game

 

4. FGpct: career field goal percentage

5. PTS: average career point scored per game

6. TRB: average career rebound per game

7. height: height of the player

8. draft_pick: draft-pick position of the player, indicating the perceived value of the player before joining NBA

9. draft_year: draft-pick year of the player, indicating the seniority of the play in NBA

10. position: primary position played by the player (PG, SG, SF, PF or C)

11. salary: total salary in US$ of the player for the 2016-17 season

3 Description and Cleaning of Dataset

In this section, we shall look into the data in more detail. Each variable is investigated individually to look for possible outliers, and/or to perform a transformation to avoid highly skewed data.

3.1 Summary statistics for the main variable of interest, salary

The following plots show the overall distribution of the variable 𝑠𝑎𝑙𝑎𝑟𝑦.

It appears that the variable 𝑠𝑎𝑙𝑎𝑟𝑦 is highly skewed, hence we apply a log-transformation (base

𝑒) to the variable. The log-transformed data appears to have some outlying values at the left tail. Upon further investigation, we notice that some players were on short-term contracts (a few weeks) during the season. Therefore, we remove those players whose salary are below US$100,000, approximately 4% of the data.

The histogram and boxplot of the log-transformed variable, with the outliers removed are shown below with summary statistics. The dataset is now more symmetric and does not have any outliers.

We shall proceed to the next section with this trimmed dataset.

 

(5/5)
Attachments:

Related Questions

. Introgramming & Unix Fall 2018, CRN 44882, Oakland University Homework Assignment 6 - Using Arrays and Functions in C

DescriptionIn this final assignment, the students will demonstrate their ability to apply two ma

. The standard path finding involves finding the (shortest) path from an origin to a destination, typically on a map. This is an

Path finding involves finding a path from A to B. Typically we want the path to have certain properties,such as being the shortest or to avoid going t

. Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class. The LineItem class will represent an individual

Develop a program to emulate a purchase transaction at a retail store. Thisprogram will have two classes, a LineItem class and a Transaction class. Th

. SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of Sea Ports. Here are the classes and their instance variables we wish to define:

1 Project 1 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of

. Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of Sea Ports. Here are the classes and their instance variables we wish to define:

1 Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

expert
Um e HaniScience

558 Answers

Hire Me
expert
Muhammad Ali HaiderFinance

972 Answers

Hire Me
expert
Husnain SaeedComputer science

922 Answers

Hire Me
expert
Atharva PatilComputer science

828 Answers

Hire Me