logo Use CA10RAM to get 10%* Discount.
Order Nowlogo
(5/5)

Use the Structured English Algorithm Modify your Python script from Assignment to replace all review scores with their z scores, and based on these z scores, to drop all rows containing outliers

INSTRUCTIONS TO CANDIDATES
ANSWER ALL QUESTIONS

Use the Structured English Algorithm

Modify your Python script from Assignment 1 to replace all review scores with their z scores, and based on these z scores, to drop all rows containing outliers. Specifically, you will need the following functionalities.

  • Using any Python method/module/technique/data structure, save column 2 (containing reviewer names) and the header row of variable names from the .csv file of reviews, so that they can be brought back into a transformed matrix later.
  • As before, save columns 3 through 11 to an ndarray, replace all instances of  '-1' (denoting missing values) with NaN, and save the cleaned values to a different ndarray.
  • Mask the array of cleaned values, then replace all values with their columnar z scores, and next save these z scores in another new array called 'z_array'.
  • Save 'z_array' to a .csv file called 'z_array_csv.csv', while bringing back in the saved column 2 and the saved header row. Use any Python method/module/technique/data structure to do so.
  • Copy 'z_array_csv.csv' to another csv file called 'no_outliers_csv.csv'. Using any Python method/module/technique/data structure, delete all rows of reviews in 'no_outliers_csv.csv' that contain two or more outlier values. An outlier value is any value greater than +3 or less than -3.

Submit your Python script as a Word/Notepad file, as well as the two csv files 'z_array_csv.csv' and 'no_outliers_csv.csv.'

**Notes from Instructor

# IMPORT VARIOUS MODULES SUCH AS csv, numpy, pandas, and zscore from scipy.stats
# SAVE THE HEADERS OF THE REVIEWS FILE (EXCEPT THE AUTHOR COLUMN HEADER) TO A LIST CALLED headers_list
# SAVE THE HEADER OF THE AUTHOR COLUMN TO A LIST CALLED author_header
# SAVE COLUMN OF REVIEWER NAMES (MINUS ITS HEADER) TO A LIST CALLED author_data
# READ COLUMNS C THROUGH K OF THE REVIEWS FILE (SKIPPING THE HEADER ROW) INTO AN NDARRAY CALLED r_array
# REPLACE ALL -1'S IN r_array WITH NaN VALUES, AND SAVE THESE CHANGES TO AN NDARRAY CALLED clean_r_array
# MASK THE NaN VALUES IN clean_r_array, AND SAVE THE MASKED ARRAY TO AN NDARRAY CALLED masked_clean_r_array
# COMPUTE Z SCORES ON VALUES IN masked_clean_r_array, AND SAVE THEN TO AN NDARRAY CALLED z_array.
# (THE ABOVE COMPUTATION WILL AUTOMATICALLY IGNORE MASKED NaN VALUES)
# SAVE z_array TO THE .CSV FILE temp_array_csv.csv USING THE np.savetxt METHOD. YOU MAY NEED TO APPLY THE ARGUMENT fmt='%f' IN THIS METHOD
# READ 'temp_array_csv.csv' INTO A PANDAS DATA FRAME CALLED z_frame
# CONVERT z_frame TO A LIST CALLED z_frame_list

(5/5)
Attachments:

Related Questions

. Introgramming & Unix Fall 2018, CRN 44882, Oakland University Homework Assignment 6 - Using Arrays and Functions in C

DescriptionIn this final assignment, the students will demonstrate their ability to apply two ma

. The standard path finding involves finding the (shortest) path from an origin to a destination, typically on a map. This is an

Path finding involves finding a path from A to B. Typically we want the path to have certain properties,such as being the shortest or to avoid going t

. Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class. The LineItem class will represent an individual

Develop a program to emulate a purchase transaction at a retail store. Thisprogram will have two classes, a LineItem class and a Transaction class. Th

. SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of Sea Ports. Here are the classes and their instance variables we wish to define:

1 Project 1 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of

. Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of Sea Ports. Here are the classes and their instance variables we wish to define:

1 Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

expert
Um e HaniScience

727 Answers

Hire Me
expert
Muhammad Ali HaiderFinance

899 Answers

Hire Me
expert
Husnain SaeedComputer science

917 Answers

Hire Me
expert
Atharva PatilComputer science

526 Answers

Hire Me