Assessment description
This final assessment is a small individual project involving data processing, analysis and visualisation. The project requires students to demonstrate python data science skills and techniques in data processing and exploration on real-world datasets with consideration of data ethics.
In this project, you will be given a dataset for data processing, analysis and visualisation.
We will be analysing the BRFSS (brfss.csv) weight vs height data. The Behavioral Risk Factor Surveillance System (BRFSS) is the nation’s premier system of health-related telephone surveys that collect state data about US residents regarding their health-related risk behaviours, chronic health conditions, and use of preventive services. Weight and height have been queried in a telephone interview.
The data for this project: see Filename Assessment_3_data_bfss.csv – USE PANDA to view
The six columns in the data represent: age, current_weight (kg), weight_a_year_ago (kg), current_weight with two decimals(kg), height (cm), and gender, where gender == 1 represents male and 2 represents female.
In this project, you are required to have insightful discovery about the data via initial exploratory and visualisation with the learned skills from this unit.
________________________________________
Assessment details
Attempt the tasks below with the given dataset, at the same time, reflect on the development and applications of data science while ensuring the respect of human rights and of the values shaping open, pluralistic and tolerant information societies.
________________________________________
Prepare a Jupyter Notebook for Tasks 1-3 of this project
The structure of the Jupyter Notebook should alternate texts and python codes and cover topics listed in the following specific tasks. You may use this template to complete and submit these tasks 1-3 of this assessment:
See FILENAME: Assessment 3.ipynb - FORMAT TO FOLLOW
Task 1 (10 marks)
Produce a summary statistics graph on current_weight, weight_a_year_ago, and height. [Hint: similar to figure 1 below]
Figure 1. An example of a summary statistics graph (n.d.)
Task 2 (10 marks): Calculate correlation
Define weight_change = (current_weight – weight_a_year_ago). Calculate the correlation between weight_change and the following variables, and determine which one is most correlated (regardless of signs of correlation) with weight_change. Use scatter plots to support your conclusion.
i. current_weight
ii. weight_a_year_ago
iii. age
[Hint: One scatter plot for each variable.]
DescriptionIn this final assignment, the students will demonstrate their ability to apply two ma
Path finding involves finding a path from A to B. Typically we want the path to have certain properties,such as being the shortest or to avoid going t
Develop a program to emulate a purchase transaction at a retail store. Thisprogram will have two classes, a LineItem class and a Transaction class. Th
1 Project 1 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of
1 Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of