COMP47670 Assignment 1: Data Collection & Preparation
Overview:
The objective of this assignment is to collect a dataset from one or more open web APIs of your choice, and use Python to preprocess and analyze the collected data.
The assignment should be implemented as a single Jupyter Notebook (not a script). Your notebook should be clearly documented, using comments and Markdown cells to explain the code and results.
Tasks:
For this assignment you should complete the following tasks:
1. Data identification:
Choose at least one open web API as your data source (i.e. not a static or pre-collected dataset). If you decide to use more than one API, these APIs should be related in some
2. Data collection:
Collect data from your API(s) using Depending on the API(s), you may need to repeat the collection process multiple times to download sufficient data.
Store the collected data in an appropriate file format for subsequent analysis (e.g. JSON, XML, CSV).
3. Data preparation and analysis:
Load and represent the data using an appropriate data structure (i.e. records/items as rows, described by features as columns).
Apply any preprocessing steps that might be required to clean or filter the data before analysis. Where more than one API is used, apply suitable data integration
Analyse, characterise, and summarise the cleaned dataset, using tables and plots where appropriate. Clearly explain and interpret any analysis results which are
Summarise any insights which you gained from your analysis of the data. Suggest ideas for further analysis which could be performed on the data in
Guidelines:
The assignment should be completed individually. Any evidence of plagiarism will result in a 0
Submit your assignment via the COMP47670 Brightspace page. Your submission
should be in the form of a single ZIP file containing the notebook (i.e. IPYNB file) and your data. If your data is too large to upload, please include a smaller sample of the data in the ZIP file.
In the notebook please clearly state your full name and your student number. Also provide links to the home pages for the API(s) which you
Hard deadline: Submit by the end of Monday 23rd March 2020
1-5 days late: 10% deduction from overall mark
6-10 days late: 20% deduction from overall mark
No assignments accepted after 10 days without extenuating circumstances approval and/or medical
DescriptionIn this final assignment, the students will demonstrate their ability to apply two ma
Path finding involves finding a path from A to B. Typically we want the path to have certain properties,such as being the shortest or to avoid going t
Develop a program to emulate a purchase transaction at a retail store. Thisprogram will have two classes, a LineItem class and a Transaction class. Th
1 Project 1 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of
1 Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of