Introduction
Do you know about Zipf’s Law? It relates to a phenomenon for how very frequent items can sometimes appear far more frequently than less frequent items. For example, the word “the” is something like 7% of words when an appropriate corpus of English-language text documents are examined.
If you were interested in examining Zipf’s Law, you might write a computer program to read a bunch of text documents and count how often each word appears. Depending on various factors, one data structure you might consider using is a balanced binary search tree. There’s another data structure you might want to consider instead, but we’re going to combine that with a balanced BST for something fun after exam two.
Getting Started
Before you begin work on this project, there are a couple of chores you'll need to complete on your ICS 46 VM to get it set up to proceed.
Refreshing your ICS 46 VM environment
Even if you previously downloaded your ICS 46 VM, you will probably need to refresh its environment before proceeding with this project. Log into your VM and issue the command ics46 version to see what version of the ICS 46 environment you currently have stored on your VM. Note, in particular, the timestamp; if you see a version with a timestamp older than the one listed below, you'll want to refresh your environment by running the command ics46 refresh to download the latest one before you proceed with this project.
If you're unable to get outgoing network access to work on the ICS 46 VM — something that afflicts a handful of students each quarter — then the ics46 refresh command won't work, but an alternative approach is to download the latest environment from the link below, then to upload the file on to your ICS 46 VM using SCP. (See the Project #0 write-up for more details on using SCP.) Once the file is on your VM, you can run the command ics46b refresh_local NAME_OF_ENVIRONMENT_FILE, replacing NAME_OF_ENVIRONMENT_FILE with the name of the file you uploaded; note that you'd need to be in the same directory where the file is when you run the command.
The file is linked from the “public” ICS 46 page; click this link and enjoy the amazing web design skill that put it together: https://www.ics.uci.edu/~mikes/ics46/
Creating your project directory on your ICS 46 VM
A project template has been created specifically for this project, containing a similar structure to the basic template you saw in Project #0.
Decide on a name for your project directory, then issue the command ics46b start YOUR_CHOSEN_PROJECT_NAME project3 to create your new project directory using the project3 template. (For example, if you wanted to call your project directory proj3, you would issue the command ics46 start proj3 project3 to create it.) Now you're ready to proceed
Choosing a project partner
You have the option to work with a second person for this assignment. If you do so, I expect you to work via pair programming. That is, you may not split the assignment, such as by having one person implement the AVL Tree while the other person implements the function that does the counting, and the two are stitched together later. I reserve the right to ask one or both project partners about the implementation and adjust the score accordingly.
Similarly, any academic dishonesty arising from a group will be treated as an offense by both partners.
Information about registering partnerships and submitting partnered assignments will be released mid-week. Keep an eye on Piazza if you are going this route. At time of this document’s publication, your professor isn’t 100% sure how to handle that aspect.
Reviewing related material
I encourage you to read your textbook; in the second edition, section 10.1 covers Binary Search Trees and 10.2 talks about AVL Trees. Your book is good at getting to the point, so this should not be a long read. Furthermore, you should look at your notes from the associated lectures.
Requirements
Fill in hpp
Your implementation must be implemented via linked nodes in the tree format from lecture. That is, you may not have a “vector-based ”
Your implementation must fit the interface given
Your implementation must be templated as
You do not need to write a copy constructor or an assignment operator, but knowing how to do so is generally a good
You do need to implement the
All functions that need to be implemented have been started in the provided .hpp
For comparing keys, use the “natural” comparison offered by <.
Any test cases provided will have something for the key that has this defined.
Do not hard code for assumptions for how the tree will be used in the second part of the
Write function void countWords(std::istream & in, MyAVLTree<std::string, unsigned> &counter) in cpp
This function will read words from the input stream. It will break them into given words. All words with the same letters, in the same order, are considered the same word: disregard capitalization. For example, “Bill” and “bill” are the same word, even if the context might imply a person named William in one case and an invoice in the
You can treat istream just like std::cin -- there is some sample code Please let me know if it is insufficient, I can explain more.
You may assume that the given tree is empty when the function is first This will certainly hold true for any test cases.
When this function returns, the tree parameter should now have a map of strings to non-negative integers, where the associated value with a given string key is the number of times that word appeared in the
It is strongly recommended that you produce the tree in the following fashion: for each word in the stream
if the word has already been seen
retrieve the count of how many times it has been seen. increment that count.
add this to the tree with a count of 1.
Your implementation does not have to be the most efficient thing ever, but it cannot be “too slow.” In general, any test case that takes over a minute on the grader’s computer may be deemed a wrong answer, even if it will later return a correct
You may not use parts of the C++ standard template library in this assignment except for std::vector. Furthermore, std::vector may only be used when implementing the three traversals (in-order, pre-order, post-order). For what it’s worth, you won’t miss it for this assignment.
DescriptionIn this final assignment, the students will demonstrate their ability to apply two ma
Path finding involves finding a path from A to B. Typically we want the path to have certain properties,such as being the shortest or to avoid going t
Develop a program to emulate a purchase transaction at a retail store. Thisprogram will have two classes, a LineItem class and a Transaction class. Th
1 Project 1 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of
1 Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of