logo Use CA10RAM to get 10%* Discount.
Order Nowlogo
(5/5)

This assignment covers learning objective

INSTRUCTIONS TO CANDIDATES
ANSWER ALL QUESTIONS

This assignment covers learning objective 1: An understanding of basic data structures, including stacks, queues, and trees; learning objective 3: An ability to apply appropriate sorting and search algorithms for a given application; learning objective 5: An ability to design and implement appropriate data structures and algorithms for engineering applications.

This assignment is to be completed on your own. The description is mainly taken from Professor Vijay Raghunathan. It deals with file compression and file decompression (similar to zip and unzip). They are based on the widely used algorithmic technique of Huffman coding, which is used in JPEG compression as well as in MP3 audio compression. In particular, you will utilize your knowledge about lists/arrays, trees and/or other necessary data structures learned in ECE26400 to design program to “decompress” (decode) a file that has been compressed (coded or encoded) using “Huffman coding.”

You may have learned about Huffman coding earlier in ECE26400. Your instructor may have pro- vided you structures/functions that are relevant for this assignment. However, you are not allowed to use those structures/functions as your own work in this assignment. You should write your own structures/functions to replace those structures/functions for this assignment.

For this assignment, you will decode a file that has been coded using a coding tree, similar to how a file is coded (or compressed) using a Huffman coding tree. Your program will have to decide whether the coding tree is a Huffman coding tree, i.e., whether the coding has been constructed optimally. We shall first present the concept of Huffman coding, which is essential for your development of a method to determine whether a given file has been coded optimally.

Although the description is based on Huffman coding tree, it can be applied to any coding tree that is a strictly binary tree (with ASCII characters as the leaf nodes). A strictly binary tree is a binary tree where a node has either 0 or 2 child nodes. A node with 0 child nodes is a leaf node and a node with 2 child nodes is an internal (non-leaf) node.

ASCII coding (and extended ASCII coding)

In (extended) ASCII coding, every character is encoded (represented) with the same number of bits (8-bits) per character. Since there are 256 different values that can be represented with 8-bits, there are potentially 256 different characters in the ASCII character set, as shown in the ASCII character table (and extended ASCII character table) available at http://www.asciitable.com/.

Using ASCII encoding (8 bits per character) the 13- character string "go go gophers" requires 13    8 = 104 bits. The table to the right shows how the coding works. From left to right, the binary bits for each

character are ordered from the most significant po- sition to the least significant position.

The given string would be written in a file as 13 bytes, represented by the following stream of bits:

01100111 01101111 00100000 01100111 01101111 00100000 01100111 01101111 01110000

01101000 01100101 01110010 01110011

A more efficient coding

There is a more efficient coding scheme that uses fewer bits. As there are only 8 different characters in

"go go gophers", we can use 3 bits to encode each of the 8 different characters. We may, for example,

 

use the coding shown in the following table (other 3-bit encodings are also possible). Again, we assume that from left to right, the 3 bits for each character is ordered from the most significant position to the least significant position.

Note that we assume that for each byte in this bit stream, we have the most significant bit on the left and the least significant bit on the right. In other words, the least signif- icant bit in this bit stream is a 1-bit. Now the string "go go gophers" would be encoded using a total of 39 bits instead of 104 bits. We can store that as five 8-bit bytes in a file as follows (in each byte, left to right is most significant to least significant):

11001000 10010001 00100011 00011010 01101011.

The first byte contains the 3 bits of ’g’ at the least significant positions, the 3 bits of ’o’ in the middle, and the less significant 2 bits of Space. The least significant bit of the second byte contains the most significant bit of Space. In positions of increasing significance, the second byte also contains ’g’, ’o’, and the least significant bit of Space. As five bytes contains 40 bits altogether, the most significant position of the last byte in the file is not used. In this assignment, all unused bits should be assigned the value of 0. However, even in this improved coding scheme, we used the same number of bits to represent each character, regardless of how often the character appears in our string. Even more bits can be saved if we use fewer than three bits to encode characters like ’g’, ’o’, and Space that occur frequently and more than three bits to encode characters like ’e’, ’h’, ’p’, ’r’, and ’s’ that occur less frequently in "go go gophers". This is the basic idea behind Huffman coding: use fewer bits for characters that occur more frequently. We’ll see how this is done using a strictly binary tree that stores the characters as its leaf nodes,

and whose root-to-leaf paths provide the bit sequences used to encode the characters.

Again, although the description is based on Huffman coding tree, it can be applied to any coding tree that is a strictly binary tree with the leaf nodes storing ASCII characters. The main difference is in the the number of bits required to represent (code/encode/compress) a given file.

 

 

(5/5)
Attachments:

Related Questions

. Introgramming & Unix Fall 2018, CRN 44882, Oakland University Homework Assignment 6 - Using Arrays and Functions in C

DescriptionIn this final assignment, the students will demonstrate their ability to apply two ma

. The standard path finding involves finding the (shortest) path from an origin to a destination, typically on a map. This is an

Path finding involves finding a path from A to B. Typically we want the path to have certain properties,such as being the shortest or to avoid going t

. Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class. The LineItem class will represent an individual

Develop a program to emulate a purchase transaction at a retail store. Thisprogram will have two classes, a LineItem class and a Transaction class. Th

. SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of Sea Ports. Here are the classes and their instance variables we wish to define:

1 Project 1 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of

. Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of Sea Ports. Here are the classes and their instance variables we wish to define:

1 Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

expert
Um e HaniScience

523 Answers

Hire Me
expert
Muhammad Ali HaiderFinance

687 Answers

Hire Me
expert
Husnain SaeedComputer science

775 Answers

Hire Me
expert
Atharva PatilComputer science

766 Answers

Hire Me