(5/5)

I need to create a file system analysis tool that will analyze partitions. In this project, you will write a file system analysis tool called fsa, which can analyze partitions formatted with the ext2 file system.

INSTRUCTIONS TO CANDIDATES

ANSWER ALL QUESTIONS

Hello, I am looking for help for my project for my operating systems course.

For the project, I need to create a file system analysis tool that will analyze partitions. Attached is a zipped file with the project assignment with all the instructions and other resources used for the project.

This program must be programmed in ubuntu and coded in C

Objectives:

• Learn internal organization of a file system

• Practice reading bits and bytes

• Practice reading/utilizing Linux kernel code, man pages, and file system libraries/data structures

1 Project Description

In this project, you will write a file system analysis tool called fsa, which can analyze partitions

formatted with the ext2 file system.

Since this program can cause harm to the file system of the storage device to be analyzed, you

will be provided a disk image file (in zipped format due to large size, so unzip before using) where

the content of this file will be the clone of an ext2 formatted disk’s binary content (content from the

operating system’s point of view - sequence of blocks). We will call this file a virtual disk. Your

program will read the given virtual disk and report some general file system statistics, individual

group statistics, and the root directory entries. The virtual disk file will be passed as a command line

argument to your program as follows, where ext2disk_100mb.img is the name of the virtual

disk file to be analyzed:

./fsa ext2disk_100mb.img

As a result, the following information should be printed ("..."s will be filled by your program):

--General File System Information--

Block Size in Bytes: ...

Total Number of Blocks: ...

Disk Size in Bytes: ...

Maximum Number of Blocks Per Group: ...

Inode Size in Bytes: ...

Number of Inodes Per Group: ...

Number of Inode Blocks Per Group: ...

Number of Groups: ...

--Individual Group Information--

-Group ...-

Block IDs: ...

Block Bitmap Block ID: ...

Inode Bitmap Block ID: ...

Inode Table Block ID: ...

Number of Free Blocks: ...

Number of Free Inodes: ...

Number of Directories: ...

Free Block IDs: ...

Free Inode IDs: ...

--Root Directory Entries--

Inode: ...

Entry Length: ...

Name Length: ...

File Type: ...

Name: ...

2 Project Details

In order to complete this project, you need to learn ext2 file system details. We have covered the basics

of ext2 in class. Some resources documenting the internals of ext2 can also be found in blackboard as

part of this assignment. If you need more information, there are plenty other resources available on

the Internet related to ext2’s design.

A big help will be installing the e2fslibs-dev package in your Ubuntu machine, if not already

installed, and using the structs defined by this library in your program for some in-memory structures

of the ext2 file system including superblock, inode, group descriptor and directory entry structures.

You can install the e2fslibs-dev package using the following command:

sudo apt-get install e2fslibs-dev

Another important issue is that you should be able to read the content of the provided virtual disk

in raw mode. Below is a sample program called sb.c (also provided in blackboard) showing possible

ways to access the bytes of the provided virtual disk:

1 #include <stdio.h>

2 #include <stdlib.h>

3 #include <fcntl.h>

4 #include <unistd.h>

5 #include <ext2fs/ext2fs.h>

7 int main(int argc, char *argv[]) {

9 int i, rv, fd;

10 unsigned char byte;

11 char *sb1 = NULL;

12 struct ext2_super_block *sb2 = NULL;

14 if (argc != 2) {

15 fprintf (stderr, "%s: Usage: ./sb disk_image_file\n", "sb");

16 exit(1);

17 }

19 fd = open(argv[1], O_RDONLY);

20 if (fd == -1) {

21 perror("disk_image_file open failed");

22 exit(1);

23 }

25 /*skip the boot info - the first 1024 bytes - using the lseek*/

26 if (lseek(fd, 1024, SEEK_CUR) != 1024) {

27 perror("File seek failed");

28 exit(1);

29 }

31 sb1 = malloc(1024);

32 sb2 = malloc(sizeof(struct ext2_super_block));

34 if (sb1 == NULL || sb2 == NULL) {

35 fprintf (stderr, "%s: Error in malloc\n", "sb");

36 exit(1);

37 }

39 /*read the superblock byte by byte, print each byte, and store in sb1*/

40 for (i = 0; i < 1024; i++) {

41 rv = read(fd, &byte, 1);

42 if (rv == -1) {

43 perror("File read failed");

44 exit(1);

45 }

46 if (rv == 1) {

47 printf("byte[%d]: 0x%02X\n", i+1024, byte);

48 sb1[i] = byte;

49 }

50 }

52 printf ("Total Number of Inodes: %u\n", *(unsigned int *)sb1);

printf ("Number of Free Inodes: %u\n", *(unsigned int *)(sb1+16));

56 /*set the file offset to byte 1024 again*/

57 if (lseek(fd, 1024, SEEK_SET) != 1024) {

58 perror("File seek failed");

59 exit(1);

60 }

62 /*read the whole superblock and load into the ext2_super_block struct*/

63 /*assumes the struct fields are laid out in the order they are defined*/

64 rv = read(fd, sb2, sizeof(struct ext2_super_block));

65 if (rv == -1) {

66 perror("File read failed");

67 exit(1);

68 }

69 if (rv == 1024) {

70 printf ("Total Number of Inodes: %u\n", sb2->s_inodes_count);

71 printf ("Number of Free Inodes: %u\n", sb2->s_free_inodes_count);

72 }

74 free(sb1);

75 free(sb2);

76 close(fd);

78 return 0;

79 }

The program above opens the virtual disk in read only mode in line 19, skips the first 1024 bytes

in line 26 (you should know why we skip the first 1024 bytes - hint: read comments), and reads the

superblock info byte by byte printing and storing the bytes being read into sb1 in lines 39-50. This is

one way to reach the superblock info. Another (much easier) way is using the ext2_super_block

struct defined in the ext2_fs.h file. To be able to use this struct, you should include the ext2fs

library as in line 5 (#include <ext2fs/ext2fs.h>) assuming you have already installed the

e2fslibs-dev package. Line 57 shows how to reposition the file offset back to byte 1024, where the

superblock info starts. Then line 64 reads the whole superblock info into the sb2 struct. This assumes

that in-memory layout of the struct is not reordered by the compiler and the struct you use does not

require any padding bytes to be inserted by the compiler. I verified that this approach works in my

64-bit Ubuntu machine using the gcc compiler, but remember that it might not work in different

configurations! You can assume that it will work in the machine that we use for grading.

As a result, "Total Number of Inodes" and "Number of Free Inodes" values are printed to the screen

using two different approaches explained above. Both approaches should print the same values. Also,

each byte of the superblock info is printed to the screen using the first approach. You can redirect the

output of this program into a file as follows:

./sb ext2disk_100mb.img > out.txt

and analyse the output file (out.txt) to see the hexadecimal values of each byte of the superblock.

2.1 Descriptions and Tips

-General File System Information- should only be printed once and you can utilize the

following tips to find out necessary information:

• "Block Size in Bytes" indicates the block size used by the file system in bytes, which is available

in the superblock but not directly. You need to make some calculations to achieve the actual

value in bytes. Check page 4 (1.1.7. s_log_block_size part) of the provided Resource1.pdf

document in blackboard for more information about this calculation.

• "Total Number of Blocks" indicates the total number of blocks available in this partition, which

is directly available in the superblock.

• "Disk Size in Bytes" can be found by multiplying the "Total Number of Blocks" and the "Block

Size in Bytes".

• "Maximum Number of Blocks Per Group" is the maximum amount of blocks that each group

has and it is directly available in the superblock. Note that the last group will most probably

not have this many blocks but all other groups except the last one will definitely have this many

blocks.

• "Inode Size in Bytes" indicates the amount of bytes each inode occupies and it is directly available in the superblock.

• "Number of Inodes Per Group" indicates the amount of inodes created for each group, which is

directly available in the superblock.

• "Number of Inode Blocks Per Group" indicates the number of blocks used to store all the inodes

for each group. You already know the "Number of Inodes Per Group" and the "Inode Size in

Bytes". Using these two values you can calculate the total amount of bytes that the inodes

occupy for each group. Then using this value and the "Block Size in Bytes" value, you can

easily calculate the amount of disk blocks used to store all the inodes for each group.

• "Number of Groups" can be calculated using the "Total Number of Blocks" and the "Maximum

Number of Blocks Per Group" values. However, it might get tricky if the last group does not

have enough blocks to fit all necessary data structures and some data blocks. For simplicity,

we assume that the virtual disk we provided has enough blocks in its last group to fit all the

necessary data structures and some data blocks.

-Individual Group Information- should print group-specific information for each group

and you can utilize the following tips to find out these values:

• "-Group ...-" should include the group ID starting with 0 for the first group.

• "Block IDs" indicates the IDs of the blocks that are assigned for this group. Therefore, you will

provide a range here starting with the first block ID of the group and ending with the last block

ID of the group. For this calculation, you can utilize the ID of the first data block given in the

superblock info, the "Maximum Number of Blocks Per Group" value, and the "Total Number of

Blocks" value since the last block might have less blocks than the others.

• "Block Bitmap Block ID" indicates the ID of the first block holding the block bitmap for this

group and it is directly available in the group descriptor of this group.

• "Inode Bitmap Block ID" indicates the ID of the first block holding the inode bitmap for this

group and it is directly available in the group descriptor of this group.

• "Inode Table Block ID" indicates the ID of the first block holding the inode table for this group

and it is directly available in the group descriptor of this group.

• "Number of Free Blocks" indicates the total number of free blocks for this group and it is directly

available in the group descriptor of this group.

• "Number of Free Inodes" indicates the total number of free inodes for this group and it is directly

available in the group descriptor of this group.

• "Number of Directories" indicates the total number of directories for this group and it is directly

available in the group descriptor of this group.

• "Free Block IDs" indicates the IDs of the blocks that are free for this group. In order to find the

free blocks, you need to check the block bitmap bit by bit for every block ID included in this

group. You will not print all the IDs one by one, you will print them using ranges. For example,

if the free block IDs are 1,2,3,4,7,9,10,11,12,13,14,15; then you will only print 1-4, 7, 9-15.

• "Free Inode IDs" indicates the IDs of the inodes that are free for this group. In order to find the

free inodes, you need to check the inode bitmap bit by bit for every inode ID included in this

group and print ranges as in the "Free Block IDs" case. One important issue here is that the

first inode of the filesystem has the ID 1, not 0.

• You can use unsigned char arrays to store a bitmaps.

-Root Directory Entries- should print the directory entries for the root directory only (not

for the sub-directories) and you can utilize the following tips to find out these values:

• Root directory is always stored in the inode ID 2 (remember, inode IDs start from 1 instead of

0). You need to retrieve this inode and check the content of its data blocks to reach the directory

entries of the root directory. Checking only the first data block of the root inode (i_block[0])

will be sufficient for this project since our virtual disk does not have so many files/subdirectories

in the root directory. In the first data block of the root inode, you can find all necessary directory

entries structured in the ext2_dir_entry2 format and considering their length (rec_len),

you can iterate through them.

• "Inode" indicates the inode number of the entry and it is directly available in the related directory

entry for every iteration.

• "Entry Length" indicates the length of this directory entry in bytes and it is directly available in

the related directory entry. This length will be used to determine the starting position of the next

directory entry.

• "Name Length" indicates the length of the file name in characters and it is directly available in

the related directory entry.

• "File Type" indicates the type of the file (1: Regular, 2: Directory etc.) and it is directly available

in the related directory entry. It is sufficient to print the integer value.

• "Name" indicates the name of the file/subdirectory and it is directly available in the related

directory entry.

2.2 Extensions

First, you can also extend your program so that it can print the directory entries for the entire filesystem, not only for the root directory. In addition, you can add the functionality of printing the actual

content of the file given a file name. Finally, you can extend your program to work with other file

systems as well, such as ext3, ext4, xfs, etc. Note that these extensions are not mandatory and will

not be tested.

Instead of using a virtual disk, you can also work on a real disk and directly open a partition and

analyse the file system sitting on that partition using the read system call. For instance, line 19 of the

above code can be replaced with:

fd = open("/dev/sda", O_RDONLY);

to analyse the partition "sda". However, you should make sure that the device file you opened is not

modified. Modifying the device file can cause file system corruption. Alternatively, you can create

the image of a partition into a binary file using the dd command and work on the virtual disk that you

created as we do in this project. Please read the man page of the dd command carefully if you want

to use it.

3 Development

You will develop your program in a Unix environment using the C programming language. gcc will

be used as the compiler. You will be provided a Makefile and your program should compile without

any errors/warnings using this Makefile. Black-box testing will be applied but we will not provide

a sample black-box testing script this time. You are free to write your own testing scripts if you

find them useful; however, you will not submit any testing code that you wrote. In order to check

the correctness of the values that your program generates, you can use dumpe2fs command of

Linux, which will dump the superblock and the block group information given the virtual disk file as

a parameter as follows:

dumpe2fs ext2disk_100mb.img

Besides the superblock and the group information, if you want to learn the content of the virtual disk

to be able to check the root directory entries, you can mount it to your file system using the mount

command as follows:"

sudo mount ext2disk_100mb.img /mnt/ext2disk_100mb -t ext2 -o loop,ro,noexec

Before mounting, make sure that you created a mount point:

sudo mkdir /mnt/ext2disk_100mb

In order to unmount, you can use the umount command as follows:

sudo umount /mnt/ext2disk_100mb

and after unmounting, you can delete the mount point that you previously created:

sudo rmdir /mnt/ext2disk_100mb

Note that for this project, it is very easy to hard-code the correct values to be printed into

your program and pass our tests without even analyzing the virtual disk. However, you should

understand that this will be considered as plagiarism, and you will receive 0 from this project.

You are also not allowed to use ext2 analyzing/parsing tools (such as dumpe2fs, etc.) and their

libraries inside your program. You will walk through the ext2 on-disk structures with your own

C code. You can only use such tools to compare your program output, test, and debug your

program. You are allowed and encouraged to use the ext2 memory data structures provided by

the e2fslibs library.

4 Submission

Work will be penalized 5 points out of 100 if the submission instructions are not followed. In

addition, memory leaks will be penalized with a total of 5 points without depending on the amount

of the leak. Similarly, compilation warnings will also be penalized with a total of 5 points without

depending onthe type or the number of warnings. You can check the compilation warnings using the

-Wall flag ofgcc.

4.1 What to Submit

1. fsa.c: including the source code of your program.

2. README.txt including the following information:

• Did you fully implement the project as described? If not, what parts are not implemented

at all or not implemented by following the specified implementation rules? Please note

that for this project, it is very easy to print out the required output to the screen without

analyzing the virtual disk yourself. This will be considered as plagiarism!

5 Grading

Grading of your program will be done by an automated process. Your programs will also be passed

through a copy-checker program which will examine the source codes if they are original or copied.

We will also examine your source file(s) manually to check if you followed the specified implementation rules, if any.

(5/5)

Use CA10RAM to get 10%* Discount.

I need to create a file system analysis tool that will analyze partitions. In this project, you will write a file system analysis tool called fsa, which can analyze partitions formatted with the ext2 file system.

ANSWER ALL QUESTIONS

Attachments:

Instructions Files

Related Questions

. Introgramming & Unix Fall 2018, CRN 44882, Oakland University Homework Assignment 6 - Using Arrays and Functions in C

. The standard path finding involves finding the (shortest) path from an origin to a destination, typically on a map. This is an

. Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class. The LineItem class will represent an individual

. SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of Sea Ports. Here are the classes and their instance variables we wish to define:

. Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of Sea Ports. Here are the classes and their instance variables we wish to define:

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

Our Experts

Um e HaniScience

Muhammad Ali HaiderFinance

Husnain SaeedComputer science

Atharva PatilComputer science

Sun	Mon	Tue	Wed	Thu	Fri	Sat
30	31	1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	1	2	3

Other Services

I need to create a file system analysis tool that will analyze partitions. In this project, you will write a file system analysis tool called fsa, which can analyze partitions formatted with the ext2 file system.

ANSWER ALL QUESTIONS

Attachments:

Instructions Files

Related Questions

. Introgramming & Unix Fall 2018, CRN 44882, Oakland University Homework Assignment 6 - Using Arrays and Functions in C

. The standard path finding involves finding the (shortest) path from an origin to a destination, typically on a map. This is an

. Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class. The LineItem class will represent an individual

. SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of Sea Ports. Here are the classes and their instance variables we wish to define:

. Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of Sea Ports. Here are the classes and their instance variables we wish to define:

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

Our Experts

Um e HaniScience

Muhammad Ali HaiderFinance

Husnain SaeedComputer science

Atharva PatilComputer science