Prepare for your exams
Get points
Guidelines and tips

Sell on Docsity

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Data Structures Hash Map Assignment, Assignments of Data Structures and Algorithms

Dokuz Eylül University Data Structures and Algorithms

Hash map implementation and exercises.

Typology: Assignments

2020/2021

Uploaded on 01/09/2021

volkan-ulker 🇹🇷

5

(1)

1 document

1 / 5

This page cannot be seen from the preview

Don't miss anything!

CME 2201 - Assignment 1

In this assignment, you are expected to index words of a document named as ‘story.txt’. You

must read this file, split it word by word, and index each word to your hash table according to

rules given below.

Requirements

 Usage of Java programming language and generic data types are required.

 You need to implement base functions of a classical Hash Table by yourself (do not

extend an available Java Hash Map class directly).

 Object Oriented Programming (OOP) principles must be applied.

 Exception handling must be used when it is needed.

1. Main Functionalities

 put(Key k, Value v)

Read the given input story.txt file, calculate the number of occurrences of each word as count

value, insert this count data into the hash table accordingly.

 Value get(Key k)

Search the given word (k) in the hash table. If the word is available in the table, then return an

output as shown below, otherwise return a “not found” message to the user. When a word is

searched in the hash table; key, count, and index of the word should be printed.

------ Output ------

Search: Ezgi

Key: 1243225

Count : 10

Index : 165

Search: Ali

Key: 68294842

Count : 3

Index : 132

Note: Results of the words of “Ezgi” and “Ali” were generated as an example. So, you will

not obtain the same results by using the ‘story.txt’ file.

 resize(int capacity)

Make the hash table dynamically growable. The put method should double the current table

size if the hash table reaches the maximum load factor. You should take the initial size of the

table as 997 and call the resize method according to two different load factor values (50% and

70%).

Partial preview of the text

Download Data Structures Hash Map Assignment and more Assignments Data Structures and Algorithms in PDF only on Docsity!

CME 2201 - Assignment 1

In this assignment, you are expected to index words of a document named as ‘story.txt’. You must read this file, split it word by word, and index each word to your hash table according to rules given below.

Requirements

 Usage of Java programming language and generic data types are required.  You need to implement base functions of a classical Hash Table by yourself (do not extend an available Java Hash Map class directly).  Object Oriented Programming (OOP) principles must be applied.  Exception handling must be used when it is needed.

1. Main Functionalities

 put(Key k, Value v)

Read the given input story.txt file, calculate the number of occurrences of each word as count value, insert this count data into the hash table accordingly.

 Value get(Key k)

Search the given word (k) in the hash table. If the word is available in the table, then return an output as shown below, otherwise return a “not found” message to the user. When a word is searched in the hash table; key, count, and index of the word should be printed.

------ Output ------

Search: Ezgi Key: 1243225 Count : 10 Index : 165

Search: Ali Key: 68294842 Count : 3 Index : 132

Note: Results of the words of “Ezgi” and “Ali” were generated as an example. So, you will not obtain the same results by using the ‘story.txt’ file.

 resize(int capacity)

Make the hash table dynamically growable. The put method should double the current table size if the hash table reaches the maximum load factor. You should take the initial size of the table as 997 and call the resize method according to two different load factor values (50% and 70%).

2. Hash Function

To specify an index corresponding to given string key, firstly you should generate an integer hash code by using a special function. Then, resulting hash code has to be converted to the range 0 to N-1 using a compression function, such as modulus operator (N is the size of hash table).

You are expected to implement two different hash functions including polynomial accumulation function and your own hash function.

Polynomial Accumulation Function (PAF)

The hash code of a string s can also be generated by using the following polynomial:

where is the left most character of the string, characters are represented as numbers in 1- 26 (case insensitive), and n is the length of the string. The constant z is usually a prime number (33, 37, 39, and 41 are particularly good choices for English words). When the z value is chosen as 33, the string "car" has the following hash value:

Note: Using of this calculation on the long strings will result in numbers that will cause overflow. You should ignore overflows or use Horner's rule to perform the calculation and apply the modulus operator after computing each expression in Horner's rule.

Your Own Hash Function (YHF)

Hash code for converting each word to an integer key must be implemented by yourself. The input value will be the word and the integer key will be returned by your hash code function. The hash (compression) function for converting a key to the index (address calculator) must be implemented by yourself.

3. Collision Handling Approach

You are expected to implement a collision resolution technique based on open addressing. The insertion algorithm is as follows:

 Calculate the hash value and initial index of the entry to be inserted.  Then search the position linearly.  While searching, the distance from initial index is kept which is called DIB (Distance from Initial Bucket).  If we can find the empty bucket, we can insert the new entry with the DIB value in here.  If we encounter an entry which has less DIB than the candidate entry, swap them.

Step 3: Final state after displacements and insertion

For entry retrieval, entries can be found using linear probing starting from their initial indexes, until they are encountered, or until an empty bucket is found, in which case it can be concluded that the entry is not in the table. The search can also be stopped if during the linear probing of a bucket is encountered for which the distance to the initial bucket is smaller than the DIB of the entry it contains.

4. Performance Monitoring

You are expected to fill the performance matrix (Table 1) by running your code under different conditions including two different load factors (50% and 70%) to decide resizing of hash table and two different hash functions (PAF and YHF).

You should count total number of collision occurrences and measure expended time while indexing words in the "story.txt" under each condition. In addition, you should calculate minimum, maximum and average search times by using the “search.txt” file that contains 100 words to search for (search time means the time expended to find a particular key in the hash table. It does not include the time spent for outputs. To calculate avg. search time, divide the total expended time to the total number of searched keys). You can use System.nanoTime() or System.currentTimeMillis() for time operations.

Load Factor

Hash Function

Collision Count

Indexing Time

Avg. Search Time

Min. Search Time

Max. Search Time

α=50%

PAF

YHF

α=70%

PAF

YHF

Table 1. Performance matrix

Provided Resources

 Document to index: story.txt  Word list to use in calculation of searching times: search.txt

Due date

December 16, 2020, 23:

Submission

You must upload your all ‘.java’ files as an archive file (.zip or .rar) to the Sakai platform. Your archived file should be named as ‘studentnumber_name_surname.rar.zip’, e.g., 2007510011_Ali_Yılmaz.rar.

Prepare and upload a report with descriptions of your data structure, java code, and performance matrix.

Plagiarism Control

The submissions will be checked for code similarity. Copy assignments will be graded as zero, and they will be announced in the Sakai.

Grading Policy

Job Percentage Usage of Generic, OOP and Try-Catch % 20 Implementation of hash operations and collision handling approach

Performance monitoring %

Data Structures Hash Map Assignment, Assignments of Data Structures and Algorithms

Related documents

Partial preview of the text

Download Data Structures Hash Map Assignment and more Assignments Data Structures and Algorithms in PDF only on Docsity!

CME 2201 - Assignment 1

 resize(int capacity)

PAF

YHF

PAF

YHF