admin管理员组文章数量:1307444
I am running into a very interesting dilemma while learning Huffman compression. I am currently studying in a course that uses zyBooks and the examples are absolutely terrible in my opinion. In 2 examples of 2 separate steps of building the Huffman tree there is an issue regarding which order the characters should be in when the dictionary contains many characters which have a frequency of 1. In 1.8.10, using the string "APPLES AND BANANAS", the characters that share a frequency of 1 are ordered in "order of appearance", yet in 1.8.13, the characters are in reverse order of appearance. Can anyone help me understand which would be the correct method as, due to this obfuscation, I am getting questions wrong at the end of the section.
I have attempted the questions at the end of the section to build the tree on paper (notice the book is not written language-specific). The book uses very short strings so its "easy" enough to do, and I want to understand the logic before I let the computer do the work for me using the code.
I am running into a very interesting dilemma while learning Huffman compression. I am currently studying in a course that uses zyBooks and the examples are absolutely terrible in my opinion. In 2 examples of 2 separate steps of building the Huffman tree there is an issue regarding which order the characters should be in when the dictionary contains many characters which have a frequency of 1. In 1.8.10, using the string "APPLES AND BANANAS", the characters that share a frequency of 1 are ordered in "order of appearance", yet in 1.8.13, the characters are in reverse order of appearance. Can anyone help me understand which would be the correct method as, due to this obfuscation, I am getting questions wrong at the end of the section.
I have attempted the questions at the end of the section to build the tree on paper (notice the book is not written language-specific). The book uses very short strings so its "easy" enough to do, and I want to understand the logic before I let the computer do the work for me using the code.
Share Improve this question asked Feb 2 at 16:14 TAW_WarDriverTAW_WarDriver 111 silver badge2 bronze badges2 Answers
Reset to default 3Either choice results in an optimal code. There is no "correct" choice when there are two or more tied frequencies, or when there is more than one second-lowest frequency choice (e.g. 1 2 2). Sometimes with different choices you can get codes with topologically different trees. Those are all still valid and equally optimal. This answer shows many different trees from the same set of frequencies. There is also no "correct" choice in assigning 0's and 1's to each left or right branch. Either way for each branch is fine.
As for the questions, it appears that in both cases the trees are provided. You are not being asked to recreate them. Though the second question implicitly assumes that 0 is assigned to the left branches and 1 to the right branches.
Huffman coding uses a greedy approach and takes the smallest frequency elements to merge at every step. Also, it's an application of OMP (Optimal Merge Patterns). You can check it out if you want. The order of merging doesn't really matter. What matters is you merge the smallest frequency items at each step. Here is the first example
Different merge orders of 1.8.9
So, the set of frequencies is {A:3, B:1, S:1, N:2}. Here are the steps to build the tree:
Step1: Choose the nodes with the smallest frequencies (B:1 and S:1) and merge them. Now, if there were more than 2 minimum frequencies, you can combine either of them (if 3 frequencies of '1', then 3 choices of merging). Ofcourse the codes would be different for different merge orders. You can include some restrictions to limit the merge orders.
Step2 Remove the merged nodes from the set and add the new node. Then, repeat these steps until all the characters are merged. In the example, there are 2 merge orders with different codes.
After building the tree, label the left edges and right edges as 0 and 1 respectively. The path from the root to the respective characters will tell you the huffman code. Again, the order of 0 and 1 can be different for left and right subtree edges.
In example 1, codes will be {A->0, B->110, S->111, N->10}. In example 2, codes will be {A->1, B->000, S->001, N->01}
Both codes are correct.
After labelling the left and right subtree edges as '0' and '1' respectively in the example 1.8.10, the code of A would be 11 and so on. Say, if 11 wasn't in the options, then we'd assume that the right subtree's edge is assigned 0 and left edge as 1. Then the answer would have been 00
Hope this helps...
本文标签: algorithmOrder of Huffman List when there are many characters with the same frequencyStack Overflow
版权声明:本文标题:algorithm - Order of Huffman List when there are many characters with the same frequency - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1741841841a2400548.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论