Huffman in Kotlin | Compress and Decompress | Decode

Huffman tree, also known as Huffman coding, is a data compression algorithm used to encode data in a way that reduces the amount of space required to store it. The algorithm was developed by David A. Huffman in 1952.

The idea behind Huffman coding is to use variable-length codes for different characters, with shorter codes used for more frequently occurring characters and longer codes used for less frequently occurring characters. This way, the total number of bits required to store the data can be reduced.

To create a Huffman tree, the frequency of occurrence of each character in the data is first calculated. The characters are then sorted based on their frequency, with the least frequent characters at the bottom. A binary tree is then constructed, with each leaf node representing a character and its frequency, and the internal nodes representing the sum of the frequencies of its child nodes.

Starting from the bottom of the tree, each leaf node is assigned a binary code by traversing up the tree to its root, where 0 is assigned to the left child and 1 to the right child. The resulting binary codes for each character are then concatenated to create the compressed data.

When the compressed data is decompressed, the binary codes are read one at a time and traversed down the Huffman tree until a leaf node is reached, at which point the corresponding character is output.

Kotlin code implements Huffman coding :

Kotlin
import java.util.*

class HuffmanNode(val char: Char, val freq: Int, val left: HuffmanNode? = null, val right: HuffmanNode? = null) : Comparable<HuffmanNode> {
    override fun compareTo(other: HuffmanNode): Int {
        return this.freq - other.freq
    }
}

HuffmanNode class represents a node in the Huffman tree. It has three properties: char for the character represented by the node, freq for the frequency of the character, and left and right for the left and right children of the node.

compareTo function is overridden to enable comparison of HuffmanNode objects based on their frequency.

Kotlin
fun buildHuffmanTree(charFreq: IntArray): HuffmanNode? {
    val priorityQueue = PriorityQueue<HuffmanNode>()

    for (i in 0 until charFreq.size) {
        if (charFreq[i] > 0) {
            priorityQueue.add(HuffmanNode(i.toChar(), charFreq[i]))
        }
    }

    while (priorityQueue.size > 1) {
        val left = priorityQueue.poll()
        val right = priorityQueue.poll()

        priorityQueue.add(HuffmanNode('\u0000', left.freq + right.freq, left, right))
    }

    return priorityQueue.poll()
}

buildHuffmanTree function takes an array of character frequencies as input and returns the root node of the Huffman tree. It first creates a PriorityQueue to store the HuffmanNode objects. Then, for each non-zero frequency in the input, it creates a new HuffmanNode object and adds it to the priority queue. Next, it repeatedly extracts the two nodes with the lowest frequency from the priority queue, creates a new parent node with their sum frequency, and adds the parent node back to the priority queue. This continues until there is only one node remaining in the priority queue, which is the root node of the Huffman tree.

Kotlin
fun buildCodeTable(root: HuffmanNode?, code: String, codeTable: MutableMap<Char, String>) {
    if (root?.left == null && root?.right == null) {
        codeTable[root?.char?:'\u0000'] = code
    } else {
        buildCodeTable(root.left, code + "0", codeTable)
        buildCodeTable(root.right, code + "1", codeTable)
    }
}

buildCodeTable function takes the root node of a Huffman tree, a string code representing the binary code for the current node, and a mutable map codeTable to store the binary codes for each character. It recursively traverses the Huffman tree and stores the binary code for each leaf node in the codeTable map. The binary code for a leaf node is obtained by concatenating the binary code for its parent node with either “0” or “1”, depending on whether the leaf node is a left or right child.

Kotlin
fun compress(input: String): String {
    val charFreq = IntArray(256)

    for (c in input) {
        charFreq[c.toInt()]++
    }

    val root = buildHuffmanTree(charFreq)
    val codeTable = mutableMapOf<Char, String>()
    buildCodeTable(root, "", codeTable)

    var output = ""
    for (c in input) {
        output += codeTable[c] ?: ""
    }

    return output
}

compress function takes a string input and returns a compressed binary string using Huffman coding. It first creates an array charFreq to store the frequency of each character in the input string. It then calls buildHuffmanTree to build a Huffman tree from the character frequencies and buildCodeTable to generate a binary code for each character. Finally, it loops through the input string, looks up the binary code for each character in the codeTable map, and concatenates them to the output binary string.

Kotlin
fun decompress(input: String, root: HuffmanNode?): String {
    var output = ""
    var node = root

    for (c in input) {
        node = if (c == '0') node?.left else node?.right

        if (node?.left == null && node?.right == null) {
            output += node?.char
            node = root
        }
    }

    return output
}

The decompress function takes a compressed binary string input and the root node of a Huffman tree, and returns the original uncompressed string. It loops through the input binary string and traverses the Huffman tree by following the path of “0” or “1” bits. When it reaches a leaf node, it adds the character represented by the node to the output string and resets the current node to the root node.

Kotlin
import java.util.*

class HuffmanNode(val char: Char, val freq: Int, val left: HuffmanNode? = null, val right: HuffmanNode? = null) : Comparable<HuffmanNode> {
    override fun compareTo(other: HuffmanNode): Int {
        return this.freq - other.freq
    }
}

fun buildHuffmanTree(charFreq: IntArray): HuffmanNode? {
    val priorityQueue = PriorityQueue<HuffmanNode>()

    for (i in 0 until charFreq.size) {
        if (charFreq[i] > 0) {
            priorityQueue.add(HuffmanNode(i.toChar(), charFreq[i]))
        }
    }

    while (priorityQueue.size > 1) {
        val left = priorityQueue.poll()
        val right = priorityQueue.poll()

        priorityQueue.add(HuffmanNode('\u0000', left.freq + right.freq, left, right))
    }

    return priorityQueue.poll()
}

fun buildCodeTable(root: HuffmanNode?, code: String, codeTable: MutableMap<Char, String>) {
    if (root?.left == null && root?.right == null) {
        codeTable[root?.char?:'\u0000'] = code
    } else {
        buildCodeTable(root.left, code + "0", codeTable)
        buildCodeTable(root.right, code + "1", codeTable)
    }
}

fun compress(input: String): String {
    val charFreq = IntArray(256)

    for (c in input) {
        charFreq[c.toInt()]++
    }

    val root = buildHuffmanTree(charFreq)
    val codeTable = mutableMapOf<Char, String>()
    buildCodeTable(root, "", codeTable)

    var output = ""
    for (c in input) {
        output += codeTable[c] ?: ""
    }

    return output
}

fun decompress(input: String, root: HuffmanNode?): String {
    var output = ""
    var node = root

    for (c in input) {
        node = if (c == '0') node?.left else node?.right

        if (node?.left == null && node?.right == null) {
            output += node?.char
            node = root
        }
    }

    return output
}

fun main() {
    val input = "Hello, world!"
    val compressed = compress(input)
    val root = buildHuffmanTree(IntArray(256).apply {
        input.forEach { this[it.toInt()]++ }
    })
    val decompressed = decompress(compressed, root)
    println("Input: $input")
    println("Compressed: $compressed")
    println("Decompressed: $decompressed")
}

In the main function, a sample input string is compressed using Huffman coding, then the compressed binary string is decompressed back to the original string. The input, compressed, and decompressed strings are then printed to the console.

Leave a Comment