Introduction

  • Optimization Problems: Given a set of constraints and an optimization function. Solutions that satisfy the constraints are called feasible solutions. A feasible solution for which the optimization function has the best possible value is called an optimal solution.
  • The Greedy Method: Make the best decision at each stage, under some greedy criterion. A decision made in one stage is not changed in a later stage, so each decision should assure feasibility.

Note:

  • Greedy algorithm works only if the local optimum is equal to the global optimum.
  • Greedy algorithm does not guarantee optimal solutions. However, it generally produces solutions that are very close in value (heuristics) to the optimal, and hence is intuitively appealing when finding the optimal solution takes too much time.

Activity Selection Problem

  • Given a set of activities S=a1,a2,,anS = { a_1, a_2, \cdots, a_n } that wish to use a resource. Each aia_i takes place during a time interval [si,fi)[s_i, f_i).
  • Activities aia_i and aja_j are compatible if sifjs_i \geq f_j or sjfis_j \geq f_i (i.e. their time intervals do not overlap).
  • Target: Select a maximum-size subset of mutually compatible activities.

Greedy criterion

Greedy Rule 1

  • Select the interval which starts earliest (but not overlapping the already chosen intervals)

  • Counterexample:

    image-20210511195901250

Greedy Rule 2

  • Select the interval which is the shortest (but not overlapping the already chosen intervals)

  • Counterexample:

    image-20210511195945399

Greedy Rule 3

  • Select the interval with the fewest conflicts with other remaining intervals (but not overlapping the already chosen intervals)

  • Counterexample:

    image-20210511200017780

Greedy Rule 4

  • Select the interval which ends first or starts latest (but not overlapping the already chosen intervals)

  • Resource become free as soon as possible

  • Correctness:

    • Algorithm gives non-overlapping intervals
    • The result is optimal

[Theorem] Consider any nonempty subproblem SkS_k, and let ama_m be an activity in SkS_k with the earliest finish time. Then ama_m is included in some maximum-size subset of mutually compatible activities of SkS_k.

  • Proof:

    • Let AkA_k be the optimal solution set, and aefa_{ef} is the activity in AkA_k with the earliest finish time.
    • If ama_m and aefa_{ef} are the same, we are done, else replace aefa_{ef} by ama_m and get AkA_k’.
    • Since fmfeff_m \leq f_{ef}, AkA_k’ is another optimal solution.
  • Implementation:

    • Select the first activity; Recursively solve for the rest.
    • Remove tail recursion by iterations.
    • T(N)=O(NlogN)T(N)=O(N\log N) to sort the end time or start time.

Elements of the Greedy Strategy

  • Cast the optimization problem as one in which we make a choice and are left with one subproblem to solve.
  • Prove that there is always an optimal solution to the original problem that makes the greedy choice, so that the greedy choice is always safe.
  • Demonstrate optimal substructure by showing that, having made the greedy choice, what remains is a subproblem with the property that if we combine an optimal solution to the subproblem with the greedy choice we have made, we arrive at an optimal solution to the original problem.

Note: Beneath every greedy algorithm, there is almost always a more cumbersome(麻烦的) dynamic-programming solution

Huffman Codes

image-20210511205532015

Note: If all the characters occur with the same frequency, then there are not likely to be any savings.

  • If character CiC_i is at depth did_i and occurs fif_i times, then the cost of the code = difi\sum d_i f_i .
image-20210511211252906 image-20210511211240494
  • The trick is: No code is a prefix of another.

  • Any sequence of bits can always be decoded unambiguously if the characters are placed only at the leaves of a full tree(all nodes are leaves or have two children), such kind of code is called prefix code.

  • Huffman’s Algorithm

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    void Huffman ( PriorityQueue  heap[ ],  int  C )
    { consider the C characters as C single node binary trees,
    and initialize them into a min heap;
    for ( i = 1; i < C; i++ ) {
    create a new node;
    /* be greedy here */
    delete root from min heap and attach it to left_child of node;
    delete root from min heap and attach it to right_child of node;
    weight of node = sum of weights of its children;
    /* weight of a tree = sum of the frequencies of its leaves */
    insert node into min heap;
    }
    }

    T=O(ClogC)T=O(C\log C)

Correctness

The greedy-choice property

[Lemma] Let C be an alphabet in which each character c \in C has frequency c.freq. Let x and y be two characters in C having the lowest frequencies. Then there exists an optimal prefix code for C in which the codewords for x and y have the same length and differ only in the last bit.

image-20210707191447772

  • Cost(T)Cost(T)Cost(T')\leq Cost(T)

The optimal substructure property

[Lemma] Let C be a given alphabet with frequency c.freq defined for each character c \in C. Let x and y be two characters in C with minimum frequency. Let C’ be the alphabet C with a new character z replacing x and y, and z.freq = x.freq + y.freq. Let T’ be any tree representing an optimal prefix code for the alphabet C’. Then the tree T, obtained from T’ by replacing the leaf node for z with an internal node having x and y as children, represents an optimal prefix code for the alphabet C.

image-20210707192044771