DS C8 Randomized Algorithm

What to Randomize?

  • Average-case Analysis: The world behaves randomly, randomly generated input solved by traditional algorithm and analysed.

  • Randomized Algorithms: The algorithm makes random decisions as the algorithm processes the worst-case input.

Algorithm Answer it yields
Efficient deterministic algorithms Always the correct answer a special case
Efficient randomized algorithms With high probability be the correct answer
Randomized algorithms always correct and run efficiently in expectation

Exp: The Hiring Problem

Problem description

We want to hire an office assistant from headhunter, and we interview a different candidate per day for NN days.

  • Interviewing Cost: CiC_i
  • Hiring Cost: ChC_h

Assume Ci<<ChC_i << C_h.

Naïve Solution

1
2
3
4
5
6
7
8
9
10
11
12
13
14
int Hiring ( EventType C[ ], int N ){   
/* candidate 0 is a least-qualified dummy candidate */
int Best = 0;
int BestQ = the quality of candidate 0;
for ( i=1; i<=N; i++ ) {
Qi = interview( i ); /* Ci */
if ( Qi > BestQ ) {
BestQ = Qi;
Best = i;
hire( i ); /* Ch */
}
}
return Best;
}

Worst case

The candidates come in increasing quality order, all will be hired.

What if ramdom order?

Let

Xi={1if candidate i is hired0,if candidate i is NOT hiredX_i = \begin{cases}1, if\ {\rm candidate\ i\ is\ hired}\\0, if\ {\rm candidate\ i\ is\ NOT\ hired}\end{cases}

Candidate i will be hired if i is the best one of Candidate 1,2,...,i. As any of first ii candidates is equally likely to be best-qualified so far,

P[Xi=1]=1i,E[Xi]=1iP[X_i = 1] = \frac{1}{i},\quad E[X_i] = \frac{1}{i}

Then XX​, the number of hires,

X=i=1NXiX = \sum_{i = 1}^{N}X_i

E[X]=E[i=1NXi]=i=1NE[Xi]=i=1N1i=lnN+O(1)E[X] = E[\sum_{i = 1}^{N}X_i] = \sum_{i = 1}^{N}E[X_i] = \sum_{i = 1}^{N}\frac{1}{i} = \ln N + O(1)

The total cost will be

O(ChE[X]+NCi)=O(ChlnN+NCi)O(C_hE[X] + NC_i) = O(C_h\ln N + NC_i)

Randomized Algorithm

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
int RandomizedHiring ( EventType C[ ], int N ){   
/* candidate 0 is a least-qualified dummy candidate */
int Best = 0;
int BestQ = the quality of candidate 0;
/* The only modification: adding a random shuffle */
randomly permute the list of candidates;

for ( i=1; i<=N; i++ ) {
Qi = interview( i ); /* Ci */
if ( Qi > BestQ ) {
BestQ = Qi;
Best = i;
hire( i ); /* Ch */
}
}

  • The random shuffle guarantees the presentation order of the candidates is random.
  • But the shuffle takes time!

Randomized Permutation

  • Assign each element A[ i ] a random priority P[ i ]
  • Sort by P[i]
1
2
3
4
5
6
7
void PermuteBySorting ( ElemType A[ ], int N )
{
for ( i=1; i<=N; i++ )
A[i].P = 1 + rand()%(N3);
/* makes it more likely that all priorities are unique */
Sort A, using P as the sort keys;
}

Online Hiring Algorithm

  • interview the first kk candidates and mark the best value among then BESTK .
  • interview the remaining candidates and hire the first one that better than BESTK.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
int OnlineHiring ( EventType C[ ], int N, int k){
int Best = N;
int BestQ = - INT_MAX ;
for ( i=1; i<=k; i++ ) {
Qi = interview( i );
if ( Qi > BestQ ) BestQ = Qi;
}
for ( i=k+1; i<=N; i++ ) {
Qi = interview( i );
if ( Qi > BestQ ) {
Best = i;
break;
}
}
return Best;

Given k

Let SiS_i be subproblem such that the ithi^{th} candidate is the best. Then the algorithm works if

  • AA: the best one is at position i

    P[A]=1NP[A] = \frac{1}{N}

  • BB: no one at positions k+1 ~ i–1 are hired

    P[B]=ki1P[B] = \frac{k}{i - 1}

    Since the best of Candidate 1,2,...,i is in the fist kk candidates.

These two are independent.

P[Si]=P[AB]=P[A]P[B]=kN(i1)P[S_i] = P[AB] = P[A]P[B] = \frac{k}{N(i-1)}

Then

P[S]=i=k+1NP[Si]=i=k+1NkN(i1)=kNi=kN11iP[S] = \sum_{i = k+1}^{N}P[S_i] = \sum_{i = k+1}^{N}\frac{k}{N(i-1)} = \frac{k}{N}\sum_{i = k}^{N-1}\frac{1}{i}

And we get

kN1xdxi=kN11ik1N11xdx\int_k^N\frac{1}{x}{\rm d}x \le\sum_{i = k}^{N-1}\frac{1}{i} \le\int_{k-1}^{N-1}\frac{1}{x}{\rm d}x

Hence

kNkN1xdxkNi=kN11ikNk1N11xdx\frac{k}{N}\int_k^N\frac{1}{x}{\rm d}x \le\frac{k}{N}\sum_{i = k}^{N-1}\frac{1}{i} \le\frac{k}{N}\int_{k-1}^{N-1}\frac{1}{x}{\rm d}x

kNlnNkkNi=kN11ikNlnN1k1\frac{k}{N}\ln\frac{N}{k}\le\frac{k}{N}\sum_{i = k}^{N-1}\frac{1}{i} \le\frac{k}{N}\ln\frac{N-1}{k-1}

Let

f(k)=kNlnNkf(k)=1NlnNk+kNkN(Nk2)=1N(lnNk1)\begin{align*} f(k) &=\frac{k}{N}\ln\frac{N}{k}\\ f'(k) &= \frac{1}{N}\ln\frac{N}{k} + \frac{k}{N}\frac{k}{N}(-\frac{N}{k^2}) = \frac{1}{N}(\ln\frac{N}{k} - 1) \end{align*}

Then f(k)=0,k=Nef'(k) = 0,k = \frac{N}{e}

f(k)max=f(Ne)=1ef(k)_{max} = f(\frac{N}{e}) = \frac{1}{e}

The best kk is Ne\frac{N}{e}.

Exp: Quick Sort

RECALL

Deterministic Quicksort

  • worst-case running time: O(N2)O(N^2)
  • average case running time: O(NlogN)O(N\log N), assuming every input permutation is equally likely.

Central splitter

Central splitter is the pivot that divides the set so that each side contains at least n4\frac{n}{4} of nn items.

In Modified Quicksort, we always select a central splitter as pivot before recursions.

  • central splitter avoids the degeneration to the O(N2)O(N^2).

Time complexity Analysis

If a pivot is chosen randomly, then

P(find a central splitter)=12P({\rm find\ a\ central\ splitter}) = \frac{1}{2}

If we re-choose the pivot if it turns out that the last chosen pivot is not a central splitter, then

E(nfind a central splitter)=2E(n_{\rm find\ a\ central\ splitter}) = 2

It is easy to get that the whole tree makes it to the deepest depth if in every recursion, the pivot split nn items into n4\frac{n}{4} and 3n4\frac{3n}{4} items.

Let the subproblem SS be of Type j if

N(34)j+1SN(34)jN(\frac{3}{4})^{j+1} \le |S| \le N(\frac{3}{4})^j

and

ntype j×Size(j)Nn_{\rm type\ j}\times Size(j) \le N

where ntype jn_{\rm type\ j} is the number of subproblems of type j, and Size(j)Size(j) is the size of subproblem of type j. Then ntype j(43)j+1n_{\rm type\ j}\le(\frac{4}{3})^{j+1} .

Then

E[Ttype j]=O(N(34)j)×(43)j+1=O(N)E[T_{\rm type\ j}] = O(N(\frac{3}{4})^{j})\times(\frac{4}{3})^{j+1} = O(N)

N(34)j1N(\frac{3}{4})^j \ge 1

then the Number of different types is log43N\log_{\frac{4}{3}}N. The total time complexity is

T(N)=log43N×O(N)=O(NlogN)T(N) = \log_{\frac{4}{3}}N \times O(N) = O(N\log N)


DS C8 Randomized Algorithm
http://example.com/2023/05/30/DS-08/
Author
Tekhne Chen
Posted on
May 30, 2023
Licensed under