I found this interesting article Range Minimum Query and Lowest Common Ancestor after I read about BIT on the same site (it's the best explanation of how BIT works comparing to other articles imo, so I looked for some other interesting stuff).

After I read through it, I decided to implement it. I posted it here.

It’s a nice article, except that the last section “AN <O(N), O(1)> ALGORITHM FOR THE RESTRICTED RMQ” is very confusing. The last two sentences of the first paragraph say that

"It’s obvious that elements in A can be just +1 or -1. Notice that the old value of A[i] is now the sum of A[1], A[2] … A[i] plus the old A[0]. However, we won’t need the old values from now on."

But then in the next paragraph, it says

"Let A’[i] be the minimum value for the i-th block in A and B[i] be the position of this minimum value in A."

It doesn't make much sense, since A is just a +1 -1 array, what's the point of finding the minimum in this array? I thought this must be a mistake, and I searched for some clarification.

There're not many discussions on this topic, probably because it's not quite practical. But from the few pages that I found, I confirmed my suspicion and figured out how it works, and it worked as expected.

Is it an overkill? Most likely. Having an O(log N) speedup - and it's only on the preprocessing time - is not a very big impact, and it's much more complicated than a simple ST.

Nonetheless, it's already surprising that it can be done at all! And I learned something new.

Someone mentioned that the algorithm can be simplified. Well, it already took me a weekend, so I won't spend more time on it. But I'll take a look of that paper when I have some spare time...