M1091 description:
  Given an n x n binary matrix grid, return the length of the shortest clear path in the matrix. If there is no clear path, return -1.
  A clear path in a binary matrix is a path from the top-left cell (i.e., (0, 0)) to the bottom-right cell (i.e., (n - 1, n - 1)) such that:
  All the visited cells of the path are 0.
  All the adjacent cells of the path are 8-directionally connected (i.e., they are different and they share an edge or a corner).
  The length of a clear path is the number of visited cells of this path.
Constraints:
  n == grid.length
  n == grid[i].length
  1 <= n <= 100
  grid[i][j] is 0 or 1

Apparently this is a bfs problem. The concept is simple, but there’re a few places we can optimize. This code breaks both the speed and memory record at the same time:

#pragma GCC optimize("Ofast","inline","-ffast-math")
#pragma GCC target("avx,mmx,sse2,sse3,sse4")
static const int SZMAX=10000;
int sz;
struct Co{
    int8_t x;
    int8_t y;
    Co operator+(const Co& other){
        return Co{int8_t(x+other.x),int8_t(y+other.y)};
    }
};
Co q[SZMAX];
Co* q2;
const Co dir[8]={ {0,1},{0,-1},{1,0},{-1,0},{1,1},{1,-1},{-1,1},{-1,-1} };
class Solution {
    int checkCo(Co co,const vector<vector<int>>& grid){
        if(co.x<0||co.x>=sz||co.y<0||co.y>=sz) return 1;
        else return grid[co.x][co.y];
    }
    int q1l,q1r,q2l,q2r;
    int l1,l2;
    int processQ(vector<vector<int>>& grid,int which){
        if(which==1){
            int qsz=q1r-q1l;
            while(qsz--){
                auto& cur=q[q1l++];
                for(int i=0;i<8;++i){
                    auto nxt=cur+dir[i];
                    auto v=checkCo(nxt,grid);
                    if(v>1) return -l1+v-2;
                    if(v==0){q[q1r++]=nxt;grid[nxt.x][nxt.y]=l1;}
                }
            }
            l1--;
        }
        else{
            int qsz=q2r-q2l;
            while(qsz--){
                auto& cur=q2[q2r--];
                for(int i=0;i<8;++i){
                    auto nxt=cur+dir[i];
                    auto v=checkCo(nxt,grid);
                    if(v<0) return l2-2-v;
                    if(v==0){q2[q2l--]=nxt;grid[nxt.x][nxt.y]=l2;}
                }
            }
            l2++;
        }
        return 0;
    }
public:
    int shortestPathBinaryMatrix(vector<vector<int>>& grid) {
        static auto _=[](){ios_base::sync_with_stdio(false);cin.tie(0);cout.tie(0);return 0;}();
        sz=grid.size();
        if(grid[0][0]||grid[sz-1][sz-1]) return -1;
        if(sz==1) return 1;
        grid[0][0]=-1;grid[sz-1][sz-1]=2;
        l1=-2;l2=3;
        q1l=0;q1r=1;q2l=-1;q2r=0;
        q2=&q[0]+SZMAX-1;
        q[0]=Co{0,0};*q2=Co{int8_t(sz-1),int8_t(sz-1)};
        while(q1l<q1r&&q2l<q2r){
            int r=(q1r-q1l)<=(q2r-q2l)?processQ(grid,1):processQ(grid,2);
            if(r) return r;
        }
        return -1;
    }
};

It combines a few tricks to increase speed. One of them is the return value for out of range inputs to checkCo() matches the value of blocked cells, so it only takes one condition check instead of two, as I mentioned before. Another is, we can do bfs on either side, but if we start from both sides and proceed with the one that has less elements in the queue, it may save space and time. The extreme case would be a tree, starting from a leaf would be much more efficient than starting from the root. We can use two queues for that, but considering that the maximum total elements in the two queues is known, we can fit both of them in one array, where one queue moves in increasing address direction while the other moves in the opposite direction.


M2101 description:
  You are given a list of bombs. The range of a bomb is defined as the area where its effect can be felt. This area is in the shape of a circle with the center as the location of the bomb.
  The bombs are represented by a 0-indexed 2D integer array bombs where bombs[i] = [xi, yi, ri]. xi and yi denote the X-coordinate and Y-coordinate of the location of the ith bomb, whereas ri denotes the radius of its range.
  You may choose to detonate a single bomb. When a bomb is detonated, it will detonate all bombs that lie in its range. These bombs will further detonate the bombs that lie in their ranges.
  Given the list of bombs, return the maximum number of bombs that can be detonated if you are allowed to detonate only one bomb.
Constraints:
  1 <= bombs.length <= 100
  bombs[i].length == 3
  $1 <= xi, yi, ri <= 10^5$

So we are given a directed graph, and our task is to find the largest number of vertices that can be reached from a single vertex. At first I thought there must be an $O(|V|+|E|)$ algorithm, because I thought the difficulty is dealing with cycles, and there are algorithms that can find strongly connected components in $\Theta(|V|+|E|)$. After that we have a DAG, which should be easier to deal with, right? But the answer is no, which is pointed out in this paper which is posted in the discussion. A DAG doesn’t make it any easier. The only possible direction one may take to lower the complexity is probably to utilize the property of circles on 2D Euclidean plane, which is not an arbitrary directed graph. But I haven’t found any exploit of that. So we may just simply do a dfs on every vertex to find the maximum.

#pragma GCC optimize("Ofast","inline","-ffast-math")
#pragma GCC target("avx,mmx,sse2,sse3,sse4")
class Solution {
    int8_t adj[100][100];
    int8_t by[100];
    int8_t dfs(int8_t i,int8_t boom){
        int8_t r=1;
        by[i]=boom;
        for(int j=0;j<adj[i][99];j++){
            if(by[adj[i][j]]!=boom) r+=dfs(adj[i][j],boom);
        }
        return r;
    }
public:
    int maximumDetonation(const vector<vector<int>>& bombs) {
        static auto _=[](){ios_base::sync_with_stdio(false);cin.tie(0);cout.tie(0);return 0;}();
        const int8_t sz=bombs.size();
        for(int8_t i=1;i<sz;++i){
            long long x1=bombs[i][0],y1=bombs[i][1],r1=bombs[i][2];
            r1*=r1;
            for(int8_t j=0;j<i;++j){
                long long dx=x1-bombs[j][0],dy=y1-bombs[j][1];
                long long dis=dx*dx+dy*dy;
                long long r2=bombs[j][2];
                if(dis<=r1) {auto& _=adj[i][99];adj[i][_++]=j;}
                if(dis<=r2*r2) {auto& _=adj[j][99];adj[j][_++]=i;}
            }
        }
        int8_t m=0;
        memset(by,-1,sz);
        for(int8_t i=0;i<sz;++i){
            if(by[i]==-1) m=max(m,dfs(i,i));
        }
        return m;
    }
};

Here I used a 2d array instead of a 2d vector to store the adjacent list, just to make it a little faster. The last element in each row stores the size. The by[] array stores which bomb the current one is detonated by.


E1232 description:
  You are given an array coordinates, coordinates[i] = [x, y], where [x, y] represents the coordinate of a point. Check if these points make a straight line in the XY plane.
Constraints:
  2 <= coordinates.length <= 1000
  coordinates[i].length == 2
  $-10^4 <= coordinates[i][0], coordinates[i][1] <= 10^4$
  coordinates contains no duplicate point.

The problem is indeed very easy, but I found quite a few bad codes on the submission page. Let’s take a look:

Sample 0 ms submission
class Solution {
public:
    bool checkStraightLine(vector<vector<int>>& coordinates) {
        float m,c;
        int v=0;
        for(int i =0;i<coordinates.size();i++){  
            int k=coordinates[0][0];
            int t=coordinates[i][0];
            if(k==t)
              v++;
        }
        if(v==coordinates.size())
        return true;
        m=float(coordinates[0][1]-coordinates[1][1])/float(coordinates[0][0]-coordinates[1][0]);      
        for(int i=1;i<coordinates.size()-1;i++){                
            c=float(coordinates[i][1]-coordinates[i+1][1])/float(coordinates[i][0]-coordinates[i+1][0]);
            cout<<c<<" ";
            if(m==c){                    
             continue;
            }
            else{                
             return false;
            }
        }    
        return true;
     
    }
};

First, multiplication is better than division, and addition or subtraction is better than multiplication in most scenarios, as long as it doesn’t take too much more operations. So if it can be done with multiplications, we should do that, because 1. it avoids converting integers to floating point numbers, 2. it avoids dividing zero problems, 3. it’s faster, even though we may need to convert the integer type to a larger one.

Second, using == to check equality is a terrible idea due to roundoff errors. It may not present in this example, since it passed all tests, probably because the computation is just a single division. But if there’s one more step, e.g. multiplication or addition/subtraction, it may result in errors. Here’s an example, try this line cout<<((double)38/100+(double)65/200);, what do you get? 0.705. Now try this line cout<<((double)38/100+(double)65/200==0.705);. What do you get? 0! It’s false! Just how bad is it? Let’s test it with a simple program:

  int cnt=0;
  for(int i=0;i<100;++i){
      for(int j=0;j<200;++j){
          if((double)i/100+(double)j/200!=double(10*i+5*j)/1000) cnt++;
      }
  }
  cout<<cnt;

I got 4239 on multiple platforms. So out of 20000 operations, 4239 of the answers are wrong, more than 20%. So how do you check if two floating point numbers are equal? You may either 1. Express them as simplest form of fractions, but it’s only applicable to rational numbers and could be complicated, or 2. Use a criterion $\epsilon$, if the difference between two floating point numbers is less than $\epsilon$, they are considered equal. But numeric errors can still cause false positive or false negative in certain cases. So this is another reason of the first rule of thumb: use integer multiplications instead of floating point divisions if possible.

The third one is a small thing, instead of checking each pair of the last two points, it would be faster to use the first point and the last point to find the slope, because the first point doesn’t change, and it’s one less addition operation because when we read the $i$th index of an array we add the address and $i$, it would be faster to just read a number that has a fixed address.

Putting all these together, this is the code that I submitted:

class Solution {
public:
    bool checkStraightLine(const vector<vector<int>>& coordinates) {
        ios::sync_with_stdio(false);cin.tie(0);cout.tie(0);
        const int sz=coordinates.size();
        const long long x0=coordinates[0][0],y0=coordinates[0][1],dy01=coordinates[1][1]-y0,dx01=coordinates[1][0]-x0;
        for(int i=2;i<sz;++i){
            if((coordinates[i][1]-y0)*dx01!=dy01*(coordinates[i][0]-x0)) return false;
        }
        return true;
    }
};

It’s probably the most reliable and optimized way to solve this problem. If the last constraint, “coordinates contains no duplicate point.”, is loosened, the only extra step we need to take is to find a point that has different coordinate than the first one, then we start from there.