Big O notation drops the coefficient, sometimes that coefficient is massive enough that O(N) only beats out O(N^2) at billions of iterations.
Premature optimisation is a massive issue, spending days working on finding a better algorithm is many times not with the time spent since the worse algorithm was plenty good enough.
Real world beats algorithmic complexity many many times because you spent ages building a complex data structure with a bunch of heap allocations all over the heap to get O(N) while it's significantly faster to just do the stupid thing that is in linear memory.
There are many cases where O(n^2) will beat O(n).
Utilising the hardware can make a bigger difference than algorithmic complexity in many cases.
Vectorised code on linear memory vs unvectorised code on data scattered around the heap.