Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Doesn’t it also help with branch prediction since the unrolled loop can use different statistics with each copy?


Non-overlapping sub-problems may be safely parallelized, and executed out-of-order.

In some architectures, both of the branch code motions are executed in parallel, and one is simply tossed after dependent operations finish. We can't be sure exactly how branch predictors and pre-fetch is implemented as it falls under manufacturer NDA. =3




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: