Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Out of curiosity, What types of applications are you running where HT hurts performance?


Sparse matrix kernels and finite element/volume integration. For bandwidth-limited operations, it is sometimes possible to get better performance by using less threads than physical cores because the bus is already saturated (for examples, see STREAM benchmarks). For dense kernels, I'm usually shooting for around 70 percent of peak flop/s, and any performance shortcomings are from required horizontal vector operations, data dependence, and multiply-add imbalance. These are not things that HT helps with.

Additionally, HT affects benchmark reproducibility which is already bad enough on multicore x86 with NUMA, virtual memory, and funky networks. (Compare to Blue Gene which is also multicore, but uses no TLB (virtual addresses are offset-mapped to physical addresses), has almost independent memory bandwidth per core, and a better network.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: