View Single Post
  #2  
Old July 15th 20, 07:42 PM posted to alt.windows7.general,alt.comp.os.windows-10,comp.sys.intel,comp.sys.ibm.pc.hardware.chips
VanguardLH[_2_]
external usenet poster
 
Posts: 1,453
Default Linux founder tells Intel to stop inventing 'magic instructions' and 'start fixing real problems'

Yousuf Khan wrote:

Linus Torvalds' comments came from this article: https://is.gd/6zpZRL


Full URL:
https://www.pcgamer.com/linux-founde...al-problems%2F

Linus is known for publishing his tirades on Windows, and even on Linux
variants. He lambasts everyone.

Tweaking hardware to look good in benchmarks is news to you? Video chip
makers have been doing this forever, making their hardware or firmware
look better in particular benchmarks (sometimes their own benchmarks
tweaked for their hardware) but for which the benchmarks have no
practical implementation illustrating actual performance in real use.

AVX wasn't just about improving FP instructions. The number of cores
available back then was maybe up to 4 allowing concurrent thread
processing. With more cores to parallelize the computing, AVX becomes
less necessary. The latest CPUs (although far outside the consumer
price range) have 64 cores, maybe more. Sorry, but bitching in
hindsight is the easy way to look superior. I don't see Linux bitching
back *then* when AVX showed up. His forward-looking crystal ball was
just as cloudy as everyone else's. So, how many cores were in your home
computer back in 2013 when AVX came out? AVX isn't just about upping
the bit-width of FP calculations, but also about parallelization. How
many desktops nowadays have any apps on them that can use all 4 cores?

Not all CPUs are waiting to do something for end users. Some are
involved in highly complex computing, like animated computer graphics.
You think Zootopia was composed on a home computer? So, you think Intel
(or AMD) are going to tool up for a completely separate production line
for consumer vs high-graphics design platforms? There is an economy in
production by reusing existing manufacturing processes. Do consumer
platforms utilize AVX? Rarely. Why didn't Linus bitch when Intel added
Streaming SIMD Extensions (SSE)? How about all those non-gaming users
that don't care even about the old SSE extensions? Oh my God, the CPU
has something they don't need.

I suppose next Linus will bitch about increased parallelization in
Mozilla's Firefox. The next engine, Servo, takes advantage of the
memory safety and concurrency features of the Rust programming language.
Servo will use parallelism by using more cores for the rendering engine,
layout, HTML parsing, image processing, decoding, and other tasks that
can be isolated (into separate processes or threads to run on more
cores). Servo also makes further use of GPU-assisted acceleration, so
code running on a different processor. Would the GPU be needed if there
more core CPUs (real or multi-core) to parallelize the FP instructions?

I think GPU-assisted acceleration in web browsers started back in 2010,
but was just for web browsers. I remember some other apps used the GPU
for faster FP processing, but they seemed few and far between. More
video games are using AVX (AVX 2 more than AVX 512) since it is part of
the DirectX12 API. LOTS of users play video games on their home
computers, so AVX is really not that rare for use on low-end computing
platforms. AVX used to be shunned by game devs due to complexity in
coding.

Scalar, non-AVX :

void interpolate(vectorvectorint& mat)
{
for(int i=2; imat.size()-1; i=i+2)
for(int j=0; jmat[0].size(); j++)
{
mat[i][j] = mat[i-1][j] + 0.5f * (mat[i+1][j] - mat[i-1][j]);
}
}


Using AVX:

void interpolate_avx(vectorvectorint& mat)
{
for(int i=2; imat.size()-1; i=i+2)
for(int j=0; jmat[0].size(); j=j+8)
{
_mm256_storeu_si256((__m256i *)&mat[i][j],
_mm256_cvtps_epi32(_mm256_add_ps(_mm256_mul_ps(_mm 256_sub_ps(_mm256_cvtepi32_ps(_mm256_loadu_si256(( __m256i
*)&mat[i+1][j])), _mm256_cvtepi32_ps(_mm256_loadu_si256((__m256i
*)&mat[i-1][j]))), _mm256_set1_ps(0.5f)),
_mm256_cvtepi32_ps(_mm256_loadu_si256((__m256i *)&mat[i-1][j])))));
}
}

However, when mandated to programmers to code a game for maximum
performance, the AVX code runs 6.5 times faster! Simple coding with
slower performance, or complicated coding with faster performance. The
tradeoff is more cost in coding work, debugging, and optimizing hence
more time to achieve faster performance. Considering have video games
have upped the number of moving objects, physics modeling, and moving
texture change, some video games have insane requirements compared to
games dated over a decade ago.

Video games are real use of AVX. It's not just making benchmarks look
better. Guess Linus doesn't have bleeding edge hosts (in technology and
to his pocket) on which to run the most demanding video games. Is Linus
even a gamer? Oh wait, yeah, not that big a selection for Linux.