In order to achieve extremely high memory bandwidth, processors like Nvidia Tesla-100, Intel Xeon Phi "KNL", and recent versions of Fujitsu SPARC64, use specialized High Bandwidth Memories (HBM) that are closely coupled to the processor die in the same package, see the image.
A disadvantage of this solution is that the capacity of such HBMs is relatively small, 16/32 GB for older/new Tesla-100, 16 GB for KNL, etc.
If the capacity of HBMs is not sufficient for a particular computation, an external DRAM memory must be used.
The external memory exhibits much lower memory bandwidth, e.g., 300 GB/s NVLink instead of 900 GB/s HBM for Tesla-100 Consequently, low arithmetic intensity computations, like the HPCG benchmark or most simulations, are executed with even much lower speed.