Discussing SPR, Genoa, TSMC vs Intel Foundries w/ Anonymous Server Engineer
Added 2023-02-04 23:11:37 +0000 UTCBroken Silicon's next guest will see the return of the Anonymous Server Engineer. We plan to heavily discuss AMD's place in the market, Intel Foundries, TSMC's future, Professional RADEON Cards (or lack thereof), and Intel's Evolving place in the market.
The point of this episode is to truly try to get at if AMD is about to take even more market share from Intel, and what Intel can do to keep up with TSMC and AMD.
You have ~24 hours to submit questions/comments below.
P.S. Unfortunately the Daniel Nenni episode will need to be postponed due to a personal family issue - we talked, it's unfortunate...but these things happen.
Last Episode with Server Engineer: https://youtu.be/sVN9LrWREBs
Latest MLID Server Leak: https://youtu.be/h20inMLeDnE
https://d1io3yog0oux5.cloudfront.net/_85c13da17096eb11e6fba92dfe5d3a5f/intel/db/887/8894/earnings_presentation/Q4%272022+Earnings+Deck_Final+PDF.pdf
Comments
Intel seems somewhat hopeful to be back in the race by 2025. Regarding server, can 512c Sierra Forest hold its own against Turin-X (with or without cache) or what comes after? I would like to believe Intel, but their track record isn't great on execution. Is all hope lost if Amd gets SMT-4 functioning for Turin? If we see Amd win in 2025, will anyone have any faith in Intel and will we see their server marketshare collapse?
Dig Wiggler
2023-02-06 05:04:16 +0000 UTCYes. But then you moved to an example that I indirectly quoted.
KarbinCry
2023-02-05 23:12:34 +0000 UTCWe didn't think that. He directly said that there was likely far more space taken up than just those literal "accelerator spots" on the die annotation.
Moore's Law Is Dead
2023-02-05 22:37:32 +0000 UTCIn many places, for many people it doesn't feel like we have really entered a recession, it feels like a limbo between the previous bull run and and an actual recession. How does it look from the server space? Are there major shifts in demand or spending?
qhfreddy
2023-02-05 16:59:45 +0000 UTCEpyc continues to sell like hot cakes, where do you see AMD going with this? How long until they have to lay off the gas because breaking into new market segments doesn't yield as much margins? Are there any big markets you still see them wanting to break into? Anything they have to take a slow approach in and will grow their numbers in the long term?
qhfreddy
2023-02-05 16:58:18 +0000 UTCCurrently, new server platforms are pushing heavily into higher power and higher density configurations, does this help with power efficiency to any degree or is it mainly about cutting out other areas of the TCO? Do you see power efficiency moving forward in the foreseeable future, if so how? It doesn't feel like the foundries have much left in the tank when it comes to power per transistor.
qhfreddy
2023-02-05 16:58:01 +0000 UTCWhat are your thoughts on DPC++ from Intel's OneAPI defaulting to -ffast-math? Intel presented slides comparing PVC to A100. What was interesting is that those slides showed A100 with CUDA, A100 with SYCL, and PVC with SYCL. What was interesting was that A100 with SYCL was faster than A100 with CUDA. Intel was even asked why that was, and said something vague about optimization and compile quality. Well, turns out DPC++ (what Intel uses to compile in oneAPI) by default enables -ffast-math flag during compile. What does this flag do? - it reorders operations without keeping IEEE standard compliance; you should get almost the same end result, but may not get the exact same result from the same inputs - it disables checking for error number after calculations - it assumes all operations are finite, meaning it does not check for "NaN" or zero result; these just get plugged into the next calculation, which can just cascade down through your math - it enables approximation for square roots and divisions - it disables signed zero (this can be mathematically important) - it assumes there will be no hardware interrupts For most stuff, this is fine. For a long, complex calculation with billions, trillions of calculations (what HPC GPUs are designed to do), this can be a bit problematic.
KarbinCry
2023-02-05 14:29:16 +0000 UTCHow do you view Ampere, and other competitors to x86 server - be it Graviton (via AWS), Ampere itself, or Nvidia with Grace.
KarbinCry
2023-02-05 14:26:59 +0000 UTCTom, I have to correct the discussion you had with Wendell about SPR accelerators. Most of them are truly small. Those are the locked down accelerators. But then there is AMX. AMX is responsible for the AI performance, for the OpenVINO results. AMX, afaik, is *not* locked behind specific SKUs or unlocks (the On Demand thing where you unlock features like DLC). And AMX might have benefits outside AI - there are other uses for low precision tensor math, and you might be able to use the big AMX registers to manually "cache" data for other units (*maybe*). But... based on the old dieshots, I've calculated that the AMX units, its registers etc. use up space of between 18 and 26 cores (big error bars since die shots can be tricky). And that's a much higher cost than two or four cores the other accelerators take up. In the discussion with Wendell, the two of you I think got a bit crossed, and operated under the theoretical that the accelerators *including* AMX eat up only a couple cores, greatly affecting your assessment. Specifically, Wendell says he might sacrifice 8 Genoa cores to get SPR accelerators in exchange, and cotes OpenVINO results as impressive enough to make that wort it. Well - Intel paid, based on my die analysis, *far* greater price than 8 cores.
KarbinCry
2023-02-05 13:01:38 +0000 UTCDo you think there is still a reluctance from some server operators to shift their software optimisation towards AMD hardware, so those are still just going with intel "because it's what works"? Or is intel keeping hold of market share simply because AMD doesn't have the bandwidth to do deployments for everyone who wants an Epyc system? How big do you think the incentives are for operators to really make use of the additional features intel has?
qhfreddy
2023-02-05 12:56:38 +0000 UTCSo far it feels like both intel and AMD are struggling to get further into the general server GPU markets than some pretty specialised HPC deployments, what do you think they need to do better to really make headways on this front?
qhfreddy
2023-02-05 12:54:02 +0000 UTCWith their financial results and layoffs, Intel is getting a lot of negative press currently. From a buyer's perspective, how does this affect you? Do you see this as a short term blip or do you start to worry about Intel's long term future? If many customers lose confidence in Intel and start switching away, could that lead to a slow death-spiral or does Intel have a solid bedrock of support that will enable them to eventually bounce back without major changes?
Chris Rijk
2023-02-05 12:32:44 +0000 UTCThe only version of MI300 that AMD has talked about is one with 24 Zen 4 cores, seemingly taking up a quarter of the space. How interested would you be in a version that only had Zen 4 cores and no CDNA compute? Basically a processor with 96 Zen 4 cores, an unknown amount of cache, 128GB of HBM3, no DDR5 but a lot of CXL.
Chris Rijk
2023-02-05 01:06:33 +0000 UTCWhen discussing AMD’s forays into AI or professional applications on GPUs in general, it’s often mentioned that they’re somewhat lacking on the software side. Do you agree and how do you see this side for AMD’s general purpose x86 servers? (eg drivers, OS/application optimisation and feature support, stability in general)
Chris Rijk
2023-02-05 01:05:38 +0000 UTCWhen server processors are discussed, it often feels like prices are ignored as if no buyer actually cares about the purchase price. What's the reality? Relative to Genoa, how do you rate the pricing of Sapphire Rapids and what are your thoughts on Intel's "On Demand" pricing for it? Bergamo will probably be launching next quarter - what kind of pricing would you expect to see for it, relative to Genoa? Looking ahead to next year, let’s say for the sake of argument that Turin and Granite Rapids have similar performance and power consumption on general workloads but Granite Rapids costs 50-100% more - would that really be a win for Intel?
Chris Rijk
2023-02-05 01:05:09 +0000 UTCWelcome back Server Engineer. Since you were last on, from your perspective, what has changed? Was there anything that particularly surprised you? Was there something that surprised others that you had expected? Has there been any particular changes in the types of hardware you are buying or the vendors you are buying them from?
Chris Rijk
2023-02-05 01:04:43 +0000 UTCAny workloads/programs you can think of that might profit from 3DVcache in the server world? I know AVX-512 has shown niche usage, but very impressive growth where applicable. I wonder if 3D Vcache may also apply somewhere and bring serious growth.
KingHarkinian
2023-02-05 00:04:40 +0000 UTCWhat's the word in the server engineer world about Intel's Xe HPG based products? Supposedly they exist, but as far as I know, no one talks about them, so it's hard to tell how good/bad they are versus AMD and Nvidia in the server space.
Cleansweep
2023-02-05 00:04:06 +0000 UTCWhat do you think will have to be cut from Intel's product portfolio next? The public and official ones were Optane, network switches, and now RISC-V development.
Benjamin Cannon
2023-02-04 23:51:56 +0000 UTCRealistically how many generations will it take for customers to regain confidence in Intel server? AMD has been consistent in their roadmap, and there's barely anything that suggests Intel can stay competitive, in terms of technology, financials, and general consistency.
Trogdor
2023-02-04 23:35:10 +0000 UTCWhat are your thoughts on ROCm, HIP and the usability of these frameworks (what hardware is actually compatible, state of documentation, adoption)?
Friedrich Günther
2023-02-04 23:18:15 +0000 UTCCheers on all your work. Put this on download since it’s close to 2330 here in a Portugal and I’m tired (still, I gather, not as much as you?). Regardless, enjoy all your informative insights, on podcast or video. Cheers
Manuel Nascimento
2023-02-04 23:17:40 +0000 UTCHello! Why are Intel's margins so bad compared to AMD if they are vertically integrated and control their foundries? AMD has to pay for TSMCl's own margins on their nodes, is Intel 7 that much more expansive?
Matheus Duque
2023-02-04 23:17:22 +0000 UTCDoes AMD see the future of their gpu's tied to ever increasing power demands? Is there any hope for those of us not willing to buy a 1000w psu to run these damn things?
Thalo215
2023-02-04 23:14:54 +0000 UTCCould you compare between TMSC 7/6 and Intel 7 nodes in terms of destiny? And what about TMSC 5/4 against Intel 4 which would be more dense?
Falto
2023-02-04 23:14:34 +0000 UTC