If it runs pytorch at speed without hand holding I would probably get one.
If it runs tinygrad at speed(a lower bar developmentally) I might get one.
Is there a model benchmarking site where you can select varying degrees of models by source code and see how they perform on different hardware. It would assist people to evaluate whether or not a specific piece of hardware is good for the jobs that they want it to do.
> Is there a model benchmarking site where you can select varying degrees of models by source code and see how they perform on different hardware. It would assist people to evaluate whether or not a specific piece of hardware is good for the jobs that they want it to do.
Not that I'm aware of (at least based on real benchmarks), but it's something I've been noodling about building, together with with some other associated data that can be helpful when wanting to select a model. Glad to hear I'm not the only one wanting it :)
- [deleted]
AMDs comparison with the M4 Pro.
https://www.digitaltrends.com/wp-content/uploads/2025/01/3d-...
Not shown: Wattage
100% chance its chewing through at least 50% more power to achieve the result.
Infact based on their TDP guidance, it goes up to 120w, which is more than double M4. But we don't know what the configuration was for this benchmark. We also don't have great numbers for M4's power consumption either.
Then you throw in the fact 120w TDP from AMD is not actually a power consumption figure... and it's all made up.
M4 Max is the most comparable to Strix Halo and while Apple does not appear to give an official power consumption, there are plenty of anecdotal reports of it using over 100W under load. For example:
https://www.reddit.com/r/macbookpro/comments/1hj3m0p/m4_max_...
https://forums.macrumors.com/threads/m4-max-eats-battery.244...
M2 Max discharges in 2-3h when running ML models, and plugged into 140W brick.
M4 is likely more power efficient, but not 2x.
Not to mention what's the performance like on battery vs. plugged in. If I have to stay tethered to the wall in order to achieve the rated performance then it's not really an apples-to-apples comparison unless you only ever use your laptop at a desk (which is probably most people, honestly).
Source article of that (though it's actually linked in TFA as well) https://www.digitaltrends.com/computing/amd-mobile-processor...
Non-thumbnail version of the chart: https://www.digitaltrends.com/wp-content/uploads/2025/01/3d-...
What is the graph supposed to measure, actually? Renders are usually measured in seconds, so high=worse, but then clearly they highlight it as they're better, so it's the second-difference as a percentage or something?
Why can't companies just include absolute numbers in their comparisons...
It's first party marketing so always orienting the scale towards "higher=better=ours" and measuring via "whatever measurement gave the best numbers to present". They could give all the information in the world and I'd still wait and see what 3rd party reviews say the performance actually is rather than look into the 1st party number.
Betterness.
They're benchmark scores.
Shouldn't they compare Max+ to M4 Max?
They should but it’s not favourable. In their presentation they specifically said it outperforms the binned M4 Pro and is on par with the unbinned M4 Pro.
It would be behind the M4 Max. It’s also over double the wattage of the M4 Pro to achieve these numbers.
It's probably not cost nearly as much as a M4 Max so I'd say no.
Even M4 Pro is a big step up. M4 max is pretty expensive and I suspect AMD is targetting a lower price point, not that any prices were mentioned today.
256 bits * DDR5-8533 is a pretty big step up from any other x86-64 laptop or SFF and should be a pretty huge help for anything graphics or bandwidth intensive, like LLMs.
I'd expect laptops with this thing will be available at closer to the Pro (~$2k) than the Max (~$3k). I see a laptop with the 375 for ~$1700 right now, which is more comparable to the 10-core M4. Or in the minipc space, the 370 is ~$1k, which would again be comparable to a 10-core M4 mac mini.
> AMD 'Strix Halo' Ryzen AI Max+ 395 PRO
Gosh that name is a mouthful
Strix Halo is the codename, not marketing name, so you can remove that part.
"AI" seems to have replaced the segment number.
The + is because it's the top-end model of the lineup.
Not sure what's Max or Pro about it though.
I think the AI appears on any SKU with enough TOPS for copilot+ (they released some 200 SKUs that are just Ryzen 5/7, no AI)
Max is the segment number i.e. it's "Ryzen 11" (the other 300 series SKUs they announced are Ryzen AI [5|7] 3xx). Weirdly though there are no Ryzen 9s so maybe it's really just a rebrand of 9.
The Pro just means it has management and security features for enterprise customers.
The 375 is labeled Ryzen 9: https://www.amd.com/de/products/processors/laptop/ryzen/300-... but of course it's one of the previously available parts of the AI lineup, not a new one.
Thank you. It's nice to be able to decode that.
They managed to put Max, plus, and Pro into the name, which is kinda impressive in a way. Now we just need Ultra to complete the set.
AMD Better watch out, Altera, a subsidiary of Intel, claims a trademark on MAX+PLUS.
there's still room for MAX(+, +) or MAX(+, PRO)
> Strix Halo is the codename, not marketing name, so you can remove that part.
That was the good part, lol.
Isn’t Strix a brand of RAM or something?
Strix is a brand name for the ASUS ROG line of PC components, but AMD is using it as a code name.
BTW, here's the meaning of "Strix": https://en.wikipedia.org/wiki/Strix_(mythology)
- [deleted]
> Not sure what's Max or Pro about it though.
Secure processor, shadow stacks, secure boot, hardware asset trackability. Enterprise stuff.
Sounds like ChatGPT had a breakdown when asked to name it.
I hate it. Whatever does "strix" even mean?? The rest of it is stupid too.
>Whatever does "strix" even mean?
https://en.wikipedia.org/wiki/Strix_(mythology)#Greek_origin...
- [deleted]
Isn't Strix Halo just a code name?
That was my understanding.
This is the first x86 SoC to follow the lead set by Apple Silicons huge unified memory bus.
Bandwidth is more or less on par with the M4 Pro, and it supports up to 128GB.
Remember that AMD has been making x86 SoC's with unified memory for quite some time.
Yeah but they had pretty meagre memory bandwidth until now, aside from the parts they made exclusively for Xbox and Playstation. AMD didn't seem to be interested in bringing fast unified memory to real computers until Apple did it. Now they need to double up the bus again to make an M4 Max-alike...
In part because it's an odd compromise. With the exception of LLM's which are a decent development... there wasn't a lot of need to high memory, but moderate GPU compute parts. You'd either have a lot of memory and a CPU, or a lot of memory and a beefy GPU.
It's possible they were restricted by agreements with Microsoft or Sony not to release anything before this year.
Those agreements would be >15 years old at this point, I doubt AMD would agree to sandbag their entire APU lineup going forward just to make the base model PS4 and Xbone look good, and I especially doubt that they would do that again with the PS5 and Xbox Series when they actually had money and room to negotiate. Likewise, it doesn't make sense as a restriction Microsoft or Sony would impose: the console business is one of convenience, not power. They aren't trying to beat PCs and they don't care if PCs are a better deal. They care if they can get you to buy a box that locks you into their DRM scheme.
Furthermore, during the chip shortages of the last few years, AMD was actually selling broken PS5 silicon for use as a normal Windows PC[0]. If there were restrictions on selling APUs above a certain performance level, then this PC wouldn't exist.
Historically PC OEMs cared about price more than graphics performance so that's what they got. If you look at mobile SoCs or the eDRAM-equipped SKUs Intel made for Apple you see more emphasis on memory & graphics performance similar to consoles.
It isn't what the market wanted, and by market I mean OEMs, because I'm sure consumers would have loved it.
The OEMs buying APUs to use in laptops and SFF desktops were more interested in cutting costs than boosting graphics performance. Users who want better 3D performance can buy a higher end laptop with a discrete GPU and juicier profit margin.
True, but apple's the benchmark in this space and have managed thin laptops with good battery life and decent (but not class leading) GPU performance.
Doubling the memory width (and tripling the bandwidth) helped Apple's GPU performance substantially and should do the same for AMD. Which means that a larger fraction of the laptop market should consider it "good enough" and still have a reasonable TDP to avoid the 2" think laptop that last for less than an hour on battery while sounding like a hair dryer.
Right, but none wider than 128 bits, unless you count PS5 and XboxX.
Intel's Knights series of chips (a.k.a. Xeon Phi, a.k.a. Larabee) for servers shipped 8gb of 320gb/s on package memory in 2012: https://en.wikipedia.org/wiki/Xeon_Phi
It "just" doubled the memory bus width from 128 to 256 bit and cranked up the interface clock speed. I wonder what it means for the infinity fabric. Is it going to run at ~4GHz to keep up?
I cant find any information on memory it uses for its 256GB/s. It said New Memory interface. Seems high even for 256Bit LPDDR5X.
Zen 5 CPU, RDNA 3.5 GPU, and XDNA 2 NPU. No word on process nodes.
It uses LPDDR5X-8000, with a 256-bit memory interface (double in comparison with standard desktops).
8 GHz x 32 bytes = 256 GB/s
This has been known for a long time.
What annoys me is that AMD does not say whether the Zen 5 cores of Strix Halo have full vector processing pipelines, like Granite Ridge and Fire Range, or they have the narrow pipelines of Strix Point and Krackan Point.
Various leaks have claimed that some products would ship with DDR5-8533 which is 266GB/sec. I wouldn't be surprised if a range of frequencies ship with the 1st gen devices.
Maybe even a SFF sized motherboard that allows CUDIMMs, which is a nice fit since each CUDIMM is 128 bits wide.
LTT showed a slide which said 4nm IIRC or was that the 9950X3D?
> If AMD keeps with tradition, which we fully expect, we will see these monstrous APU chips come to desktop PCs in the future.
How far in the future? I don't need another laptop, but would be nice to have a box to run local llms on. If these things can run LLMs at a decent clip then this would be sort of a "shut up and take my money" situation.
I hope the LLM benchmarks announced were reasonable and not something gross like using a model that doesn't swap on the Strix Halo, but does on a 4090.
HP announced a HP Z2 mini g1a, which is bigger than a NUC, but I believe still considered a SFF:
https://www.pcworld.com/article/2567865/hp-z2-mini-g1a-packs...
It's gotta be swapping or something. The 4090 is faster (and far more expensive) than Strix Halo in every way.
After digging it looks like 70B Q4 requires 35GB or so. So yes the AMD comparison is a strix halo not swapping vs a 4090 that is swapping. Seems kinda misleading, but is a real advantage for larger models.
The problem with using these on a AM5 motherboard would be that these chips have quad channel ddr5, while AM5 has 2 memory channels.
HP already announced a mini workstation with this chip.
I'm seeing a laptop [1] do you have a link to the workstation?
EDIT: Oh, I see they're calling the laptop a workstation.
[1] https://liliputing.com/hp-zbook-ultra-14-g1a-mobile-workstat...
There's a laptop AND a desktop, called the HP Z2 Mini G1a.
https://www.pcworld.com/article/2567865/hp-z2-mini-g1a-packs...
"coming soon" and no pricing.
Sure, but the previous iterations (like Strix in the non-halo form) and similar AMD laptop chips are popular with numerous SFFs from the likes of ASUS, ASRock, Gigabyte, Beelink, Minisforum, GMKtex, Lenovo, and many others.
I've heard similar rumors of similar SFFs from framework, system76, and similar VARs.
The Strix Halo does seem pretty compelling for those that want less volume, power, and money than a discrete GPU. I'd love a small SFF with either 128GB ram (some parts will have the ram in the package)) or two CUDIMM slots.
Generally SFFs use laptop parts, but tend to be out a few months later than the equivalent laptops.
I expect similar with the Strix Halo/AI Max.
If they can deliver the drives and support for pytorch out of the box then I know what my next laptop will have in it.
Yep. Though I'd prefer just to have a small form factor box on my desk. Something the size of a Mac Mini.
The strix halo announce was pretty much exactly what was leaked.
However one big surprise was that the Halo 395 chip runs Llama 3.1 70B-Q4 2.2x times faster than a RTX 4090 24GB. Anyone have any details? The slide mentions seeing AMD endnote SHO-14 for details.
Maybe 70B-Q4 doesn't fit in 24GB?
Last time I played with it, 70b models are much larger than 24gb without a lot of quantization.
It mentioned Q4, but after searching around a bit looks like 70B-Q4 need 35GB or so. So strix halo is 2.2x faster than a 4090 when it's paging to system ram.
Not so impressive 8-(.
About 4090 it is bad metric because that given model wont fit on 4090 Grrr Nvidia.
sure. But that's kind of the point. A 4090 with 24GB is going to cost more than one of these strix halo mini PCs with 64GB to 128GB. You're going to be able to run larger models on it without thrashing around between VRAM and DRAM. Will it run as fast as a model that could fit entirely on that 4090? No way, but it will be able to run larger models at fairly decent speed for home use.
Wasn’t intel floating a iGPU with 128GB VRAM?
mmmmh.... I wonder how much the CPU ram usage will slow down the GPU with real life loads.
I'm optimistic that it won't be a big issue. After all if 256 bit wide memory helped a wide range of application it would have been added earlier. After all xbox and playstation have had similar for 2 generations so far.
The bandwidth should mostly help the GPU performance for games or LLMs, but not random desktop apps.
Usually, this is all about memory latency. Until the game devs are aware of this latency issue, games should be ok... but it means you need more control on CPU machine code execution, in other words, you need to ge closer to the bare metal.
100fps is 10ms for a frame (we now know 60pfs is not enough, fps must never drop below 75-80fps, but I would really target 100+ fps)
> AMD says this delivers groundbreaking capabilities for thin-and-light laptops
Laptops with 120W cpus aren't thin or light.
Amd in general has allowed their mobile cpus to power scale. I don’t see why this would be any different. 15w on the go. 120 with plug
Edit: per article these are positioned from 45w up
AMDs slides say 45-120W.
Base TDP is 45. I think you could go below base, if you really wanted, but AMD wants you to design for 45.
FTA (emphasis mine):
"All of the AI Max chips have a 55W base TDP, but also a configurable TDP that ranges from 45 to 120W to unleash more horsepower in designs that can handle the thermal output."
Like a SFF desktop with a big 140mm fan on top. Here's hoping.
Yeah. I'd love to have one of these in mini-ITX or micro-ATX form.
I wonder if at this point, NVMe further becomes a bottlenecking factor?
Seems unlikely, at least for any normal use case. PCIe 5.0 (the current standard that is widely shipping) has 16GB/sec in each direction (32GB/sec total) with a x4 connection.
Doubly so that for the last decade or so memory sizes haven't increases. Seems like the vast majority of machines these days are 8-16GB ram and have a max of 64GB ram unless you are buying workstation parts.
How many times a second do you need to load 100% of ram from storage?
At least the strix halo supports 128GB of ram.
The quoted tdps seem to be for CPU+GPU combined and these SKUs have huge increases in GPU size (40 CUs up from a max of 16 last gen). The M4 Pro and M4 Max fall into the same tdp range.
That's for both CPU, supposedly no-longer puny iGPU and the NPU. You'll get ≥80% real-world throughput for about half the power budget which is can easily be cooled in a 14" laptop that's thin for a gaming laptop without shattering your eardrums. I'll grant that you'll have to pick between acceptable acoustics, thermal throttling, or form factor when you go beyond 60W continuous heat output.
> That's for both CPU, supposedly no-longer puny iGPU and the NPU.
Right, the CPU.