Discussion Optane Client product current and future

Page 18 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

cbn

Lifer
Mar 27, 2009
12,968
221
106
With respect to using Optane as RAM Extender for AMD systems on the lower end for desktops:

https://forums.anandtech.com/thread...g-ssd-and-a-sata6g-hdd.2552820/#post-39733537

Virtual Larry, I think using a SATA SSD with HDD is a good idea for StoreMI systems with plenty of RAM.

But I wonder about Systems with maybe 8GB RAM. Does 32GB (or 58GB) Optane + 3D TLC (or even 3D QLC) SATA SSD make a better storage system compared to the same 8GB RAM + 3D TLC NVMe SSD?

I am thinking mainly page-out for browsing, but also overall system usage.

P.S. Currently AMD StoreMI is only available for desktop, but I am also interested in how that kind of match-up would work for APU laptops.
 
Last edited:

nosirrahx

Senior member
Mar 24, 2018
304
75
101
P.S. Currently AMD StoreMI is only available for desktop, but I am also interested in how that kind of match-up would work for laptops.

On laptops I have been disappointed with the lack of Optane support. Many vendors are locking you out from using what should be supported features through their BIOS modification.

The Lenovo laptops in particular really piss me off. They only support the 16GB module, all other modules are blocked at the BIOS level.

If StoreMI requires BIOS support I would be willing to get that AMD laptops suffer the same problem.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
I think 16GB Optane could work (for browsing page-out) provided the page file was pinned by StoreMI......but wow having some laptop makers blocking out 32GB Optane (or larger) is really surprising.
 

nosirrahx

Senior member
Mar 24, 2018
304
75
101
I think 16GB Optane could work (for browsing page-out) provided the page file was pinned by StoreMI......but wow having some laptop makers blocking out 32GB Optane (or larger) is really surprising.

Yep, crazy since they don't block the Optane CP which literally tells you to upgrade to the 32GB module if you want manual pinning.

A few people literally did this and went to Lenovo for support when it didn't work. Lenovo just points to the support PDF which mentions the 16GB module and nothing about blocking the others.
 
  • Like
Reactions: cbn

cbn

Lifer
Mar 27, 2009
12,968
221
106
With H10 Optane needing PCIe bifurcation support I do wonder if Intel will release an AIO Optane and NAND SSD based on PCIe x2 and SATA?

The NAND part would be slower, but It would be compatible with any motherboards first M.2 slot (I believe).
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
Besides what I mentioned here could it be that for H10 Optane Intel allows "SRT" to be used alongside Intel Optane Software application?

Then 64GB or less (actually I would hope for more than 64GB) of the NAND part of H10 SSD could be used to cache a hard drive?

So taking my previous post and this post together two possibilities for additonal caching opportunities with H10:

1. H10 caching NVMe NAND primary volume + SATA secondary volume. (So both volumes cached with Optane via Intel Optane software application)

2. H10 caching NVMe NAND primary volume + NAND on H10 caching SATA secondary volume via "SRT".

P.S. One very nice thing that Intel SRT would allow a user to do (that Intel Optane software application will not currently let a user do) is cache a RAID volume--> https://www.intel.com/content/dam/s...el_smart_response_technology_user_guide_3.pdf
 
Last edited:

cbn

Lifer
Mar 27, 2009
12,968
221
106
Looking at the lack of NVMe NAND scaling for consumer workloads mentioned in this post and this post (with this post showing an exception) I got to wondering about the future of PCIe 4.0 x 4 or 5.0 x 4 NVMe SSDs.

Could it be that PCIe 4.0 x 2 or even PCIe 5.0 x 1 (or PCIe 5.0 x2) becomes more popular that expected? This until the consoles get a hardware overhaul that promotes a greater need for high Sequential?

Maybe the next hardware overhaul that will boost the need for high Sequential will involve an advanced generation of Optane stacked on top of a Radeon or Intel GPU die?

Or maybe a new low latency Optane that boosts CPU single thread and efficiency?

Or maybe booth?

We'll see. I wouldn't mind going for PCIe 4.0/5.0 x1 or x2 drives. Most lower end NVMe drives can barely saturate a PCIe 3.0 x2 link, so cutting down on link width seems a good idea. It'll reduce latency, and possibly power consumption for mobile use. Desktop can retain a x4 link, since power doesn't matter too much there. Even PCIe 4.0 x4 offers a whopping 7.88GB/s, with double that for 5.0. So it should be enough for the foreseeable future.

Here is the Anandtech comparison of 4K QD1 Read between the PCI x 2 Optanes (Orange and dark blue) and the PCIe x 4 Optane (light blue):

https://www.anandtech.com/show/12512/the-intel-optane-ssd-800p-review/5

burst-rr.png

(PCIe x 2 Optanes are a good deal faster than the PCie x4 Optanes)

With that noted, I wonder if restricting the 800p and Optane memory to PCIe x 1 would boost 4K QD1 even more?

P.S. Below is the 4K QD1 write for the PCIe x2 and PCIe x 4 Optanes:

burst-rw.png


I wonder how much reducing 800p and Optane memory to PCIe x1 boosts the 4K QD1 write? Is it possible the 32GB Optane could do 270 MB/s (~matching the 4K QD1 write of the 512GB 660p)? Or is that too optimistic?
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
There's no point going into the extremes just hoping to boost QD1 and latency. Because real workloads are more complex than that.

Also from the chart above you can clearly see that the 800P does far better, meaning in writes whatever allows higher sequentials are also allowing random writes to be better as well.

There's probably a minor "efficiency loss" when you increase links, just as when you increase the number of memory channels. Going from x1 to x2 is probably a negligible loss. If you double that again the loss is higher and maybe that's why they stuck to x2. But future Optane Memory devices are moving to x4 as well.

If you stick to the x1 link reads will be handicapped, and the resulting loss in sequential and multi queue depth bandwidth might result in overall lower performance. It's not entirely right to say QD1 is dominant. It may be right to say QD1/2/4 is dominant. But dominant doesn't mean everything, and you want to have really good sequentials and many queue depth performance too.

Further, the burst Random Read figures Anandtech gets is actually quite unrealistic. Windows tends to get lower, and there's a significant penalty due to meltdown and spectre. If you get certain settings wrong you could easily lose half of your performance too.

Overall the Optane SSDs should really be a temporary solution paving a way to DIMM-based ones, and caching is to save on cost.
 
  • Like
Reactions: VirtualLarry

cbn

Lifer
Mar 27, 2009
12,968
221
106
There's no point going into the extremes just hoping to boost QD1 and latency. Because real workloads are more complex than that.

When I wrote that I was mainly thinking of RAM extension, particularly browsing.

You are right, of course, that lowering PCIe x 2 to PCIe x1 will lower Sequential Read and high queue depth IOPS. However if you look at the first quote in post #432 (which was originally post #338) you will see that SATA SSD is doing almost identical (if not identical) to NVMe NAND in most games* (4K video editing with an intermediate codec should be the same with either SATA SSD or NVMe NAND SSD).

*Only one game showed a strong difference in performance.

Further, the burst Random Read figures Anandtech gets is actually quite unrealistic. Windows tends to get lower, and there's a significant penalty due to meltdown and spectre.

Yep, Windows and Spectre would definitely affect latency reduction scaling.

But I wonder if there is possibly still a gain that could boost QD1 and QD2 4K Read more than the loss of bandwidth takes away in typical consumer workloads? Maybe there is a way I can test this via a PCIe 3.0 x 1 slot? Would a riser cable add enough latency to be a problem? (Too bad the motherboard I am thinking about using doesn't have an open ended PCIe 3.0 x 1 slot....then this would be a lot easier (re: I would just M.2 to PCIe x 4 adapter card and would be done with the physical set-up)).

P.S. I wonder how Linux would do in comparison? (I would think better, but how much better?)

(Too bad there is no software from Intel or Enmotus allowing Optane to be used with another drive as a single volume in Linux)

EDIT: Here are two M.2 to PCIe x 1 adapters I found:

s-l1600.jpg



It installs like this in a motherboard:

s-l500.jpg


And here is another one:

m2-1181-x1.jpg


Still deciding which one to get.

I'm leaning towards the first one because I imagine it would be lower latency.

EDIT 2: Decided to get the first adapter. (It is being sent from China so probably 4 weeks till it gets here).
 
Last edited:

cbn

Lifer
Mar 27, 2009
12,968
221
106
Regarding the H10 Optane and AMD below is a chart as well as a link describing AM4 motherboards configurable either as PCIe x 4 or PCIe x2 + 2 x SATA.

index.php

This is most obvious when looking at the X300/A300:

Screenshot-4.png


This link has more of the same info:

https://www.amd.com/en/products/chipsets-am4

Screenshot-6.png


Screenshot-7.png


Doesn't this imply (because PCIe 3.0 x2 can run as 2 x SATA 6Gbps) PCIe x2/PCIe x2 bifurcation is present in these boards* with both PCIe x 4 and SATA modes for M.2 (I know most, if not all, Intel boards support PCIe x 4 and SATA mode it at least one M.2 port).

If so, then I would assume H10 is already compatible with AMD boards and maybe even Intel boards that support PCIe x4 and SATA modes in M.2.

If true, that opens up the idea of using StoreMI (or FuzeDrive) with H10 Optane.

*This board here actually disables two SATA ports (SATA 6Gbps 5/6) when the M.2 in question (M.2_1) is used either for SATA or PCIe. (This implies that M.2_1 is SATA Express compatible)
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
However if you look at the first quote in post #432 (which was originally post #338) you will see that SATA SSD is doing almost identical (if not identical) to NVMe NAND in most games* (4K video editing with an intermediate codec should be the same with either SATA SSD or NVMe NAND SSD

Perhaps you want to revisit your links, because while it doesn't show big of a gain as it does in synthetic tests, the differences are still significant.

So what we tend to do is try to see problems as a single ended thing, when in truth they are multi-faceted and much more complicated. Software developers have vastly varying skills and resources to put in, and the types of software out there are numerous to say the least. And in cases like file transfer, sequential matters a lot. This is why no single metric(like QD1 random reads) can represent everyone.

We have serious problems in SSD benchmarking, because most benchmarks are synthetic, or seems to focus too much on file transfers and you don't do that often. Maybe though current SSD benchmarking is not unrealistic as they seem to be. Storage is considered "cold" storage where data is used very infrequently.

With RAM, its used for compute, not just retrieving files once in a while, thus throughput matters. Slow RAM actually slows down compute, while slow storage doesn't. As long as its used as storage, SSD benchmarks will suck and real world benefits marginal. We could go to PCIe version 10, with x1 link having 128GB/s, but nothing will change, because its cold storage.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
However if you look at the first quote in post #432 (which was originally post #338) you will see that SATA SSD is doing almost identical (if not identical) to NVMe NAND in most games* (4K video editing with an intermediate codec should be the same with either SATA SSD or NVMe NAND SSD).

*Only one game showed a strong difference in performance.

Perhaps you want to revisit your links, because while it doesn't show big of a gain as it does in synthetic tests, the differences are still significant.

Here are the links (originally posted in #338):

https://forums.anandtech.com/threads/nvme-adapter.2556316/#post-39641478

https://forums.anandtech.com/threads/nvme-adapter.2556316/#post-39644028

https://forums.anandtech.com/threads/nvme-adapter.2556316/#post-39643985

If you look at the gaming results found in those there is only one strong exception and that is the one from the last link where Call of Duty Infinite Warefare is more than twice as fast with the 960 EVO (11 seconds) as it is with the MX300 (25 seconds). After that the next best result for NVMe NAND (also found in the last link) has the load time for SATA SSD 23% slower.

From the first link (which is the link with charts I can post) notice 850 EVO vs. 970 EVO:

load-game1.png


load-game2.png

load-game3.png


In two of the games the 850 EVO is identical (actually .1 second faster in one) to 970 EVO and in third game (Shadow of Mordor) the 850 EVO is only 6% slower.

Here are some more games from the first link:

bf1.png


850 EVO is 8.3% slower than 970 EVO in Battlefield 1.

watchdogs2.png


In the game above (Watch dogs 2) 850 EVO is only .7% slower than the 970 EVO.


aHR0cDovL21lZGlhLmJlc3RvZm1pY3JvLmNvbS9TL0wvODA0NTQ5L29yaWdpbmFsLzE4LnBuZw==


And in the last comparison (from link one) we don't have 850 EVO but we do have the MX500 SATA SSD which is a hair faster than 970 EVO.

(So for link #1 4 games have SATA basically identical to PCIe 3.0 x4 NVMe NAND and two games where SATA SSD is only 6% and 8.3% slower. This is pretty much what I found with my Youtube video reseach* which is found in the second link at the beginning of this post.)

*In three of the Youtube videos NVMe NAND was no faster (or without much difference) than SATA SSD (for all games tested) and in the fourth Youtube video there were games where NVMe NAND was~ identical to SATA SSD but it also had three results where SATA SSD was slower by a larger margin than found in first three Youtube videos but this margin was only 18%,17% and 9% for the three games that were slower.

The following videos showed NAND based NVMe and SATA SSD either essentially the same or without much difference:



https://www.youtube.com/watch?v=EdF_aerWcW8

The video below had some titles loading around the same time and a few titles having a larger difference (eg, Battlefield 1 was ~46 seconds on NVMe and ~50 seconds on SATA SSD, Hitman was ~12 seconds on NVMe and ~14 seconds on SATA SSD, Rainbow Six Seige 6.11 seconds on NVMe and 7.10 seconds on SATA SSD)

https://www.youtube.com/watch?v=GKv8cAaJgqs

Final tally on the games in the three provided links:

1 Game where SATA is less than half the speed of PCIe 3.0 x4 NVMe (found in third link)
1 Game where SATA is 23% slower than PCIe 3.0 x4 NVMe (found in third link)
1 Game where SATA is 18% slower than PCIe 3.0 x4 NVMe (found in second link)
1 Game where SATA is 17% slower than PCIe 3.0 x4 NVMe (found in second link)
1 Game where SATA is 16% slower than PCIe 3.0 x4 NVMe (found in second link)
1 Game where SATA is 9% slower than PCIe 3.0 x4 NVMe (found in second link)
1 Game where SATA is 8.3% slower than PCIe 3.0 x4 NVMe (found in first link)
1 Game where SATA is 6% slower than PCIe 3.0 x4 NVMe (found in first link)
1 Game where SATA is 3.6% slower than PCIe 3.0 x4 NVMe (found in second link)
1 Game where SATA is 3.4% slower than PCIe 3.0 x4 NVMe (found in second link)
1 Game where SATA is 3.2% slower than PCIe 3.0 x4 NVMe (found in second link)
1 Game where SATA is 1.6% slower than PCIe 3.0 x4 NVMe (found in second link)

And......

16 Games where SATA is essentially identical in speed to PCIe 3.0 x 4 NVMe!

So most games really are almost identical (if not identical) in load times when comparing PCIe 3.0 x 4 NVMe NAND to SATA SSD.

P.S. Originally I missed one game in which SATA was 16% slower than NVMe (but this is now included in the above tally)
 
Last edited:

kurosaki

Senior member
Feb 7, 2019
258
250
86
It's almost as it's not just loading stuff from the harddrive/SSD. Watchdogs, 46 secs! It should load 8 GB in 2-4 seconds with the NVME, the rest must be compile-times? Compiling the scene/ world before entering it? Must be loads of calculations before anyways, that isn't storage bound. As you pointed out in most cases. What about games without loading screens? Maybe we could se less glitching in games that load new environments under the radar?
 

nosirrahx

Senior member
Mar 24, 2018
304
75
101
It's almost as it's not just loading stuff from the harddrive/SSD. Watchdogs, 46 secs! It should load 8 GB in 2-4 seconds with the NVME, the rest must be compile-times? Compiling the scene/ world before entering it? Must be loads of calculations before anyways, that isn't storage bound. As you pointed out in most cases. What about games without loading screens? Maybe we could se less glitching in games that load new environments under the radar?

You are forgetting that SSD ideal speed requires specifically 1 large file. As the number of files increases and the size of the files decrease SSDs move away from their maximum sustained transfer speed and towards their limitations imposed by latency. If you follow SSD reviews you will see that while maximum sequential speed between top NVMe and top SATA is pretty huge, the gap between 4KQ1T1 speed is far smaller.

The setup routines that rely on the CPU also mater quite a bit. If hardware reviewers were a bit more savvy they would take this into account and do game load times on systems with CPUs clocked well into the 5s and with 32GB of RAM hand tweaked to maximize speed and latency. Doing this would let the SSD be more of a bottleneck and show more of the true delta.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Good point, looks like it's compute and or io-bound in all cases

Don't forget the common denominator oriented programming. It has to work in systems with platter HDDs. Some programs artificially inject delays to accommodate this. Some will try to cache using system RAM as much as possible.

But really it'll be a combination of many factors:
-Programming meant to work for all drives
-Workload is not just sequential throughput bound, or just QD1 random read/write bound
-SSDs can't show full speed in most scenarios
-Certain parts are bound by speed of rest of the system, like CPU
-Some loading time is needed if to show some information for the user

That's why Optane with 5-10x the speed in random read QD1 is only minimally faster. It should be even better because Optane doesn't slow down when the drive is full, or the state of data gets dirty.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
Earlier in the thread here I posted some results Linus Tech Tips did using Optane as RAM extender.

But I am wondering how Browsing is affected?

Maybe a comparison of Optane H10 vs. Optane caching SATA SSD vs. Optane caching HDD vs. more RAM? This using Firefox and/or Chrome (with and without an Ad blocker (e.g. uBlock Origin)).

Maybe even throw in Enmotus Fuzedrive in the mix as well? (This is the software AMD uses for StoreMi). Speaking of that I am actually wondering if Optane H10 can work in certain AMD AM4 motherboards? (see post #435)
 

nosirrahx

Senior member
Mar 24, 2018
304
75
101
Maybe even throw in Enmotus Fuzedrive in the mix as well? (This is the software AMD uses for StoreMi). Speaking of that I am actually wondering if Optane H10 can work in certain AMD AM4 motherboards? (see post #435)

I wonder what the cost of one of these drives would be if they put the Optane controller and software on the actual drive to remove all compatibility issues?
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
I wonder what the cost of one of these drives would be if they put the Optane controller and software on the actual drive to remove all compatibility issues?

I think being able to turn two controllers into one might save Intel few dollars in production costs, but overwhelming amount of the cost is due to the storage chips themselves. Whatever they save they'll probably use it for profit.

Even if it was cheaper manufacturing cost wise, there will be additional resources required to fully integrate the two chips to work harmoniously, and they'll have to rely on volume to offset that cost.

The greatest benefit of fully unifying them is they won't have to rely on software for management which will greatly improve user experience, and reduce data loss. There might be performance improvements as well.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,343
10,046
126
I wonder what the cost of one of these drives would be if they put the Optane controller and software on the actual drive to remove all compatibility issues?
But you see, that would remove the dependence on Intel CPUs, and... they don't want that.

Imagine, if Intel made HDDs... ECC would be non-existent except for "Enterprise" versions, that cost 10X as much, and they would find a way to segment them somehow, like drives above a 1TB, require a "Core" CPU, rather than a Pentium or Celeron to operate, etc.
 

nosirrahx

Senior member
Mar 24, 2018
304
75
101
But you see, that would remove the dependence on Intel CPUs, and... they don't want that.

Imagine, if Intel made HDDs... ECC would be non-existent except for "Enterprise" versions, that cost 10X as much, and they would find a way to segment them somehow, like drives above a 1TB, require a "Core" CPU, rather than a Pentium or Celeron to operate, etc.

Well...with the way Optane sells its not exactly like the Intel exclusivity is a huge help. They could split the difference and put some Intel platform exclusive features into the hardware side of things so AMD users could still get the performance but not the software interface.
 
  • Like
Reactions: cbn

cbn

Lifer
Mar 27, 2009
12,968
221
106
But you see, that would remove the dependence on Intel CPUs, and... they don't want that.

Imagine, if Intel made HDDs... ECC would be non-existent except for "Enterprise" versions, that cost 10X as much, and they would find a way to segment them somehow, like drives above a 1TB, require a "Core" CPU, rather than a Pentium or Celeron to operate, etc.

Well...with the way Optane sells its not exactly like the Intel exclusivity is a huge help. They could split the difference and put some Intel platform exclusive features into the hardware side of things so AMD users could still get the performance but not the software interface.

The following is from 2016, but I can't imagine it still not the case today:

https://www.pcworld.com/article/310...mory-to-work-on-amd-pcs.html#tk.rss_computers

Intel's lightning-fast Optane SSDs and memory won't be limited to PCs featuring the company's own chips, but could work with PCs based on AMD processors as well.

Intel wants to make adoption of Optane easy for makers of PCs and servers regardless of the chips they use, said Rob Crooke, senior vice president and general manager of the company's Non-Volatile Memory Solutions Group, in an interview.

Still I have to say it is disappointing we can't use Optane on Pentium or Celeron.
 
Last edited:

cbn

Lifer
Mar 27, 2009
12,968
221
106
Thinking more about Optane as RAM Extender, I wonder if Gen 2 Optane takes a page or two out of IMFT Gen 2 3D NAND's playbook.

Remember how IMFT Gen 1 3D NAND only came in one size, a big 168mm2 32L 256Gb MLC/384Gb TLC....but Gen 2 started off with a small 58mm2 64L 256Gb TLC die:

https://www.techinsights.com/techno...st-reports/intel-micron-64l-3d-nand-analysis/

1.jpg


With a 64L 512Gb TLC die and a 64L 768Gb TLC/1024Gb QLC die that followed.

Maybe Gen 2 3DXpoint does the same thing? (ie, start of with a small die, follow up with a medium die and finish up with a large die)

If true, then we would have a much different landscape for Optane (on the consumer front).

For one thing writes would be a whole better without needing high capacities. That alone could put a 16GB Optane made with two Gen 2 64Gb 3DXpoint into the same write ballpark we saw with the 58GB 800p (reason: parallelism per GB increases by four because a (estimated) 51.5mm2 Gen 2 64Gb die has the same amount of layers as two Gen 1 128Gb dies). That would be excellent! Likewise a 32GB Optane made up of four 64Gb Gen 2 3DXpoint dies could potentially be double that in writes (controller permitting).

P.S. In the past I was confused on why the Optane M15 would use 16GB Gen 1 Optane with PCIe 3.0 x4, but maybe Intel wants the extra links because small die Gen 2 will boost Sequential Read even at such a low capacity? (With this noted I am still concerned about PCIe 3.0 x4 increasing latency over PCIe 3.0 x2)
 
Last edited:

cbn

Lifer
Mar 27, 2009
12,968
221
106
The M.2 to PCIe x 1 adapter I mentioned in post #434 came today (surprised it got here so quick).....

......and I did run Crystaldiskmark 6.0.2 using my 16GB Optane at both PCIe 3.0 x 2 and PCIe 3.0 x 1.

The upshot is that there was essentially no difference in 4K QD1 read or write. (This using the 16GB Optane in expansion bay #6 of my HP Z420 Workstation).
 
  • Like
Reactions: kurosaki

cbn

Lifer
Mar 27, 2009
12,968
221
106
The setup routines that rely on the CPU also mater quite a bit. If hardware reviewers were a bit more savvy they would take this into account and do game load times on systems with CPUs clocked well into the 5s and with 32GB of RAM hand tweaked to maximize speed and latency. Doing this would let the SSD be more of a bottleneck and show more of the true delta.

Good point about the clockspeed and IPC.

Beyond that I wonder how much CPU core count matters?

Is loading something that could be maximized with a overclocked (5+ GHz) Core i3 in most (or all) cases?