Discussion Optane Client product current and future

Page 10 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

cbn

Lifer
Mar 27, 2009
12,968
221
106
Here is an article by Anand on the Apple Fusion drive:

https://www.anandtech.com/show/6406/understanding-apples-fusion-drive

Interestingly, in addition to auto-tier it also has a write buffer of about 4GB. (I can imagine that helping a lot as the two drives used for fusion are additive in capacity)

With Fusion Drive enabled, Apple creates a 4GB write buffer on the NAND itself. Any writes that come in to the array hit this 4GB buffer first, which acts as sort of a write cache. Any additional writes cause the buffer to spill over to the hard disk. The idea here is that hopefully 4GB will be enough to accommodate any small file random writes which could otherwise significantly bog down performance. Having those writes buffer in NAND helps deliver SSD-like performance for light use workloads.
That 4GB write buffer is the only cache-like component to Apple's Fusion Drive. Everything else works as an OS directed pinning algorithm instead of an SSD cache. In other words, Mountain Lion will physically move frequently used files, data and entire applications to the 128GB of NAND Flash storage and move less frequently used items to the hard disk. The moves aren't committed until the copy is complete (meaning if you pull the plug on your machine while Fusion Drive is moving files around you shouldn't lose any data). After the copy is complete, the original is deleted and free space recovered.
With that noted I wonder how the Optane compatible AMD StoreMI (ie, Enmotus FuzeDrive) handles writes? A write buffer would make sense, but how big? Also since this thread is about Optane (very useful IMO for paging out) I do wonder how Apple FusionDrive and AMD StoreMI would handle page file?

P.S. Enmotus FuzeDrive does have file pinning which could be used for the page file.......


https://www.enmotus.com/hubfs/PDFs/User Guides/Enmotus FuzeDrive v1.2.1 Windows User Guide.pdf



.....but right now file pinning it is not enabled on the AMD for Ryzen version of FuzeDrive.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
At least two things I am wondering about the Intel Optane Software:

1.) For the 16GB (block level caching), how large of a write buffer is reserved by the software? Does the write buffer also handle writes for page-outs?

2.) For the 32GB and greater (block and file level caching), how large of a write buffer is reserved? Also where do writes for page-outs and virtual memory (not included in page file) go? White listed files for page out and virtual memory (not in page file) or is the page out and virtual memory handled by the block level write buffer?
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
I know that the Optane caching software is aware of file types and intentionally balances file level and block level caching to greatly improve performance while avoiding the caching of data that does not need accelerating.

Would Apple and AMD be able to create their own software that caches to Optane in the same way or does Optane have some sort of "secret sauce" that only Intel has access to?
That is a good question.

There is now an Intel Version of the Optane compatible FuzeDrive software....so hypothetically a side by side test of the software used by AMD could be performed.
 

nosirrahx

Senior member
Mar 24, 2018
294
69
101
I did some Optane and VROC testing. The top pic is intentionally pre-patch OS and BIOS, the bottom has the OS and BIOS fully updated. The left is the 905P, the right is 4 way VROC 0 900P.



I have been doing some more testing with current OS and BIOS but with Specter and Meltdown protection disabled. It seems that the performance hit on VROC is far worse than these original tests indicate. The following benchmarks are from the same system:



You are reading that correctly. 4KQ1T1 on VROC 0 drops by 40% if Specter and Meltdown protection is enabled. I also have access to a NUC8i7HVK and set it up with 2 800P in RAID 0. One bench is with Specter and Meltdown protection enabled, the other disabled, pretty sure I don't need to tell you which one is which:

 

cbn

Lifer
Mar 27, 2009
12,968
221
106
A good Laptop for browsing that is lightweight with a big screen (using Intel Low Power display technology) and has long battery life.

Google has made high end Browser laptops for years, but I really do hope we see the Linux distros get involved as well. (Linux distro = more privacy than Google).

So with that noted, I do wonder what kind of power consumption there would be for various combinations of DDR4 vs. LPDDR4 vs. DDR Optane....at idle and with a bunch of (mostly) static webpages open? (My speculation is that the Optane persistent memory would be far better on power consumption than the DRAM for this type of work)

EDIT: It could even be some kind of wearable device (eg, augmented reality glasses for browsing also using Intel Low Power display technology). In fact, Mozilla has a new browser for this.
Thinking back to the idea I am mentioning above (specifically the laptop)....perhaps the best way to implement is to use Intel's compute card?

Then a person could have one SoC with the expensive DDR Optane able to be used in more than one device. (eg, small lapdock*, larger lapdock*, fanless deskdock, etc).

*These using the new 1 watt Sharp or Innolux panels compatible with the new Intel Low Power Display Technology.

Also would be interesting for a privacy oriented machine like the ones sold by Purism. (I like the features on those laptops...would be nice to have the kill switches for a Windows machine too).

P.S. Glad to see Lapdocks using Intel Compute card are already on the way:

http://nexdock.com/

 
Last edited:

cbn

Lifer
Mar 27, 2009
12,968
221
106
Some results using my HP Z420 Workstation (with latest patch for Spectre v2 installed)....

Windows ReadyBoost with 16GB Optane caching 3 x WD5000AZLX in RAID-0:



Primocache with 16GB Optane caching (read only) 3 x WD5000AZLX in RAID-0:



16GB Optane by itself:



3 x WD5000AZLX in RAID-0 (RUN #1):



3 x WD5000AZLX in RAID-0 (RUN #2):

 

cbn

Lifer
Mar 27, 2009
12,968
221
106

Pretty surprising results on Blender and Gaming. (Less impressive results with Premiere)







P.S. The storage being cached by the 32GB Optane is a 1 TB SATA 6 Gbps SSD.
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,079
2,883
136
Those results are quite interesting. I speculate Adobe Premiere doesn't gain as much because the data stream is more sequential than the others and lean write heavy which would hit the weak points of Optane.

They should have also thrown in two more results,
-NVMe drive with 3GB memory
-NVMe drive with 16GB memory

Just for completeness. It would also tell us if the results are bound by latency or bandwidth.
 
  • Like
Reactions: Cerb and cbn

nosirrahx

Senior member
Mar 24, 2018
294
69
101
Now you can pin individual applications and folders with the latest Optane Memory driver: https://www.intel.sg/content/www/xa/en/support/articles/000028779/memory-and-storage.html
Needs 32GB or larger Optane to work.
I have been checking that out on my new travel laptop. I am getting pretty impressive performance combining a SATA SSD with an 800P (not supported but 100% functional).

The 800P is pretty awesome as a cache drive as it fits in a laptop and the space available for caching and the performance blow the 32GB model away.

 
  • Like
Reactions: cbn

nosirrahx

Senior member
Mar 24, 2018
294
69
101
The price point of the 800P 58GB actually makes a lot of sense in a laptop. Compare the following 2 configurations:

2TB 970 EVO

VS.

2TB 860 EVO + 58GB 800P

Since 4KQ1T1 is the performance metric that you "feel" as a typical Windows user the Optane setup is faster in the vast majority of use cases.

The price for the 2 combined drives is about 100$ less than the 970 on its own. I don't understand why Intel only officially supports a configuration that really does not demonstrate their potential.

3 out of the 4 last Optane enhanced systems I have built all use an unsupported Optane configuration:

4TB 860 EVO + 240GB 900P (son's gaming system)
1TB 860 PRO + 118GB 800P (personal travel laptop)
1TB HDD + 58GB 800P (system for friend's daughter)
 

IntelUser2000

Elite Member
Oct 14, 2003
8,079
2,883
136
It seems like from the sequential read info the application also uses system memory for caching? You can't get 4GB from any NVMe drive.

Or do you have a RAM drive?
 

nosirrahx

Senior member
Mar 24, 2018
294
69
101
It seems like from the sequential read info the application also uses system memory for caching? You can't get 4GB from any NVMe drive.

Or do you have a RAM drive?
I believe that it has something to do with the screenshot where it mentions total memory as the total of both the RAM and Optane.

I have not looked into the notes yet but I would be willing to bet that Optane cache software is now "spare RAM" aware and makes use of some kind of hybrid Optane + RAM cache.

You are correct though, under normal circumstances those sequential speeds are impossible even for the 970 Pro.
 
  • Like
Reactions: cbn

nosirrahx

Senior member
Mar 24, 2018
294
69
101
These caching drives are proving to be quite popular.
This is surprising since Intel seems to be doing just about everything wrong in terms of Optane.

The upcoming 380GB M.2 drives that wont work with the vast majority of systems is another example of this.

Why not a 236GB 905P (essentially a double 118GB 800P with a 4X interface) in a 2280 form factor? This 22110 length SSD seems to be a bad move.

I think those M.3 (wider M.2) form factor SSDs are going to take over in the server sector so I can't see why Intel is using a form factor that isn't for the present or future.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,079
2,883
136
I can think of few reasons why they would do so.

1. Price, and appropriate positioning: It's expensive and uses lots of power so it'll be avoided in SFF and budget systems. Most desktop boards support the 22110 size.

2. Technical reasons. 900/905P uses a 7-channel controller to achieve its sequential throughput. The 22110 allows for rapid TTM since it can fit 7 chips. You'd need a significant controller change to get it working with an even number of chips.

3. Data center SSDs already use the M.2 22110 size. 905P M.2 is a server P4800X M.2 derivative.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106

nosirrahx

Senior member
Mar 24, 2018
294
69
101
That is awesome. I'm very glad they did that.

P.S. I wonder if pinning the page file shows a difference. (Did the Optane system acceleration software already pin the page file by default?)
The default caching is labeled as 2 things, "System files" and "Top accessed content". These 2 are not delineated further and you cannot "unpin" them.

Pagefile.sys and swapfile.sys cannot be pinned as Optane says that they are in use and need to be closed.
 
  • Like
Reactions: cbn

IntelUser2000

Elite Member
Oct 14, 2003
8,079
2,883
136
There's a good enough reason to believe the P4800X M.2 will launch during Intel's Datacenter talk this week. It's not then unbelievable to imagine they'll also showcase the M.2 version of the 905P.
 

nosirrahx

Senior member
Mar 24, 2018
294
69
101
There's a good enough reason to believe the P4800X M.2 will launch during Intel's Datacenter talk this week. It's not then unbelievable to imagine they'll also showcase the M.2 version of the 905P.
I have been checking out laptops with 22110 support, not much to choose from. Once the M.2 905P drops I might see about creating a 860 EVO + 905P laptop to get the ultimate in performance and space.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
It seems like from the sequential read info the application also uses system memory for caching? You can't get 4GB from any NVMe drive.

Or do you have a RAM drive?
I believe that it has something to do with the screenshot where it mentions total memory as the total of both the RAM and Optane.

I have not looked into the notes yet but I would be willing to bet that Optane cache software is now "spare RAM" aware and makes use of some kind of hybrid Optane + RAM cache.

You are correct though, under normal circumstances those sequential speeds are impossible even for the 970 Pro.
Yes, that looks like it would be RAM cache (though the 4K QD1 Read is low for RAM).

Hmmm.....I wonder if this RAM cache is dynamic or static in size*? Perhaps one way of testing this is to page out and see if the big sequential numbers come back when using a larger sample size on Crystaldiskmark.

*Enmotus FuzeDrive and Romex Primocache RAM cache is static in size.

P.S. Would be pretty exciting if this RAM cache was dynamic (edit: for clarification I mean dynamically allocated) and could be made (or is) "persistent" (ie, last RAM cache loaded up (from disk) upon boot like the Primocache Level-1 "RAM cache" can be set-up to do. NOTE: Not sure if the Enmotus FuzeDrive RAM cache (called "FuzeRAM") is persistent or (if not persistent by default) has a persistent setting like Primocache RAM cache does.)
 
Last edited:

ASK THE COMMUNITY