Two technological innovations that could reduce the need for NAND?

cbn · Dec 15, 2017

cbn said:
Assuming "Internal RAID" hard drives became available what impact do you think that would have on caching strategy?

For example, I noticed this program allows a maximum size for files to be cached.

Smallfilecache.Maxsize - This will be the maximum allowed size of any file to be cached. This is currently set to 3MB and is the recommended size.

Click to expand...

dave_the_nerd said:
Your turn. Why do you think caching strategy would need to change?

If the throughput is high enough on platters, then there would be less need to cache the larger files on solid state storage.

whm1974 · Dec 15, 2017

Why not just make the SSDs out of 3D Xpoint? I'm sure with farther research both Intel and Micron can produce cheaper, faster, and larger SSDs to replace both HDDs and NAND for storage.

IntelUser2000 · Dec 15, 2017

whm1974 said:
Why not just make the SSDs out of 3D Xpoint? I'm sure with farther research both Intel and Micron can produce cheaper, faster, and larger SSDs to replace both HDDs and NAND for storage.

If 3D XPoint based devices become something to replace HDD and NAND, it'll only be because the size and cost for the size has come down for majority of the market to not needing slower technologies. Full replacement will probably not happen for 30-50 years. And even then I bet you HDDs will be used in a way Tape drives are used today.

We haven't reached that point for NAND yet, and HDDs continue to have a massive advantage in $/capacity.

It just has to do with physics and laws of nature. It's always a balance. They talk of scaling issues with NAND, and scaling issues with HDDs.

dave_the_nerd · Dec 16, 2017

cbn said:
If the throughput is high enough on platters, then there would be less need to cache the larger files on solid state storage.

Overall sequential throughput is already high enough on platters. It's latency that's the problem. Internal "RAID" doesn't actually improve that.

cbn · Dec 16, 2017

dave_the_nerd said:
Overall sequential throughput is already high enough on platters. It's latency that's the problem. Internal "RAID" doesn't actually improve that.

Increasing Sequential (without decreasing latency) does help:

(The RAID-0 hard drive sequential read is somewhere in between the Samsung 850 SSD and the single hard drive)

Samsung SSD 850 Pro 256GB: 30 seconds
2 x WD Black 4TB 7200 rpm RAID-0: 40 seconds
WD Black 4TB 7200 rpm: 48 seconds
Seagate Momentus 500GB 5400 rpm: 1 minute 42 seconds

IntelUser2000 · Dec 17, 2017

cbn said:
Increasing Sequential (without decreasing latency) does help:

(The RAID-0 hard drive sequential read is somewhere in between the Samsung 850 SSD and the single hard drive)

Samsung SSD 850 Pro 256GB: 30 seconds
2 x WD Black 4TB 7200 rpm RAID-0: 40 seconds
WD Black 4TB 7200 rpm: 48 seconds
Seagate Momentus 500GB 5400 rpm: 1 minute 42 seconds

That's likely not due to RAID, but increase in RPM from 5400RPM on a power-sipping Notebook drive, to a more modern Desktop drive with 7200RPM and bigger DRAM buffers.

Update: I made a mistake of not noticing the standalone WD Black drive. I stand corrected.

In most cases, RAID increases latency, so there are no benefits. I guess its more beneficial for a HDD where its so slow in the first place. You still have increased boot times with RAID though, regardless of whether its a HDD or an SSD.

dave_the_nerd · Dec 19, 2017

cbn said:
Increasing Sequential (without decreasing latency) does help...

Loading big games into memory and backup up or cloning an HDD for backups are just about the only things most people do that are actually bottlenecked by sequential read/write.

Run PCMark or something on your single HDD, then a RAID-0, then on a midrange SSD, and see what the difference is, in terms of an overall score. Or a few of the multitasking benchmarks that Anandtech used to do before they standardized on the Anandtech Storage Bench.

You'll see.

cbn · Dec 27, 2017

Very happy to see the following being developed: (This originally posted by Elixir)

https://blog.seagate.com/technology..._flagship3_messaging;JXyixoGaROyvrtBYgUlcBw==

Seagate unveiled today its new Multi Actuator technology, a breakthrough that can double the data performance of its future-generation hard drives in hyperscale data centers. As higher areal densities on future hard drives put downward pressure on performance, Seagate’s Multi Actuator technology will more than offset these pressures. That means customers with data-intensive applications will continue to enjoy the highest levels of hard drive performance, while they simultaneously keep up with the need to manage vast, ever-increasing quantities of data. Seagate’s Multi Actuator technology is in development to be deployed on products in the near future.

Seagate-Multi-Actuator-technology-conceptual-illustration.jpg

In its first generation, Seagate’s Multi Actuator technology will equip hard drives with dual actuators (two actuators). With two actuators operating on a single pivot point, each actuator will control half of the drive’s arms. Half the drive’s recording heads will operate together as a unit, while the other half will operate independently as a separate unit. This enables a hard drive to double its performance while maintaining the same capacity as that of a single actuator drive.

Seagate’s new Multi Actuation technology is a way to put the performance of parallelism within a single hard drive unit. The host computer can treat a single Dual Actuator drive as if it were two separate drives. This means the host computer can ask a single high-capacity drive to retrieve two different data requests simultaneously — delivering data up to twice as fast compared with a single-actuator drive.

With two actuator arms and 1.8 TB PMR platters (280 MB/s) this should saturate SATA 6 Gbps.

With that mentioned, I wonder how far they will take this?

XavierMace · Dec 27, 2017

cbn said:
With two actuator arms and 1.8 TB PMR platters (280 MB/s) this should saturate SATA 6 Gbps.

With that mentioned, I wonder how far they will take this?

You have noticed this is being positioned as a purely enterprise item, right? This is about IOPS, not straight throughput. I don't see this having much of a market on the consumer side especially since you know there's going to be a cost associated with it.

cbn · Dec 27, 2017

XavierMace said:
You have noticed this is being positioned as a purely enterprise item, right? This is about IOPS, not straight throughput.

They also mentioned sequential in the link:

Nonetheless, the Hyperscale community is unanimous: they want hard drives to continue to achieve optimal IOPS/TB, at the lowest cost per TB. Technically speaking, this means the industry must improve Random IOPS at low latencies, and improve Sequential transfer rate.

William Gaatjes · Dec 27, 2017

cbn said:
Very happy to see the following being developed: (This originally posted by Elixir)

https://blog.seagate.com/technology/multi-actuator-technology-a-new-performance-breakthrough/?lipi=urn:liage:d_flagship3_messaging;JXyixoGaROyvrtBYgUlcBw==

With two actuator arms and 1.8 TB PMR platters (280 MB/s) this should saturate SATA 6 Gbps.

With that mentioned, I wonder how far they will take this?

Is there also not another technology coming that has two actuators per arm (Dual stage)? One is the main pivot which is electromagnetically controlled , and near the heads is another pivot where the head with GMR sensors is piezoelectrically moved around.

http://toshiba.semicon-storage.com/product/storage/pdf/ToshibaReview_vol66n11.pdf
https://www.hgst.com/sites/default/files/resources/WP_DSA.pdf

XavierMace · Dec 27, 2017

cbn said:
They also mentioned sequential in the link:

It doesn't state this specific technology is doing much in that regards.

cbn · Dec 28, 2017

IntelUser2000 said:
You still have increased boot times with RAID though, regardless of whether its a HDD or an SSD.

When I searched for "RAID-0 increasing boot time" I did see comments that if the RAID controller adds to the post time then booting in RAID-0 can be slower than booting with a single drive. However, in the following video the RAID-0 hard drives are faster than a single hard drive for boot:

Samsung SSD 850 Pro 256GB: 13 seconds
2 x WD Black 4TB 7200 rpm RAID-0: 38 seconds
WD Black 4TB 7200 rpm: 46 seconds
Seagate Momentus 500GB 5400 rpm: 54 seconds

cbn · Dec 28, 2017

dave_the_nerd said:
Loading big games into memory and backup up or cloning an HDD for backups are just about the only things most people do that are actually bottlenecked by sequential read/write.

I couldn't find a video, but loading non-gaming applications with RAID-0 hard drives should also be faster than a single hard drive.

Also (besides the cloning and backup you mention) software install times should be decreased by RAID-0 hard drives.

Here is a video where Sims 3 install time is decreased by RAID-0 hard drives compared to a single hard drive:

Samsung SSD 850 Pro 256GB: 52 seconds
2 x WD Black 4TB 7200 rpm RAID-0: 1 minute 53 seconds
WD Black 4TB 7200 rpm: 4 minutes 38 seconds
Seagate Momentus 500GB 5400 rpm: 7 minutes 4 seconds

P.S. I know you mentioned "most people" but another thing to consider is 4K video editing which needs RAID-0 with three hard drives (but I figure two of the fastest hard drives in RAID-0 will also work).

cbn · Jan 1, 2018

A really good diagram showing the flow of data within a server system:

https://blogs.oracle.com/datawarehousing/a-closer-look-at-oracle-big-data-appliance

Oracle Big Data appliance:

Oracle Exalytics :

P.S. The Exadata machine, in addition to the Exalytics machine, uses PCIe Flash.

Exadata X7-2:

Exadata X7-8:

IntelUser2000 · Jan 2, 2018

cbn said:
When I searched for "RAID-0 increasing boot time" I did see comments that if the RAID controller adds to the post time then booting in RAID-0 can be slower than booting with a single drive. However, in the following video the RAID-0 hard drives are faster than a single hard drive for boot:

Interesting. Generally the rule is RAID 0 increases latency. The increase in latency is bad for really fast drives like NAND SSDs. For Optane its not worth it at all, because you lose the biggest advantage in getting them.

I did have experience with RAID 0 a while ago. The system seems to add additional verification stage. On an HDD it isn't entirely surprising that it reduces boot times. But on an SSD? Hardly. Even on the link you gave the decrease in boot times isn't significant. In the days before good SSDs like the X25-M I used WD Raptor drives. They run at 10K RPM and aimed at enthusiasts. They were preferred over RAID.

Even compared against RAID, you can do far better by going with an SSD, because the sequential speed is far better, especially the NVMe ones.

cbn · Jan 27, 2018

cbn said:
I don't think that would happen any time soon (re: I don't how many layers and bits Intel/Micron 3DXpoint can scale and how fast. Also hard drives have platter density increases coming in the future through MAMR), but one interesting thing that could happen is the integration of hard drive and 3DXpoint through the controller (eg, SSHD controller using internal platter RAID + 3DXpoint for read cache).

Platters (using internal RAID) = for larger files sizes and optimized for sequential read and write.
3DXpoint (read cache) = for small file sizes (held in 3DXpoint by an algorithm) optimized for random read.

Some info below in how the solid state read cache works on current NAND based SSHDs:

https://www.anandtech.com/show/3734/seagates-momentus-xt-review-finally-a-good-hybrid-hdd

P.S. I found out using the link from this post, that 3DXpoint can actually be designed in at least two different ways:

http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PG01&p=1&u=/netahtml/PTO/srchnum.html&r=1&f=G&l=50&s1="20160276022".PGNR.&OS=DN/20160276022&RS=DN/20160276022

So a high endurance layer design with slower access time vs. low endurance layer design with faster access time.

So maybe Intel/Micron could deploy chips with lower endurance layers but fast access time as read cache for hard drives? (And other consumer electronics/PCs)

After writing this post in reference to the information I found here I'm starting to believe Intel and Micron may very well have additional reasons to use the low endurance but fast access time 3DXpoint variant (mentioned in the quote above) if it could be deployed as a small die (in a highly parallel design). This assuming 3DXpoint follows rules similar to NAND as far as performance, endurance and retention goes.

So with the current 3DXpoint being quite large at 206.5mm2 (128Gb/16GB) perhaps we would see a smaller die at 51.6mm2 (32Gb/4GB) to 103.2mm2 (64Gb/8GB) that when stacked could do double duty for both DIMMs (highly parallel small die application) and secondarily on lesser applications like cache on SSHD, IoT, etc.

Package size of Crucial 128GB DDR4 module:

vs.

Package size of current Intel Optane Memory:

Obviously the current optane packages (housing the 206.5 mm2 128Gb/16GB dies) are much larger than the DDR4 packages (Do keep in mind though when eyeballing the size difference the DDR4 module has larger dimensions (31.25mm x 133.35mm) than the M.2 2280 (22mm x 80mm) used by the Optane Memory).

cbn · Jan 27, 2018

So assuming 3DXpoint controllers are developed for using many packages of small die (low endurance but fast access time) 3DXpoint in parallel for datacenter server DIMMs* I would assume the same controller (with some modifications) could also be used with the large die 3DXpoint dies for PCIe SSDs.

* These small die in the Server DIMMs trading retention time for additional endurance.

In summary:

1.) Server DIMM: Uses controller with many channels along with small die (low endurance but fast access time) 3DXpoint and then tunes for greater endurance at the cost of reduced retention.

2.) PCIe x8 or PCIe x16 SSD: Uses controller with many channels along with large die (high endurance but slow access time) 3DXpoint and then increases the number of bits per cell to gain capacity at the cost of some endurance.

cbn · Jan 31, 2018

IntelUser2000 said:
They can also replace DRAM and SLC caching by using 3D XPoint caching instead.

The limit to that is really whether IMFT is willing to make 3D XPoint dies at 8Gbit sizes. They are at 128Gbit for the smallest one now. It seems manufacturers use older deprecated DDR2 or DDR3 chips for the buffers, so maybe in about a decade we'll see hard drive and SSD vendors use 8GB first generation 3D XPoint chips for buffers, with their 80TB hard drive and 15TB SSDs.

Considering the current 2 layer 128Gb (16GB) "only" yields 900 MB/s Seq. read when used in a 16GB Optane Memory module I think there would be a case for smaller die sizes when used for other applications that are more bandwidth intensive.

Example: (Using the current 2 layer design) 16 x 8Gb(1GB) dies should yield 14.4 GB/s (re: 16 x 900 MB/s) or maybe for more capacity 16 x 16Gb(2GB) which should also yield 14.4 GB/s. 32 x 8Gb(1GB) dies should yield 28.8 GB/s (re: 32 x 900 MB/s).

(Would be very interesting IMO for some wearable small form factor gadget needing decent bandwidth (graphics, etc) and battery life*.)

*I reckon the storage also being the memory should help this (all things being equal) and lend the designs to being able to use smaller batteries.

cbn · Jan 31, 2018

IntelUser2000 said:
What they can do is rather than having regular hard drives with 256MB DRAM-based buffers, they can move to a 3D XPoint based buffer at 1-2GB.

Also consider Seagate uses around 1GB of SLC NAND (in addition to DRAM) for multi-tier caching on the Barracuda 2.5" drives:

https://www.anandtech.com/show/1075...5-mobile-hard-drives-with-up-to-5-tb-capacity

All the new BarraCuda 2.5” HDDs feature 128 MB of DRAM cache as well as multi-tier caching (MTC) technology, which is designed to hide peculiarities of SMR. Hard drives featuring shingled recording write new magnetic tracks that overlap part of the previously written tracks. This may slow down the writing process since the architecture requires HDDs to rewrite adjacent tracks after any writing operation. To “conceal” such peculiarities, Seagate does a number of tricks. Firstly, it organizes SMR tracks into bands in a bid to limit the amount of overwriting. Secondly, the MTC technology uses several bands of PMR tracks on the platters, around 1 GB of NAND flash cache as well as DRAM cache. When workloads generate relatively small amount of writes, the HDD writes data to NAND and/or to the PMR tracks at a predictable data rate. Then, during light workloads or idle time, the HDD transfers written data from the caches to SMR tracks, as described by Mark Re (CTO of Seagate) earlier this year.

P.S. Seagate desktop drives are also now starting to use SMR as well---> https://forums.anandtech.com/threads/what-consumer-hard-drives-have-smr-platters.2525313/

IntelUser2000 · Feb 1, 2018

cbn said:
Considering the current 2 layer 128Gb (16GB) "only" yields 900 MB/s Seq. read when used in a 16GB Optane Memory module I think there would be a case for smaller die sizes when used for other applications that are more bandwidth intensive.

Huh? 16GB Optane Memory is a 1-layer, 1-channel device.

They don't need to use multiple dies to achieve higher bandwidth. Flash has ONFI standards that increase bandwidth per die. Future 3D XPoint dies can have more bandwidth per chip, if IMFT decides to.

You know what is interesting? QLC SSDs are supposed to be dirt cheap. They are talking about $100 for 512GB 660p? It could be a nice pairing with Optane Memory. Of course, you can't use NVMe with current Optane Memory, but maybe the newer ones coming in 2 weeks, or updated drivers will allow that.

cbn · Feb 1, 2018

IntelUser2000 said:
Huh? 16GB Optane Memory is a 1-layer

Each 16GB die is 2 layers:

https://www.theregister.co.uk/2017/02/03/micron_working_on_nextgeneration_xpoint/

As current XPoint is 2 layers we expect that gen-2 will be 4-layer and gen-3 either 8 or 16-layer.

The second generation will increase to 4 layers:

https://www.eetimes.com/author.asp?section_id=36&doc_id=1330829

The 2nd generation 3D XPoint with 4-layer stacking

IntelUser2000 · Feb 1, 2018

cbn said:
Each 16GB die is 2 layers:

The second generation will increase to 4 layers:

https://www.eetimes.com/author.asp?section_id=36&doc_id=1330829

Oh yea, right. I stand corrected.

cbn · Feb 1, 2018

IntelUser2000 said:
They don't need to use multiple dies to achieve higher bandwidth. Flash has ONFI standards that increase bandwidth per die. Future 3D XPoint dies can have more bandwidth per chip, if IMFT decides to.

Yep, that is true if IMFT decides to increase the number of planes per layer.....or they could keep the number of planes per layer the same and make the dies smaller.

With that mentioned I believe IMFT's primary goal in the beginning is the high capacity DIMMs for in-memory computing.....and to get 512GB is going to take 32 x 16GB dies which (at the current number of planes per layer) should be plenty to hit the necessary bandwidth for DDR4. (re: 32 x 900MB/s = 28.8 GB/s).

But what about 256GB or 128GB 3DXPoint DIMMs? Do they make smaller dies for these? Or do they increase planes per layer so they can make these small capacity DIMMs with less dies and still hit the necessary bandwidth?

IntelUser2000 · Feb 1, 2018

cbn said:
But what about 256GB or 128GB 3DXPoint DIMMs? Do they make smaller dies for these? Or do they increase planes per layer so they can make these small capacity DIMMs with less dies and still hit the necessary bandwidth?

It won't need a lot. Early projections had bandwidth of less than half DDR4 on Purley.

Optane DIMM setups require the system have some DDR4 modules for endurance and write caching. I think it'll have 2 slots for DRAM and 6 for Optane DIMM per CPU, but that's my guess based on system pictures.

Two technological innovations that could reduce the need for NAND?

Lifer

Diamond Member

Elite Member

Lifer

Lifer

Elite Member

Lifer

Lifer

Diamond Member

Lifer

Lifer

Diamond Member

Lifer

Lifer

Lifer

Elite Member

Lifer

Lifer

Lifer

Lifer

Elite Member

Lifer

Elite Member

Lifer

Elite Member