Architecturally the SM2256 shares the same core design as its predecessor SM2246EN. The design is modular, which allows Silicon Motion to change parts of the controller without redoing the rest. It features the same single 32-bit Argonaut RISC processor core as the SM2246EN, which is quite unique because we have seen many SSD controller vendors moving towards multi-core ARM architectures. A single custom core obviously brings efficiency gains and we've witnessed those in the SM2246EN, but the downside of such limited CPU power is sustained performance when the controller has to perform garbage collection at the same time as processing host IOs.
The only dramatic change is in the error correction circuitry as the SM2256 supports Low Density Parity Check (LDPC) error correction codes instead of more common and less powerful BCH ECC. Silicon Motion calls its ECC technology as NANDXtend, and it's a combination of LDPC hard and soft decode along with RAID5-like data recovery. The benefit of having three levels of ECC is performance because LDPC soft decode and recovery from parity both have a relatively noticeable impact on performance and are typically only needed when the drive approaches its end of life (i.e. when the NAND has been cycled a lot). Uncycled NAND has much higher reliability because the tunnel oxide hasn't worn out due to P/E cycles, so only very little ECC is needed and LDPC hard decode is sufficient and also doesn't have a dramatic impact on performance.
The reason why hard decode is faster than soft decode lies in how the voltage of a cell is sensed. Hard sensing is binary based, so for an SLC cell like in the graph above the cell can be either 1 or 0. However, as you can see, the voltage threshold distributions overlap slightly and that's actually far worse with MLC and TLC since there are more voltage states. In soft sensing the voltage distributions are divided into several segments, which requires more precision and iterations. For example in segment 4 the bit value can be either 1 or 0 as the distributions overlap, so probability algorithms are used to figure out the correct value. To be honest, ECC codes and the way they work are way over my head, but in case you are familiar with ECC and want to learn more, I suggest you simply google LDPC as there are numerous publicly available academic papers that go into more depth about this topic.
Silicon Motion claims that its NANDXtend technology can extend the endurance of TLC NAND by up to three times, making TLC more robust for heavier workloads and also allowing the use of lower quality NAND that some OEMs may use anyway due to the lack of in-house binning equipment. Unfortunately I didn't have any time to do extended endurance testing with the SM2256 yet to validate Silicon Motion's claims, but I will be sure to test that once we have a retail drive on our hands,