Monday, August 23, 2010

What is flash cache?

A Flash cache acts like SRAM memory caches that are designed to speed up DRAM access times; Flash caches speed access to HDDs in an analogous manner. Data is drawn from HDDs as needed and the retrieved data is cached in NAND Flash. The next time this data is needed, it?s drawn directly from the cache instead of the slower HDD. Flash caches do not require as much NAND Flash memory as SSDs, and therefore cost less, but they can deliver significant performance improvements when paired with HDDs?in fact the effective performance of a Flash cache paired with an HDD can actually exceed that of an SSD (Sony Vaio VGN-FZ battery).

(Note: It’s also possible to use DRAM to cache HDD data, but DRAM is more expensive than NAND Flash for equivalent capacity and DRAM provides only volatile storage unless you add a backup battery. For these reasons, NAND Flash is the better choice for an HDD memory cache .)( (Toshiba PA3399U-2BRS battery)\ (Sony VGP-BPL9 battery)\ (Sony VGP-BPS9 battery))

Using a faster memory technology as a cache for a slower-yet-cheaper memory technology is a relatively common technique used by computer designers. Designers have always faced memory access-time problems and caching is a very, very common solution to this problem. If the typical working set is a small fraction of the total HDD capacity, then a cache that holds that working set will make the HDD appear to be as fast (or almost as fast) as NAND Flash memory, resulting in a dramatic improvement in application performance (Sony VGP-BPL11 battery).

Adding a cache can deliver significant performance gains for I/O-intensive workloads but it?s critical to make the cache invisible to the application to avoid rewriting the application code. You make a Flash cache invisible by intimately integrating it into the operating system and the file system (Sony VGN-FZ460E battery). This is a critical step because it sidesteps the need to rewrite the application so that it need not decide what goes where. Application code must explicitly manage code and data placement in storage when a system employs a mix of HDDs and faster, Flash-based SSDs but not if the Flash memory is configured as a cache. If you can write or rewrite an application so that it explicitly controls where data is stored, then a mix of SSDs and HDDs can be used effectively. NAND Flash cache used to accelerate HDD performance solves a more common problem?a problem ingrained in all existing application programs that are not written for an explicit SSD/HDD storage hierarchy (Sony Vaio VGN-FZ18M battery).

The question is: Is there a practical working set that’s a small subset of a computer?s total disk capacity? Intel’s Amber Huffman presented some very interesting data in 2008. Intel tracked five employee power users and observed how they used data over successive time periods (IBM ThinkPad R50 battery). Four out of five of these power users used no more than 6 Gbytes of data for a working set in a typical 10-hour work period. A 6-Gbyte NAND Flash cache is easily and economically achievable today. It?s not an incredibly expensive amount of NAND Flash memory. With the right parallelism designed into the cache, you can get the required access time, throughput, and capacity to make a huge improvement in application performance by masking the HDD?s access time with a relatively small Flash cache (IBM ThinkPad R60 battery).

Here?s a different example from the Enterprise world that demonstrates the advantages of using Flash memory to cache HDD storage. Pliant Technology, a vendor of high-speed Flash Enterprise SSDs, studied a typical data warehouse. The company compared high-end disk arrays composed of fast, enterprise-class, short-stroked HDDs against a hybrid array of four SSDs and many low cost HDDs (not short-stroked). Pliant?s hybrid drive array dramatically increased available disk capacity and performance versus the conventional short-stroked HDD array. The disk capacity per rack shelf increased by almost an order of magnitude, while the IOPS performance increased 6.5x (IBM ThinkPad R51 battery).

Note that the cost per rack shelf also increased significantly, but this increase was compensated by a corresponding decrease in the number of shelves required for storage. The key figures of merit for this example:

  • total storage-system cost decreased by 50%
  • cost per IOPS decreased 50%
  • the cost per gigabyte of storage improved, and
  • the hybrid disk array required one eighth the power to operate and cool compared to the amount of power needed to operate and cool the array of fast, short-stroked HDDs?nearly an order of magnitude improvement in power consumption.

So NAND Flash memory used as a disk cache whether for low-end applications or enterprise installations shows great promise (IBM ThinkPad X41 Tablet battery).

No comments:

Post a Comment