Exploring Memory Access Alternatives
The application software products we’ve mentioned so far are usually ROM-based products. They do not need to boot up from a hard disk, because the host device does not usually have a hard disk. This makes it hard—actually, impossible—to remotely reconfigure over the air, but it helps to protect the OS from virus infection and means that the software turns on more or less instantaneously. The ROM-based OS needs to talk to localized and distributed memory within the device. The time taken to go and fetch data from memory and then act on that memory in part determines the delay and delay variability budget. Table 6.2 shows a typical 16- bit microcontroller (from Hitachi) with on-board RAM. You can reduce the clock speed but only at the cost of increasing the instruction cycle time. Similarly, you could reduce the operating voltage (which is higher than you would want in a digital cellular handset), but this will again slow the cycle time.
The problem of memory access is that speed of access is only improving by about 7 percent per year, whereas raw processor speed is rising at about 60 percent per year. It’s not the memory or the processor that’s the problem; the problem is the interface between the two devices. The answer is to embed the memory in a system on-chip solution. However, then you need to decide where to put the memory: with the DSP or with the microcontroller or, as is (usually the case, with both, in which case you need to optimize the intercommunication between the microcontroller and DSP. In terms of organizing memory for maximum performance (minimum delay and delay variability), the general rule of thumb is to have the fast-access storage cells on chip and relatively slow cells on DRAM, and then to work out what should be where and when. This is known in the industry as algorithms of probability and locality. It also becomes important to throw things away when not needed, a bit like good housekeeping. This is referred to in the industry as garbage management. The problem is that the performance problem that has always existed for off-chip memory access is beginning to reappear for on-chip memory. The solution is to have processors that hide memory access delays by multithreading—that is, handling several tasks at once and switching between them each cycle. Essentially this means that we have a memory real-time operating system that needs to coexist with the microcontroller real-time operating system that, in turn, needs to coexist with the DSP realtime operating system. Infineon Technologies has tried to bridge the divide that is beginning to open up in terms of design tools and design rules in each of these separate areas. Table 6.3 illustrates an example of a product sampled to the 3G handset design community in the late 1990s that combined a DSP and microcontroller core with Flash, RAM, and ferroelectric random access memory (FRAM). This was a 500-MIPS device. In practice, it has become necessary to have at least 1000 MIPS available. The selling proposition for Tri- Core is that DSP, memory and microcontroller functions are defined by a common software development environment, which in turn can take advantage of new technologies like FRAM. The table shows the gate density performance benefits realizable from decreasing device geometry from 0.35 micron to 0.18 micron.
FRAM is a really useful memory product. It is not as dense as DRAM or Flash but is low power, and it will survive about 10 trillion read/write cycles. In addition, these devices are sometimes described as persistent, or nonvolatile, storage devices, which means they do not lose their memory when the handset’s battery goes flat, and they have about a 10-year data retention. They also provide fast read, write, and bit-level erase capability. Essentially, you can think of such devices as solid-state hard disks, since both exploit ferroelectric and magnetic effects to provide storage. About 20 times faster than EEPROM, FRAM is beginning to appear both as a standalone product and on smart cards. Hitachi offer a range of products optimized for the storage and redelivery of multimedia files. These devices come in 16, 32, 64, and 128 Mbyte packages and use interleaving (the simultaneous writing of two or more Flash memories) to deliver write speeds of 2 Mbps and read speeds of 1.7 Mbps. The write time for 500 kbytes of image data from a 3-megapixel digital camera is about 0.25 seconds. This highlights the importance of memory bandwidth performance and, specifically, memory delivery bandwidth performance. Most of the focus for portable products has been solid-state memory, but it is also worth considering parallel developments in miniature disk device storage. The pervasiveness of laptop PCs has greatly improved the mechanical robustness of hard disk drives. Micro-miniaturization techniques have also made possible miniature disk drives that are both space- and power-efficient—and offer huge amounts of storage bandwidth. Miniature disk drives (fitting within a Type III 10.5-mm form factor PC card) have been available since 1992 and have increased over the past 10 years from providing a few Mbytes of storage to a few Gbytes. Type III card devices today are capable of storing 15 Gbytes. In 1999, Type II PC card devices (5 mm thick) became available using magnetic resistance heads with a read density of 8500 tracks per inch and offering about a 10 times reduction in storage cost compared to solid state. The example shown in Figure 6.3 is an IBM 1-Gbyte Microdrive, a hard disk drive in a CompactFlash Type II PC card format. In terms of storage bandwidth, this is sufficient to store 1000 high-resolution photos, 12 music CDs, or 1000 novels. The device delivers a 4.2 Mbps transfer rate, which is over twice as fast as solid-state Flash, and a 1 in 1013 bit error rate. It weighs 16 grams, so is not too implausible as an add-in product to a digital cellular handset, which typically weighs 80 grams. (The hamster is not included.) 159
114 times read
|