Sự đánh đổi mới trong thiết kế chip tiên tiến
Device design begins with the anticipated workload. What is it actually supposed to do? What resources — computational units, memory, sensors — are available?
Answering these questions and developing the functional architecture are the first steps in a new design — well before committing it to silicon, said Tim Kogel, senior director of technical product management at Synopsys. Yet even these early decisions begin to constrain the physical architecture.
With a model of the proposed functionality, planners can begin to ask ‘what if’ questions. Does increasing on-chip memory improve performance enough to justify the increased cost and silicon area? What type of GPU is the best match for the anticipated workload? Tools like Synopsys’s Platform Architect can incorporate estimated performance metrics, even if detailed benchmarks are not yet available for the expected technology. At a given device density, what is the likely power consumption of a memory or logic unit of a given size? What’s the expected connection distance between adjacent circuit blocks? How much memory bandwidth does each of the potential interconnection options provide?
High-level integration decisions also need to be made early in the process. Will a design be realized in planar silicon, by stacking layers at the chip level, or through package integration? “Different connection technologies are knobs we have to play with,” said Keith Lanier, technical product management director at Synopsys. “Placing a memory module on top of a logic module will produce a shorter interconnection distance, but a higher power density than two separate packages.”
3D packages, 3D chips
Broadly speaking, the major division in 3D integration is between hybrid and monolithic approaches. As Macronix International’s F. M. Lee defined it, hybrid integration is a packaging technology. Complete functional chiplets are assembled into a single package, usually with the help of through-silicon vias (TSVs), interposers, and redistribution layers.[1]
In theory, at least, the individual chiplets can come from different vendors and they can be combined in different ways. Hybrid integration does not constrain the individual chiplets, but TSVs consume a lot of silicon area, limiting the available interconnection bandwidth.
Monolithic integration, in contrast, builds two or more active device layers on a single substrate, using conventional integrated circuit metallization to connect them. Depending on the materials involved, the component layers might be grown in situ on the destination wafer or transferred from a growth substrate. For example, the Macronix work, in collaboration with TSMC, incorporated an indium-gallium zirconium oxide (IGZO) memory array into a standard BEOL interconnect structure, then placed a MoS2-based image sensor on top. Monolithic integration assumes that further processing will take place. That is, after the second (or third) active layer is grown or transferred, the resulting wafer still needs to be compatible with further lithography and interconnect steps. Layer growth and transfer processes therefore must meet the same flatness and cleanliness specifications as other wafer process steps.
Many early monolithic integration demonstrations have relied on layer transfer techniques. (The layer transfer approach is sometimes described as sequential integration.) When the individual layers are fabricated separately, their individual processes have more flexibility. Even in silicon, some processes are simply easier on separate wafers. For instance, etching through the top transistor and internal dielectric (ILD) to make contact with the bottom transistor in a CFET is very difficult. Layer transfer can separate the top and bottom devices and their related contact etch steps, simplifying both.
It’s important to remember, though, that each such layer was originally grown on a prime silicon wafer. Using two, three, or more silicon wafers to create a single finished device wafer imposes a very large cost burden. If layer transfer schemes are to be financially viable, processes that release the transferred layer without grinding and allow growth wafer reuse will be essential.
In other cases it may not be possible to grow the desired material directly on silicon at all.
Growth of 2D semiconductors typically requires very high temperatures and uses substrates like sapphire and quartz. It’s typically necessary to transfer the layer to an intermediate carrier wafer first, then to the destination substrate. Luke Prenger, senior research associate for temporary bonding materials at Brewer Science, emphasized that the design of the temporary adhesive used for such a transfer is critical. The adhesive needs to be stronger than the bond between the transferred layer and its growth substrate, but still must be removable easily and without residue. Adhesive design also needs to consider residue cleaning requirements.
“Depending on what the 2D material is, there can be smaller residues that remain like a blanket over the 2D material,” Prenger said. “Plasma cleaning could be used, but might require a tight processing window. Wet cleaning requires an understanding of what chemicals will affect the specific 2D material used.”
Power, heat, and dielectrics
Though individual component designs may not be finalized until much later, the underlying process technology still informs both performance and cost planning for the overall architecture. For example, in work presented at the 2024 VLSI Technology Symposium, S. Mishra and colleagues at imec compared performance and heat dissipation characteristics between an A10 nanosheet transistor process and an A5 CFET process.
The A5 process roughly doubled the circuit density and dramatically reduced connection distance. However, it also made heat dissipation more difficult. At a constant operating voltage, the A5 devices saw a 9°C temperature rise and a 12% to 15% increase in power density. Reducing the operating voltage to maintain a constant temperature cut the operating frequency by 10%, as expected. Yet thanks to the reduced connection distance and ability to fit more computing elements in the same space, net system throughput still increased by 40%. This kind of information supports PPA (power-performance-area) analysis at the architecture level.[2]
However, designers can’t use information they don’t have. Synopsys’s Kogel emphasized that the process specification needs to expose any tunable parameters.
For instance, Seong Kwang Kim and colleagues at KAIST found that reducing the ILD thickness between the top and bottom layer of a CFET transistor from 1.4mm to 70nm allowed heat from the top device layer to dissipate through the wafer back side.[3] If details like this are to be useful to designers, they must be incorporated into process models at the individual block level.
Architecture planning considers both standby and dynamic power requirements. Mishra observed that even as overall transistor counts increase, more and more transistors are being turned off to conserve power. In fact, leakage current is increasing faster than dynamic power consumption in modern designs. Increasing leakage means increasing heat, as well. While copper is a good thermal conductor, using “thermal vias” to facilitate heat dissipation makes the challenging problem of circuit routing even more difficult. Space allotted to thermal vias is not available for device circuitry, while such vias introduce additional capacitance.
The heat dissipation characteristics of dielectrics are generally poor, but research at TSMC suggested that wide-gap materials might be considered. Integrating such materials is not easy, though. Bulk AlN, for instance, is a good thermal conductor, but the thermal properties of AlN thin films are not as good and are highly dependent on crystal structure. Diamond is commonly used as a heat spreader and has the best thermal conductivity of any known material. Unfortunately, it also requires very high deposition temperatures and is nearly impossible to pattern.[4]
Designing for future performance
Further process evolution is unlikely to make design challenges easier. As Kogel said, the functional architecture, physical architecture, and system architecture are designed by three different groups that historically haven’t communicated with each other. While machine learning tools are still relatively primitive, early results suggest they can help. They increase the amount of data that analytical tools can consider, and therefore the size of problems where they can assist. As they become more powerful, they may be able to help designers consider several layers of the design hierarchy at once.
- F. M. Lee et al., “3D Monolithically Integrated Device of Si CMOS Logic, IGZO DRAM-like, and 2D MoS2 Phototransistor for Smart Image Sensing,” 2023 International Electron Devices Meeting (IEDM), San Francisco, CA, USA, 2023, pp. 1-4, doi: 10.1109/IEDM45741.2023.10413766.
- S. Mishra et al., “Thermal Considerations for Block-Level PPA Assessment in Angstrom Era: A Comparison Study of Nanosheet FETs (A10) & Complementary FETs (A5),” 2024 IEEE Symposium on VLSI Technology and Circuits (VLSI Technology and Circuits), Honolulu, HI, USA, 2024, pp. 1-2, doi: 10.1109/VLSITechnologyandCir46783.2024.10631358.
- S. K. Kim et al., “Role of Inter-Layer Dielectric on the Electrical and Heat Dissipation Characteristics in the Heterogeneous 3D Sequential CFETs with Ge p-FETs on Si n-FETs,” 2023 International Electron Devices Meeting (IEDM), San Francisco, CA, USA, 2023, pp. 1-4, doi: 10.1109/IEDM45741.2023.10413845.
- W. Y. Woon et al., “Thermal dissipation in stacked devices,” 2023 International Electron Devices Meeting (IEDM), San Francisco, CA, USA, 2023, pp. 1-4, doi: 10.1109/IEDM45741.2023.10413721.
Related Reading
Reasons To Know IGZO
Indium-based oxide semiconductors can boost performance and reduce resistance losses.
Monolithic Vs. Heterogeneous Integration
New processes, materials, and combinations of existing technologies will determine future directions for semiconductors.
Chiplets Make Progress Using Interconnects As Glue
Industry learning expands as more SoCs are disaggregated at leading edge, opening door to more third-party chiplets.