To date in our 3D IC weblog collection, we’ve mentioned front-end design approaches to develop 3D IC-based units, the significance of early planning for interconnect verification and meeting level-layout vs schematic in 3D IC design verification. At this time, we focus on the impression 3D IC has on semiconductor gadget reliability, starting from understanding thermal interactions inside and between semiconductor units to the associated thermo mechanical stress that may result in warpage, cracking of micro bumps, hybrid bonds, and past. We may even have a look at electrical reliability from the elements of latch-up, prevention, and ESD safety.
Challenges 3D IC heterogeneous assemblies want to beat to make sure reliability
Reliability has been difficult within the built-in circuit (IC) area however was manageable with design rule checking (DRC) guidelines. Now, it’s more difficult, and you’ll see conditions the place a die will be DRC-clean or LVS-clean however nonetheless fails on the manufacturing ground. What’s worse is when it really works nice within the check bench however fails unexpectedly and prematurely as soon as offered and used within the subject. This failure danger leaves IC designers unsure, notably on the extra superior nodes.
Issues seen within the IC area at the moment are beginning to seem at a better degree from heterogeneous 3D IC environments. Unsurprisingly, these are fashioned by how units and wires work together. That doesn’t change whether or not you’re connecting issues inside a chip or throughout a number of chips or chiplets. The issues are nonetheless there, notably with bigger electrostatic discharge (ESD) purposes. Different circumstances embrace electrical overstress, particularly should you’re focusing on automotive or military-aerospace purposes the place reliability is vital. You don’t need to be on the street, and all of a sudden the automobile dies in the course of visitors. Latch-up is a solution to get two issues inadvertently shorted collectively; they’re toggling collectively. Normally comes with a excessive voltage that allows it, one thing bigger than anticipated, however these issues can and do occur.
Once we consider ESD, we consider a human finger, strolling round scuffing your ft (particularly because the climate will get colder), selecting up a cost, and touching a chip someplace that dissipates the cost into the chip and blows up the transistors. Yikes – that’s not a superb factor and, sadly, not only a human discharge that may trigger this. From chip manufacturing to its use in an software, there are various alternatives to trigger an surprising cost to seem and trigger catastrophic occasions.
For system-on-chip (SoC) designs, we are able to discover these conditions. Even within the extra advanced eventualities, with instruments like Calibre PERC, we are able to discover them and apply recognized options to handle them. Within the case of latch-up, you want a superb guard ring and a ample quantity of faucets and present ties. On the ESD aspect, it’s about making certain you might have sufficient safety pads. With a sudden, surprising cost, you may give it a quick path to floor as a substitute of going via transistors.
The problem in 3D IC design is that it differs from an SoC, the place all the pieces is in a single course of and characterised all upfront. Designers join issues from a number of sources with barely totally different behaviors, probably from one die to the subsequent. Design groups should make choices to handle a number of the variations. For an ESD problem that mixes a number of dies, the designer should determine which die to put the fixes in or easy methods to separate them throughout a number of dies. One other instance is the dimensions of the safety units might change relying on the method of the person chiplet. Many customers already do that for SoC design. The problem comes with eventualities the place we don’t have to spend so much of time desirous about them as we speak however might want to – areas of thermal and stress.
3D heterogeneous integrations and thermal-related reliability points
Design guidelines usually tackle thermal considerations for a single-process SoC. By following design guidelines, the design stays inside comparatively recognized efficiency and will behave. However as we introduce warmth, we all know the transistor depends on the working temperature.
As the warmth will increase, electrons and the holes you’re attempting to push via get heated, and that warmth dissipates to the neighboring atmosphere. To assist reduce the impression, you progress the warmth round just a little bit and, within the course of, toggle a transistor. The quicker you toggle that transistor or, the extra present you attempt to push to make that occur, the warmer it is going to get.
- With transistors which are shut to one another, they usually begin to warmth up collectively. With SoC, design guidelines forestall these conditions by not permitting transistors to be too shut collectively and assist designers keep away from the dual-heating drawback. With multi-die chiplets and heterogeneous processes, issues are totally different, particularly with ground planning.
- For an SoC, ground planning includes figuring out the aircraft and the place to place the totally different parts to get the behaviors with the optimum settings. For 3D IC, it’s not ground planning within the conventional sense as a result of not all the pieces is on the identical ground, however the course of makes use of most of the similar ideas.
- For instance, you need to keep away from having issues that toggle quick and are incessantly used subsequent to related objects. Designers keep away from placing a GPU and CPU shut collectively as a result of they get sizzling. With 3D designs, designers additionally keep away from placing GPU on the CPU to forestall dissipating warmth on the horizon aircraft and up and down in area.
When stacking dies on a standard SoC, the transistors are on the silicon aircraft, and there’s a transparent path for a way temperature and warmth are dissipated – it goes via the substrate. When a number of skinny dies are stacked up, there may very well be a high temperature in a single that will don’t have any place to go. Particularly, the warmth will transfer downward for face-to-face bonded copper-bonded dies with a prime tier with a silicone part. The warmth has no different place to go however to the transistors which are on the die that’s under it. With an extended path to an precise thermal escape, akin to a thermal warmth sink, the temperatures are sizzling and keep sizzling longer.
With all of those new variables, there are numerous new issues for 3D IC designs. Mitigating components akin to TSV via silicone to dissipate warmth quicker or a via insulator through to function a chimney to centralize warmth take up area and introduce stresses. Temperature goes to impression how electrical behaviors current. The massive problem is figuring this out can solely be finished with collaboration.
3D IC designers want to know an entire system from a multi-physics perspective to know the conduct because the design evolves. It’s not usually recognized upfront what’s one of the best ways to design a 3D IC – there are various, many extra choices on easy methods to construct a design with all the brand new dimensions. On the flip aspect, you now have many extra alternatives in your specific design wants, and it’s extra work to judge the suitable choices than it was.
Thermally-induced mechanical stress and 3D IC design reliability
Mechanical and bodily stress – akin to while you put a heavy piece of concrete on what you thought could be a straight board, the board bends proper or perhaps breaks. Stress and temperature are interrelated, however typically it’s straightforward to overlook.
Contemplating the fundamentals, have a look at the previous splendid gasoline regulation: PV=nRT. Strain, quantity, and temperature are all interrelated. We’re not essentially speaking a few gasoline, however as a primary idea, stress will increase as temperature will increase. As temperatures come down, stresses go down, and vice versa — assuming quantity is held fixed. With a rise in stress on the transistor, it is going to behave in a different way.
For conventional SoC within the LVS area, we glance into the post-layout simulation area to establish every transistor in context to judge the stressors. We seize the stressors as properties on the transistors that we feed into the transistor mannequin for electrical evaluation. Temperature creates stress, and stress creates temperature, so we need to establish the stress of a tool from thermal impression. Probably the most vital thermal impression comes from when the temperature is the most well liked – usually within the producer stage for an IC.
In typical superior node processes, we conduct a fast thermal anneal (RTA), which includes heating it to excessive temperatures for a short while. RTA ought to enhance the crystallinity of the polysilicon and different design components. Nevertheless it does induce stress at that time as a result of excessive temperature, and that stress stays. Consider it as little fissures or little cracks that you just didn’t need to be there. These weak factors are going to worsen till they finally fail. For chiplets, should you perceive the manufacturing course of and get details about the utmost RTA temperature, that’s a superb place to begin.
Materials properties are one other problem for each thermal and stress. How a lot stress one thing can take is determined by what it’s — a chunk of metal is stronger than a popsicle stick. For 3D IC design, understanding the fabric and manufacturing course of is crucial to understanding the historic stress.
The opposite half comes from while you’re in a heterogeneous atmosphere – and the stresses come from what’s round you. Earlier than silicon, you create transistors and a few metallization and oxides on prime which are properly characterised. With the three-dimensional design, there are totally different processes and connections with steel via a bundle or TSVs or with Microbox. One of many largest is ball grid arrays (BGAs) should you’re connecting that bundle onto a board someplace, the BGAs are giant and create numerous stress. As you’ll be able to see, we’re including stress in ways in which aren’t related in an SoC and design groups must account for these new stress factors and deal with them appropriately. These impression system conduct or efficiency, and the system conduct or efficiency impacts the stresses and temperature relying on the frequency. Excessive frequencies will likely be hotter than low frequencies. Utilizing that data will assist predict the thermal impacts on this system, thermal data, and bodily data – finally figuring out the stresses. Each change {the electrical} conduct, leading to probably a number of cycles till closure.
When semiconductor design groups ought to begin to make sure reliability in 3D IC design
To make sure reliability – it’s important to begin early. However the earlier you begin, the much less you understand. So, it’s an improve-as-you-go course of. Figuring out the sort of heterogeneous 3D IC to construct is a useful place to begin. If it’s unknown, you can begin with particular person dies as in the event that they’re homogeneous buildings and begin placing them in place with their TSVs or different connections to evaluate potential thermal and stress issues. From there, you’ll be able to place parts on prime of or subsequent to others. You’ll be able to play trade-offs at an early stage, however you’re lacking a number of the particulars.
For particulars, it’s essential to know all the pieces concerning the chipsets and the connections. As you begin to add in additional, you make choices. Then proceed via the cycle and make a number of rounds of research. In fact, you could find this stuff on the finish as you set the design collectively. However as you progress additional alongside within the design, the dearer it’s to return.
Designing a dependable bundle requires the design staff to give attention to evaluation as early as attainable, akin to on the prototype and starting stage, and perceive the way it may drive mechanical stress and electrical-related failures. It’s extra work than prior to now, however that’s the good thing about it as a result of you might have so many extra choices. Semiconductor design groups have extra room to be intelligent and discover methods to do issues not finished earlier than.