Understanding the Massive Spend on Superior Packaging Services


//php echo do_shortcode(‘[responsivevoice_button voice=”US English Male” buttontext=”Listen to Post”]’) ?>

Main chipmakers in recent times spent tens of billions of {dollars} on advanced-chip-packaging amenities—to arrange for constructing processors in multi-chiplet packages that may provide constant efficiency will increase and guarantee continuity of Moore’s legislation.

Analysts interviewed by EE Occasions took pains to clarify the spend. However earlier than we go there, let’s first take a detailed have a look at the numbers.

Superior-chip-packaging income is predicted to develop from $44.3 billion final yr to $78.6 billion by 2028, in line with Yole Intelligence. In the meantime, the standard chip packaging market was valued at round $47.5 billion final yr and is projected to develop to $57.5 billion by 2028. The entire chip packaging market is predicted to succeed in $136 billion by 2028, Yole estimates.

Superior packaging revenues by platform (Supply: Yole Intelligence)

Given quickly increasing demand for superior packaging, it isn’t stunning that that main chipmakers and “outsourced semiconductor meeting and take a look at,” or OSAT, corporations spent round $14.5 billion on superior packaging fabs and instruments in 2022, Yole Growth estimated. Intel was projected to steer the market with round $4 billion funding, adopted by TSMC with about $3.6 billion, Samsung Foundry with circa $2 billion, and ASE Group with roughly $1.7 billion. As a consequence of a chip market downturn this yr, the businesses are anticipated to scale back their superior packaging CapEx budgets to $11.9 billion, which remains to be a large sum.

Superior packaging CapEx by main chipmakers and OSATs (Supply: Yole Intelligence)

There are a lot of the explanation why superior packaging market and multi-chiplet designs are set to thrive within the coming years. 

Firstly, chip manufacturing prices are rising as foundries increase their quotes with each new manufacturing node—as fab gear will get dearer. 

That makes manufacturing of huge monolithic chips notably costly while you think about defect density and yields. So, making two smaller chips after which stitching them collectively could also be so much cheaper than making one large monolithic die. 

“In case you shrink the die, you get larger yield,” G. Dan Hutchinson, vice chair at TechInsights, informed EE Occasions.

Secondly, it makes excellent sense to disaggregate designs. 

Whereas logic continues to scale by way of energy, efficiency and space with each new course of expertise, scaling of analog and SRAM circuits primarily stopped at 5-nm–3-nm nodes, which makes chip designers much less inclined to undertake of the newest nodes. As an alternative, they disaggregate designs after which produce totally different chiplets utilizing probably the most optimum course of expertise. 

Moreover, after getting sufficient totally different chiplets, you may combine and match them in a product to construct an answer tailor-made for a selected workload.

“That is not a query of, if everyone’s going this [multi-chiplet] method,” Hutchinson stated. “It’s not even a query of when; when has already occurred. The extra they have a look at this, the extra they discover, there are benefits within the efficiency of the chips, there are benefits in your management of IP, there are such a lot of market benefits.”

The third motive is probably much less apparent. The utmost publicity discipline measurement of a recent EUV scanner is 26 mm (slit) by 33 mm (scan), or 858 mm² (which is usually known as most reticle measurement). Subsequent-generation Excessive-NA EUV (0.55 numerical aperture) scanner will retain a 26mm slit, however will halve the scan to 16.5mm, so the publicity discipline measurement will probably be 429 mm2

In consequence, someplace past 1.8nm, chip builders should use multi-chiplet or multi-tile designs for high-performance purposes. In actual fact, Intel as soon as deliberate to deploy Excessive-NA at its 1.8nm fabrication expertise. As an alternative, the corporate opted to make use of standard EUV scanners with 0.33 NA optics with EUV double patterning and/or sample shaping, so Excessive-NA is coming to fabs past 1.8nm.

“Once you go Excessive-NA, the publicity discipline measurement will get lower in half,” Hutchinson stated. “So, it’s changing into actually a giant crucial to make [multi-chiplet designs with advanced packaging] occur. It’s by way of what are the strategic drivers driving them to do that.”

Intel bets huge on EMIB

Intel in recent times made large bets on its Embedded Multi-Die Interconnect Bridge (EMIB) for two.5D integration and Foveros for 3D integration course of applied sciences. Lots of its vital merchandise depend on the corporate’s EMIB and Foveros superior packaging applied sciences. The record consists of: Ponte Vecchio compute GPU;, Sapphire Rapids and Sapphire Rapids HBM processors for datacenters and supercomputers; and Meteor Lake, Arrow Lake and Lunar Lake upcoming consumer CPUs.  

The variety of Intel’s multi-chiplet product’s already available on the market and within the pipeline reveals that it makes good sense financially for Intel to make use of such design, Jon Peddie, president of Jon Peddie Analysis, informed EE Occasions. “Multi-chip, chiplet, stacked chips all exploit a few of Intel’s intrinsic strengths — manufacturing at sub-atomic ranges, supplies science, and signaling expertise. Chiplets is not only about attempting to hook up a bunch of processors like a Lego package. It’s about accurately designing, managing, and manufacturing in quantity.”

Intel’s common and superior packaging applied sciences (Supply: Intel)

Intel is among the many leaders on the subject of packaging applied sciences, and it’s poised to make use of them extensively. But, the corporate stays comparatively conservative, sources informed EE Occasions.

“In case you discuss to any analyst, they may let you know that Intel is the world’s chief within the expertise,” Hutchinson stated. “And but, they can even let you know the unhappy half is, Intel has not used it as a lot because it ought to.”

As a result of Intel bets a lot on its superior packaging applied sciences, it shouldn’t come as a shock that it additionally spends billions on superior packaging amenities. The corporate’s packaging facility close to Rio Rancho, N.M., that’s set to return on-line this yr price the corporate about $3.5 billion

In the meantime, this plant is just not the one Intel superior packaging facility within the U.S. The corporate’s CH4, close to Chandler, Arizona, can also be able to utilizing Intel’s EMIB and Foveros applied sciences. It has additionally been used to construct samples of the corporate’s Meteor Lake CPU.

“Arizona is the place the [packaging] analysis is completed,” Hutchinson stated. “Arizona has at all times been the packaging R&D middle, and New Mexico is extra of a producing web site. Then [Intel] additionally has packaging in different elements of the world, however it isn’t probably the most superior.

Intel’s common and superior packaging roadmap (Supply: Intel)

In mid-June, Intel stated it was additionally set to construct a $4.6 billion, superior packaging facility in Poland, by 2027. 

The massive sum of investments provides an thought in regards to the scale and capabilities of the upcoming plant, in addition to how vital this facility will probably be for Intel. In actual fact, it’s fully logical for the chip big to spend money on superior packaging closely because the firm has traditionally managed manufacturing of its chips—from a silicon wafer to a whole system. Intel desires to maintain it that method—for not solely its personal merchandise but additionally its foundry providers. So, the cash it’s spending on such amenities is about to be larger than the sums spent by TSMC.

“One among Intel’s actual benefits for his or her foundry choices would be the means to do full-service, full flows from the wafer in to bundle out,” Hutchinson stated.

TSMC: Billions on packaging amenities

In late June, TSMC stated it deliberate to construct a $2.87 billion facility for superior chip packaging in Taiwan. The plant is predicted to return on-line a number of years down the street.

“To satisfy market wants, TSMC is planning to ascertain a complicated packaging fab within the Tongluo Science Park,” TSMC stated in ready remarks, noting that it expects to create 1,500 job alternatives.

TSMC’s superior packaging amenities in Taiwan. The picture is from mid-2021; AP6 is now on-line. (Supply: Cadence, TSMC)

The upcoming Tongluo facility will in all probability be just like TSMC’s just lately launched Superior Backend Fab 6, which helps all the firm’s 3DFabric packaging applied sciences, together with frontend 3D stacking methods equivalent to chip-on-wafer (CoW) and wafer-on-wafer (WoW), in addition to backend packaging applied sciences like built-in fan-out (InFO, chip first) and chip-on-wafer-on-substrate (CoWoS, chip final).

TSMC’s superior packaging applied sciences (Supply: TSMC)

The $2.87 billion funding within the Tongluo superior packaging facility is just not the one huge funding TSMC is making in packaging. 

In the previous few months, a number of media retailers have printed studies claiming that TSMC can barely meet demand for its CoWoS packaging resulting from overwhelming demand for Nvidia’s compute GPUs used for synthetic intelligence (AI) and high-performance computing (HPC) purposes. To satisfy demand for its CoWoS expertise in 2025, the corporate is shopping for new instruments to put in into current amenities and enhance CoWoS capability by two occasions, by the tip of subsequent yr.

“However for the again finish, the superior packaging facet, particularly for the CoWoS, we do have some very tight capability to—very exhausting to satisfy 100% of what prospects wanted,”  C.C. Wei, chief government of TSMC, through the firm’s earnings name in July. “So, we’re working with prospects for the quick time period to assist them to satisfy the demand, however we’re rising our capability as rapidly as doable.”

He stated the corporate considerably expects the tightness to ease up towards the tip of subsequent yr, noting that CoWoS capability will probably be doubled as compared with this yr.

Being the world’s largest contract maker of chips, TSMC earns large earnings by making a number of the world’s most advanced processors, equivalent to Nvidia’s H100 compute GPU. 

But, the corporate understands that, sooner or later, lots of its prospects will depend on multi-chiplet designs and won’t solely want items of silicon however silicon on a silicon interposer and dealing in live performance. That requires a giant funding in superior packaging applied sciences and acceptable manufacturing amenities.

“One of many the explanation why TSMC has performed very well with [advanced packaging] is as a result of they obtained into it very early and so they understood it very early,” Hutchinson stated.

TSMC’s superior packaging applied sciences (Supply: Cadence, TSMC)

Whereas TSMC can and can provide superior packaging providers to its shoppers, it desires conventional OSAT suppliers to catch up and supply comparable providers. 

In actual fact, to popularize its 3DFabric packaging applied sciences within the business, the corporate fashioned its 3DFabric Alliance, which encompasses builders of digital design automation (EDA) instruments, IP designers, contract chip design corporations, reminiscence makers, superior substrate producers, fab software makers, and OSATs. 

Within the best-case situation, TSMC would love corporations like ASE Expertise, Amkor Expertise, and JCET to supply superior packaging to prospects, which is why it’s keen to license its strategies to chip assemblers. 

Moreover, ASE already has numerous its personal superior packaging applied sciences akin to these utilized by TSMC (e.g., FoCoS, fan out chip on substrate). However OSATs might not be too inclined to supply such providers simply now since they require large investments and pose large dangers as a failure with a multi-chiplet packaging renders a number of chiplets ineffective.

“We expect [chipmakers] want to supply OSATs extra technical assist/revenue incentives if they need OSATs to be extra engaged within the CoWoS (Chip on Wafer on Substrate) market,}  Szeho Ng, an analyst with China Renaissance Securities, wrote in a word to shoppers. “OSATs’ foray into the ‘WoS’ course of tallies with our long-held view–it’s generic die connect and doable beneath the standard disaggregated foundry-OSAT mannequin the place each events handle their very own processes. We’re skeptical about OSATs’ penetration in ‘CoW,’ as extra entrance/backend course of crossover raises execution dangers. Any CoW rework prices are expensive for OSATs, which not like front-end fabs lack the engaging front-end wafer revenue avenue to fund their backend forays.”

The problem could possibly be mitigated if the reimbursement for any yield deficiencies from the OSATs is restricted to a cap that each one taking part events have agreed upon, and clear tasks are established, the analyst notes. However since this has not been performed but, one of many the explanation why TSMC has to take a position billions in superior packaging, is as a result of its companions amongst OSATs are much less inclined to supply such “margin dilutive” CoWoS providers. 

One more reason why TSMC has to pour in billions in superior packaging is as a result of corporations like ASE Expertise—which earned $21.831 billion in 2022—hve significantly decrease CapEx budgets. ASE’s capital expenditures totaled $440 million within the first half of 2023, and the corporate has stated that it’ll spend one other $580 to $600 million on manufacturing instruments within the second half of this yr.

Samsung: Progressing Quickly

Samsung Foundry is the third contract maker of chips that has each modern lithography fabrication processes and superior packaging applied sciences. 

Whereas the corporate doesn’t spend money on superior packaging capability as aggressively as Intel and TSMC, it has numerous refined packaging applied sciences, together with 2.5D I-Dice (interposer-based), H-Dice (hybrid interposer, or hybrid FCPBGA) and 3D X-Dice.

Samsung’s superior packaging applied sciences (Supply: Cadence)

Relating to manufacturing of logic basically and high-profile merchandise specifically, Samsung Foundry is significantly smaller than Intel and TSMC, so its superior packaging applied sciences might not be as well-known as these of Intel and TSMC. 

But, Samsung is ramping up its superior packaging providers in a short time. The corporate earned $3.1 billion packing chips utilizing its I-Dice, H-Dice, and X-Dice applied sciences in 2021, however its packaging division elevated its income to $4 billion in 2022.

Justified and helpful

Multi-chiplet designs are an inherent a part of the semiconductor business’s future: It’s simpler to maximise yields of smaller dies, it doesn’t make sense to make analog and interface circuitry on modern processes, and Excessive-NA EUV scanners will lower most die measurement in half, making stitching obligatory for any high-performance design. 

Moreover, many designs are going to make use of 2x, 4x, 6x reticle measurement packages to pack all of the logic they may want for AI and HPC purposes within the close to future.

“Chiplets are on the forefront of our business and supply probably the most environment friendly and cost-effective answer for the world’s knowledge infrastructure by their extremely customizable nature,” stated Sudhir Mallya, a advertising government at Alphawave Semi, a contract chip designer and IP supplier. “Given their smaller dies, they’ve larger yields, which lowers manufacturing prices and energy consumption. Moreover, they supply a ‘greater than Moore’s’ means to deal with the compute wants of AI apps in contrast with conventional GPUs which were used to coach AI fashions, whereas additionally offering a more-flexible product configuration.”

Main chip designers understood that smaller dies take much less time to get to excessive yields years in the past, so we’ve got seen numerous types of multi-chip module (MCM) processors for servers (e.g., IBM Power5), consumer (Intel Pentium D), and even recreation consoles (ATI Xenos with eDRAM) for many years. 

In fashionable historical past, AMD led the way in which with its disaggregated Ryzen and Epyc designs comprising of a number of CCDs (core advanced dies) and a single IOD (enter/output die) linked utilizing Infinity Material, a expertise that AMD has perfected for its Intuition MI300-series datacenter APUs and compute GPUs. 

Then Intel introduced a slew of its multi-chiplet CPUs and GPUs in 2020-2021, after which Apple got here up with its M1 Extremely and M2 Extremely dual-die processors.

Several types of AI datacenter silicon: giant monolithic dies vs multi-chiplet options (Supply: Alphawave Semi)

Disaggregating compute logic and IO (e.g., Ryzen and Epyc) or producing two smaller dies as a substitute of 1 huge die (Intuition MI250, M1/M2 Extremely) are maybe the obvious disaggregation eventualities. In actual fact, design disaggregation might be very rewarding since in some instances chip designers can save 30% to 40% of prices by splitting up their designs, in line with  TechInsights’ Hutchinson—who a few years in the past wrote a TechInsight’s Chip Insider paper about how “for those who take a design and also you fraction it as much as important and non-critical [chiplets], and then you definately issue within the variations in masks layers, you may really save 30% to 40% of your price,” he recalled.

Along with disaggregating their designs, AMD and Apple solved two important issues with their processors: implementing high-bandwidth interconnects between two or extra dies with efficiency comparable with inside chip connections and presenting these processors as one to software program (i.e., unify sources, reminiscence pool, and so on.). 

Whereas AI and HPC software program is tailor-made to make use of all compute sources it might get, graphics purposes are exhausting to scale throughout totally different GPUs. So, Apple did fairly a job with its M2 Extremely: Whereas many corporations have tried, solely Apple has clearly succeeded in making such a multi-GPU design work correctly with its newest M2 Extremely system-in-package.

“Others have already got, earlier than Apple,” Peddie stated. “AMD was the primary with their heterogeneous software program program HSA in 2013 that grew to become the Radeon Open eCosystem (ROCm) umbrella in 2020. Intel initiated their oneAPI in 2019, and in SoC land Qualcomm has been doing it since 2007 with the introduction of Snapdragon and its multi-processor structure.”

There are quite a few high-end chips with large die sizes available on the market. 

For instance, Nvidia’s GH100 has a die measurement of 814 mm2 and based mostly on unofficial data, the corporate’s H100 compute GPUs are briefly provide not due to lack of silicon (i.e., low yields), however as a result of TSMC’s superior packaging capacities are absolutely booked.

“Firms like TSMC, Intel, and to some extent Samsung have gotten so good [that you can get] cheap yields on a full discipline die,” Hutchinson stated. “So, your integration restrict turns into not simply the restrict of the decision, but additionally the restrict of the publicity discipline measurement and litho software [reticle].”

Since corporations like Nvidia can get cheap yields with their large compute GPUs or FPGAs, it could appear that they’re much less inclined to disaggregate their designs. However stitching two or extra comparable dies collectively is one thing that chip designers should do within the Excessive-NA period anyway, Hutchinson asserted. Due to this fact, it makes quite a lot of sense to learn to use such designs now, which is what AMD and Apple are doing.

“You’re at a degree now, the place when you get to a full publicity discipline die, you will should do [multi-chiplet] anyway with Excessive-NA, you higher begin attempting to determine the way it works immediately so you don’t hit this expertise and [it] kills you.”

Disaggregation of chip designs might be performed in a number of methods. For instance, Intel disaggregated not solely compute tiles, but additionally Rambo cache and HBM reminiscence PHY in Ponte Vecchio. AMD adopted the suite with its Navi 31 GPU that obtained its Infinity Cache and GDDR6 reminiscence PHY unfold over six separate chiplets. 

Enabling die-to-die interconnects with bandwidth and latencies comparable with inside interconnections is tough and costly. However so long as chip designer will get larger yields and manufacturability and decrease silicon price offsets the price of superior packaging, corporations will use it.

“Once I did that [multi-chiplet cost] modeling a number of years in the past, the massive benefit was, you might offset the extra bundle prices by the truth that you had been getting larger yield amongst all of the dies,” Hutchinson recalled.

Normal superior packaging roadmap (Supply: Yole Group)

All three main chipmakers and OSATs are advancing their 2.5D- and 3D-packaging applied sciences by making pitchers smaller to allow denser interconnects. 

These next-generation superior packaging applied sciences will probably be dearer than current variations of CoWoS, EMIB, Foveros, or X-Dice. 

In the meantime, as long as prices of packaging are offset by decrease prices of silicon, chip builders are going to make use of them. 

Moreover, over time every little thing will get cheaper.

“As soon as the method is operating easily, and if the quantity is excessive sufficient, engineering (of virtually something) will discover higher, extra environment friendly, and due to this fact cheaper methods of doing issues,” Peddie stated. “What retains a course of from being improved is the ROI for the development. If you’re solely making 10 of one thing a yr, that doesn’t assist very a lot investigation in the way to make it higher or quicker. However if you find yourself punching out 1,000,000 a day of one thing you’ve got price range (and necessity) to make it extra environment friendly.”


Superior packaging applied sciences deliver a plethora of alternatives to chip designers. However additionally they current challenges. Amongst them: sign administration, impendence traits of chips made on totally different nodes, energy consumption, packaging yields and packaging prices.

One of many guarantees of multi-chiplet design ideology is that it permits builders of options to combine and match chiplets made on totally different nodes to get desired efficiency and options to satisfy calls for of current and rising purposes. However chips made on totally different nodes use totally different voltages, have totally different impendences, and might even characteristic totally different Z-height, making their integration a nightmare. 

As Peddie places it, Intel’s Meteor Lake and Ponte Vecchio system-in-packages (SiPs) are fashionable miracles constructed by totally different fabs on totally different manufacturing nodes in several elements of the world.

“[Multi-chiplet designs are all] about sign administration engineering to get the right propagation and impedance traits throughout a tiny piece of composite supplies at GHz speeds and pico-second rise occasions,” he stated. “We blithely discuss these fashionable miracles and take them without any consideration. However that’s as a result of only a few individuals who just do discuss it have by no means struggled to even connect an oscilloscope or logic analyzer to a pin with out influencing the conduct of the system they’re attempting to measure.”

To make sure that chiplets developed by totally different corporations and made on totally different nodes are appropriate with one another and might be mixed in a single product bundle, main chip designers and producers developed Common Chiplet Interconnect Categorical (UCIe), an open specification that defines chiplet-to-chiplet interconnection with the goal to construct a ubiquitous ecosystem. 

However UCIe is in its infancy and there are consultants within the business who’ve doubts that the UCIe 1.0 specification will probably be sufficient to construct a strong ecosystem of chiplets.  

But, the specification will proceed to develop and solely time will inform whether or not UCIe will permit to construct an ecosystem that’s even distantly as considerable as PCIe.

Energy consumption of multi-chiplet designs in comparison with built-in designs can also be a factor to think about. Whereas it could be cheaper to construct a multi-tile answer, a monolithic one might provide larger efficiency and decrease energy consumption and thus be extra environment friendly.

“In case you exit to a chiplet, it prices you a large number in energy and velocity,” Hutchinson stated. “It’s simply not as a lot as for those who go from the chip to a different chip on the board, […] it relies on the design and the gap and all that, however you get these orders of magnitude [higher power consumption for interconnections when compared to] a totally built-in system. […] So there may be at all times an inherent benefit to integrating every little thing right into a single die for that motive. However sooner or later, it breaks since you attempt to pack an excessive amount of into it.”

In the meantime, larger per-chiplet yields might allow builders to throw in additional transistors into their multi-chiplet designs in comparison with what they might have used for a monolithic design, which is able to guarantee larger efficiency.

Since superior packaging requires clear rooms, refined gear and tens of advanced steps, the idea of yields is absolutely relevant to multi-chiplet SiPs, too. 

Assembling a processor based mostly on 5 or extra chiplets (e.g., Intel’s Meteor Lake consists of 5 chiplets) made on totally different nodes sounds believable from many factors of view, excessive yield of the packaging course of is essential since if it fails, all chiplets go to bin, which implies losses for the chipmaker and/or OSAT. 

Each Intel and TSMC argue that their superior packaging yields are very excessive, however they definitely don’t disclose any numbers.

Intel acknowledged this yr that giant substrates, equivalent to these used for enormous system-in-packages like Ponte Vecchio, are likely to warp, which poses yield dangers and makes it tough to assemble them onto a motherboard. For now, Intel appears to be happy with what it has. However, to make sure that its future SiPs don’t bend and carry out higher (as they combine issues like optical interconnects), the corporate plans to implement glass substrates as a substitute of natural substrates within the second half of this decade. Such a transfer requires quite a lot of adjustments and funding and to a considerable diploma it’s necessitated by the continued transition to multi-chiplet designs.

Whereas superior packaging strategies basically, and multi-chiplet designs specifically, current ample alternatives for chip builders, these are very advanced applied sciences that current quite a few challenges to producers of microelectronics and OSATs.

But, each new expertise pushes the boundaries of what’s doable, and firms like Intel have realized the way to resolve issues and make breakthroughs over the past 60-plus years in microelectronics. 

“TSMC, Intel, Samsung [and other chipmakers] have 1000’s of extremely good scientists and engineers who don’t sit round all day sipping tea and enjoying Wordle,” Peddie stated, noting that superior packaging applied sciences didn’t simply come out of the blue. “They got here from hundred, perhaps 1000’s of experiments. They had been devised by 1000’s of hours of simulation runs, chalk discuss, and sleepless nights. 

“We’re impressed with what they’re doing immediately,” he added. “However within the labs, they’re attempting to resolve the issues, and overcome the limitations of what’s going to be manufactured three to 5 years from now, and excited about what and the way they may construct transistors 10 years from now.”


Leave a comment