News Posts matching #Ascend

Return to Keyword Browsing

DeepSeek R2 Leak Reveals 512 PetaFLOPS Push on Domestic AI Accelerator Infrastructure

DeepSeek, a company that took the AI world by storm with its R1 model, is preparing a new and reportedly much improved DeepSeek R2 model release, according to a well-known AI insider @iruletheworldmo on X. Powered by Huawei's Ascend 910B chip clusters, a possible Huawei Atlas 900, and DeepSeek's in-house distributed training framework, R2 pushes these accelerators to an impressive 82% utilization, translating to 512 PetaFLOPS of FP16 performance—half an exaFLOP in computing power. According to Huawei lab data, that's roughly 91% of what NVIDIA's older A100 clusters deliver, yet DeepSeek claims it cuts per-unit training costs by a remarkable 97.3%. Behind DeepSeek R2 is a carefully cultivated ecosystem of partners. Tuowei Information, a leading OEM in the Ascend family, manages over half of DeepSeek's supercomputing hardware orders, while Sugon provides liquid-cooled server racks capable of handling up to 40 kW per unit. To keep power consumption in check, Innolight's silicon-photonics transceivers shave off another 35% compared to traditional solutions.

Geographically, operations are split across major hubs: Runjian Shares runs the South China supercomputing center under contracts exceeding ¥5 billion annually, and Zhongbei Communications maintains a 1,500-PetaFLOP reserve in the Northwest for peak demands. On the software side, DeepSeek R2 already supports private deployment and fine-tuning, powering smart-city initiatives in 15 provinces through the Yun Sai Zhilian platform. North China's node, overseen by Hongbo Shares' Yingbo Digital, adds another 3,000 PetaFLOPS to the mix. If computing power is scarce, Huawei is ready to deploy its CloudMatrix 384 system, which is positioned as a domestic alternative to NVIDIA's GB200 NVL72. It features 384 Ascend 910C accelerators to achieve 1.7× the overall petaFLOPS and 3.6× the total HBM capacity of the NVL72 cluster—yet it lags significantly in per-chip performance and consumes nearly four times more power. Nonetheless, the R2 model launch is expected to come smoothly online, and we are waiting for the official launch and benchmarks to see its performance.

TSMC Can't Track Where Its Chips End Up, Annual Report Admits

TSMC has acknowledged fundamental visibility limitations in its semiconductor supply chain, stating in its latest annual report that it "inherently lacks visibility regarding the downstream use or user of final products." This disclosure relates to an incident where 7 nm chips manufactured for Sophgo were later identified in Huawei's Ascend 910B/C AI accelerators, whose hardware is subject to US export restrictions. The contract foundry outlined its standard process: receiving GDS files through intermediaries, validating technical specifications, creating photomasks, and fabricating wafers without insight into end applications. Subsequent analysis revealed that those very chips matched Huawei's specifications, providing components for approximately one million dual‑chiplet AI accelerator units, with two million dies shipped to Huawei.

The report warns that compliance violations by supply‑chain partners, such as failing to secure proper import, export or re‑export permits, could trigger regulatory investigations and penalties, even when TSMC adheres to its established protocols. US already proposed a $1 billion fine for TSMC. This visibility gap just shows that challenges in semiconductor manufacturing, where complex distribution networks obscure the path between fabrication and deployment, are not easily overcome. Foundries are facing increasing pressure to enhance tracking capabilities despite the inherent limitations of the contract manufacturing model. US sanctions on Chinese companies are growing their walls even higher, and this could mean that sanction-abiding companies might avoid doing business with Chinese entities altogether to avoid getting fined.

Huawei Prepares 6 nm Ascend 920C Accelerator: 900 TeraFLOPS, 4000 GB/s HBM3

Huawei recently revealed that its CloudMatrix 384 AI super node cluster can outperform NVIDIA's GB200 NVL72 in standard benchmarks, even though it consumes more power per performance unit. That system relies on Huawei's current Ascend 910C accelerators, which deliver strong raw compute performance but lag behind in efficiency metrics. To address this gap, Huawei is preparing the Ascend 920 family, with the training‑focused Ascend 920C built on SMIC's 6 nm process. According to DigiTimes, each 920C card will deliver more than 900 TeraFLOPS of BF16 half-precision performance. It also upgrades memory to HBM3 modules, providing 4,000 GB/s of bandwidth, which is up from the Ascend 910C's eight HBM2E stacks and 3,200 GB/s.

The existing Ascend 910C peaks at 780 TeraFLOPS in BF16 operations and uses a chip‑to‑chip interconnect bandwidth of 400 GB/s. Its packaging limits that bandwidth, but it still supports high‑speed communication between nodes in ultra‑dense AI clusters. Huawei will retain the chiplet‑based design in the 920C and refine the tensor acceleration engines for Transformer and Mixture‑of‑Experts models. Internal projections estimate that overall training efficiency on the 920C will improve by 30-40 percent compared to the 910C. This should narrow the performance‑per‑watt difference against competitors' solutions. In terms of system integration, the Ascend 920C will support PCIe 5.0 and next‑generation high‑throughput interconnect protocols. These features aim to improve resource scheduling and reduce latency in super node deployments, where tight node‑to‑node synchronization is critical. Huawei has not announced a firm release date for the Ascend 920C, but DigiTimes sources claim that it will enter mass production in the second half of 2025, which could mean just a few months from now.

Report Suggests Huawei Ascend 910C AI Accelerator's Utilization of Foreign Parts; Investigators Find 7 nm TSMC Dies

Earlier today, TechPowerUp covered the alleged performance prowess of Huawei's CloudMatrix 384 system super node. According to SemiAnalysis opinion, the system's Ascend 910C AI accelerators are a generation behind—in terms of chip performance—when compared to NVIDIA's GB200 "Blackwell" AI GPU design. SMIC seemed to be in the picture, as Huawei's main fabrication partner—possibly with an in-progress 5 nm node process. Instead, SemiAnalysis has surmised that the Ascend 910C is based on plenty of non-native technologies. Huawei's (current and prior) "aggressive skirting of export controls" has likely enabled the new-gen AI chip's better than expected performance stats. SemiAnalysis documented the early sample's origins: "while the Ascend chip can be fabricated at SMIC, we note that this is a global chip that has HBM from Korea (Samsung), primary wafer production from TSMC (Taiwan), and is fabricated by 10s of billions of wafer fabrication equipment from the US, Netherlands, and Japan...One common misconception is that Huawei's 910C is made in China. It is entirely designed there, but China still relies heavily on foreign production."

Despite China's premiere foundry business making pleasing in-roads with a theorized "7 nm N+2" manufacturing test line, Huawei has seemingly grown impatient with native immature production options. Today's SemiAnalysis article presents a decent dose of inside knowledge: "while SMIC does have 7 nm, the vast majority of Ascend 910B and 910C are made with TSMC's 7 nm. In fact, the US Government, TechInsights, and others have acquired Ascend 910B and 910C and every single one used TSMC dies. Huawei was able to circumvent the sanctions on them against TSMC by purchasing ~$500 million of 7 nm wafers through another company, Sophgo...It is rumored Huawei continues to receive wafers from TSMC via another 3rd party firm, but we cannot verify this rumor." Another (fabless) Chinese chip design firm—Xiaomi—appears to still have direct/unrestricted access to TSMC manufacturing lines, albeit not for enterprise-grade AI products.

Huawei CloudMatrix 384 System Outperforms NVIDIA GB200 NVL72

Huawei announced its CloudMatrix 384 system super node, which the company touts as its own domestic alternative to NVIDIA's GB200 NVL72 system, with more overall system performance but worse per-chip performance and higher power consumption. While NVIDIA's GB200 NVL72 uses 36 Grace CPUs paired with 72 "Blackwell" GB200 GPUs, the Huawei CloudMatrix 384 system employs 384 Huawei Ascend 910C accelerators to beat NVIDIA's GB200 NVL72 system. It takes roughly five times more Ascend 910C accelerators to deliver nearly twice the GB200 NVL system performance, which is not good on per-accelerator bias, but excellent on per-system level of deployment. SemiAnalysis argues that Huawei is a generation behind in chip performance but ahead of NVIDIA in scale-up system design and deployment.

When you look at individual chips, NVIDIA's GB200 NVL72 clearly outshines Huawei's Ascend 910C, delivering over three times the BF16 performance (2,500 TeraFLOPS vs. 780 TeraFLOPS), more on‑chip memory (192 GB vs. 128 GB), and faster bandwidth (8 TB/s vs. 3.2 TB/s). In other words, NVIDIA has the raw power and efficiency advantage at the chip level. But flip the switch to the system level, and Huawei's CloudMatrix CM384 takes the lead. It cranks out 1.7× the overall PetaFLOPS, packs in 3.6× more total HBM capacity, and supports over five times the number of GPUs and the associated bandwidth of NVIDIA's NVL72 cluster. However, that scalability does come with a trade‑off, as Huawei's setup draws nearly four times more total power. A single GB200 NVL72 draws 145 kW of power, while a single Huawei CloudMatrix 384 draws ~560 kW. So, NVIDIA is your go-to if you need peak efficiency in a single GPU. If you're building a massive AI supercluster where total throughput and interconnect speed matter most, Huawei's solution actually makes a lot of sense. Thanks to its all-to-all topology, Huawei has delivered an AI training and inference system worth purchasing. When SMIC, the maker of Huawei's chips, gets to a more advanced manufacturing node, the efficiency of these systems will also increase.

SMIC Reportedly On Track to Finalize 5 nm Process in 2025, Projected to Cost 40-50% More Than TSMC Equivalent

According to a report produced by semiconductor industry analysts at Kiwoom Securities—a South Korean financial services firm—Semiconductor Manufacturing International Corporation (SMIC) is expected to complete the development of a 5 nm process at some point in 2025. Jukanlosreve summarized this projection in a recent social media post. SMIC is often considered to be China's flagship foundry business; the partially state-owned organization seems to heavily involved in the production of (rumored) next-gen Huawei Ascend 910 AI accelerators. SMIC foundry employees have reportedly struggled to break beyond a 7 nm manufacturing barrier, due to lack of readily accessible cutting-edge EUV equipment. As covered on TechPowerUp last month, leading lights within China's semiconductor industry are (allegedly) developing lithography solutions for cutting-edge 5 nm and 3 nm wafer production.

Huawei is reportedly evaluating an in-house developed laser-induced discharge plasma (LDP)-based machine, but finalized equipment will not be ready until 2026—at least for mass production purposes. Jukanlosreve's short interpretation of Kiwoom's report reads as follows: (SMIC) achieved mass production of the 7 nm (N+2) process without EUV and completed the development of the 5 nm process to support the mass production of the Huawei Ascend 910C. The cost of SMIC's 5 nm process is 40-50% higher than TSMC's, and its yield is roughly one-third." The nation's foundries are reliant on older ASML equipment, thus are unable to produce products that can compete with the advanced (volume and quality) output of "global" TSMC and Samsung chip manufacturing facilities. The fresh unveiling of SiCarrier's Color Mountain series has signalled a promising new era for China's foundry industry.

Huawei Obtained Two Million Ascend 910B Dies from TSMC via Shell Companies to Circumvent US Sanctions

According to a recent Center for Strategic and International Studies report, Huawei got its hand on approximately two million Ascend 910B logic dies through shell companies that misled TSMC. This acquisition violates US export controls designed to restrict China's access to advanced semiconductor technology. The report details how Huawei leveraged intermediaries to procure chiplets for its AI accelerators before TSMC discovered the deception and halted shipments. These components are critical for Huawei's AI hardware roadmap, which progressed from the original Ascend 910 (manufactured by TSMC on N7+ until 2020) to the domestically produced Ascend 910B and 910C chips fabricated at SMIC using first and second-generation 7 nm-class technologies, respectively. Huawei reportedly wanted TSMC-made dies because of manufacturing challenges in domestic chip production. The Ascend 910B and 910C reportedly suffer from poor yields, with approximately 25% of units failing during the advanced packaging process that combines compute dies with HBM memory.

Despite these challenges, the performance gap with market-leading solutions still remains but has narrowed considerably, with the Ascend 910C reportedly delivering 60% of NVIDIA H100's performance. Huawei has executed a strategic stockpiling initiative, particularly for high-bandwidth memory components. The company likely acquired substantial HBM inventory between August and December 2024, when restrictions on advanced memory sales to China were announced but not yet implemented. The semiconductor supply chain breach shows that enforcing technology export controls is challenging, and third parties can still purchase silicon for restricted companies. While Huawei continues building AI infrastructure for both internal projects and external customers, manufacturing constraints may limit its ability to scale deployments against competitors with access to more advanced manufacturing processes. Perhaps a future domestic EUV-based silicon manufacturing flow will allow Huawei to gain access to more advanced domestic production, completely circumventing US-imposed restrictions.

Huawei Ascend AI Accelerator Production Yields Reportedly "Doubled" in Early 2025

Huawei is likely celebrating milestones on multiple fronts—as reported earlier this month, the Chinese technology manufacture has pulled in record revenues and experienced consistent growth. Additionally, industry insiders believe that things are going well within the company's production pipeline. According to a Financial Times report, Huawei's next-generation AI accelerator model is on the way—the unannounced "Ascend 910C" is touted to directly compete with NVIDIA's H100 AI GPU. Industry moles believe that Huawei has partnered with SMIC for the manufacture of in-house accelerator designs. Whispers suggest a selection of the foundry's 7 nm N+2 process.

The alleged doubling of production yields (within a year)—from 20% to 40%—signals a significant achievement. As reported by FT, this milestone indicates Huawei's Ascend chip production line becoming profitable for the very first time. Two inside sources propose that Huawei and SMIC are targeting a 60% yield goal in the near future. In 2025, leaked plans suggest production tallies of roughly 100,00 Ascend 910C processors, and 300,000 of the current-gen Ascend 910B chip.

Huawei Delivers Record $118 Billion Revenue with 22% Yearly Growth Despite US Sanctions

Huawei Technologies reported a robust 22% year-over-year revenue increase for 2024, reaching 860 billion yuan ($118.27 billion), demonstrating remarkable resilience amid continued US-imposed trade restrictions. The Chinese tech giant's resurgence was primarily driven by its revitalized smartphone division, which captured 16% of China's domestic market share, overtaking Apple in regional sales. This achievement was notably accomplished by deploying domestically produced chipsets, marking a significant milestone for the company. In collaboration with Chinese SMIC, Huawei delivers in-house silicon solutions to integrate with HarmonyOS for complete vertical integration. The company's strategic diversification into automotive technology has emerged as a crucial growth vector, with its smart car solutions unit delivering autonomous driving software and specialized chips to Chinese EV manufacturers.

In parallel, Huawei's Ascend AI 910B/C platform recently announced compatibility with DeepSeek's R1 large language model and announced availability on Chinese AI cloud providers like SiliconFlow. Through a strategic partnership with AI infrastructure startup SiliconFlow, Huawei is enhancing its Ascend cloud service capabilities, further strengthening its competitive position in the global AI hardware market despite ongoing international trade challenges. Even if the company can't compete on performance versus the latest solutions from NVIDIA and AMD due to the lack of advanced manufacturing required for AI accelerators, it can compete on costs and deliver solutions that are typically much more competitive with the price/performance ratio. Huawei's Ascend AI solutions deliver modest performance. Still, the pricing makes AI model inference very cheap, with API costs of around one Yaun per million input tokens and four Yuan per one million output tokens on DeepSeek R1.

Artificial Intelligence (AI) Chips Market to Grow by USD 902.6 Billion by 2029: Technavio

Report with market evolution powered by AI—The global artificial intelligence (AI) chips market size is estimated to grow by USD 902.6 billion from 2025-2029, according to Technavio. The market is estimated to grow at a CAGR of over 81.2% during the forecast period. Increased focus on developing AI chips for smartphones is driving market growth, with a trend towards convergence of AI and IoT. However, dearth of technically skilled workers for ai chips development poses a challenge. Key market players include Advanced Micro Devices Inc., Baidu Inc., Broadcom Inc., Cerebras, Fujitsu Ltd., Google LLC, Graphcore Ltd., Huawei Technologies Co. Ltd., Intel Corp., International Business Machines Corp., MediaTek Inc., Microchip Technology Inc., NVIDIA Corp., NXP Semiconductors NV, Qualcomm Inc., SambaNova Systems Inc., Samsung Electronics Co. Ltd., SenseTime Group Inc., Taiwan Semiconductor Manufacturing Co. Ltd., and Tesla Inc.

Huawei Ascend 910B Accelerators Power Cloud Infrastructure for DeepSeek R1 Inference

When High-Flyer, the hedge fund behind DeepSeek, debuted its flagship model, DeepSeek R1, the tech world went downward. No one expected Chinese AI companies can produce high-quality AI model that rivals the best from OpenAI and Anthropic. While there are rumors that DeepSeek has access to 50,000 NVIDIA "Hopper" GPUs, including H100, H800, and H20, it seems like Huawei is ready to power Chinese AI infrastructure with its AI accelerators. According to the South China Morning Post, Chinese cloud providers like SiliconFlow.cn are offering DeepSeek AI models for inference on Huawei Ascend 910B accelerators. For the price of only one Yuan for one million input tokens, and four Yuan for one million output tokens, this economic model of AI hosting is fundamentally undercutting competition like US-based cloud providers that offer DeepSeek R1 for $7 per million tokens.

Not only is running on the Huawei Ascend 910B cheaper for cloud providers, but we also reported that it is cheaper for DeepSeek itself, which serves its chat app on the Huawei Ascend 910C. Using domestic accelerators lowers the total cost of ownership, with savings passed down to users. If Western clients prefer AI inference to be served by Western companies, they will have to pay a heftier price tag, often backed by the high prices of GPUs like NVIDIA H100, B100, and AMD Instinct MI300X.

Reports Suggest DeepSeek Running Inference on Huawei Ascend 910C AI GPUs

Huawei's Ascend 910C AI chip was positioned as one of the better Chinese-developed alternatives to NVIDIA's H100 accelerator—reports from last autumn suggested that samples were being sent to highly important customers. The likes of Alibaba, Baidu, and Tencent have long relied on Team Green enterprise hardware for all manner of AI crunching, but trade sanctions have severely limited the supply and potency of Western-developed AI chips. NVIDIA's region-specific B20 "Blackwell" accelerator is due for release this year, but industry watchdogs reckon that the Ascend 910C AI GPU is a strong rival. The latest online rumblings have pointed to another major Huawei customer—DeepSeek—having Ascend silicon in their back pockets.

DeepSeek's recent unveiling of its R1 open-source large language model has disrupted international AI markets. A lot of press attention has focused on DeepSeek's CEO stating that his team can access up to 50,000 NVIDIA H100 GPUs, but many have not looked into the company's (alleged) pool of natively-made chips. Yesterday, Alexander Doria—an LLM enthusiast—shared an interesting insight: "I feel this should be a much bigger story—DeepSeek has trained on NVIDIA H800, but is running inference on the new home Chinese chips made by Huawei, the 910C." Experts believe that there will be a plentiful supply of Ascend 910C GPUs—estimates from last September posit that 70,000 chips (worth around $2 billion) were in the mass production pipeline. Additionally, industry whispers suggest that Huawei is already working on a—presumably, even more powerful—successor.

TSMC Cuts Off Chinese Firm For Reportedly Shipping to Sanctioned Huawei

According to a recent Reuters report, TSMC has decided to cut off Chinese firm Sophgo following the discovery of TSMC-manufactured components in Huawei's advanced AI processor. The suspension came after technology research firm TechInsights identified a TSMC-manufactured chip within Huawei's Ascend 910B processor during a detailed analysis. This discovery raised significant concerns, as Huawei has been restricted from accessing such technology under US export controls since 2020. TSMC promptly notified US authorities upon learning of the situation and launched an internal investigation. While being sanctioned by the US, Huawei needed to use a proxy firm to get access to high-end silicon manufacturing to produce its Ascend accelerators.

Sophgo, which has ties to cryptocurrency mining equipment manufacturer Bitmain, strongly denies any business relationship with Huawei. The company states it has provided TSMC with a detailed investigation report asserting its compliance with all applicable laws, saying: "SOPHGO has never been engaged in any direct or indirect business relationship with Huawei. SOPHGO has been conducting business in strict compliance with applicable laws and regulations, including but not limited to all the applicable US national export control laws and regulations, and has never been in violation of any of such laws and regulations. SOPHGO has provided detailed investigation report to TSMC to prove that SOPHGO is not related to the Huawei investigation."
Return to Keyword Browsing
Jun 12th, 2025 20:43 EEST change timezone

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts