Feed aggregator

Vishay Intertechnology Automotive Grade IHDM Inductors Offer Stable Inductance and Saturation at Temps to +180 °C

ELE Times - 2 hours 11 min ago

Vishay Intertechnology, Inc. introduced two new IHDM Automotive Grade edge-wound, through-hole inductors in the 1107 case size with soft saturation current to 422 A. Featuring a powdered iron alloy core technology, the Vishay Inductors Division’s IHDM-1107BBEV-2A and IHDM-1107BBEV-3A provide stable inductance and saturation over a demanding operating temperature range from -40 °C to +180 °C with low power losses and excellent heat dissipation.

The edge-wound coil of the devices released provides low DCR down to 0.22 mΩ, which minimizes losses and improves rated current performance for increased efficiency. Compared to competing ferrite-based solutions, the IHDM-1107BBEV-2A and IHDM-1107BBEV-3A offer 30 % higher rated current and 30 % higher saturation current levels at +125 °C. The inductors’ soft saturation provides a predictable inductance decrease with increasing current, independent of temperature.

With a high isolation voltage rating up to 350 V, the AEC-Q200 qualified devices are ideal for high current, high temperature power applications, including DC/DC converters, inverters, on-board chargers (OBC), domain control units (DCU), and filters for motor and switching noise suppression in internal combustion (ICE), hybrid (HEV), and full-electric (EV) vehicles. The inductors are available with a selection of two core materials for optimized performance depending on the application.

Standard terminals for the IHDM-1107BBEV-2A and IHDM-1107BBEV-3A are stripped and tinned for through-hole mounting. Vishay can customize the devices’ performance — including inductance, DCR, rated current, and voltage rating — upon request. Customizable mounting options include bare copper, surface-mount, and press fit. To reduce the risk of whisker growth, the inductors feature a hot-dipped tin plating. The devices are RoHS-compliant, halogen-free, and Vishay Green.

The post Vishay Intertechnology Automotive Grade IHDM Inductors Offer Stable Inductance and Saturation at Temps to +180 °C appeared first on ELE Times.

Wi-Fi 8 Is on the Horizon. Qualcomm Outlines Priorities and Capabilities

AAC - 8 hours 45 min ago
What does a wireless standard look like when shaped by edge cases? Qualcomm has a reliability-focused vision for Wi-Fi 8, slated for 2028.

Navitas’ cuts losses in Q2 despite revenue still being down year-on-year

Semiconductor today - Wed, 08/06/2025 - 21:03
For second-quarter 2025, gallium nitride (GaN) power IC and silicon carbide (SiC) technology firm Navitas Semiconductor Corp of Torrance, CA, USA has reported revenue of $14.49m, down on $20.47m a year ago but up slightly on $14m last quarter...

Coherent inaugurates $127m factory in Vietnam

Semiconductor today - Wed, 08/06/2025 - 20:51
Materials, networking and laser technology firm Coherent Corp of Saxonburg, PA, USA has inaugurated its new $127m manufacturing facility in Nhon Trach Industrial Park, Dong Nai province, southern Vietnam, which will produce precision-engineered materials and photonics components used in applications spanning smartphones and electric vehicles to advanced medical devices...

Why Smart Meter Accuracy Starts With Embedded Design

AAC - Wed, 08/06/2025 - 20:00
Unreliable data is a serious problem for smart meters. This industry article explains why, and presents a solution in the form of embedded software.

The second version of my A+E Key M.2 to Front Panel USB 2.0 Adapter Card

Reddit:Electronics - Wed, 08/06/2025 - 17:48
The second version of my A+E Key M.2 to Front Panel USB 2.0 Adapter Card

I posted V1.0 here a few months ago and a couple people pointed out some problems. I also found some of my own. I need to change the design, so I've made V1.1. I've made a lot of improvements to the board and my documentation. All of my progress can be tracked in the v1.1 branch on my github. I am planning on ordering new boards soon. Any feedback would be appreciated.

submitted by /u/SuperCookieGaming
[link] [comments]

Flip ON Flop OFF: high(ish) voltages from the positive supply rail

EDN Network - Wed, 08/06/2025 - 17:07

We’ve seen lots of interesting conversations and Design Idea (DI) collaboration devising circuits for power switching using inexpensive (and cute!) momentary-contact SPST pushbuttons. A recent and interesting extension of this theme by frequent contributor R Jayapal addresses control of relatively high DC voltages: 48 volts in his chosen case.

Wow the engineering world with your unique design: Design Ideas Submission Guide

In the course of implementing its high voltage feature, Jayapal’s design switches the negative (Vss a.k.a. “ground”) rail of the incoming supply instead of the (more conventional) positive (Vdd) rail. Of course, there’s absolutely nothing physically wrong with this choice (certainly the electrons don’t know the difference!). But because it’s a bit unconventional, I worry that it might create possibilities for the unwary to make accidental, and potentially destructive, misconnections.

Figure 1’s circuit takes a different tack to avoid that.

Figure 1 Flip ON/Flop OFF referenced to the V+ rail. If V+ < 15v, then set R4 = 0 and omit C2 and Z1. Ensure that C2’s voltage rating is > (V+ – 15v) and if V+ > 80v, R4 > 4V+2

Figure 1 returns to an earlier theme of using a PFET to switch the positive rail for power control, and a pair of unbuffered CMOS inverters to create a toggling latch to control the FET. The basic circuit is described in “Flip ON Flop OFF without a Flip/Flop.”

What’s different here is that all circuit nodes are referenced to V+ instead of ground, and Zener Z1 is used to synthesize a local bias reference. Consequently, any V+ rail up to the limit of Q1’s Vds rating can be accommodated. Of course, if even that’s not good enough, higher rated FETs are available.

Be sure to tie the inputs of any unused U1 gates to V+.

Stephen Woodward’s relationship with EDN’s DI column goes back quite a long way. Over 100 submissions have been accepted since his first contribution back in 1974.

Related Content

The post Flip ON Flop OFF: high(ish) voltages from the positive supply rail appeared first on EDN.

Hack Club Highway - My first two PCBs

Reddit:Electronics - Wed, 08/06/2025 - 17:06
Hack Club Highway - My first two PCBs
Project 1: µController - A Custom Game Controller for Unrailed

I designed this compact controller specifically for playing Unrailed. Here's what makes it special:

  • Custom PCB with USB-C connectivity
  • Battery-powered with a boost converter for stable 5V
  • Hall effect sensors for precise control

The journey wasn't without its challenges - I may have slightly overheated a Nano S3 during assembly 😅 but managed to salvage it with some creative bodge-wiring using a Xiao. Currently, it's fully functional except for one hall effect sensor!

Project 2: The Overkill Macro Pad

Ever thought "I need more buttons"? Well, how about 100 of them?

Features: - 100 mechanical switches - Individual RGB LEDs for EVERY key - OLED display - Powered by a Raspberry Pi Pico - Auto polarity-correcting power input (because who has time to plug in power the right way?)

Some fun challenges I ran into: - Had to redo the PCB multiple times (always double-check your footprints!) - Learned the hard way about thermal management during soldering - Discovered that 100 LEDs can create some interesting signal integrity challenges - Found some microscopic shorts that only showed up when the board heated up (freezer debugging FTW!)

Currently, it's working with some bodge wires, though a few keys are still being stubborn. The case needs some tweaking, but hey, that's part of the fun of DIY, right?

Lessons Learned
  1. Don't rush soldering - thermal management is crucial
  2. Always verify footprints BEFORE ordering PCBs
  3. When in doubt, add level shifters
  4. Hardware debugging requires equal parts patience and creativity

Both projects are open source, and I'll be happy to share more details if anyone's interested! Let me know if you have any questions!

submitted by /u/RunTheBot
[link] [comments]

TDK showcases at electronica India 2025 its latest technologies driving transformation in automotive, industrial, sustainable energy, and digital systems

ELE Times - Wed, 08/06/2025 - 15:00

Under the claim “Accelerating transformation for a sustainable future,” TDK presents highlight solutions from September 17 to 19, 2025, at the Bangalore International Exhibition Centre (BIEC).

At hall 3, booth H3.D01, visitors can explore innovations in automotive solutions, EV charging, renewable energy, industrial automation, smart metering, and AI-powered sensing TDK’s technologies support the region’s shift toward cleaner mobility, intelligent infrastructure, and energy efficient living across automotive, industrial, and consumer sectors TDK Corporation will showcase its latest component and solution portfolio at electronica India 2025, held from September 17 to 19, 2025, at the Bangalore International Exhibition Centre (BIEC).

With the theme “Accelerating transformation for a sustainable future,” TDK presents technologies that reflect the region’s priorities in mobility electrification, industrial modernization, renewable energy, and digital infrastructure. The exhibit at hall 3, booth H3.D01 features live demonstrations and expert-led insights across key applications — from electric vehicles and smart factories to energy-efficient homes and immersive digital experiences.

TDK’s solution highlights at electronica India 2025:

Automotive solutions: Explore TDK’s comprehensive portfolio for electric two-wheelers and passenger vehicles, including components for battery management, motor control, onboard charging, and ADAS. Highlights include haptic feedback modules, Hall-effect sensors, and a live demo of the Volkswagen ID.3 traction motor featuring precision sensing technologies.

EV charging: Experience innovations in DC fast charging, including components for bi-directional DC-DC conversion, varistors, inductors, and transformers. A live 11 kW reference board demonstrates scalable, efficient charging for India’s growing e-mobility infrastructure.

Industrial automation: Discover intelligent sensing and connectivity solutions that boost uptime and efficiency. Live demos include SmartSonic Mascotto (MEMS time-of-flight), USSM (ultrasonic module), and VIBO (industrial accelerometer) – all designed to support predictive maintenance and smart infrastructure.

Energy & home: TDK presents high-voltage contactors, film capacitors, and protection devices for solar, wind, hydrogen, and storage systems. Explore TDK’s India-made portfolio of advanced passive components and power quality solutions, developed at the company’s state-of-the-art Nashik and Kalyani facilities. These technologies support a wide range of applications, including mobility, industrial systems, energy infrastructure, and home appliances.

Smart metering: TDK showcases ultrasonic sensor disks, NTC sensors, inductors, and RF solutions that enable accurate and connected metering for electricity, water, and gas, supporting smarter utility management.

ICT & Connectivity: Explore AR/VR retinal projection modules, energy harvesting systems, and acoustic innovations. Highlights include PiezoListen floating speakers, BLE-powered CeraCharge demos, and immersive sound and navigation technologies for smart devices and wearables.

Accessibility & Innovation: TDK presents the WeWALK Smart Cane, powered by ultrasonic time-of-flight sensors, accelerometers, gyroscopes, and MEMS microphones — enhancing mobility and independence for visually impaired users.

The post TDK showcases at electronica India 2025 its latest technologies driving transformation in automotive, industrial, sustainable energy, and digital systems appeared first on ELE Times.

k-Space hires new sales director

Semiconductor today - Wed, 08/06/2025 - 13:18
k-Space Associates Inc of Dexter, MI, USA — which produces thin-film metrology instrumentation and software — has recruited Heidi Olson as its new sales director, supporting its global growth in the semiconductor and photovoltaics industries and continuing to build out its presence in the glass manufacturing, solar and industrial markets...

Top 10 Machine Learning Frameworks

ELE Times - Wed, 08/06/2025 - 12:43

Today’s world includes self-driving cars, voice assistants, recommendation engines, and even medical diagnoses thrive powered at their core by robust machine learning frameworks. Machine learning frameworks are the solution that really fuels all these intelligent systems. This article will delve into the definition and what it means to function as a machine learning framework, mention some popular examples, and review the top 10 ML frameworks.

A machine learning framework is a set of tools, libraries, and interfaces to assist developers and data scientists in building, training, testing, and deploying machine learning models.

It functions as a ready-made software toolkit, handling the intricate code and math so that users may concentrate on creating and testing algorithms.

Here is how most ML frameworks work:

  1. Data Input: You feed your data into the framework (structured/unstructured).
  2. Model Building: Pick or design an algorithm (e.g neural networks).
  3. Training: The model is fed data so it learns by adjusting weights via optimization techniques.
  4. Evaluation: Check the model’s accuracy against brand new data.
  5. Deployment: Roll out the trained model to implementation environments (mobile applications, website etc.)

Examples of Machine Learning Frameworks:

  • TensorFlow
  • PyTorch
  • Scikit-learn
  • Keras
  • XGBoost

Top 10 Machine Learning Frameworks:

  1. TensorFlow

Google Brain created the open-source TensorFlow framework for artificial intelligence (AI) and machine learning (ML). It was created to make it easier to create, train and implement machine learning models especially deep learning models across several platforms by offering the necessary tools.

Applications supported by TensorFlow are diverse and include time series forecasting, reinforcement learning, computer vision and natural language processing.

  1. PyTorch

Created by Facebook AI Research, PyTorch is an eminent, yet beginner-friendly academic research framework. PyTorch uses dynamic computation graphs that provide easy debugging and testing. Being very flexible, it is mostly preferred while conducting deep learning work with a number of breakthroughs and research papers taking PyTorch as their primary framework.

  1. Scikit-learn

Scikit-learn is a Python library built upon NumPy and SciPy. It’s the best choice for classical machine learning algorithms like linear regression, decision trees, and clustering. It’s simple API with documented instructions for use makes it fit for handling small to medium-sized datasets when prototyping.

  1. Keras

Being a high-level API, Keras is tightly integrated into TensorFlow. More modern deep learning techniques promoted and supported from the interface deliver ease in realizing ML problems. Keras covers all the stages that an ML engineer goes through in the realization of a solution: data processing, hyperparameter tuning, deployment, etc. Its intention was to enable fast experimentation.

  1. XGBoost

XGBoost- Extreme Gradient Boosting-is an advanced machine-learning technique geared toward efficiency, speed, and utmost performance. It is a GBDT-based machine-learning library that is scalable and distributed. It is the best among the machine learning libraries for regression, classification, and ranking, offering parallel tree-boosting.

The understanding of the bases of machine learning and the methods on which XGBoost runs is important; these are supervised machine learning, decision trees, ensemble learning, and gradient boosting.

  1. LightGBM

LightGBM is an open-source high-performance framework and is also created by Microsoft. It is the technique on gradient boosting used in ensemble learning framework.

LightGBM is a fast gradient boosting framework that uses tree-based learning algorithms. It was developed in the product environment while keeping the requirements of speed and scalability in mind. Training times are much shorter, and the computer resources are fewer. Memory requirements are also less, making it suitable for resource-starved systems.

LightGBM will also, in many cases, provide better predictive accuracy because of its novel histogram-based algorithm and optimized decision tree growth strategies. It allows for parallel learning, distributed training on multiple machines, and GPU acceleration-to scale to massive datasets while maintaining performance

  1. Jax

JAX is an open-source machine learning framework based on the functional programming paradigm developed and maintained by Google. JAX stands for “Just Another XLA,” where XLA is short for Accelerated Linear Algebra. It is famous for numerical computation and automatic differentiation, which assist in the implementation of many machine learning algorithms. JAX, being a relatively new machine learning framework, is some way in providing features useful in realizing a machine learning model.

  1. CNTK

Microsoft Cognitive Toolkit (CNTK) is an open-source deep learning framework developed by Microsoft to implement efficient training of deep neural networks. It is scalable in training models across multiple GPUs and across multiple servers, especially good for large datasets and complex architectures. Weighing its flexibility, CNTK supports almost all classes of neural networks and is useful in many kinds of machine-learning tasks such as feedforward, convolutional, and recurrent networks.

  1. Apache Spark MLlib

Apache Spark MLlib is Apache Spark’s scalable machine learning library built to ease the development and deployment of machine learning apps for large datasets. It offers a rich set of tools and algorithms for various machine learning tasks. It is designed for simplicity, scalability and easy integration with other tools.

  1. Hugging Face Transformers

Hugging Face Transformers is an open-source framework specializing in deep learning paradigms developed by Hugging Face. It provides APIs and interfaces for the download of state-of-the-art pre-trained models. Following their download, the user can then fine-tune the model to best serve his or her purpose. The models perform usual tasks in all modalities, including natural language processing, computer vision, audio, and multi-modal. Hugging Face Transformers represent Machine Learning toolkits for NLP, trained on specific tasks.

Conclusion:

Machine learning frameworks represent the very backbone of modern AI applications. Whether a beginner or a seasoned pro building very advanced AI solutions, the right framework will make all the difference.

From huge players such as TensorFlow and PyTorch down to niche players such as Hugging Face and LightGBM, each framework claims certain virtues that it is best suited for in different kinds of tasks and industries.

The post Top 10 Machine Learning Frameworks appeared first on ELE Times.

Keysight Automated Test Solution Validates Fortinet’s SSL Deep Inspection Performance and Network Security Efficacy

ELE Times - Wed, 08/06/2025 - 09:32

Keysight BreakingPoint QuickTest simplifies application performance and security effectiveness assessments with predefined test configurations and self-stabilizing, goal-seeking algorithms

Keysight Technologies, Inc. announced that Fortinet chose the Keysight BreakingPoint QuickTest network application and security test tool to validate SSL deep packet inspection performance capabilities and security efficacy of its FortiGate 700G series next-generation firewall (NGFW). BreakingPoint QuickTest is Keysight’s turn-key performance and security validation solution with self-stabilizing, goal-seeking algorithms that quickly assess the performance and security efficacy of a variety of network infrastructures.

Enterprise networks and systems face a constant onslaught of cyber-attacks, including malware, vulnerabilities, and evasions. These attacks are taking a toll, as 67% of enterprises report suffering a breach in the past two years, while breach-related lawsuits have risen 500% in the last four years.

Fortinet developed the FortiGate 700G series NGFW to help protect enterprise edge and distributed enterprise networks from these ever-increasing cybersecurity threats, while continuing to process legitimate customer-driven traffic that is vital to their core business. The FortiGate 700G is powered by Fortinet’s proprietary Network Processor 7 (NP7), Security Processor 5 (SP5) ASIC, and FortiOS, Fortinet’s unified operating system. Requiring an application and security test solution that delivers real-world network traffic performance, relevant and reliable security assessment, repeatable results, and fast time-to-insight, Fortinet turned to Keysight’s BreakingPoint QuickTest network applications and security test tool.

Using BreakingPoint QuickTest, Fortinet validated the network performance and cybersecurity capabilities of the FortiGate 700G NGFW using:

  • Simplified Test Setup and Execution: Pre-defined performance and security assessment suites, along with easy, click-to-configure network configuration, allow users to set up complex tests in minutes.
  • Reduced Test Durations: Self-stabilizing, goal-seeking algorithms accelerate the test process and shorten the overall time-to-insight.
  • Scalable HTTP and HTTPS Traffic Generation: Supports all RFC 9411 tests used by NetSecOPEN, an industry consortium that develops open standards for network security testing. This includes the 7.7 HTTPS throughput test, allowing Fortinet to quickly assess that the FortiGate 700G NGFW’s SSL Deep Inspection engine can support up to 14 Gbps of inspected HTTPS traffic.
  • NetSecOPEN Security Efficacy Tests: BreakingPoint QuickTest supports the full suite of NetSecOPEN security efficacy tests, including malware, vulnerabilities, and evasions. This ensures the FortiGate 700G capabilities are validated with relevant, repeatable, and widely accepted industry standard test methodologies and content.
  • Robust Reporting and Real-time Metrics: Live test feedback and clear, actionable reports showed that the FortiGate 700G successfully blocked 3,838 of the 3,930 malware samples, 1,708 of the 1,711 CVE threats, and stopped 100% of evasions, earning a grade “A” across all security tests.

Nirav Shah, Senior Vice President, Products and Solutions, Fortinet, said: “The FortiGate 700G series next-generation firewall combines cutting-edge artificial intelligence and machine learning with the port density and application throughput enterprises need, delivering comprehensive threat protection at any scale. Keysight’s intuitive BreakingPoint QuickTest application and security test tool made our validation process easy. It provided clear and definitive results that the FortiGate 700G series NGFW equips organizations with the performance and advanced network security capabilities required to stay ahead of current and emerging cyberthreats.”

Ram Periakaruppan, Vice President and General Manager, Keysight Network Test and Security Solutions, said: “The landscape of cyber threats is constantly evolving, so enterprises must be vigilant in adapting their network defenses, while also continuing to meet their business objectives. Keysight’s network application and security test solutions help alleviate the pressure these demands place on network equipment manufacturers by providing an easy-to-use package with pre-defined performance and security tests, innovative goal-seeking algorithms, and continuously updated benchmarking content, ensuring solutions meet rigorous industry requirements.”

The post Keysight Automated Test Solution Validates Fortinet’s SSL Deep Inspection Performance and Network Security Efficacy appeared first on ELE Times.

TIL you can use the iPhone magnifier app to inspect PCB much better than the camera app

Reddit:Electronics - Wed, 08/06/2025 - 02:22
TIL you can use the iPhone magnifier app to inspect PCB much better than the camera app

One of the difficulties I had with the camera app is that you couldn't leave the LED on for close up pictures to read off resistor codes. The magnifier app will let you manually leave the iPhone flashlight on, and set a fixed zoom if needed and save the controls layout so you can jump back to PCB inspection. The first picture is with the magnifier and the second is with the iPhone camera app. It saves you from needing to take a PCB to a microscope to figure out what was up with it. Also saves some disassembly to get the PCB out of whatever it is installed in. I was able to figure out the board at some point had been hand soldered with the wrong resistor value and that was the source of all our issues.

submitted by /u/grahasbtye
[link] [comments]

First Ethernet-Based AI Memory Fabric System to Increase LLM Efficiency

AAC - Wed, 08/06/2025 - 02:00
Enfabrica's new Ethernet-based AI memory fabric system drops AI inference cost per user per token by up to 50%.

AXT’s Q2 revenue constrained by slower-than-expected China export permitting

Semiconductor today - Tue, 08/05/2025 - 21:54
For second-quarter 2025, AXT Inc of Fremont, CA, USA — which makes gallium arsenide (GaAs), indium phosphide (InP) and germanium (Ge) substrates and raw materials — has reported revenue of $18m, down 7.2% on $19.4m last quarter and 35.5% on $27.9m a year ago. This is below the guidance of $20–22m provided on 1 May, but at the top end of the revised guidance of $17.5–18m issued on 9 July...

Vijay Varada's Braille display modified so that the driver of the display is integrated into the cell.

Reddit:Electronics - Tue, 08/05/2025 - 20:49

https://hackaday.io/project/191181-electromechanical-refreshable-braille-module Based on this.

This board has a cheap ch32v003 microcontroller and communicates by i2c and can be chained together so you can have multiple on the same i2c bus. This is the smallest board I have ever made. Feedback appreciated, Thank you!

submitted by /u/WWFYMN1
[link] [comments]

The next AI frontier: AI inference for less than $0.002 per query

EDN Network - Tue, 08/05/2025 - 20:48

Inference is rapidly emerging as the next major frontier in artificial intelligence (AI). Historically, the AI development and deployment focus has been overwhelmingly on training with approximately 80% of compute resources dedicated to it and only 20% to inference.

That balance is shifting fast. Within the next two years, the ratio is expected to reverse to 80% of AI compute devoted to inference and just 20% to training. This transition is opening a massive market opportunity with staggering revenue potential.

Inference has a fundamentally different profile—it requires lower latency, greater energy efficiency, and predictable real-time responsiveness than training-optimized hardware, which entails excessive power consumption, underutilized compute, and inflated costs.

When deployed for inference, the training-optimized computing resources result in a cost-per-query at one or even two orders of magnitude higher than the benchmark of a cost of $0.002 per query established by a 2023 McKinsey analysis based on the Google 2022 search activity estimated to be in average 100,000 queries per second.

Today, the market is dominated by a single player whose quarterly results reflect its stronghold. While a competitor has made some inroads and is performing respectably, it has yet to gain meaningful market share.

One reason is architectural similarity; by taking a similar approach to the main player, rather than offering a differentiated, inference-optimized alternative, the competitor faces the same limitations. To lead in the inference era, a fundamentally new processor architecture is required. The most effective approach is to build dedicated, inference-optimized infrastructure, an architecture specifically tailored to the operational realities of processing generative AI models like large language models (LLMs).

This means rethinking everything from compute units and data movement to compiler design and LLM-driven architectures. By focusing on inference-first design, it’s possible to achieve significant gains in performance-per-watt, cost-per-query, time-to-first-token, output-token-per-second, and overall scalability, especially for edge and real-time applications where responsiveness is critical.

This is where the next wave of innovation lies—not in scaling training further, but in making inference practical, sustainable, and ubiquitous.

The inference trinity

AI inference hinges on three critical pillars: low latency, high throughput and constrained power consumption, each essential for scalable, real-world deployment.

First, low latency is paramount. Unlike training, where latency is relatively inconsequential—a job taking an extra day or costing an additional million dollars is still acceptable as long as the model is successfully trained—inference operates under entirely different constraints.

Inference must happen in real time or near real time, with extremely low latency per query. Whether it’s powering a voice assistant, an autonomous vehicle or a recommendation engine, the user experience and system effectiveness hinge on sub-millisecond response times. The lower the latency, the more responsive and viable the application.

Second, high throughput at low cost is essential. AI workloads involve processing massive volumes of data, often in parallel. To support real-world usage—especially for generative AI and LLMs—AI accelerators must deliver high throughput per query while maintaining cost-efficiency.

Vendor-specified throughput often falls short of peak targets in AI workload processing due to low-efficiency architectures like GPUs. Especially, when the economics of inference are under intense scrutiny. These are high-stakes battles, where cost per query is not just a technical metric—it’s a competitive differentiator.

Third, power efficiency shapes everything. Inference performance cannot come at the expense of runaway power consumption. This is not only a sustainability concern but also a fundamental limitation in data center design. Lower-power devices reduce the energy required for compute, and they ease the burden on the supporting infrastructure—particularly cooling, which is a major operational cost.

The trade-off can be viewed from the following two perspectives:

  • A new inference device that delivers the same performance at half the energy consumption can dramatically reduce a data center’s total power draw.
  • Alternatively, maintaining the same power envelope while doubling compute efficiency effectively doubles the data center’s performance capacity.

Bringing inference to where users are

A defining trend in AI deployment today is the shift toward moving inference closer to the user. Unlike training, inference is inherently latency-sensitive and often needs to occur in real time. This makes routing inference workloads through distant cloud data centers increasingly impractical—from both a technical and economic perspective.

To address this, organizations are prioritizing edge-based inference processing data locally or near the point of generation. Shortening the network path between the user and the inference engine significantly improves responsiveness, reduces bandwidth costs, enhances data privacy, and ensures greater reliability, particularly in environments with limited or unstable connectivity.

This decentralized model is gaining traction across industry. Even AI giants are embracing the edge, as seen in their development of high-performance AI workstations and compact data center solutions. These innovations reflect a clear strategic shift: enabling real-time AI capabilities at the edge without compromising on compute power.

Inference acceleration from the ground up

One high-tech company, for example, is setting the engineering pace with a novel architecture designed specifically to meet the stringent demands of AI inference in data centers and at the edge. The architecture breaks away from legacy designs optimized for training workloads with near-theoretical performance in latency, throughput, and energy efficiency. More entrants are certain to follow.

Below are some of the highlights of this inference technology revolution in the making.

Breaking the memory wall

The “memory wall” has challenged chip designers since the late 1980s. Traditional architectures attempt to mitigate the impact on performance introduced by data movement between external memory and processing units by layering memory hierarchies, such as multi-layer caches, scratchpads and tightly coupled memory, each offering tradeoffs between speed and capacity.

In AI acceleration, this bottleneck becomes even more pronounced. Generative AI models, especially those based on incremental transformers, must constantly reprocess massive amounts of intermediate state data. Conventional architectures struggle here. Every cache miss—or any operation requiring access outside in-memory compute—can severely degrade performance.

One approach collapses the traditional memory hierarchy into a single, unified memory stage: a massive SRAM array that behaves like a flat register file. From the perspective of the processing units, any register can be accessed anywhere, at any time, within a single clock. This eliminates costly data transfers and removes the bottlenecks that hamper other designs.

Flexible computational tiles with 16 high-performance processing cores dynamically reconfigurable at run-time executes either AI operations, like multi-dimensional matrix operations (ranging from 2D to N-dimensional), or advanced digital signal processing (DSP) functions.

Precision is also adjustable on-the-fly, supporting formats from 8 bits to 32 bits in both floating point and integer. Both dense and sparse computation modes are supported, and sparsity can be applied on the fly to either weights or data—offering fine-grained control for optimizing inference workloads.

Each core features 16-million registers. While a vast register file presents challenges for traditional compilers, two key innovations come to rescue:

  1. Native tensor processing, which handles vectors, tensors, and matrices directly in hardware, eliminates the need to reduce them to scalar operations and manually implements nested loops—as required in GPU environments like CUDA.
  2. With high-level abstraction, developers can interact with the system at a high level—PyTorch and ONNX for AI and Matlab-like functions for DSP—without the need to write low-level code or manage registers manually. This simplifies development and significantly boosts productivity and hardware utilization.

Chiplet-based scalability

A physical implementation leverages a chiplet architecture, with each chiplet comprising two computational cores. By combining chiplets with high-bandwidth memory (HBM) chiplet stacks, the architecture enables highly efficient scaling for both cloud and edge inference scenarios.

  • Data center-grade inference for efficient tailoring of compute and memory resources suits edge constraints. The configuration pairs eight VSORA chiplets with eight HBM3e chiplets, delivering 3,200 TFLOPS of compute performance in FP8 dense mode and optimized for large-scale inference workloads in data centers.
  • Edge AI configurations allow efficient tailoring of compute resources and lower memory requirements to suit edge constraints. Here, two chiplets + one HBM chiplet = 800 TFLOPS and four chiplets + one HBM chiplet = 1,600 TFLOPS.

Power efficiency as a side effect

The performance gains are clear as is power efficiency. The architecture delivers twice the performance-per-watt of comparable solutions. In practical terms, the chip draw stops at just 500 watts, compared to over one kilowatt for many competitors.

When combined, these innovations provide multiple times the actual performance at less than half the power—offering an overall advantage of 8 to 10 times compared to conventional implementations.

CUDA-free compilation

One often-overlooked advantage of the architecture lies in its streamlined and flexible software stack. From a compilation perspective, the flow is simplified compared to traditional GPU environments like CUDA.

The process begins with a minimal configuration file—just a few lines—that defines the target hardware environment. This file enables the same codebase to execute across a wide range of hardware configurations, whether that means distributing workloads across multiple cores, chiplets, full chips, boards, or even across nodes in a local or remote cloud. The only variable is execution speed; the functional behavior remains unchanged. This makes on-premises and localized cloud deployments seamless and scalable.

A familiar flow without complexity

Unlike CUDA-based compilation processes, the flow appears basic without layers of manual tuning and complexity through a more automated and hardware-agnostic compilation approach.

The flow begins by ingesting standard AI inputs, such as models defined in PyTorch. These are processed by a proprietary graph compiler that automatically performs essential transformations such as layer reordering or slicing for optimal execution. It extracts weights and model structure and then outputs an intermediate C++ representation.

This C++ code is then fed into an LLVM-based backend, which identifies the compute-intensive portions of the code and maps them to the architecture. At this stage, the system becomes hardware-aware, assigning compute operations to the appropriate configuration—whether it’s a single A tile, an edge device, a full data center accelerator, a server, a rack or even multiple racks in different locations.

Invisible acceleration for developers

From a developer’s point of view, the accelerator is invisible. Code is written as if it targets the main processor. During compilation, the compilation flow identifies the code segments best suited for acceleration and transparently handles the transformation and mapping to hardware, lowering the barrier for adoption and requiring no low-level register manipulation or specialized programming knowledge.

The instruction set is high-level and intuitive, carrying over capabilities from its origins in digital signal processing. The architecture supports AI-specific formats such as FP8 and FP16, as well as traditional DSP operations like FP16/ arithmetic, all handled automatically on a per-layer basis. Switching between modes is instantaneous and requires no manual intervention.

Pipeline-independent execution and intelligent data retention

A key architectural advantage is pipeline independence—the ability to dynamically insert or remove pipeline stages based on workload needs. This gives the system a unique capacity to “look ahead and behind” within a data stream, identifying which information must be retained for reuse. As a result, data traffic is minimized, and memory access patterns are optimized for maximum performance and efficiency, reaching levels unachievable in conventional AI or DSP systems.

Built-in functional safety

To support mission-critical applications such as autonomous driving, functional safety features are integrated at the architectural level. Cores can be configured to operate in lockstep mode or in redundant configurations, enabling compliance with strict safety and reliability requirements.

In the final analysis, a memory architecture that eliminates traditional bottlenecks, compute units tailored for tensor operations, and unmatched power efficiency sets a new standard for AI inference.

Lauro Rizzatti is a business advisor to VSORA, an innovative startup offering silicon IP solutions and silicon chips, and a noted verification consultant and industry expert on hardware emulation.

 

Related Content

The post The next AI frontier: AI inference for less than $0.002 per query appeared first on EDN.

Bosch Propels Advanced ADAS Forward With Pair of Radar SoCs

AAC - Tue, 08/05/2025 - 20:00
Built on 22 nm RF-CMOS and optimized for AI-based sensing, the new radar SoCs take aim at next-generation ADAS performance.

Beijing IP Court denies Innoscience’s appeal against EPC’s compensated-gate patent

Semiconductor today - Tue, 08/05/2025 - 19:11
Efficient Power Conversion Corp (EPC) of El Segundo, CA, USA — which makes enhancement-mode gallium nitride on silicon (eGaN) power field-effect transistors (FETs) and integrated circuits for power management applications —says that the Beijing IP Court has denied the appeal filed by China-based gallium nitride-on-silicon (GaN-on-Si) power solutions firm Innoscience (Suzhou) Technology Co Ltd, reaffirming the validity of EPC’s Chinese Patent No. ZL201080015425.X, ‘Compensated gate MISFET and method for fabricating the same’...

Pages

Subscribe to Кафедра Електронної Інженерії aggregator