Feed aggregator

"Міська мозаїка" Лариси Пуханової в музеї КПІ

Новини - 2 hours 5 min ago
"Міська мозаїка" Лариси Пуханової в музеї КПІ
Image
kpi чт, 05/21/2026 - 14:40
Текст

У Державному політехнічному музеї при КПІ ім. Ігоря Сікорського нещодавно пройшла виставка відомої київської художниці Лариси Пуханової. На виставці було представлено її живописні та графічні роботи: міські пейзажі з різних куточків світу, давні квартали, химерні дворики, ранкові київські вулиці в тумані.

How machine vision, intelligent sensing, and edge AI are powering smart factory

EDN Network - 5 hours 41 min ago

Manufacturing is at a pivotal moment. Global supply-chain volatility, increasing energy costs, workforce shortages, and growing expectations for quality and customization are forcing factories to rethink how they operate. Traditional automation, optimized for predictability and repetition, struggles to cope with today’s variability and speed of change.

The smart factory represents a decisive shift: production environments that can sense, interpret, and adapt in real time. Central to this shift are three tightly connected technology domains: machine vision, intelligent sensing, and edge AI. Together, they enable factories not just to collect data, but to turn it into insight and action where it matters most.

Figure 1 The notion of smart factory marks a decisive shift in modern manufacturing. Source: Renesas

The limits of conventional automation

Conventional automation systems excel at executing predefined logic. However, they are inherently reactive. When processes drift, materials vary, or equipment degrades, intervention is often manual, time‑consuming, and costly.

Key pressures accelerating the move toward smarter automation include:

  • Greater product diversity driven by mass customization
  • Higher quality expectations that allow little tolerance for defects
  • Skilled labor shortages across engineering and maintenance roles
  • Soaring downtime costs, particularly in highly automated lines

Addressing these challenges requires automation systems that are more perceptive and context-aware systems capable of learning from data rather than simply enforcing rules.

Below is a quick recap of smart factory’s three key design building blocks: machine vision, intelligent sensing, and edge AI.

Machine vision: From inspection to interpretation

Machine vision is one of the most visible pillars of the smart factory. Once limited to basic presence checks or rigid defect criteria, today’s vision systems can interpret complex scenes and adapt to variation.

Seeing beyond pass or fail

Traditional, rule-based vision systems perform well under tightly controlled conditions but tend to break down when lighting, materials, or product designs change. Modern vision approaches increasingly incorporate learning-based techniques that recognize patterns instead of relying on fixed thresholds.

Figure 2 Modern vision systems recognize patterns instead of relying on fixed thresholds. Source: Renesas

This evolution enables machines to distinguish acceptable variation from true defects, adapt to new product versions with minimal retraining, and provide richer information for downstream decision-making.

Broader roles on the factory floor

Machine vision now plays a central role in:

  • In-line quality assurance, detecting cosmetic, structural, and assembly issues
  • Robot guidance, enabling flexible pick-and-place and assembly operations
  • Traceability, supporting serialization and regulatory compliance
  • Safety monitoring, detecting unsafe conditions or human proximity

As processing moves closer to where images are captured, vision becomes more responsive and resilient, key traits for real-time factory environments.

Figure 3 Machine vision technology is quickly acquiring the key traits required in real-time factory environments. Source: Renesas

Intelligent sensing: Adding awareness to automation

While machine vision provides visual insight, intelligent sensing fills in the rest of the picture. Parameters such as vibration, temperature, current, torque, pressure, and acoustics reveal what is happening inside machines and processes.

From measurement to meaning

Intelligent sensors are no longer passive components. Increasingly, they embed local processing and diagnostics, enabling them to filter and contextualize raw signals, detect subtle behavioral changes, and reduce unnecessary data transmission.

Instead of reporting isolated values, sensors can now indicate conditions such as early wear, imbalance, or inefficiency.

The power of sensor fusion

True process understanding emerges when multiple sensor types are combined. By correlating visual data with physical and environmental measurements, factories gain a far more reliable and nuanced view of operations.

For example, a visual anomaly combined with abnormal vibration data may indicate tool degradation rather than a material flaw. This holistic view reduces false alarms and accelerates corrective action.

Edge AI: Intelligence at the point of action

Edge AI ties machine vision and intelligent sensing together, enabling factories to interpret complex data locally, without relying on constant cloud connectivity.

Why the edge matters

Manufacturing environments demand capabilities that centralized systems struggle to provide:

  • Low-latency decision-making for time-critical control
  • Operational autonomy in environments with limited connectivity
  • Data sovereignty and IP protection
  • Scalable deployment across many machines and lines

Edge AI meets these needs by bringing inference and decision logic directly to machines.

Figure 4 Edge AI, the third key building block in smart factory designs, ties machine vision and intelligent sensing. Source: Renesas

Practical impact on operations

With edge AI, factories become more intelligent and proactive in their operations. Instead of reacting to problems after they occur, systems can predict potential failures in advance and help avoid costly disruptions. Processes can also be adjusted in real time to account for changes in materials or environmental conditions, ensuring consistent quality and efficiency.

In addition, AI-driven systems can identify unusual patterns and anomalies that were not explicitly programmed, enabling earlier detection of issues. At the same time, more intuitive and responsive human–machine interactions improve safety and usability on the shop floor. Altogether, this represents a clear shift from reactive control toward adaptive, self-optimizing operations.

Convergence: Creating intelligence through integration

The greatest gains emerge when machine vision, intelligent sensing, and edge AI are designed as a unified system rather than isolated capabilities.

Consider a high-mix production line:

  • Machine vision identifies subtle quality deviations
  • Intelligent sensors monitor mechanical and electrical behavior
  • Edge AI correlates these inputs to identify emerging issues

Instead of scrapping products or stopping the line, the system can adjust in real time, maintaining quality while maximizing throughput. This distributed intelligence also simplifies factory architectures. Decisions are made close to the process, improving responsiveness and system robustness.

Designing for sustainable smart factories

Achieving this level of intelligence is not just a technical challenge, it is a system and ecosystem challenge. Manufacturers need platforms that simplify integration across sensing, processing, connectivity, and security, while supporting long product lifecycles typical of industrial environments.

As adoption accelerates, successful smart factory strategies share several traits:

  • Scalability, allowing intelligence to be added incrementally
  • Interoperability, avoiding vendor lock-in
  • Lifecycle support, including long-term availability and maintenance
  • Energy-efficient design, balancing performance with sustainability

Smart factories built on these principles are better equipped to adapt, not just to current challenges, but to future uncertainty.

In the final analysis, smart factory is not defined by a single technology, but by how technologies work together. Machine vision gives machines eyes. Intelligent sensing provides awareness. Edge AI delivers understanding.

With the right enablement and ecosystem support, manufacturers can move beyond reactive automation toward systems that continuously learn, adapt, and improve. In doing so, they transform data into decisions, and factories into resilient, future-ready operations.

Suad Jusuf is director of product marketing at Renesas Electronics. His work centers on defining distinctive value, empowering differentiation, and accelerating customer success through integrated MCU/MPU platforms, AI tools, and system‑level enablement and offerings.

Special Section: Smart Factory

The post How machine vision, intelligent sensing, and edge AI are powering smart factory appeared first on EDN.

Круглий стіл "Хроніки Чорнобиля" у ДПМ

Новини - Wed, 05/20/2026 - 23:51
Круглий стіл "Хроніки Чорнобиля" у ДПМ
Image
Інформація КП ср, 05/20/2026 - 23:51
Текст

Цей Круглий стіл було організовано в Державному політехнічному музеї імені Бориса Патона при КПІ 22 квітня. Його гостями стали люди, які брали безпосередню участь в подоланні наслідків наймасштаб­нішої техногенної катастрофи в історії людства, та українські журналісти, які попри перепони влади першими розповіли і показали правду про трагедію на ЧАЕС. А ще – майбутні журналісти, які навчаються в дитячій Медіашколі Sail міста Василькова та освоюють ази професії в Інформа­ційно-творчому агентстві "ЮН-ПРЕС" Київського палацу дітей та юнацтва.

Конференція «Використання ШІ в публічному управлінні: виклики, можливості, перспективи»

Новини - Wed, 05/20/2026 - 23:14
Конференція «Використання ШІ в публічному управлінні: виклики, можливості, перспективи»
Image
kpi ср, 05/20/2026 - 23:14
Текст

👥 Понад 80 представників державних органів, науки, освіти та міжнародних організацій і понад 700 онлайн-слухачів. Такою масштабною видалася II науково-практична конференція з міжнародною участю «Використання ШІ в публічному управлінні: виклики, можливості, перспективи» в КПІ ім. Ігоря Сікорського.

DIY Raspberry Pi Oscilloscope

Reddit:Electronics - Wed, 05/20/2026 - 22:04
DIY Raspberry Pi Oscilloscope

As a follow-up to the toy oscilloscope I designed here, I designed and built something that more closely resembles a real oscilloscope! I included some shots of the build process, all done at home by hand with a hot air station and a preheater.

It has 2 channels, each running an ADC3908 off of a shared clock at anywhere from 1MS/s to 62.5MS/s. I wanted to use the 125MS/s version of the part but since I'm still using the Pi for all of the data acquisition and processing, this is about as fast as you can possibly go.

The front-end was supposed to have ~30MHz of analog bandwidth but since I had to remove the filter caps after assembly, I think theoretically it has whatever the bandwidth is at the ADC inputs. All of the analog components before the ADC have higher bandwidth.

It supports input full-scale ranges from +/-33mV to +/-180V, though I'm hesitant to plug something I made into mains power. It should be isolated as all power comes from the Pi, either through a wall plug or USB powerbank, but I'm still wary. I'll probably try it one day though.

It wound up costing way more than I would have hoped, and I probably chose some components that were more expensive than necessary. For example: the two linear regulators I used for the analog supply rails are pricy because of their very low noise, but my actual noise levels aren't great in the end. I think the total BOM cost was ~$150 if you include the PCB and you can get a way faster real scope for that price. It was still a great learning project though.

submitted by /u/hapemask
[link] [comments]

Multiphase controllers optimize mobile Vcore power

EDN Network - Wed, 05/20/2026 - 20:06

Three digital multiphase controllers from AOS enable Intel IMVP9.3 Vcore power delivery in high-performance mobile systems. When paired with the company’s DrMOS and Smart Power Stage devices, the AOZ71049QI, AOZ71149QI, and AOZ71146QI form a complete power solution for Intel Panther Lake and Wildcat Lake mobile processor architectures.

The buck controllers use AOS’s advanced transient modulation (A2TM), a hybrid approach that combines digital tuning with analog efficiency. By integrating variable-frequency hysteretic peak current-mode control with advanced phase current sensing, they deliver fast transient response and balanced current sharing across both transient and DC loads. They also maintain low quiescent power across all Intel IMVP9.3 power states, helping maximize battery life in laptops and notebooks. Key features are summarized below:

  • Flexible configurations: Up to 4+2+1+2 phase outputs for Core (IA), Graphics (GT), Auxiliary (SA), and LPCORE domains
  • Low quiescent current: 5.9 mA at PS0 in 3+2+1+1 configurations
  • Power management: Autonomous phase shedding and auto-DCM to reduce power loss
  • Compatibility: Supports industry-standard DrMOS and driver + MOSFET power stages from multiple vendors
  • Acoustic noise suppression: Integrated features reduce audible noise under dynamic load conditions

The AOZ71049QIAOZ71149QI and AOZ71146QI are available in production volume, with a lead time of 12 to 16 weeks. Prices start at $2.66 in 1000-piece quantities.

Alpha & Omega Semiconductor 

The post Multiphase controllers optimize mobile Vcore power appeared first on EDN.

LED driver animates exterior vehicle lighting

EDN Network - Wed, 05/20/2026 - 20:05

Lumissil’s IS32FL3776 matrix LED driver brings expressive intelligent signal displays (ISDs) to software-defined exterior automotive lighting. With 36 constant-current channels providing 60 mA each, it drives dynamic LED light matrices up to 36×6 with as many as 216 individually addressable LEDs.

Automotive ISD systems use matrix LED patterns to communicate vehicle intent, safety status, driver-assistance cues, and brand identity. The IS32FL3776 enables compact LED designs used in RGB mini LED displays, full-width front light strips, grille lamps, automated driving system marker lamps, and other expressive vehicle lighting functions.

The driver features high-resolution, high-frequency dithered PWM for fine brightness adjustment and smooth animations without flicker or camera banding. For improved system efficiency and thermal performance, the IS32FL3776 uses DCFB adaptive control to optimize the LED supply rail while maintaining sufficient headroom for proper current regulation. A software-configurable architecture supports either internal operation or external PMOS drive for power sequencing in larger matrix configurations.

The IS32FL3776 is available for sampling and volume production, with evaluation hardware and reference designs provided to facilitate system development.

IS32FL3776 product page 

Lumissil Microsystems 

The post LED driver animates exterior vehicle lighting appeared first on EDN.

MCUs bridge I3C across voltage domains

EDN Network - Wed, 05/20/2026 - 20:04

Microchip’s PIC18-Q20 MCUs integrate up to two I3C peripherals and Multi-Voltage I/O (MVIO) in 14- and 20-pin packages as small as 3×3 mm. Well suited for sensor interfacing, real-time control, and connectivity applications, they simplify communication across multiple voltage domains with minimal external circuitry.

Compared to I2C, I3C provides higher data rates and lower power consumption while remaining backward compatible with legacy systems. The MCUs operate across three independent voltage domains, with MVIO-enabled pins supporting I3C communication down to 1.0 V. Additional integration includes a 10-bit ADC with computation, capacitive touch sensing, and an 8-bit signal routing port for flexible peripheral interconnect.

The PIC18-Q20 series can process sensor data, manage low-latency interrupts, and perform system status reporting, reducing the workload on a host MCU in larger systems. These devices are supported by Microchip’s hardware and software development ecosystem, including the PIC18F16Q20 Curiosity Nano Evaluation Kit for rapid prototyping.

Now in production, the PIC18-Q20 MCUs are available from Microchip and its authorized distributors.

PIC18-Q20 product page 

Microchip Technology 

The post MCUs bridge I3C across voltage domains appeared first on EDN.

Motor MCU integrates driver and control functions

EDN Network - Wed, 05/20/2026 - 20:03

Toshiba is sampling the TB9M040FTG motor control device, which integrates an MCU and motor driver for controlling small automotive motors. Part of the SmartMCD series, it supports single-channel motor drive currents up to 2 A, enabling direct drive of three-phase brushless DC motors used in electric valves, HVAC dampers, flaps, and grille shutters.

In addition to an Arm Cortex-M23 processor core running at up to 40 MHz and the motor driver, the TB9M040FTG incorporates flash memory, a 5-V high-side driver for power-supply functions, and a power supply that operates at automotive battery voltage levels. It also integrates a LIN transceiver for ECU communication.

The device features a hardware vector engine that offloads field-oriented control (FOC) processing, helping reduce CPU load and software size. Back-EMF detection enables sensorless square-wave control.

All functions are integrated into a compact VQFN36 package, reducing component count in automotive equipment. The TB9M040FTG is compliant with AEC-Q100 Grade 0 and ASIL-B requirements.

TB9M040FTG product page

Toshiba Electronic Devices & Storage 

The post Motor MCU integrates driver and control functions appeared first on EDN.

CPU IP processes mixed scalar and vector workloads

EDN Network - Wed, 05/20/2026 - 20:02

The SiFive Performance P570 Gen 3 is a RISC-V out-of-order superscalar vector processor IP designed for scalable performance. SiFive says it delivers a substantial performance improvement over the P550 Gen 1, along with a comprehensive set of mandatory and optional RVA23 profiles.

The IP can serve as the control processor in embedded IoT devices with full networking stacks or as the main applications processor in consumer devices running operating systems such as Android and enterprise-grade Linux. Its vector unit also supports AI model execution and inference on edge devices.

Multicore configurations scale to 16 cores across four clusters with shared L3 and optional L2 cache, a RISC-V-compliant interrupt architecture, and fine-grain power-management control. The P570 supports mandatory RVA23 requirements, including Hypervisor and Vector extensions, along with optional security and management extensions, RISC-V Vector Crypto, and FP16/BF16 capabilities for AI acceleration.

The Performance P570 Gen 3 IP is available now. Visit the product page for configuration and customization details.

P570 Gen 3 product page 

SiFive

The post CPU IP processes mixed scalar and vector workloads appeared first on EDN.

Designers guide: Sensors for medical devices

EDN Network - Wed, 05/20/2026 - 20:00
The ams Osram AS5920M sensor module.

The healthcare industry is progressively moving from a centralized, clinical model to a more patient-centric approach, requiring monitoring solutions that are portable, wearable, and patient-focused. This process involves significant technical and hardware challenges. Designers must find a way to maximize diagnostic accuracy and reliability in a clinical setting while also keeping power consumption, size, and long-term reliability in mind.

The design of a medical device usually follows a modular approach. This means that each part, from the first signal capture to the last communication protocol, must be optimized for speed and efficiency. This article will provide insights into some of the most relevant sensors employed in medical devices as well as the associated technologies, including analog front ends (AFEs), power management devices, and wireless system-on-chips (SoCs) for connectivity.

Pressure sensors

With the wide range of medical devices, from wearable glucose monitors to computerized tomography (CT) scan equipment, there is also a variety of sensors incorporated into these devices. These include pressure and temperature sensors as well as biosensors and accelerometers.

Pressure sensors are used in a wide range of medical equipment, from non-invasive blood pressure monitors to specialized airflow sensors in ventilators. Besides common medical requirements, such as reliability and high sensitivity, these sensors must provide robustness and endurance. Medical-grade pressure sensors must exhibit high linearity and long-term stability, maintaining calibration over weeks or months of continuous operation.

For example, TDK Corporation offers a wide portfolio of piezoresistive pressure sensor dies well-suited for high-precision measurements in the medical sector. Based on advanced silicon MEMS technology, these sensors are grouped into three main categories: absolute, gauge, and differential pressure.

As for the piezoresistive pressure measurement methods, sensor dies are available with frontside and backside absolute measurement and gauge-differential measurement. The frontside configuration, where the electronics are directly exposed, is preferred for dry, non-aggressive gases. The backside design allows the sensor to handle wet media or non-aggressive fluids because the sensitive electronic components are shielded on the opposite side of the pressure-sensing diaphragm. Finally, the gauge configuration is well-suited for physiological measurements relative to ambient.

The C39 series are highly miniaturized dies (with an area of 0.65 × 0.65 mm) with frontside absolute pressure measurement up to 1.2 bar. Able to operate over a temperature range from −40°C to 150°C, these sensors are optimized for high burst pressure and feature narrow sensitivity tolerances and high signal stability. As such, they are suited for integration into high-density wearable medical devices.

Sensors for medical imaging

Imaging technology has made significant advances in the last few years. CT scans used for the diagnosis and monitoring of various conditions, including cancer and cardiovascular diseases, have evolved with the introduction of the photon-counting CT (PCCT).

The main difference between these two techniques lies in how sensors (“detectors”) process X-rays. Conventional CT uses indirect energy-integrating detectors. X-rays hit a scintillator, convert it to light, and then to electricity. In practice, they measure the total energy accumulated, losing individual photon data.

PCCT instead uses direct-conversion sensors that convert X-rays directly into electrical pulses, counting every single photon and measuring its specific energy. This eliminates electronic noise, improves spatial resolution, and allows for precise tissue differentiation at a lower radiation dose.

Ams Osram, now part of Infineon Technologies AG, introduced a system-in-package (SiP) sensor module specifically designed for photon-counting detectors. This sensor, shown in Figure 1, enables a significant reduction in the radiation dose and diagnostic images with higher resolution.

As the company states, the AS5920M module features a 9× reduction of the module’s detector pixel size compared with traditional CT systems. Moreover, more modules can be combined in an array arrangement, increasing the detection area according to the desired CT application.

The ams Osram AS5920M sensor module.Figure 1: The AS5920M is a four-sided buttable SiP sensor module engineered for photon-counting detectors (Source: ams Osram)

At the 2025 annual meeting of the American Society for Radiation Oncology, Siemens Healthineers presented the Naeotom Alpha.Prime PCCT scanner (Figure 2) based on cadmium telluride crystal detectors that significantly improve image resolution and contrast. The company introduced the world’s first PCCT scanner in 2021.

Siemens Healthineers’ Naeotom Alpha.Prime PCCT scanner.Figure 2: Siemens Healthineers’ Naeotom Alpha.Prime PCCT scanner (Source: Siemens Healthineers AG) Embedding AI in sensors

The integration of embedded AI cores directly into biosensors is changing the architecture of medical diagnostics. Previously, devices were limited to a traditional sensing process, wherein all raw data was transmitted to a central processor for analysis. With the direct integration of edge intelligence, sensors can now process data locally, exactly where it is sourced.

The main benefit of this architecture is efficiency, as the device transmits only processed results or alerts. This approach significantly reduces the system’s power consumption, latency, and required bandwidth.

STMicroelectronics has introduced a high-accuracy biosensor that integrates a vertical AFE (vAFE) for biopotential signals (typically cardio and neurological parameters) with a low-power, three-axis accelerometer with AI and anti-aliasing. The ST1VAFE3BX’s vAFE features programmable gain and input impedance and includes a 12-bit ADC.

Providing output data at a rate up to 3,200 Hz, the biosensor is well-suited for biopotential measurement of heart, brain, and muscular activities. The compact size (2 × 2 mm) and reduced power consumption (48.1 µA during normal operation, which can be cut to just 2.6 µA in power-saving mode) suit it for wearables designed for predictive healthcare.

The biosensor features ST’s proprietary machine-learning core (MLC) and finite-state machine (FSM), which allow designers to develop decision-making rules and algorithms to be deployed directly on the chip. The AI-assisted capabilities enable the sensor to autonomously manage motion and activity detection.

This AI feature decreases the interactions with the host controller, reducing the overall power consumption and latency while extending battery life. MLC and FSM can be implemented using ST’s software development tools such as MEMS Studio, which is part of the ST Edge AI Suite.

High-precision AFE

The integrity of a medical device is defined by the quality of its input data. AFEs are components required for interfacing with the human body. They are essential for all types of medical sensors that produce analog signals and therefore require further processing, such as conditioning, amplification, filtering, and digital conversion.

AFEs bridge the gap between physical measurements, typically available in analog form, and the compute device that processes them in digital form. In medical devices, AFEs are required for any sensor that measures physical parameters.

AFEs operate by extracting small-amplitude physiological signals from the environment, which are often noisy or subject to electromagnetic interference. As a result, to achieve medical-grade results, the AFE must provide a high signal-to-noise ratio and low leakage currents.

Among the sensors that require an AFE are biosensors, such as those used in continuous glucose monitoring (CGM) and electrocardiogram patches. Onsemi’s CEM102 is an AFE specifically designed for CGM and similar applications. Based on an amperometric measurement that senses very low currents, the device features a small form factor and low power consumption. These features suit the CEM102 for miniaturized and battery-operated medical devices.

The CEM102 can be operated with a supply voltage ranging from 1.3 to 3.6 V—typically a single 1.5-V silver oxide battery or a standard 3-V coin cell. It supports up to four electrodes, integrates a high-resolution ADC and several DACs for bias setting and a factory-trimmed system, and can be interfaced with a host controller, such as the onsemi RSL15, a secure Bluetooth 5.2 wireless microcontroller (MCU) for connecting to an external device or terminal.

Power management

In the design of compact wearables, such as hearing aids, power management represents one of the most challenging constraints. Designers must select power management integrated circuits (PMICs) with high efficiency, thus preserving the energy provided by small battery cells.

Onsemi’s HPM10 battery-charge controller is a high-performance PMIC engineered to recharge batteries in miniaturized medical devices, typically hearing aids and cochlear implant devices. The device supports different rechargeable battery technologies, including lithium-ion and silver-zinc, and can detect zinc-air and nickel-metal hydride disposable batteries.

The HPM10 also provides a charger communication interface to communicate the state of the charging process to the hearing-aid charger. Other information available on this interface includes the battery voltage levels, current levels, temperature, and battery failures.

Connectivity

A medical device is more effective if it can communicate data to clinicians or electronic health-record systems. Several connectivity protocols are available, and their selection is based on the application’s range and data throughput requirements.

Low-power Bluetooth SoCs are the industry standard for wearables, providing a reliable and efficient link to smartphones or home gateways. For high-bandwidth clinical environments, such as hospitals or clinics, integrating Wi-Fi 6 with Bluetooth Low Energy (LE) represents a suitable connectivity solution.

For example, Silicon Labs’ Series 2 BG29 family of wireless SoCs is designed to provide Bluetooth LE connectivity in an extremely small form factor. The BG29 device’s small size (2.6 × 2.8 mm) suits it for applications such as wearable health and medical devices and battery-operated sensors. The device integrates a DC/DC boost converter supporting a wide voltage range, a Coulomb counter for accurate battery monitoring, 1 MB of flash, 256 kB of RAM, and security features.

Silicon Labs’ BG29 wireless SoC.Figure 3: Silicon Labs’ BG29 is available in compact QFN and WLCSP packages. (Source: Silicon Laboratories)

NXP Semiconductors is collaborating with Silex Technology, a provider of wireless connectivity and smart edge solutions for the medical and industrial sectors. Silex focuses on wireless solutions for medical applications requiring high longevity, cybersecurity features, and high reliability. Patient monitors, medical wearables, and other connected devices often operate in hospitals where several Wi-Fi access points are available.

Silex integrates NXP’s Wi-Fi SoCs in its Wi-Fi 6 + Bluetooth 5.3 and 5.4 module solutions, including NXP’s IW611 Wi-Fi 6 SoC and RW610 Wi-Fi 6 wireless MCU.

The post Designers guide: Sensors for medical devices appeared first on EDN.

Triple-duty current loop calibrator

EDN Network - Wed, 05/20/2026 - 15:00

It’s always gratifying when a simple and successful design idea luckily turns out to have additional applications that you didn’t originally envision.  Here’s an example.

Wow the engineering world with your unique design: Design Ideas Submission Guide

A while back, the design shown in Figure 1 was accepted for publication:


Figure 1 U1 plus R1 through R5 current steering networks convert a 0/20mA input into a 4/20mA output.

Later, the same circuit, when wired up differently as shown in Figure 2, turned out to be an equally good fit in a different job:


Figure 2 This 4/20mA current loop converter integrates an OFF/ON field contact.

A recent Design Idea by another frequent contributor, Jayapal Ramalingam, addressed the problem of convenient calibration of precision current loop receivers in industrial applications. His design comprises a linear control input that expedites calibration and testing.   He explains that it helps to:

…”calibrate the analog input modules of distributed control systems (DCSs) and programmable logic controllers (PLCs) by simulating process signals.”…

This inspired me to wonder if a different approach to the same calibration problem might also be useful.  I imagined a design in which the three standard analog test current loop levels: 0, 4mA, and 20mA, were accurately preset and quickly accessed by flipping a switch. I then proceeded to ponder whether that same friendly little converter circuit might work in such an application.

Figure 3 shows the result:


Figure 3 The three-position, center-off, DPDT switch S1 converts this current converter (verbiage redundancy pun-intentional) into a convenient current calibrator.

Not only did it fit, but the calibration procedure for the new role is just as quick, simple, and easy to accomplish in a single pass as it was before.

  1. Set S1 to the 4mA position.
  2. Tweak 4mA adj for 4mA output (as measured, for example, with a precision DMM).
  3. Set S1 to the 20mA position.
  4. Adjust 20mA adj for 20mA output (ditto).

So, it turns out that the same circuit thriftily fits three related, yet different, applications – a triple-duty design trifecta.

Stephen Woodward‘s relationship with EDN’s DI column goes back quite a long way. Over 200 submissions have been accepted since his first contribution back in 1974.  They have included best Design Idea of the year in 1974 and 2001.

Related Content 

The post Triple-duty current loop calibrator appeared first on EDN.

How data movement defines performance for AI silicon

EDN Network - Wed, 05/20/2026 - 10:47

Regardless of the applications, most artificial intelligence (AI) chip designers face the same challenges. Whether it’s cloud data centers, edge devices, automotive platforms, or industrial robotics, optimal performance now depends on how efficiently data is moved.

When data movement is delayed, even the fastest compute engines are left waiting, reducing throughput, increasing latency, and wasting power.

As AI designs continue to grow in complexity, managing massive data flows through fixed, point-to-point connections no longer scales efficiently. Designers are now dealing with hundreds of compute engines and memory instances, each with different performance requirements, all of which must move data simultaneously.

A network-on-chip (NoC) brings order to chaos by providing a scalable, shared communication infrastructure that moves data where it needs to go with controlled latency and bandwidth. With built-in mechanisms for congestion management, traffic prioritization, and workload isolation, NoCs help teams deliver consistent, predictable performance while staying within tight power, area, and timing budgets.

Different markets, same bottleneck

Whether in hyperscale cloud infrastructure or inside an embedded vision processor, the core problem is data bottlenecks. The end markets differ, but the underlying architectural constraint remains the same. In the cloud, the goal is maximum throughput. Training clusters push bandwidth into the terabytes-per-second range. Massive GPUs and AI accelerators continuously ingest and process vast datasets. In large data center GPUs, more than 80% of dynamic energy is consumed by data transfers to and from DRAM. That energy is not spent on computing. It is spent moving bits.

At the edge, priorities flip. Systems such as autonomous vehicles, robotics, and smart cameras demand microsecond-level latency, strict determinism, and ultra-low power consumption. Edge AI devices may spend up to 90% of inference time waiting on memory I/O.

This is the invisible drain on AI performance.

Why NoC architecture matters

The NoC is the backbone that determines how efficiently data flows within a system-on-chip (SoC) or across multiple dies. However, the NoC must be optimized correctly. If not, the entire system slows down, regardless of how powerful the compute cores may be.

AI designs often rely on wide parallel interfaces between IP blocks. As system innovation increases, routing congestion, timing closure issues, and power overhead become more difficult to manage. An NoC addresses these challenges by packetizing traffic. Transactions are broken into packets and routed across a structured fabric, much like off-chip networking. This approach significantly reduces wiring complexity.

A wide AXI interface can require hundreds of signals; for example, a given AXI bus interface that requires 280 signals can be reduced to 150 by packetizing transactions. Fewer wires mean less congestion, simpler routing, easier timing closure, reduced silicon area, and lower dynamic power, as shown in the figure below.

Here is an outline of the advantages of packetized data with NoC IP Source: Arteris

Equally important, an NoC decouples IP blocks from transport details. Designers integrate heterogeneous CPUs, GPUs, NPUs, memory controllers, and accelerators without manually wiring hundreds of signals between blocks. The network fabric handles transport abstraction. This level of decoupling does more than simplify integration within a single die. It also lays the groundwork for the next major shift in system design, where functionality is distributed across multiple dies and coordinated at the system level.

From monolithic dies to systems of systems

The separation of IP from transport becomes critical as designs transition to chiplet-based architectures. The shift enables teams to optimize each piece of silicon independently for its specific function and power trade-offs. It also improves yield, lowers costs, and makes it easier to increase compute capacity by adding or reusing chiplets as requirements change.

Within each die, a coherent NoC uses standard protocols such as AMBA CHI or ACE. Non-coherent fabrics connect peripherals and specialized engines into the broader system. Across dies, UCIe enables high-speed die-to-die communication. In advanced multi-package systems, coherent and non-coherent NoCs communicate seamlessly across chiplet boundaries.

The result is effectively a system of systems, with multiple specialized silicon components orchestrated into a unified compute engine. The NoC fabric spans the entire package, coordinating traffic between dies and subsystems.

In this environment, the interconnect is no longer just a supporting block. It shapes the entire system architecture. Every AI system, whether in the cloud or at the edge, has to strike the right balance among three things. Bandwidth must keep GPUs, XPUs, and AI engines fully utilized. Latency must remain low to support real-time inference and control. Efficiency must hold power and thermal budgets within limits as systems expand.

Designers also need a practical way to grow compute resources without redesigning the interconnect. Modular tiling approaches address that need. Each tile includes its own network interface unit and can be replicated across an NPU array. Need more compute? Add more tiles. The fabric scales without requiring a complete redesign.

Closing the architectural loop

In AI SoCs, designing the NoC requires more than defining the logical topology. Engineers should introduce physical awareness early in the design process. That means using floorplan information, estimated wire distances, and timing constraints. Physical awareness must be built directly into the design flow.

A modern NoC design flow includes:

  1. High-level architectural modeling and simulation
  2. Integration of physical constraints through virtual floor planning
  3. Automatic insertion of pipeline stages with built-in timing analysis
  4. Closed-loop export of constraints to physical synthesis tools

This approach bridges the gap between architectural intent and layout reality. In production designs, physically aware NoC automation has demonstrated the ability to reduce total wire length by roughly 26%, cut maximum latency by half, and improve overall productivity by an order of magnitude. Tasks that once required weeks of manual tuning can now be completed in less than a day.

Cache hierarchy and data locality

Interconnect optimization must be paired with effective cache architecture. Multi-level cache hierarchies, including L1, L2, and L3, store frequently used data close to the compute engines, reducing memory access latency. Without an effective cache hierarchy, CPU utilization can drop to single digits.

In some AI SoC regions, last-level non-coherent caches improve data availability without participating in a full coherency protocol. Workloads that do not require tight synchronization, such as certain signal-processing or multimedia tasks, benefit from this approach, which simplifies the design while improving throughput. By increasing data locality, the cache structure reduces reliance on external memory and stabilizes interconnect traffic.

The reality of AI SoC design

The cost of developing leading-edge SoCs has risen from under $100 million a decade ago to more than $700 million today. So, each design iteration or silicon re-spin carries enormous financial risk.

Manual integration processes, fragile scripting, and misaligned hardware-software interfaces amplify that risk. Automated SoC integration flows that validate IP early, maintain consistent specifications across teams, and compile millions of registers in minutes can significantly reduce development time and errors.

Arteris addresses these architectural demands with interconnect IP purpose-built for complex AI platforms where efficient data transport determines overall system behavior. Its FlexNoC and Ncore solutions provide configurable non-coherent and coherent fabrics that support heterogeneous compute clusters and multi-die designs, reducing communication bottlenecks that limit utilization.

By aligning scalable interconnect architecture with disciplined implementation methodology, these interconnect solutions enables design teams to translate system intent into silicon more predictably in an era defined by rising complexity and cost sensitivity.

Automation and physically aware design are no longer optional optimizations. They are survival tools in the AI decade.

Andy Nightingale, VP of product management and marketing at Arteris, has over 39 years of experience in the high-tech industry, including 23 years in various engineering and product management roles at Arm.

 

Related Content

The post How data movement defines performance for AI silicon appeared first on EDN.

Indian Navy awards ADITI 3.0 contract for High Power Microwave System to Tonbo Imaging

ELE Times - Wed, 05/20/2026 - 09:44

Defence technology company Tonbo Imaging receives an award and a contract from the Indian Navy under the ADITI 3.0 innovation framework to integrate and commission a High Power Microwave (HPM) system for naval platforms. The programme supports iDEX and the Defence Innovation Organisation (DIO) under the Ministry of Defence, Government of India. Within the scope of the engagement, Tonbo Imaging will undertake system integration and commissioning activities, followed by the supply of multiple production units upon successful development, validation, and acceptance.

High-power microwave systems represent a strategically significant direct-energy capability and are considered a strategic asset; only a limited number of countries possess them today. Such systems provide a non-kinetic means of disabling or degrading adversary electronics, sensors, and unmanned systems, and one of the few practical approaches to countering swarms of drones, making them increasingly relevant in modern maritime and asymmetric threat environments. The Indian Navy’s continued investment in this domain reflects a forward-looking approach to electromagnetic spectrum dominance and next-generation deterrence.

ADITI (Advanced Defence Technology Incubation) is a Government of India initiative to enable the maturation, integration, and validation of advance defence technologies before induction. The selection of Tonbo Imaging under ADITI 3.0 reflects the emphasis on indigenously developing strategic capabilities aligned with the Navy’s evolving operational requirements and long-term force modernisation plans.

Commenting on the development, Arvind Lakshmikumar, Managing Director and Chief Executive Officer, Tonbo Imaging India Limited, said, “This programme represents a significant responsibility to execute complex capability integration with discipline, rigour, and clear alignment to end-user operational needs. Over the past several years, Tonbo Imaging has invested substantially in the indigenous development of core building blocks of High Power Microwave technology, including critical sub-systems and vacuum tube sources. We are among the very few private organisations to own core intellectual property in vacuum-tube technologies that are fundamental to HPM systems, and this deep technology foundation has been a key factor in our selection for this naval programme. For the class of effects required in High Power Microwave applications, vacuum tube–based sources remain the practical path forward, as they can generate the extremely high peak power and energy levels necessary for effective target coupling. Solid-state RF sources, while well suited for many RF applications, cannot today achieve the required peak power and pulse energy levels within feasible size, weight, and efficiency envelopes for operational HPM systems.”

With this engagement, Tonbo Imaging’s role extends well beyond that of an imaging and electro-optics company, reinforcing its position as a defence technology company focused on the development and integration of advance defence systems. The programme underscores the company’s growing involvement in complex system-level integration, advanced electronics, embedded software, and emerging direct-energy and mission systems, in addition to its prevailing strengths in electro-optics. This evolution reflects Tonbo Imaging’s transition toward delivering integrated defence capabilities.

About Tonbo Imaging India Limited

Tonbo Imaging India Limited is a defence technology company that focuses on the design, development, and integration of advanced sensing, perception, and mission-critical systems for military and security applications. The company’s portfolio spans electro-optics, thermal imaging, situational awareness, advanced electronics, embedded software, and emerging directed-energy technologies, enabling the delivery of integrated defence solutions that support operational requirements across land, maritime, and air domains. Tonbo Imaging continues to evolve as a defence technology and systems company, investing in the development of next-generation directed-energy systems as well as advanced defence solutions such as loitering munitions and counter-unmanned aerial systems (C-UAS), reflecting its strategic focus on addressing emerging operational threats through indigenous capability development.

The post Indian Navy awards ADITI 3.0 contract for High Power Microwave System to Tonbo Imaging appeared first on ELE Times.

Sensors Converge 2026: Smarter and lower-power sensors

EDN Network - Tue, 05/19/2026 - 20:00
TDK’s SensorStage platform.

The Sensors Converge 2026 conference showcased some of the latest advances in sensor and sensing solutions for applications ranging from wearables and smartphones to industrial and automotive. The show, with over 160 exhibitors, also highlighted the industry’s shifting focus to edge AI and smart, connected systems with demos that showcased real-world applications in edge AI, robotics, and autonomous systems.

While sensor manufacturers continue to focus on shrinking solutions and package sizes, this year’s product introductions also indicate an increased need for lower power consumption. Here is a sampling of new sensors featured at this year’s show.

Vibration sensors across wearables and industrial

Upbeat Technology showcased its latest family of low-power MEMS vibration sensors and vibration processing units (VPUs), including the UPM01 and UPM02 series with a UP201/301 dual-core RISC-V AI microcontroller (MCU), aimed at high-quality voice clarity and predictive intelligence in a small footprint.

Suited for space-constrained wearables applications, the UPM01/UPM02 VPU, also called a bone-conduction microphone, measures 3.2 × 2.5 mm, and the UP201 dual-core RISC-V AI MCU measures 3.0 × 3.0 mm. Together, they create Upbeat’s Tiny AI Engine that provides on-device intelligence to wearables, industrial systems, drones, and consumer electronics. The solution enables “crystal-clear voice” in open wearable stereo (OWS) headsets, smart glasses, and intelligent voice recorders and delivers predictive maintenance for industrial automation.

The UPM01 series offers multiple interface variants: the UPM01A (analog), UPM01Ax (higher-sensitivity analog), UPM01D (digital), and UPM01Dx (higher-sensitivity digital). The UPM02 provides analog and digital options with a higher signal-to-noise ratio (SNR) for applications in which audio clarity is critical, the company said.

The UPM01 extends the frequency response of conventional MEMS vibration sensors from 5 Hz to 11.3 kHz and delivers an SNR of 60 dB(A) for a more accurate sound capture, while the UPM02 offers a frequency response range from 5 Hz to 5.4 kHz and an exceptionally high SNR of up to 68 dB(A).

Both series consume minimal power and can operate for extended periods on a single battery charge, making them suited for mobile devices, wearables, and other battery-powered applications.

The UP201/UP301 heterogeneous dual-core RISC-V edge AI platform targets energy-efficient deep-learning applications, enabling AI analysis closer to the data source for fast response and lower bandwidth usage. Delivering ultra-low-power, always-on intelligence, the platform enables continuous sensing with minimal power and instant wake-up for intensive AI tasks.

Mass-production shipments for the UPM01/UPM02 have started, with the UP201/UP301 scheduled to ship in October 2026.

Upbeat also unveiled its UP301 + UPM01 Falcon Demo Kit, described as a ready-to-run evaluation platform for machine-vibration analysis. Aimed at engineers who want to prototype and validate predictive maintenance solutions, the kit includes a UP201 dual-core RISC-V AI MCU EVB, variable-speed motor, two UPM01D FPCs, power adapter, and access to the Falcon graphical user interface (GUI), the Upbeat Vibration Analysis Suite GUI software. The demo kit is available for purchase at www.upbeattechtw.com/products/demo-kits.

Other demonstrations included OWS headsets, smart glasses with AI voice interaction, a smart AI voice recorder, a factory machine-vibration application, and smart AI toys with touch-gesture recognition.

Upbeat’s UPM01, UPM02, and UP201 devices create its Tiny AI Engine.Upbeat’s UPM01, UPM02, and UP201 devices create its Tiny AI Engine. (Source: Upbeat Technology)

Ahead of the show, STMicroelectronics announced its wide-bandwidth, three-axis vibration sensor, aimed at saving space and energy in industrial and automotive condition-monitoring applications. With an extended temperature range of −40°C to 125°C, the IIS3DWBG1 enables vibration monitoring in harsh environments.

The IIS3DWBG1 offers a selectable, full-scale acceleration range of ±2/±4/±8/±16 g and can measure accelerations with a bandwidth up to 6 kHz with an output data rate of 26.7 kHz. Housed in a 2.5 × 3-mm LGA-14L package, the MEMS sensor is suitable for industrial condition-monitoring systems, in which sensor placement and mounting are critical to measurement accuracy.

The small size and wide operating temperature range allow the flexibility to place small, externally attached sensors at optimal diagnostic locations while enabling integration inside smart motors and smart gearboxes, ST said.

In addition, the low power consumption delivers long-lasting operation in battery-powered applications. The sensor’s wide bandwidth and high resolution simplify capturing patterns associated with defects or wear, as well as equipment setup issues such as looseness and misalignment.

The IIS3DWBG1 can also detect electromechanical vibrations in coils, transformers, snubber capacitors, busbars, connectors, and general vibrations originating in the power electronics module, such as traction inverters. This enables automotive OEMs to extend remote diagnostics to cover power modules, as well as traction inverters in electric vehicles.

Thanks to a flat frequency response from DC to above 6 kHz (−3 dB point) and noise density of 75 µg/√Hz in three-axis mode, the sensor detects extremely small vibrations, providing enhanced early warning to prevent equipment failures. The sensor is highly resistant to mechanical shocks, according to ST, and integrates digital features including a configurable low-pass or high-pass filter with selectable cutoff frequency, an embedded FIFO, interrupts, a temperature sensor, and self-test capability.

The IIS3DWBG1 is in production now. An evaluation kit is available.

The ST IIS3DWBG1 MEMS vibration sensor.The ST IIS3DWBG1 MEMS vibration sensor can operate in harsh automotive and industrial applications. (Source: STMicroelectronics) AMR and TMR sensors

Murata Manufacturing Co. Ltd. introduced its ultra-low-power anisotropic magnetoresistance (AMR) sensors, the MRMS166R and MRMS168R. These sensors are designed to increase battery life in healthcare, wearable, and IoT devices. The MRMS166R is claimed as the first AMR sensor to combine an average current consumption of 20 nA with operation from a 1.2-V supply, enabling extended battery life in coin-cell-powered systems.

These solid-state magnetic sensors detect the presence or absence of a magnetic field and generate an output signal that system logic uses to control functions such as transitions between active and sleep modes. This provides contactless switching without mechanical components, improved reliability, and support for sealed, miniaturized designs, Murata said.

This automatic switching between active and sleep modes is widely used in battery-powered devices to reduce standby power consumption and extend operating life, Murata said. Applications include healthcare, such as capsule endoscopes and medical patches; wearable devices, including AR glasses and wireless earbuds; and security-related IoT devices, such as door-open/close-detection systems and smart locks.

These devices commonly use silver oxide coin batteries (typically 1.55 V) that place constraints on available capacity and operating voltage. This means AMR sensors used as magnetic switches must minimize current consumption while maintaining stable operation at a low voltage, Murata said.

To address these challenges, Murata redesigned the AMR sensor’s internal circuitry, enabling ultra-low current consumption and operation down to 1.2 V. This significantly reduces battery consumption during standby operation, supporting device operation for more than two years in typical use.

The MRMS166R operates over a 1.2-V to 3.6-V supply range (1.5 V typ.) with an average current consumption of 20 nA and a maximum current output of 1 mA. The MRMS168R operates over a 2.0-V to 3.6-V supply range (3.0 V typ.), with an average current consumption of 80 nA and a maximum output current of 12 mA, providing higher output drive capability for devices requiring increased load current. Both devices are housed in a compact package measuring 1.0 × 1.0 × 0.4 mm (0.04 × 0.04 × 0.02 inches). The MRMS166R and MRMS168R sensors are now in mass production.

Murata’s MRMS166R/MRMS168R AMR sensor.Murata’s MRMS166R/MRMS168R AMR sensor (Source: Murata Manufacturing Co. Ltd.)

MultiDimension Technology Co. Ltd. debuted its tunneling magnetoresistance (TMR) TMR2531 (±1,000-Gauss linear range) and TMR2539 linear sensors (extended ±1,500-Gauss linear range) for smartphone cameras at Sensors Converge. Available in production quantities, these ultra-compact TMR linear sensors are designed for high-precision smartphone optical image stabilization (OIS) applications.

These sensors enable micron-level displacement measurement in voice coil motor (VCM) modules, allowing VCM driver ICs to precisely correct camera shake in real time during photo and video capture, MDT said. They measure the z-axis perpendicular magnetic field amplitude via a Wheatstone full-bridge configuration with four high-SNR TMR elements.

Periscope-style telephoto lenses have pushed OIS precision requirements into the micron scale to control prism positioning over extended motion ranges, MDT said. The new TMR sensor technology addresses these challenges with a high SNR, broad linear measurement ranges, and high immunity to magnetic interference, making it suited for advanced camera autofocus and OIS solutions in flagship smartphones.

Both series offer a 1.0-V to 5.5-V supply voltage and a shielding capability of ±3,000 Gauss for stable operation in interference-prone VCM environments. They are housed in a small DFN4L package (0.8 × 0.5 × 0.25 mm) for constrained VCM designs.

Faster sensor development

TDK Corp. introduced two development tools at Sensors Converge to simplify evaluation of TDK sensors. The InvenSense SensorStage software is an evaluation platform to simplify development and accelerate data analytics for TDK’s SmartMotion inertial measurement units (IMUs) and TMR magnetometers, while SensorGPT uses AI to generate simulated datasets to improve and accelerate development of edge AI IoT devices.

The all-in-one platform SensorStage bridges the gap between simple GUIs and custom test benches, offering advanced visual analytics and automated scripting to help engineers move from setup to insight without manual configuration, TDK said. SensorStage enables evaluation of complex, on-chip algorithms for applications in OIS, wearables, AR/smart glasses, and IoT with a future-proof architecture that supports existing and upcoming high-performance sensors.

The SensorStage platform is paired with the SmartMotion development board. Together, sophisticated on-chip features including machine-learning algorithms, the APEX engine for Gyro Assisted Fusion, motion and event detection, and chip-level power consumption are visualized. This delivers precise calibration and faster time to market for complex designs.

SensorStage is currently available for InvenSense ICM-456xx and ICM-426xx SmartMotion IMUs and will soon be available for additional InvenSense MEMS sensor solutions.

TDK’s SensorStage platform.TDK’s SensorStage platform pairs with the SmartMotion development board. (Source: TDK Corp.)

SensorGPT uses generative AI, signal processing, statistical methods, and simulations to create and manage sensor data at scale. Particularly aimed at smart IoT and ambient IoT applications, the AI tool streamlines model development and deployment, reducing time and cost, while enhancing the performance and efficiency of edge AI models and applications, TDK said.

SensorGPT sensor data synthesis trains generative models with limited real-world data to learn underlying patterns and generates synthetic data that mimics real-world data. It reduces the reliance on real-world data through intelligent sensor data synthesis, cutting data-collection efforts from 80% to nearly 10%, according to TDK, which enables faster, more scalable edge AI development.

The AI tool leverages physics-based and mathematical models to simulate and generate synthetic sensor data and uses mathematical and computational techniques to simulate data reflecting the dynamics and characteristics of real sensor outputs, TDK explained.

Other features include data-augmentation techniques that automatically transform existing sensor data into diverse datasets spanning a range of conditions and scenarios, while the assisted annotation streamlines the labeling of training data, which improves the quality for model training.

SensorGPT generates a 90% similarity between synthetic and real-world sensor data. This enables the use of the synthetically generated data for faster edge AI solution prototyping, testing, and deployment. It reduces edge AI model-building time from five-plus months down to a few weeks, according to TDK.

Generated dataset in a vibration sensor demo using TDK’s SensorGPT.Generated dataset in a vibration sensor demo using TDK’s SensorGPT (Source: TDK Corp.)

The post Sensors Converge 2026: Smarter and lower-power sensors appeared first on EDN.

🧐 Запрошуємо на методичний семінар “Перевірка робіт в епоху ШІ”

Новини - Tue, 05/19/2026 - 16:33
🧐 Запрошуємо на методичний семінар “Перевірка робіт в епоху ШІ”
Image
kpi вт, 05/19/2026 - 16:33
Текст

Бібліотека КПІ запрошує долучитися до методичного семінару “Перевірка робіт в епоху ШІ”, який команда StrikePlagiarism.com проведе спеціально для КПІ ім. Ігоря Сікорського.

КПІшники на благодійному марафоні MHP Run4Victory

Новини - Tue, 05/19/2026 - 16:24
КПІшники на благодійному марафоні MHP Run4Victory
Image
kpi вт, 05/19/2026 - 16:24
Текст

🏃‍➡️ Марафон від NewRun — масштабна благодійна спортивна подія, що збирає бігунів, аматорів спорту й усіх прихильників активного способу життя. Цьогоріч учасники підтримали важливу мету — збір 5 мільйонів гривень на протезування ветеранів.

GPUs: A high-throughput architecture confronting a workload shift

EDN Network - Tue, 05/19/2026 - 16:05

There is a growing architectural tension at the heart of modern AI infrastructure. The processors that enabled the deep learning revolution—graphics processing units (GPUs)—remain the dominant engines of large-scale training and inference. Yet the computational profile of frontier language models is evolving in ways that increasingly expose the structural assumptions embedded in GPU design.

Memory wall undermining GPU efficiency in large language models

A profound bottleneck lies on the memory wall, the growing performance gap where processors can execute arithmetic operations far faster than memory systems can supply data, causing increasingly powerful compute units to sit idle while waiting on bandwidth- and latency-limited data movement.

Using the Nvidia H100 as a reference point, modern GPUs deliver multiple petaflops of FP8 tensor throughput and several terabytes per second of high-bandwidth memory access. On paper, arithmetic capacity is immense. In practice, trillion-parameter-class large language models (LLMs) are frequently memory-bound. Arithmetic intensity during inference can fall below 10 FLOPs per byte, which means that performance is limited less by compute units and more by how quickly parameters can be fetched and activations moved.

Energy considerations reinforce this imbalance. A floating-point multiply-accumulate is inexpensive relative to a high bandwidth memory (HBM) access, and cross-chip communication can cost orders of magnitude more energy than local arithmetic. See Table 1.

Table 1 Here is a comparison among capacity, energy consumption, bandwidth and latency in a typical memory hierarchy. Source: Author

As model size grows, an increasing share of system energy is spent moving data rather than computing on it. The arithmetic units stall while waiting for weight tensors to arrive, and effective throughput becomes a function of bandwidth and latency rather than raw FLOPS. The challenge compounds when models exceed single-device memory capacity and must be distributed across multiple accelerators.

Frontier LLMs challenging foundations of GPU architecture

The historical success of GPUs in machine learning emerged from an unusually strong alignment between hardware structure and model behavior. Modern GPUs from companies such as Nvidia and AMD are fundamentally throughput-oriented processors built around the single instruction multiple threads (SIMT) execution model.

Groups of threads—warps on Nvidia architectures, wavefronts on AMD architectures—execute instructions in lockstep. Maximum efficiency is achieved when threads follow identical execution paths, access memory in predictable patterns, and sustain dense arithmetic workloads with minimal synchronization overhead.

This design originated in graphics rendering, where millions of pixels or vertices undergo nearly identical operations in parallel. The same architectural assumptions proved highly effective for early deep learning systems, particularly convolutional neural networks and dense transformers. Large matrix multiplications, regular tensor shapes, and high arithmetic intensity mapped naturally onto GPU tensor cores and wide vectorized execution pipelines. Under sufficiently large batch sizes, GPUs can sustain exceptionally high utilization because computation dominates memory latency and control-flow overhead.

Frontier LLMs, however, are evolving away from the dense and homogeneous workloads that originally favored GPU architectures.

Modern LLM systems increasingly incorporate conditional computation: Mixture of experts (MoE) layers, dynamic token routing, retrieval augmentation, speculative decoding, adaptive context management, variable sequence lengths, and sparsity-aware attention mechanisms. These techniques improve scaling efficiency at the model level by reducing the amount of computation performed per token while preserving or increasing representational capacity. They also introduce irregularity into execution patterns, precisely the condition under which SIMT architectures become less efficient.

The key issue is not simply “warp divergence” in the narrow classical GPU sense where threads within a warp follow different branches of a control-flow statement. In many MoE implementations, tokens routed to different experts are regrouped before execution specifically to minimize intra-warp divergence.

The deeper architectural tension is broader: SIMT processors are optimized for spatially and temporally coherent workloads, while modern frontier inference increasingly behave like sparse, dynamically scheduled computation with uneven work distribution and heavy communication dependencies.

In dense transformers, nearly every parameter participates in every token evaluation. Computational intensity remains high, tensor dimensions are regular, and work scheduling is relatively predictable. In sparse MoE systems, by contrast, only a small subset of experts may activate for a given token. A model with 16 experts and top-2 routing, for example, activates only a fraction of total parameters at each inference step. Although this dramatically improves parameter efficiency from a modeling perspective, it also fragments execution into uneven and dynamically changing workloads.

The consequence is reduced effective hardware utilization, not necessarily because every warp is internally diverging, but because the overall system struggles to maintain uniform occupancy, balanced scheduling, and continuous tensor-core saturation. Some experts become overloaded while others sit idle.

Token batches routed to a given expert may be too small to fully utilize matrix engines efficiently. Memory access patterns become less regular. Kernel launch granularity deteriorates. Synchronization overhead increases. The result is that the theoretical arithmetic throughput of the GPU becomes increasingly difficult to translate into sustained application-level throughput.

Furthermore, interactive AI workflows, especially AI agents that respond step by step, are difficult for GPUs to run efficiently. GPUs work best when they can process very large batches of data at once. In LLMs, this usually means combining many user requests together into large matrix operations. Large matrix operations are efficient because they involve much more computation than data movement, keeping the GPU fully occupied.

But interactive systems need low latency: the model must respond immediately instead of waiting to accumulate a large batch of requests. That means the batch size stays small. Small batches create smaller matrix operations that are less efficient on GPUs. The GPU spends more time moving data around and less time doing computation. As a result, GPU utilization drops. So, there is a trade-off. Large batches lead to high GPU efficiency but higher latency. Conversely, small batches cause low latency but worse GPU efficiency.

Agentic workflows usually prioritize responsiveness, which is why they are harder to run efficiently on GPUs.

The resulting inefficiencies are often obscured by headline FLOP metrics. Modern accelerators advertise enormous peak throughput numbers, but peak throughput reflects idealized dense execution under carefully tuned conditions. Real-world frontier inference frequently operates far from these conditions.

Effective utilization may decline substantially when workloads become routing-heavy, communication-bound, latency-sensitive, or dynamically imbalanced. In practice, the limiting resource increasingly shifts from raw arithmetic capability to orchestration efficiency across memory systems, interconnects, and distributed scheduling layers.

The hidden GPU complexity tax

Alongside these architectural mismatches lies another challenge: the growing software and optimization burden required to extract acceptable performance from GPU systems.

GPUs do not automatically deliver near-peak efficiency. High performance requires extensive manual optimization across multiple abstraction layers. Developers must orchestrate host-device memory transfers, optimize tensor layouts, tune kernel launch parameters, manage register pressure, balance shared-memory usage, fuse operations to reduce synchronization overhead, and carefully align workloads with hardware-specific execution characteristics. Small deviations in tensor dimensions, sequence lengths, routing distributions, or batch composition can materially reduce throughput.

As models become more dynamic, optimization itself becomes more fragile. Kernels tuned for one generation of hardware may perform poorly on another. Code paths optimized for dense transformers may degrade under sparse routing conditions. Performance engineering increasingly depends on vendor-specific toolchains such as CUDA, custom compiler stacks, graph schedulers, and specialized communication libraries tightly coupled to a particular hardware ecosystem.

The cumulative effect is a growing “complexity tax” surrounding GPU-centric AI infrastructure. The cost is not merely electrical power or silicon area, but engineering specialization, portability constraints, software maintenance overhead, and system fragility. As frontier models continue shifting toward sparse, distributed, and conditionally executed architectures, the tension between SIMT-oriented hardware assumptions and emerging AI workloads is becoming increasingly difficult to ignore.

Alternative AI processing architectures are mandatory

These pressures have catalyzed interest in alternative accelerator architectures designed explicitly around transformer workloads and data movement efficiency. Systems such as the TPUs developed by Google emphasize systolic arrays and compiler-driven dataflow scheduling to improve determinism and reduce divergence overhead.

Cerebras Systems has pursued wafer-scale integration, placing tens of gigabytes of SRAM directly on-chip in its wafer-scale engine to minimize off-chip memory traffic and reduce partitioning complexity. Graphcore designed its intelligence processing unit (IPU) around fine-grained parallelism and distributed local memory, explicitly targeting irregular and sparse workloads.

Drawing on more than two decades of architectural expertise and 14 silicon tape outs, VSORA developed an approach replacing the SIMT computational model with a dataflow architecture specifically engineered to overcome the memory wall. At its core is a massive flat register file spanning several megabytes, designed to supply data directly to large arrays of compute engines organized into wide, deeply pipelined execution paths.

Anticipating the evolving requirements of edge inference and future AI algorithms such as those for autonomous driving (AD L3-L5) applications, it also designed and embedded highly programmable processing cores capable of executing an extensive library of DSP operations with low latency and high efficiency.

While each approach involves trade-offs and varying degrees of ecosystem maturity, they share a common premise: future AI workloads are constrained less by arithmetic throughput and more by data orchestration, locality, and communication efficiency.

The next compute frontier

The broader trend in AI systems reflects a shift in the dominant bottleneck. During the convolutional era, compute capacity measured in TFLOPS was the primary metric. Early transformer models balanced compute and memory bandwidth. Frontier LLMs at trillion-parameter scale are now constrained primarily by memory movement and interconnect efficiency. As sparsity and conditional activation become central architectural features, the efficiency of routing and dataflow scheduling begins to outweigh peak arithmetic density.

GPUs remain foundational to AI infrastructure, particularly in training. Their ecosystem maturity, programmability, and unmatched dense training throughput ensure continued relevance, particularly during large-scale pretraining where arithmetic intensity remains high and workloads are relatively regular. However, as models grow more conditional, more distributed, and more memory-bound, the architectural friction becomes increasingly visible.

The future of AI acceleration will likely reward designs that privilege data locality, minimize cross-device communication, and execute sparse patterns natively rather than emulating them within a dense SIMT framework.

The decisive question for next-generation systems is no longer how many floating-point operations per second can be delivered in isolation. It is how efficiently data can be moved, routed, and scheduled across increasingly complex and sparsely activated models.

Lauro Rizzatti is a business development executive with VSORA, a technology company offering silicon semiconductor solutions that redefine performance. He is a noted chip design verification consultant and industry expert on hardware emulation.

Related Content

The post GPUs: A high-throughput architecture confronting a workload shift appeared first on EDN.

4/20mA to 0/20mA loop current converter for grounded loads

EDN Network - Tue, 05/19/2026 - 15:00

Ground-referenced loads are common in industry. This circuit implements current loop conversions for them.

Recently, there have been several published Design Ideas on converting 0/20mA to 4/20mA current and 4/20mA to 0/20mA current as full-circle current loop conversions. However, these circuits have all focused on floating loads. It’s common to also come across loads that are ground-referenced. The circuit in Figure 1 addresses this alternative requirement, converting 4/20mA current to 0/20mA current for feeding grounded loads.

Wow the engineering world with your unique design: Design Ideas Submission Guide


Figure 1 In this 4/20mA to 0/20mA converter for grounded loads, R4 and RB can be replaced by multi-turn potentiometers for tuning purposes.

How does it work? Input current of 4-20mA feeds into R1 and is converted into 0.4V – 2.0V, which is buffered by U2A. U1 generates a reference current of 1mA which is is fed into R4, and which converts it to 0.4V. This converted current is buffered by U2B. U2C subtracts the two voltages.

Next, let’s look at the positive input of U2D. There are two currents:

  • One current going through Ra= \frac{(Iinput*R1)-(Iref*R4)}{Ra}
  • Another current going through Rb= –\frac{Iout.Rc}{Rb+Rc}

Since the negative input of U2D is grounded, these two currents must be same:

\frac{(Iinput*R1)-(Iref*R4)}{Ra} = \frac{Iout.Rc}{Rb+Rc} where I ref*R4 is 0.4V

Iout = (\frac{Iinput.R1-0.4V}{Ra})*(1+\frac{Rb}{Rc})

Select the values of Rb and Rc such that Rb/Rc =124. Substituting the values of Ra, Rb, Rc and R1 from Figure 1:

Iout = (\frac{Iinput.R1-0.4V}{Ra})*125

Hence, Iout= (Iinput-4 mA)*1.25. For example, if the I input is 20 mA, I out= (20-4)*1.25=20mA. And if the I input is 4mA, I out= (4 – 4) * 1.25=0mA.

How do you tune the circuit? Implement R4 and Rb as multi-turn potentiometers. R1 conversely should be a precision 100 ohm resistor. Adjust R4 such that voltage across it is 0.4 V. Feed 20mA current from the precision current source as the Iinput and adjust Rb to get 20mA as Iout. Repeat this exercise by feeding the circuit with 4mA and 12mA.

Simulation test results follow:

Iinput (mA)

4.0

6.0

8.0

10.0

12.0

14.0

16.0

18.0

20.0

Iout (mA)

0.44

2.56

5.06

7.56

10.1

12.6

15.1

17.6

20.1

Calculated Iout (mA)

0

2.5

5.0

7.5

10.0

12.5

15.0

17.5

20.0

Error (%)

2.2

0.3

0.3

0.3

0.5

0.5

0.5

0.5

0.5

These results suggest that the circuit delivers high accuracy, with error no higher than 0.5%, except with a 4mA input. The error and associated accuracy can be improved by selecting high-end operational amplifiers such as instrumentation amplifiers with negligible offset and a high common-mode rejection ratio (CMRR).

Q2 prevents the output current from exceeding a few mA above the 20mA input threshold, as a safety measure. R5, R6, and R7 should be identical in value. And also implement R8 as a multi-turn potentiometer. The resultant tuning capability helps to reduce the output to near-zero for a 4mA input.

Jayapal Ramalingam has over three decades of experience in designing electronics systems for power & process industries and is presently a freelance automation consultant.

Related Content

The post 4/20mA to 0/20mA loop current converter for grounded loads appeared first on EDN.

Pages

Subscribe to Кафедра Електронної Інженерії aggregator