PATENT

Patent Abstract

System and Method for Predictive Multi-Tier Weight Residency and Precision Orchestration for Neural-Network Inference

Patent application 202641038857 in the Manish KL patent portfolio, covering tiered weight memory and related technical systems.

202641038857 2026-03-28 Tiered Weight Memory

Summary

Overview

Neural-network weights are treated as live runtime state whose placement and precision can change across HBM, lower volatile tiers, and storage-backed tiers according to workload behavior.

Abstract

Technical Abstract

A policy engine evaluates reuse, routing likelihood, layer criticality, transfer cost, decompression cost, bandwidth pressure, and quality sensitivity to decide how each weight shard or expert block should be stored and staged. The controller schedules promotions, demotions, decompression, and predictive prefetch while enforcing precision floors for quality-sensitive blocks.

Search Context

SEO Keywords

weight residency patent, neural network weight orchestration patent, HBM patent, memory hierarchy patent, inference precision patent

Related Patents

More Patents in memory residency, KV systems, and deterministic inference

These filings sit nearby in the portfolio and strengthen internal linking across related patent topics.