HAMi GPU Isolation Validation - Lesson 1C Part B / Lesson 6 Part B¶
STATUS: ✅ VALIDATED - captured on a real NVIDIA RTX A6000 (Hyperstack), 2026-06-23. Evidence: real command output captured during the run (artifact files cited in the table below). This is the runtime-enforcement half the HAMi scheduling sim deliberately cannot prove. Lab:
hami-isolation-realgpu/.
Environment¶
| Item | Value |
|---|---|
| Date | 2026-06-23 |
| Machine | Hyperstack GPU VM (optimistic-fermi), Ubuntu 22.04 |
| GPU | NVIDIA RTX A6000, 48 GB (49140 MiB), UUID GPU-d3d0a942-4b02-e4bc-b07e-485c8d2c8552; Ampere, no MIG - software sharing is the only option (the HAMi premise) |
| Driver | 535.183.06 (CUDA 12.4) |
| Kubernetes | k3s, containerd config v3, default-runtime: nvidia (k3s --default-runtime flag, not a containerd template) |
| GPU stack | HAMi 2.9.0, scheduler image tag matched to the k8s server version. No GPU Operator (HAMi ships its own device plugin; the two must not coexist) |
| HAMi pods | hami-device-plugin 2/2 Running, hami-scheduler 2/2 Running (hami-pods.txt) |
| Registration | node allocatable nvidia.com/gpu = 10 (1 card × deviceSplitCount); hami.io/node-nvidia-register → devmem:49140, devcore:100, type:"NVIDIA RTX A6000", mode:"hami-core" (node-allocatable.txt) |
HAMi advertises only
nvidia.com/gpuin node allocatable;gpumem/gpucoresare accounted by the HAMi scheduler from the register annotation and enforced per-pod by HAMi-core - they are intentionally not node-allocatable resources.
Validation checklist (5 exercises)¶
| # | Exercise | Pass criteria | Result | Evidence |
|---|---|---|---|---|
| 1 | Co-residency | two pods on one physical GPU | ✅ hami-share-a + hami-share-b both Running on optimistic-fermi |
1-co-residency.txt |
| 2 | Virtualized device view | in-pod nvidia-smi shows the slice |
✅ both pods report 0MiB / 8000MiB, not the real 49140 MiB |
2-3-probe-memory-a.txt, -b.txt |
| 3 | Memory-cap enforcement | allocation refused at the slice, by HAMi-core | ✅ cudaMalloc refused after 7680 MiB; [HAMI-core ERROR] ... Device 0 OOM 8594128896 / 8388608000 (8388608000 B = exactly 8000 MiB), while the card had ~40 GB free |
2-3-probe-memory-a.txt, -b.txt |
| 4 | Per-device budget (scheduler) | a pod that fits an empty card stays Pending beside the slices | ✅ hami-oversubscribe (45000 MiB) Pending - FilteringFailed ... CardInsufficientMemory |
4-oversubscribe-status.txt, 4-oversubscribe-events.txt |
| 5 | The mechanism | the HAMi-core injection that enforces the cap | ✅ env CUDA_DEVICE_MEMORY_LIMIT_0=8000m, CUDA_DEVICE_SM_LIMIT=0; library /usr/local/vgpu/libvgpu.so; NVIDIA_VISIBLE_DEVICES=GPU-d3d0a942-… |
5-probe-mechanism-a.txt |
What this proves (that the simulation cannot)¶
- Two tenants share one physical GPU (Exercise 1) - stock Kubernetes treats a GPU as indivisible and cannot do this.
- The slice is real, two ways: the container is shown an 8 GB card (Exercise 2), and a CUDA allocation is refused at 8 GB while ~40 GB physically remained (Exercise 3). The refusal comes from HAMi-core, not the hardware - the contradiction that only a user-space CUDA-interception cap can produce.
- The card is one shared, accounted budget (Exercise 4): the HAMi scheduler refuses a 45
GB pod beside two 8 GB slices (
CardInsufficientMemory) even though 45 GB < the 48 GB card - the real-hardware counterpart of the simulation's per-device exhaustion test. - The mechanism is concrete (Exercise 5): HAMi-core is injected as
libvgpu.soand readsCUDA_DEVICE_MEMORY_LIMIT_0=8000m- the sameCUDA_DEVICE_MEMORY_LIMITmechanism that NVIDIA's KAI Scheduler adopted in June 2026 for its own fractional-GPU memory isolation.
Scope limits¶
This is software isolation (user-space CUDA interception), not MIG hardware fault
isolation - treat it as a scheduling-and-accounting guarantee with runtime enforcement, not
a security boundary. It proves the memory cap and the virtualized device view; it does
not measure compute-throttling accuracy or noisy-neighbour interference under sustained
load. Single node, so nothing about NCCL/NVLink/MIG/multi-node scale. Full ledger:
fake-vs-real-limitations.md.