LinxISA v0.4 Hardening Policy¶
This document defines the live public policy for selecting and validating
rendering- and AI-adjacent hardware hardening in canonical v0.4 systems.
It does not freeze a final accelerator set. Instead, it freezes the rules that
any hardening decision MUST satisfy so the design stays aligned with the
existing v0.4 rendering profile, command model, and fallback architecture.
Scope¶
- rendering and AI acceleration candidates,
- selection criteria for new hardened units,
- required fallback and observability rules,
- measurement requirements that must precede hardening decisions.
This page is normative for hardening policy. It does not replace the ISA manual for block semantics or the command contract for submission ownership.
Existing Contract Alignment¶
Any hardening work under v0.4 MUST preserve these already-frozen rules:
- command submission remains BCC-led and block-stream based,
VECremains the general programmable SIMT compute path for parallel-loop execution across workload classes,MPARVEC kernels remain the required functional fallback for rendering PTOs,- TAU execution remains tile-to-tile only,
- small or rarely changing state remains descriptor-driven through
B.ARGandB.IOR, - larger or hot state remains explicit through referenced records or StateTiles,
- hidden execution state outside the architectural block stream and referenced state records is not part of the canonical model.
These rules come from:
docs/architecture/v0.4-architecture-contract.mddocs/architecture/v0.4-workload-engine-model.mddocs/architecture/v0.4-rendering-command-contract.mddocs/architecture/linxcore/microarchitecture.mdisa/v0.4/state/rendering_profile.jsondocs/architecture/isa-manual/src/chapters/08_tile_blocks.adoc
Core Policy¶
Canonical v0.4 uses limited hardening:
- only high-value and well-measured units should become hardened engines,
- everything not hardened MUST remain executable through the existing software and VEC paths,
- hardening decisions MUST improve a measured bottleneck rather than follow a speculative full-pipeline rewrite,
- new hardened paths MUST integrate through the same explicit block-stream submission model already used by the rest of the stack.
The default posture is conservative. If the measurement case is weak or the fallback story is unclear, the work remains on software or VEC.
Required Fallback Rule¶
Every hardened rendering or AI path MUST have a functional fallback path that is already valid in the canonical model.
- Rendering-oriented acceleration MUST preserve an
MPARVEC fallback or an earlier software-backed reference path when the accelerated unit is absent, disabled, or still under validation. - TAU-facing hardening MUST remain tile-to-tile and MUST NOT bypass the
explicit memory-domain rules by adding direct
.brgbehavior to TAU. - Any global-memory interaction required by a hardened path MUST still occur through explicit TMA, DMA, or VEC-mediated mechanisms visible in the command stream.
This rule keeps bring-up, emulator validation, and staged deployment aligned.
TAU As The First Rendering Hardening Substrate¶
Under the current v0.4 rendering profile, TAU is the canonical substrate
for limited rendering-oriented hardening.
This means:
VECremains the general programmable shader and fallback path,TAUis the preferred place to host selected tile-oriented hardened rendering work once the workload evidence justifies it,- hardening is expressed in a way that preserves explicit tile inputs and tile outputs rather than inventing a hidden side-channel execution path.
The architectural handoff medium is the tile register file and referenced tile-state records:
- intermediate rendering state crossing between blocks or engines is carried in tiles,
- small parameters and descriptor-like state are carried through
B.ARGandB.IOR, - larger or hotter rendering state remains in explicit StateTiles,
- TAU-facing work MUST consume and produce tile-visible state so the fallback boundary remains explicit and observable.
This is the part of the old draft direction that is now frozen. It does not freeze the final catalog of hardened functions.
LinxCore Microarchitecture Alignment¶
TAU-facing hardening MUST align with the live LinxCore microarchitecture contract rather than bypassing it.
Required consequences:
- hardened rendering work must enter the machine through explicit block-stream commands, not a sideband command path,
- dispatch, completion, and cancellation must remain visible at the same block/BID boundary used by other block-engine work,
- younger speculative engine work must remain flushable under the normal
bid > flush_bidrollback rule, - completion must compose with the existing block-engine completion model described by the rendering command contract and LinxCore microarchitecture contract,
- a hardened TAU unit must not create hidden global-memory effects outside the explicit TMA, DMA, VEC, and committed block-stream rules.
This keeps rendering hardening compatible with precise retirement, redirect, and recovery behavior in LinxCore.
Candidate Priority Classes¶
The current canonical priority order for evaluation is:
- DMA, BLT, clear, and related copy or format-conversion work.
- Texture and sampler work.
- Depth, stencil, and blend work.
- Raster, tiler, and binning work.
- Optional tensor or MMA-class acceleration for AI-focused workloads.
This list defines evaluation priority, not an implementation commitment. A later candidate may move forward earlier only if measurements show a stronger and more verifiable payoff.
Measurement Requirements¶
No hardening candidate may be promoted without workload-backed measurements.
At minimum, the evaluation set MUST capture:
- instruction-mix or IR-mix proxies for the relevant workloads,
- bytes read and written where measurable,
- texture sample counts, filter modes, and dominant formats when texture work is involved,
- depth, stencil, and blend frequency when fixed-function render-output work is involved,
- occupancy or working-set pressure signals that explain why the current path is bottlenecked,
- backend identity and configuration for each captured result.
Acceptable measurement sources include:
- Mesa driver counters,
- shader compiler or IR statistics,
- kernel submission statistics,
- emulator or RTL performance counters when available,
- machine-readable workload summaries produced by canonical bring-up runners.
Experimental Workflow¶
Hardening evaluation MUST proceed in this order:
- Establish the baseline path using software or existing
MPARVEC execution. - Add instrumentation at the command, driver, compiler, or emulator layer.
- Run representative workloads and collect machine-readable measurements.
- Rank candidate bottlenecks using those measurements.
- Select one candidate unit at a time.
- Define the explicit command/block interface and the matching fallback path.
- Validate correctness against the fallback path before treating the hardened unit as preferred.
Parallel speculative hardening of multiple large units is discouraged during bring-up because it weakens attribution and makes fallback validation harder.
Command and State Rules For Hardened Units¶
Any hardened unit selected under this policy MUST obey the same explicit command
and state model used elsewhere in canonical v0.4:
- invocation MUST be visible in the lowered block stream,
- descriptor and state dependencies MUST be explicit,
- reusable state MUST appear as immutable referenced records or StateTiles,
- completion and synchronization MUST compose with the existing block/BID model,
- emulator and kernel traces MUST be able to report the unit's execution at the same semantic boundary as other block-stream work.
Opaque side channels that materially alter execution are not allowed.
Acceptance Criteria¶
A candidate is worth hardening only when all of the following are true:
- the workload evidence shows a recurring bottleneck that the candidate directly addresses,
- the command and state interface can be expressed within the canonical block-stream model,
- the fallback path remains available and correctness-comparable,
- the validation burden is manageable in QEMU, AVS, and later RTL evidence,
- the candidate does not force contradiction with tile-to-tile TAU policy, descriptor-driven state rules, or the existing rendering command contract.
If any of those conditions fail, the candidate remains in software or VEC.
What This Policy Does Not Freeze¶
This page intentionally does not freeze:
- the final set of implemented hardened engines,
- whether a given function lands as TEPL, TAU micro-kernel, or another accelerator-specific form,
- whole-stage versus subgraph offload granularity,
- final benchmark-suite composition beyond the requirement that the workloads be representative and machine-readable.
Those details require additional canonical pages once the design is mature enough to lock.
Relationship To Bring-up¶
The rendering bring-up plan establishes the first software-backed baseline in
docs/bringup/rendering_vulkan_bringup.md.
This hardening policy starts after that baseline exists:
- first prove the userspace and rendering path works,
- then measure it,
- then harden one justified bottleneck at a time without breaking the existing fallback path.