LinxISA v0.4 Workload Engine Model¶
This document defines the live public workload-to-engine model for canonical
v0.4.
It freezes how the architecture should be described across general-purpose
compute, AI workloads, and rendering workloads without implying that these are
three separate machines. The key rule is that VEC is the general SIMT vector
compute path for parallel work across workload classes.
Scope¶
- workload-class to engine mapping,
- role of
VEC,TMA,CUBE, andTAUin the currentv0.4profile, - canonical interpretation of
VECfor parallel loops, - boundaries between general programmable execution and hardened engines.
This page is normative for workload-model descriptions in live architecture docs, tests, guides, and bring-up materials.
Core Model¶
LinxISA v0.4 is one block-structured architecture serving multiple workload
classes:
- general-purpose computing,
- AI-oriented workloads,
- rendering-oriented workloads.
These workloads share one architectural state model, one block model, and one set of execution engines. They are not separate incompatible execution modes.
VEC Rule¶
VEC is the canonical general SIMT vector compute engine in v0.4.
VECexecutes general parallel-loop work through the SIMT kernel model.- Any workload that can be expressed as parallel loops may run in
VEC. - This includes, but is not limited to:
- AI kernels such as softmax and other non-matmul parallel compute stages,
- shader-style compute in rendering,
- general data-parallel compute outside graphics or AI-specific pipelines.
In architectural terms, this means:
- programmable SIMT computation is expressed through canonical vector block
forms such as
MPARandMSEQ, VECis not a rendering-only engine and not an AI-only engine,- hardened engines may accelerate specific cases, but they do not replace the
role of
VECas the general parallel execution substrate.
Engine Roles By Workload Class¶
General-Purpose Computing¶
General-purpose computing may use scalar, vector, and tile-capable parts of the machine as needed.
- scalar and block-structured control remain first-class,
VECis the programmable path for general parallel loops,TMAmay be used where explicit tile movement or layout staging is useful,- specialized engines are optional accelerators rather than required entry points.
AI-Oriented Workloads¶
AI-oriented workloads primarily compose:
CUBE + VEC + TMA
Role split:
CUBEhandles the current matrix and accumulator engine path,VEChandles general programmable SIMD/SIMT computation, including non-matmul kernels such as softmax and other loop-based transforms,TMAhandles explicit tile-memory movement and staging.
This means AI is not “CUBE-only”. Matrix-heavy work may prefer CUBE, but the
broader AI pipeline still depends on VEC for general parallel stages.
Rendering-Oriented Workloads¶
Rendering-oriented workloads primarily compose:
VEC + TMA + TAU
Role split:
VEChandles programmable shader-style computation and the required fallback path,TMAhandles explicit tile transport and memory staging,TAUhosts tile-to-tile hardened rendering-oriented work under the current rendering profile.
Rendering is therefore not “TAU-only”. Shaders and other programmable parallel stages remain VEC work in the canonical model.
LinxCore Composition Rule¶
The workload-to-engine model is implemented through the LinxCore block-ordered microarchitecture rather than through a separate incompatible GPU machine.
Canonical v0.4 composition is:
BCCand the block fabric orchestrate heterogeneous work submission,VEC,TMA,CUBE, andTAUprovide the current engine set for programmable compute, data movement, matrix acceleration, and tile-oriented hardening,- execution may overlap across engines, but architectural visibility and completion remain governed by the existing block/BID model,
- tiles remain the primary explicit handoff medium between heterogeneous blocks and engines.
This means:
- BCC-led software lowering and runtime orchestration remain the front-end control model,
- programmable shader-style and other parallel-loop kernels execute on
VECunless an explicitly selected specialized engine is used, - hardened engine work must still enter through the architectural block stream and remain flush-safe under LinxCore recovery rules,
- the workload model does not imply a separate hidden packet machine or a second retirement model outside LinxCore.
Fallback Rule¶
The workload-engine model preserves the existing fallback architecture:
- if a specialized engine is unavailable or a workload stage does not have a
canonical hardened mapping, the stage must still be expressible through
programmable
VECexecution or an earlier software baseline, CUBEaccelerates selected AI-heavy kernels but does not remove the need forVEC,TAUaccelerates selected rendering-oriented tile work but does not remove the need forVEC.
This rule is required for bring-up, AVS evidence, and phased hardening.
TMA Rule¶
TMA is the explicit tile-movement and staging path shared across workload
classes.
- It is not specific to AI or rendering alone.
- It should be used when tile data movement, layout conversion plumbing, or staging is part of the workload.
- It does not replace programmable compute; it composes with
VEC,CUBE, andTAU.
What This Model Does Not Freeze¶
This page intentionally does not freeze:
- exact microarchitectural issue width or lane count choices,
- final binning or pipeline-stage partitioning for rendering,
- exact AI-operator coverage for
CUBE, - future accelerator additions beyond the currently frozen engine roles.
Those remain separate architecture and implementation decisions.
Relationship To Other Canonical Pages¶
- top-level workload targets are summarized in
docs/architecture/v0.4-architecture-contract.md, - LinxCore execution ordering and recovery are defined in
docs/architecture/linxcore/microarchitecture.md, - rendering PTO carrier legality is defined in
docs/architecture/v0.4-rendering-pto-contract.md, - rendering fallback and hardening rules are defined in
docs/architecture/v0.4-hardening-policy.md, - rendering command lowering is defined in
docs/architecture/v0.4-rendering-command-contract.md.