Skip to content

LinxISA v0.4 Workload Engine Model

This document defines the live public workload-to-engine model for canonical v0.4.

It freezes how the architecture should be described across general-purpose compute, AI workloads, and rendering workloads without implying that these are three separate machines. The key rule is that VEC is the general SIMT vector compute path for parallel work across workload classes.

Scope

  • workload-class to engine mapping,
  • role of VEC, TMA, CUBE, and TAU in the current v0.4 profile,
  • canonical interpretation of VEC for parallel loops,
  • boundaries between general programmable execution and hardened engines.

This page is normative for workload-model descriptions in live architecture docs, tests, guides, and bring-up materials.

Core Model

LinxISA v0.4 is one block-structured architecture serving multiple workload classes:

  • general-purpose computing,
  • AI-oriented workloads,
  • rendering-oriented workloads.

These workloads share one architectural state model, one block model, and one set of execution engines. They are not separate incompatible execution modes.

VEC Rule

VEC is the canonical general SIMT vector compute engine in v0.4.

  • VEC executes general parallel-loop work through the SIMT kernel model.
  • Any workload that can be expressed as parallel loops may run in VEC.
  • This includes, but is not limited to:
  • AI kernels such as softmax and other non-matmul parallel compute stages,
  • shader-style compute in rendering,
  • general data-parallel compute outside graphics or AI-specific pipelines.

In architectural terms, this means:

  • programmable SIMT computation is expressed through canonical vector block forms such as MPAR and MSEQ,
  • VEC is not a rendering-only engine and not an AI-only engine,
  • hardened engines may accelerate specific cases, but they do not replace the role of VEC as the general parallel execution substrate.

Engine Roles By Workload Class

General-Purpose Computing

General-purpose computing may use scalar, vector, and tile-capable parts of the machine as needed.

  • scalar and block-structured control remain first-class,
  • VEC is the programmable path for general parallel loops,
  • TMA may be used where explicit tile movement or layout staging is useful,
  • specialized engines are optional accelerators rather than required entry points.

AI-Oriented Workloads

AI-oriented workloads primarily compose:

  • CUBE + VEC + TMA

Role split:

  • CUBE handles the current matrix and accumulator engine path,
  • VEC handles general programmable SIMD/SIMT computation, including non-matmul kernels such as softmax and other loop-based transforms,
  • TMA handles explicit tile-memory movement and staging.

This means AI is not “CUBE-only”. Matrix-heavy work may prefer CUBE, but the broader AI pipeline still depends on VEC for general parallel stages.

Rendering-Oriented Workloads

Rendering-oriented workloads primarily compose:

  • VEC + TMA + TAU

Role split:

  • VEC handles programmable shader-style computation and the required fallback path,
  • TMA handles explicit tile transport and memory staging,
  • TAU hosts tile-to-tile hardened rendering-oriented work under the current rendering profile.

Rendering is therefore not “TAU-only”. Shaders and other programmable parallel stages remain VEC work in the canonical model.

LinxCore Composition Rule

The workload-to-engine model is implemented through the LinxCore block-ordered microarchitecture rather than through a separate incompatible GPU machine.

Canonical v0.4 composition is:

  • BCC and the block fabric orchestrate heterogeneous work submission,
  • VEC, TMA, CUBE, and TAU provide the current engine set for programmable compute, data movement, matrix acceleration, and tile-oriented hardening,
  • execution may overlap across engines, but architectural visibility and completion remain governed by the existing block/BID model,
  • tiles remain the primary explicit handoff medium between heterogeneous blocks and engines.

This means:

  • BCC-led software lowering and runtime orchestration remain the front-end control model,
  • programmable shader-style and other parallel-loop kernels execute on VEC unless an explicitly selected specialized engine is used,
  • hardened engine work must still enter through the architectural block stream and remain flush-safe under LinxCore recovery rules,
  • the workload model does not imply a separate hidden packet machine or a second retirement model outside LinxCore.

Fallback Rule

The workload-engine model preserves the existing fallback architecture:

  • if a specialized engine is unavailable or a workload stage does not have a canonical hardened mapping, the stage must still be expressible through programmable VEC execution or an earlier software baseline,
  • CUBE accelerates selected AI-heavy kernels but does not remove the need for VEC,
  • TAU accelerates selected rendering-oriented tile work but does not remove the need for VEC.

This rule is required for bring-up, AVS evidence, and phased hardening.

TMA Rule

TMA is the explicit tile-movement and staging path shared across workload classes.

  • It is not specific to AI or rendering alone.
  • It should be used when tile data movement, layout conversion plumbing, or staging is part of the workload.
  • It does not replace programmable compute; it composes with VEC, CUBE, and TAU.

What This Model Does Not Freeze

This page intentionally does not freeze:

  • exact microarchitectural issue width or lane count choices,
  • final binning or pipeline-stage partitioning for rendering,
  • exact AI-operator coverage for CUBE,
  • future accelerator additions beyond the currently frozen engine roles.

Those remain separate architecture and implementation decisions.

Relationship To Other Canonical Pages

  • top-level workload targets are summarized in docs/architecture/v0.4-architecture-contract.md,
  • LinxCore execution ordering and recovery are defined in docs/architecture/linxcore/microarchitecture.md,
  • rendering PTO carrier legality is defined in docs/architecture/v0.4-rendering-pto-contract.md,
  • rendering fallback and hardening rules are defined in docs/architecture/v0.4-hardening-policy.md,
  • rendering command lowering is defined in docs/architecture/v0.4-rendering-command-contract.md.