// 02Architecture
Decoder-only. Financial corpus. Domain-native.
LUALDI.AI is a decoder-only transformer purpose-built on financial corpus — not a fine-tune, not an adapter layer. The architecture is clean, auditable, and built on the same structural principles as GPT-2.
// ARCHITECTURE
Decoder-only transformer
Causal self-attention, pre-norm transformer blocks, and tied input/output embeddings — built on the same structural principles as GPT-2. Clean, modern PyTorch, auditable at every layer.
GPT-2 Style · Causal Attention · PyTorch
// SCALE
Scalable architecture
One architecture, every size. The same design scales the way transformers do — deeper layers, wider embeddings, longer context — from a single-GPU base to enterprise-grade regimes. One model family, built to grow.
Scalable · Base to Enterprise · Decoder-Only
// TOKENIZATION
Byte-level BPE tokenizer
Byte-level BPE tokenization in the GPT-2 lineage — any financial or regulatory term, however rare, is represented by composing subwords, so nothing falls out of vocabulary. Tied embeddings keep the model efficient.
Byte-Level BPE · Subword · Efficient
// TRAINING
Modern, reproducible training stack
Trained with AdamW, a cosine learning-rate schedule with linear warmup, mixed-precision compute, and gradient accumulation — a modern, well-understood stack. Stable, reproducible, and entirely under in-house control.
AdamW · Cosine LR · Mixed Precision
// CORPUS
Curated finance-domain corpus
Trained on a curated finance-domain corpus rather than a general web crawl — producing a model that has internalised the vocabulary, cadence, and register of financial and corporate writing from the ground up.
Curated · Finance Domain · In-House
// VALIDATION
Held-out validation, checkpointed
Every training run is measured against a held-out validation split — train and validation loss tracked at regular intervals, with the strongest checkpoint kept by validation loss. Progress is measured, not assumed.
Held-Out Split · Loss Tracking · Best Checkpoint