Neural Network Layers
160+Every major neural network architecture, from classic to cutting-edge, in pure C#
AiDotNet provides the most comprehensive collection of neural network layer types available in any .NET library. Build any architecture from individual layers or compose pre-built models. Every layer supports automatic differentiation, GPU acceleration, SIMD vectorization, and mixed-precision training.
Convolutional Layers
Standard and advanced convolutional layers for spatial feature extraction in images, video, and 1D signals.
Conv1D
One-dimensional convolution for sequence and time-series data processing.
Conv2D
Standard 2D convolution with configurable kernel size, stride, padding, and dilation.
Conv3D
Three-dimensional convolution for volumetric data and video processing.
DepthwiseSeparableConv
Factored convolution (MobileNet-style) that reduces parameters and compute.
DilatedConv
Atrous convolution for capturing larger receptive fields without pooling.
DeformableConv
Convolution with learnable offsets for geometric transformations (DCNv2).
GroupConv
Grouped convolution for channel-wise feature separation (ResNeXt-style).
TransposedConv
Learnable upsampling via transposed convolution for decoders and generators.
CausalConv
Causal convolution for autoregressive models like WaveNet.
Recurrent Layers
Sequence modeling layers with memory for natural language, time series, and sequential data.
LSTM
Long Short-Term Memory with forget, input, and output gates for long-range dependencies.
GRU
Gated Recurrent Unit with reset and update gates for efficient sequence modeling.
BiLSTM
Bidirectional LSTM processing sequences in both forward and backward directions.
BiGRU
Bidirectional GRU for context-aware sequence encoding.
IndRNN
Independently Recurrent Neural Network for stable long-range learning.
QRNN
Quasi-Recurrent Neural Network combining CNN and RNN for parallelizable sequence modeling.
SRU
Simple Recurrent Unit with highway connections for fast training.
FastGRNN
Fast, Accurate, Stable and Tiny GRU variant optimized for edge devices.
LiGRU
Light GRU with fewer parameters and competitive performance.
Attention & Transformer Layers
Self-attention, cross-attention, and transformer building blocks for state-of-the-art architectures.
MultiHeadAttention
Standard multi-head scaled dot-product attention (Vaswani et al.).
SelfAttention
Single-head self-attention for intra-sequence relationships.
CrossAttention
Attention between two different sequences (encoder-decoder).
FlashAttention
IO-aware exact attention with tiling for O(N) memory and faster speed.
LinearAttention
Linearized attention with O(N) complexity via kernel feature maps.
SlidingWindowAttention
Local attention with fixed window size for long sequences (Longformer-style).
GroupedQueryAttention
GQA with shared key-value heads for efficient inference (LLaMA 2).
MultiQueryAttention
MQA with single key-value head for maximum inference speed.
RotaryEmbedding
Rotary Position Embeddings (RoPE) for relative position encoding.
ALiBi
Attention with Linear Biases for length extrapolation without position embeddings.
Normalization Layers
Normalization techniques for stable and faster training across architectures.
BatchNorm
Batch normalization for mini-batch statistics normalization.
LayerNorm
Layer normalization used in transformers and RNNs.
GroupNorm
Group normalization for small batch sizes and detection tasks.
InstanceNorm
Instance normalization for style transfer and image generation.
RMSNorm
Root Mean Square normalization used in LLaMA and modern LLMs.
AdaptiveNorm
Adaptive normalization with learned scale/shift for conditional generation.
Graph Neural Network Layers
Layers for learning on graph-structured data: molecules, social networks, point clouds.
GCN
Graph Convolutional Network layer with spectral-based convolutions.
GAT
Graph Attention Network with attention-weighted neighbor aggregation.
GraphSAGE
Inductive representation learning on large graphs with neighbor sampling.
GIN
Graph Isomorphism Network for maximally powerful GNNs.
EdgeConv
Edge convolution for dynamic graph construction (DGCNN).
PointNet
Direct point cloud processing without graph construction.
SchNet
Continuous-filter convolutional network for molecular modeling.
EGNN
E(n) Equivariant Graph Neural Network for 3D geometric data.
Generative Layers
Building blocks for VAEs, GANs, normalizing flows, and other generative models.
VAE Encoder/Decoder
Variational autoencoder with reparameterization trick for latent generation.
VQ-VAE
Vector Quantized VAE with discrete latent representations.
GAN Generator/Discriminator
Adversarial training framework building blocks.
WGAN Critic
Wasserstein GAN critic with gradient penalty for stable training.
StyleGAN Layers
Style modulation, noise injection, and progressive growing layers.
Normalizing Flow
Invertible transformations for exact density estimation.
Train a neural network in C#
using AiDotNet;
// Train a neural network with the AiModelBuilder facade
var result = await new AiModelBuilder<float, float[], float>()
.ConfigureModel(new NeuralNetwork<float>(
inputSize: 784, hiddenSize: 128, outputSize: 10))
.ConfigureOptimizer(new AdamWOptimizer<float>(lr: 1e-3))
.ConfigurePreprocessing()
.BuildAsync(features, labels);
// Inference
var prediction = result.Predict(newSample); Start building with Neural Network Layers
All 160+ implementations are included free under Apache 2.0.