Neural Architectures

RTnn supports multiple neural network architectures for radiative transfer modeling.

Recurrent Neural Networks (RNN)

LSTM (Long Short-Term Memory)

Bidirectional LSTM with final Conv1d projection.

from rtnn import RNN_LSTM

model = RNN_LSTM(
    feature_channel=6,    # Input features
    output_channel=4,     # Output channels
    hidden_size=128,      # Hidden state size
    num_layers=3          # Number of LSTM layers
)

Architecture: - Bidirectional LSTM: captures forward and backward dependencies - Conv1d output: projects hidden states to output channels

GRU (Gated Recurrent Unit)

Similar to LSTM but with fewer parameters.

from rtnn import RNN_GRU

model = RNN_GRU(
    feature_channel=6,
    output_channel=4,
    hidden_size=128,
    num_layers=3
)

Transformer

Self-attention based encoder for sequence processing.

from rtnn import TransformerEncoder

model = TransformerEncoder(
    feature_channel=6,
    output_channel=4,
    embed_size=64,        # Embedding dimension
    num_layers=2,         # Number of transformer blocks
    heads=4,              # Attention heads
    forward_expansion=4,  # Feed-forward expansion factor
    seq_length=10,        # Input sequence length
    dropout=0.1           # Dropout rate
)

Features: - Positional embeddings - Multi-head self-attention - Residual connections - Layer normalization

FCN (Fully Connected Network)

Deep fully connected network with batch normalization.

from rtnn import FCN

model = FCN(
    feature_channel=6,
    output_channel=4,
    num_layers=3,         # Number of hidden layers
    hidden_size=196,      # Hidden layer size
    seq_length=10,        # Input sequence length
    dim_expand=0          # Optional sequence expansion
)

Architecture: - Flattens input: (batch, channels, seq) → (batch, channels * seq) - FCBlock: Linear → BatchNorm → ReLU - Optional sequence length expansion

Model Comparison

Architecture

Parameters

Best For

Pros/Cons

LSTM/GRU

Moderate

Temporal dependencies

Good for sequences, can be slow

Transformer

Large

Long-range dependencies

Parallel processing, memory intensive

FCN

Moderate

Simple relationships

Fast, no temporal modeling

Choosing an Architecture

  1. LSTM/GRU: Default choice for temporal sequences

  2. Transformer: For long sequences or when attention is important

  3. FCN: For simple regression tasks without temporal structure

See Training Strategy for hyperparameter recommendations.