Pre-Push Workflow

This section outlines the standardized workflow to follow before pushing changes to the RTnn repository. Adhering to this workflow ensures that your contributions are clean, tested, and compatible with the latest codebase, facilitating smooth collaboration and maintaining code quality.

Why This Workflow Matters

A disciplined pre-push workflow ensures:

  • Reproducibility of radiative transfer experiments

  • Model versioning and experiment tracking

  • Code quality for both ML and data processing components

  • Smooth collaboration between researchers with different expertise

Prerequisites

Before starting, ensure you have:

  • A clean working directory (or stashed changes)

  • Access to the RTnn repository

  • Required tools installed:

    uv --version          # Fast Python package manager
    pre-commit --version  # Git hooks for code quality
    git --version         # Version control
    python --version      # Python 3.9+ required
    

1. Fetch Latest Changes From Remote

Always begin by updating your local knowledge of the remote repository without modifying your working files:

git fetch origin

This command: - Downloads new data from remote branches - Updates origin/* references - Does not merge or rebase your working files

Why this matters for RTnn: Multiple researchers may be working on different model architectures (LSTM, GRU, Transformer, FCN), data preprocessing pipelines, or evaluation metrics simultaneously.

2. Check Branch Status

Examine your current branch status:

git status

Interpretation Guide:

Status Message

Required Action

“Your branch is up to date”

Proceed to step 3

“Your branch is ahead”

Your changes are ready to push

“Your branch is behind”

MUST update before pushing (see step 3)

“Changes not staged”

Stage your changes with git add

“Unmerged paths”

You have unresolved conflicts

If you see:

Your branch is behind 'origin/master' by X commits

You must update your branch before pushing to avoid integration issues.

3. Rebase Onto Latest Remote Branch

Rebase your feature branch onto the latest version of the base branch:

git pull --rebase origin master

(Replace master with your base branch, e.g., develop or feature/*.)

Why rebase instead of merge?

Approach

Result

Use Case

Merge

Creates merge commits, more complex history

When preserving experiment history for papers

Rebase

Linear, clean history

For feature branches before PR

For RTnn development, rebase is preferred for feature branches to maintain a readable project history, especially when tracking model iterations.

Conflict Resolution Guide

Conflicts occur when Git cannot automatically reconcile changes. This is common in collaborative development, especially when multiple researchers modify:

  • Model architecture definitions (models/rnn.py, models/Transformer.py)

  • Training logic (main.py)

  • Data preprocessing pipelines (dataset.py)

  • Dependency specifications (pyproject.toml)

  • Evaluation metrics (evaluater.py)

When a conflict occurs, Git will pause and display:

CONFLICT (content): Merge conflict in src/rtnn/models/rnn.py
error: could not apply abc1234... feat: add bidirectional LSTM

Step 1 — Identify Conflicted Files

git status

Look for files under:

Unmerged paths:
  both modified:   src/rtnn/models/rnn.py
  both modified:   src/rtnn/dataset.py

Step 2 — Examine the Conflict

Open each conflicted file. You’ll see conflict markers:

<<<<<<< HEAD
# Your local changes - experimenting with larger hidden size
class RNN_LSTM(BaseRNN):
    def __init__(self, feature_channel, output_channel, hidden_size=256, num_layers=4):
        super().__init__(feature_channel, output_channel, hidden_size, num_layers, 'lstm')
=======
# Remote changes from origin/master - added dropout
class RNN_LSTM(BaseRNN):
    def __init__(self, feature_channel, output_channel, hidden_size=128, num_layers=3,
                 dropout=0.1):
        super().__init__(feature_channel, output_channel, hidden_size, num_layers, 'lstm')
>>>>>>> origin/master

Understanding the markers:

  • <<<<<<< HEAD → Your current branch’s version

  • ======= → Separator between conflicting versions

  • >>>>>>> origin/master → Remote branch’s version

Step 3 — Resolve the Conflict

Edit the file to create the correct version. For model code:

  1. Preserve both innovations if they’re compatible

  2. Test the combined architecture mentally or with quick local tests

  3. Check parameter compatibility with existing training configs

  4. Document architectural decisions in comments

Example resolution combining both approaches:

# Resolved: larger hidden size with dropout
class RNN_LSTM(BaseRNN):
    def __init__(self, feature_channel, output_channel, hidden_size=256, num_layers=4,
                 dropout=0.1):
        super().__init__(feature_channel, output_channel, hidden_size, num_layers, 'lstm')

Critical: Remove ALL conflict markers:

<<<<<<<
=======
>>>>>>>

Step 4 — Mark as Resolved

After fixing each file, and passing the pre-commit hooks (see next section), stage the resolved files:

git add src/rtnn/models/rnn.py
git add src/rtnn/dataset.py

Do not use git add . blindly - ensure only resolved files are staged.

Step 5 — Continue the Rebase

git rebase --continue

If more conflicts appear, repeat the process. Git will apply each commit one by one.

Abort Rebase (Emergency Option)

If the rebase becomes too complex or you need to start over:

git rebase --abort

This returns your branch to its state before starting the rebase.

When to abort: - You’re unsure about conflict resolutions - You need to discuss architectural changes with the team - You accidentally started rebase on wrong branch

4. Standardize Code with Pre-commit Hooks

RTnn uses pre-commit hooks to enforce code quality standards. After successful rebase, run all hooks:

pre-commit run --all-files

What these hooks check:

Hook

Purpose

ruff

Linting and code style (replaces flake8, isort, pydocstyle)

ruff-format

Automatic code formatting (replaces black)

end-of-file-fixer

Ensures files end with a newline

trailing-whitespace

Removes trailing whitespace

mixed-line-ending

Enforces LF line endings

forbid-tabs / remove-tabs

Ensures spaces instead of tabs

check-yaml

Validates YAML files (GitHub Actions configs)

check-json

Validates JSON files

check-added-large-files

Prevents committing large files

check-merge-conflict

Detects unresolved merge conflicts

Because the hooks modify files automatically:

git add .

Then run pre-commit again to confirm everything is clean:

pre-commit run --all-files

Expected output: “All files passed” or similar success message.

5. Run the Test Suite

Before pushing, verify your changes don’t break existing functionality:

# Run all tests
python -m unittest discover tests -v

# Run specific model tests
python -m unittest tests.test_rnn -v
python -m unittest tests.test_fcn -v
python -m unittest tests.test_transformer -v

# Run with test runner (rich output)
python tests/test_runner.py

Success criteria:

  • ✅ All tests pass (0 failures)

  • ✅ No new warnings

  • ✅ Tests complete in reasonable time

If tests fail:

  • Examine error messages carefully

  • Check if failures relate to your changes

  • Fix issues locally

  • Re-run tests until they pass

6. Commit Changes (If Needed)

If you made additional fixes (conflict resolution, formatting, test fixes):

git add .
git commit -m "fix: resolve merge conflicts and apply formatting"

Commit message guidelines for RTnn (Conventional Commits):

Type

Example

feat:

feat: add Transformer model for RT emulation

fix:

fix: correct normalization in data preprocessing

docs:

docs: update API documentation for RNN models

test:

test: add unit tests for calc_hr function

refactor:

refactor: simplify loss function computation

perf:

perf: optimize dataloading with multiprocessing

config:

config: update default hyperparameters for LSTM

If no changes were needed after rebase and hooks, you may not need a new commit.

7. Push Your Changes

Finally, push your branch to the remote repository:

git push

If push is rejected (due to history rewrite from rebase):

git push --force-with-lease

Critical: Always use --force-with-lease, never plain --force.

Option

Safety

--force

Overwrites remote branch blindly - DANGEROUS

--force-with-lease

Checks if remote branch has changed since your last fetch - SAFER

Quick Reference: Daily Workflow

For quick daily use, here’s the complete workflow in one block:

# Step 1-2: Update and check status
git fetch origin
git status

# Step 3: Rebase onto latest master
git pull --rebase origin master
# (Resolve conflicts if needed)

# Step 4: Run pre-commit hooks
pre-commit run --all-files

# Step 5: Run tests
python -m unittest discover tests -v

# Step 6: Commit if needed
git add .
git commit -m "type: your message here"

# Step 7: Push safely
git push origin your-branch

Common Pitfalls to Avoid

Pitfall

Solution

Committing large model checkpoints

Use .gitignore or model registry

Forgetting to update dependencies

Run uv pip list and update pyproject.toml

Hardcoded paths

Use pathlib and relative paths

Ignoring type hints

Add type hints for better code quality

Changing random seeds

Document or make configurable

Breaking existing APIs

Use deprecation warnings before removing

Important Rules Summary

DO:

  • Fetch before working

  • Rebase feature branches

  • Resolve conflicts carefully

  • Run pre-commit hooks

  • Test thoroughly

  • Use --force-with-lease

  • Document model changes

  • Version control configurations

DON’T:

  • Ignore conflicts

  • Leave conflict markers

  • Skip tests after changes

  • Push without rebasing

  • Use plain --force

  • Commit large data files

  • Break existing APIs without deprecation

  • Hardcode model paths or seeds

Following this workflow ensures that your contributions to RTnn integrate smoothly with the work of other researchers and maintain the high standards required for AI-based radiative transfer modeling.