Package guide PyPI package

Rex Framework Package Guide

This guide is for installed-package usage from PyPI-style environments.

Package intent: - Primary: end-user inference workflows with remotely stored model weights. - Secondary: validation-oriented use in CI or application test environments.

1. Installation

Minimal package:

pip install rex-framework

Recommended for real inference workloads:

pip install "rex-framework[pytorch]"

Install notes: - numpy is installed automatically with the package. - The pytorch extra supports Python 3.10 to 3.13. - PyTorch wheels are not available on PyPI for Python 3.14. - On some platforms (such as macOS x86_64), Python 3.13 may not have a compatible torch wheel yet. Use Python 3.11 if that happens.

Quick verification:

python -c "import torch, rex; print(torch.__version__); print(rex.__version__)"

Optional extras:

pip install "rex-framework[google-drive]"
pip install "rex-framework[onedrive]"
pip install "rex-framework[bench]"
pip install "rex-framework[all]"

2. End-to-End CLI Flow

2.1 Convert a checkpoint to Rex format

rex-convert model.pt -o ./rex_output --framework pytorch --model-id my-model

2.2 Serve chunk files with HTTP range support

rex-serve --dir ./rex_output/weights --port 8080

2.3 Validate and inspect manifest

rex-validate ./rex_output/manifest.json
rex-inspect ./rex_output/manifest.json --verbose

2.4 Run benchmark

rex-benchmark \
  --manifest ./rex_output/manifest.json \
  --base-url http://localhost:8080 \
  --inferences 10 \
  --warmup 2 \
  --output benchmark.json

3. Python API Usage

3.1 Basic load + inference

import numpy as np
from rex.api.load import load_model
from rex.api.generate import run_inference_sync

manifest_path = "./rex_output/manifest.json"
runtime = load_model(manifest_path)
input_data = np.random.randn(1, 768).astype(np.float32)

output, metrics = run_inference_sync(manifest_path, input_data)
print(metrics.total_time_ms)

3.2 Explicit configuration

from rex.api.config import RexConfig
from rex.api.load import load_model

config = RexConfig()
config.storage.base_url = "http://localhost:8080"
config.cache.max_memory_cache_bytes = 512 * 1024 * 1024
config.scheduler.prefetch_window = 4

runtime = load_model("./rex_output/manifest.json", config=config)

4. Feature Configuration (Enable/Disable)

The package exposes feature controls through RexConfig. This section maps directly to currently available fields in src/rex/api/config.py.

4.1 Cache and invariant controls

from rex.api.config import RexConfig

config = RexConfig()

# Keep the core Rex invariant on in production
config.cache.enforce_invariant = True
config.cache.max_local_fraction_of_model = 0.4

# Cache policy: lru | lfu | weighted_utility
config.cache.policy = "weighted_utility"
config.cache.weighted_alpha = 0.5
config.cache.weighted_beta = 0.3
config.cache.weighted_gamma = 0.2

# Disk cache toggle
config.cache.max_disk_cache_bytes = 0  # 0 disables disk cache
config.cache.disk_cache_path = ".rex_cache"

4.2 Scheduler and prefetch toggles

from rex.api.config import RexConfig

config = RexConfig()

# Prefetch toggle
config.scheduler.enable_prefetch = True

# Window behavior
config.scheduler.prefetch_window = 4
config.scheduler.adaptive_window = True

# Planner mode
config.scheduler.scheduler_mode = "graph"       # or "sequential"
config.scheduler.build_execution_dag_metadata = True

# Graph partitioning toggle
config.scheduler.enable_graph_partitioning = True
config.scheduler.graph_partition_budget_bytes = 0

# Advanced prefetch tuning
config.scheduler.prefetch_horizon = 4
config.scheduler.max_prefetch_queue_depth = 0

4.3 Runtime priors and eviction strategy

from rex.api.config import RexConfig

config = RexConfig()

config.scheduler.enable_attention_priors = True
config.scheduler.enable_moe_priors = True

# aggressive | retain_n_layers | smart | none
config.scheduler.eviction_strategy = "retain_n_layers"
config.scheduler.retain_window = 2

4.4 Storage concurrency and rate behavior

from rex.api.config import RexConfig

config = RexConfig()

config.storage.max_concurrent_fetches = 4
config.storage.adaptive_concurrency = True
config.storage.respect_rate_limits = True
config.storage.request_timeout_seconds = 30.0
config.storage.retry_max_attempts = 3

4.5 Observability toggles

from rex.api.config import RexConfig

config = RexConfig()

config.observability.log_level = "INFO"
config.observability.log_format = "console"     # console | json | quiet
config.observability.enable_tracing = True
config.observability.trace_export_path = None

5. Preset Patterns

5.1 Minimal/debug profile (most advanced features off)

from rex.api.config import RexConfig

config = RexConfig()
config.cache.policy = "lru"
config.cache.max_disk_cache_bytes = 0
config.scheduler.enable_prefetch = False
config.scheduler.adaptive_window = False
config.scheduler.scheduler_mode = "sequential"
config.scheduler.build_execution_dag_metadata = False
config.scheduler.enable_graph_partitioning = False
config.scheduler.enable_attention_priors = False
config.scheduler.enable_moe_priors = False
config.storage.adaptive_concurrency = False

5.2 Throughput-oriented profile (features on)

from rex.api.config import RexConfig

config = RexConfig()
config.cache.policy = "weighted_utility"
config.cache.max_disk_cache_bytes = 2 * 1024 * 1024 * 1024
config.scheduler.enable_prefetch = True
config.scheduler.scheduler_mode = "graph"
config.scheduler.enable_graph_partitioning = True
config.scheduler.prefetch_horizon = 8
config.scheduler.max_prefetch_queue_depth = 32
config.scheduler.enable_attention_priors = True
config.scheduler.enable_moe_priors = True
config.storage.adaptive_concurrency = True
config.storage.max_concurrent_fetches = 8

6. Environment Variable Overrides

RexConfig supports these environment variables:

Example:

REX_STORAGE_URL=http://localhost:8080 \
REX_CACHE_SIZE_MB=512 \
REX_CACHE_POLICY=lru \
REX_PREFETCH_WINDOW=4 \
REX_LOG_LEVEL=DEBUG \
python your_script.py

7. Kaggle and Colab Usage

Rex works in notebook environments as long as: - You install package extras needed by your workload. - Manifest and chunk files are reachable over HTTP range requests.

Notebook pattern:

!pip install "rex-framework[pytorch]"

from rex.api.load import load_model
runtime = load_model("https://your-host/path/to/manifest.json")

This keeps full model weights remote while fetching only required chunks at runtime.

8. Test Support (Secondary)

The package supports validation workflows, but it does not ship the repository's full test suite.

Typical uses in CI and application test environments:

rex-validate ./rex_output/manifest.json
rex-benchmark --manifest ./rex_output/manifest.json --base-url http://localhost:8080

9. Included CLI Commands

10. Notes

This page is generated from rex_framework_package_guide.md on the main branch.