HaoZeke
diff --git a/‎docs/src/engines/torch-sim-architecture.rst‎
Lines changed: 83 additions & 0 deletions b/‎docs/src/engines/torch-sim-architecture.rst‎
Lines changed: 83 additions & 0 deletions
diff --git a/‎docs/src/engines/torch-sim-batched.rst‎
Lines changed: 79 additions & 0 deletions b/‎docs/src/engines/torch-sim-batched.rst‎
Lines changed: 79 additions & 0 deletions
diff --git a/‎docs/src/engines/torch-sim-getting-started.rst‎
Lines changed: 94 additions & 0 deletions b/‎docs/src/engines/torch-sim-getting-started.rst‎
Lines changed: 94 additions & 0 deletions
diff --git a/‎docs/src/engines/torch-sim-model-loading.rst‎
Lines changed: 98 additions & 0 deletions b/‎docs/src/engines/torch-sim-model-loading.rst‎
Lines changed: 98 additions & 0 deletions
diff --git a/‎docs/src/engines/torch-sim.rst‎
Lines changed: 9 additions & 3 deletions b/‎docs/src/engines/torch-sim.rst‎
Lines changed: 9 additions & 3 deletions
diff --git a/‎python/metatomic_torch/metatomic/torch/model.py‎
Lines changed: 1 addition & 3 deletions b/‎python/metatomic_torch/metatomic/torch/model.py‎
Lines changed: 1 addition & 3 deletions
@@ -0,0 +1,83 @@
+.. _torchsim-architecture:
+
+Architecture
+============
+
+This page explains how ``MetatomicModel`` bridges TorchSim and
+metatomic.
+
+SimState vs list of System
+--------------------------
+
+TorchSim represents a simulation as a single batched ``SimState``
+containing all atoms from all systems, with a ``system_idx`` tensor
+tracking ownership. Metatomic expects a ``list[System]`` where each
+``System`` holds one periodic structure.
+
+``MetatomicModel.forward`` converts between these representations:
+
+1. Split the batched positions and atomic numbers by ``system_idx``
+2. Create one ``System`` per sub-structure with its own cell
+3. Call the model on the list of systems
+4. Concatenate results back into batched tensors
+
+Forces via autograd
+-------------------
+
+Metatomic models typically output only total energies. Forces are
+computed as the negative gradient of the energy with respect to atomic
+positions::
+
+   F_i = -dE/dr_i
+
+Before calling the model, each system's positions are detached and set
+to ``requires_grad_(True)``. After the forward pass,
+``torch.autograd.grad`` computes the derivatives.
+
+Stress via the strain trick
+---------------------------
+
+Stress is computed using the Knuth strain trick. An identity strain
+tensor (3x3, ``requires_grad=True``) is applied to both positions and
+cell vectors::
+
+   r' = r @ strain
+   h' = h @ strain
+
+The stress per system is then::
+
+   sigma = (1/V) * dE/d(strain)
+
+where V is the cell volume. This gives the full 3x3 stress tensor
+without finite differences.
+
+Neighbor lists
+--------------
+
+Models specify what neighbor lists they need via
+``model.requested_neighbor_lists()``, which returns a list of
+``NeighborListOptions`` (cutoff radius, full vs half list).
+
+The wrapper computes these using:
+
+- **vesin**: Default backend for both CPU and GPU. Handles half and
+  full neighbor lists. Systems on non-CPU/CUDA devices are temporarily
+  moved to CPU for the computation.
+- **nvalchemiops**: Used automatically on CUDA for full neighbor lists
+  when installed. Keeps everything on GPU, avoiding host-device
+  transfers.
+
+The decision happens per-call in ``_compute_requested_neighbors``: if
+all systems are on CUDA and nvalchemiops is available, full-list
+requests go through nvalchemi while half-list requests still use vesin.
+
+Why a separate package
+----------------------
+
+metatomic-torchsim has its own versioning, release schedule, and
+dependency set (``torch-sim-atomistic``). Keeping it separate from
+metatomic-torch avoids forcing a torch-sim dependency on users who only
+need the ASE calculator or other integrations.
+
+The package is pure Python with no compiled extensions, making it
+lightweight to install.
@@ -0,0 +1,79 @@
+.. _torchsim-batched:
+
+Batched simulations
+===================
+
+TorchSim supports batching multiple systems into a single ``SimState``
+for efficient parallel evaluation on GPU. ``MetatomicModel`` handles
+this transparently.
+
+Creating a batched state
+------------------------
+
+Pass a list of ASE ``Atoms`` objects to ``atoms_to_state``:
+
+.. code-block:: python
+
+   import ase.build
+   import torch_sim as ts
+   from metatomic_torchsim import MetatomicModel
+
+   model = MetatomicModel("model.pt", device="cpu")
+
+   atoms_list = [
+       ase.build.bulk("Cu", "fcc", a=3.6, cubic=True),
+       ase.build.bulk("Ni", "fcc", a=3.52, cubic=True),
+       ase.build.bulk("Al", "fcc", a=4.05, cubic=True),
+   ]
+
+   sim_state = ts.io.atoms_to_state(atoms_list, model.device, model.dtype)
+
+Evaluating the batch
+--------------------
+
+A single forward call evaluates all systems:
+
+.. code-block:: python
+
+   results = model(sim_state)
+
+The output shapes reflect the batch:
+
+- ``results["energy"]`` has shape ``[3]`` (one energy per system)
+- ``results["forces"]`` has shape ``[n_total_atoms, 3]`` (all atoms
+  concatenated)
+- ``results["stress"]`` has shape ``[3, 3, 3]`` (one 3x3 tensor per
+  system)
+
+How system_idx works
+--------------------
+
+``SimState`` tracks which atom belongs to which system via the
+``system_idx`` tensor. For three 4-atom systems, ``system_idx`` looks
+like::
+
+   [0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2]
+
+``MetatomicModel.forward`` uses this to split the batched positions and
+types into per-system ``System`` objects before calling the underlying
+model.
+
+Batch consistency
+-----------------
+
+Energies computed in a batch match those computed individually. This is
+guaranteed because each system gets its own neighbor list and
+independent evaluation. The existing test
+``test_energy_consistency_single_vs_batch`` validates this property.
+
+Performance considerations
+--------------------------
+
+Batching is most beneficial on GPU, where the neighbor list computation
+and model forward pass can run in parallel across systems. On CPU, the
+speedup comes from reduced Python overhead (one call instead of N).
+
+For very large systems or many small ones, adjust the batch size to fit
+in GPU memory. TorchSim does not impose a maximum batch size, but each
+system gets its own neighbor list, so memory scales with the sum of
+per-system sizes.
@@ -0,0 +1,94 @@
+.. _torchsim-getting-started:
+
+Getting started
+===============
+
+This tutorial walks through running a short NVE molecular dynamics
+simulation with a metatomic model and TorchSim.
+
+Prerequisites
+-------------
+
+Install the package and its dependencies:
+
+.. code-block:: bash
+
+   pip install metatomic-torchsim
+
+You also need a saved metatomic model file (``.pt``). If you have a
+metatrain checkpoint (``.ckpt``), install metatrain as well:
+
+.. code-block:: bash
+
+   pip install metatrain
+
+Load the model
+--------------
+
+.. code-block:: python
+
+   from metatomic_torchsim import MetatomicModel
+
+   model = MetatomicModel("path/to/model.pt", device="cpu")
+
+The wrapper detects the model's dtype and supported devices
+automatically. Pass ``device="cuda"`` to run on GPU.
+
+Build a simulation state
+------------------------
+
+TorchSim works with ``SimState`` objects. Convert ASE ``Atoms`` using
+``torch_sim.io.atoms_to_state``:
+
+.. code-block:: python
+
+   import ase.build
+   import torch_sim as ts
+
+   atoms = ase.build.bulk("Si", "diamond", a=5.43, cubic=True)
+   sim_state = ts.io.atoms_to_state([atoms], model.device, model.dtype)
+
+Evaluate the model
+------------------
+
+Call the model on the simulation state to get energies, forces, and
+stresses:
+
+.. code-block:: python
+
+   results = model(sim_state)
+
+   print("Energy:", results["energy"])    # shape [1]
+   print("Forces:", results["forces"])    # shape [n_atoms, 3]
+   print("Stress:", results["stress"])    # shape [1, 3, 3]
+
+Run NVE dynamics
+----------------
+
+Use TorchSim's Velocity Verlet integrator:
+
+.. code-block:: python
+
+   from torch_sim.integrators import VelocityVerletIntegrator
+
+   integrator = VelocityVerletIntegrator(
+       model=model,
+       state=sim_state,
+       dt=1.0,  # femtoseconds
+   )
+
+   for step in range(100):
+       sim_state = integrator.step(sim_state)
+       if step % 10 == 0:
+           energy = model(sim_state)["energy"].item()
+           print(f"Step {step:3d}  E = {energy:.4f} eV")
+
+The total energy should remain approximately constant in an NVE
+simulation, which serves as a basic sanity check for your model.
+
+Next steps
+----------
+
+- :ref:`torchsim-model-loading` covers all supported input formats
+- :ref:`torchsim-batched` explains running multiple systems at once
+- :ref:`torchsim-architecture` describes the internals
@@ -0,0 +1,98 @@
+.. _torchsim-model-loading:
+
+Loading models
+==============
+
+``MetatomicModel`` accepts several input formats. Each section below
+shows one loading pattern.
+
+From a saved ``.pt`` file
+-------------------------
+
+The most common case. Pass the path to a TorchScript-exported metatomic
+model:
+
+.. code-block:: python
+
+   from metatomic_torchsim import MetatomicModel
+
+   model = MetatomicModel("path/to/model.pt", device="cpu")
+
+The file must exist and contain a valid ``AtomisticModel``. A
+``ValueError`` is raised if the path does not exist.
+
+From a metatrain checkpoint
+---------------------------
+
+Pass a ``.ckpt`` path to load a metatrain checkpoint directly. This
+requires the ``metatrain`` package:
+
+.. code-block:: python
+
+   model = MetatomicModel("path/to/checkpoint.ckpt")
+
+The checkpoint is exported to an ``AtomisticModel`` internally.
+
+PET-MAD shortcut
+----------------
+
+The string ``"pet-mad"`` downloads and loads the PET-MAD universal
+model:
+
+.. code-block:: python
+
+   model = MetatomicModel("pet-mad")
+
+This also requires ``metatrain`` to be installed. The model weights are
+fetched from HuggingFace on first use.
+
+From a Python AtomisticModel
+-----------------------------
+
+If you already have an ``AtomisticModel`` instance (for example, built
+programmatically):
+
+.. code-block:: python
+
+   from metatomic.torch import AtomisticModel
+
+   atomistic_model = build_my_model()  # returns AtomisticModel
+   model = MetatomicModel(atomistic_model, device="cuda")
+
+From a TorchScript RecursiveScriptModule
+-----------------------------------------
+
+If you have a scripted model loaded via ``torch.jit.load``:
+
+.. code-block:: python
+
+   import torch
+
+   scripted = torch.jit.load("model.pt")
+   model = MetatomicModel(scripted, device="cpu")
+
+The script module must have ``original_name == "AtomisticModel"``.
+Otherwise a ``TypeError`` is raised.
+
+Selecting a device
+------------------
+
+By default, ``MetatomicModel`` picks the best device from the model's
+``supported_devices``. Override with the ``device`` parameter:
+
+.. code-block:: python
+
+   model = MetatomicModel("model.pt", device="cuda:0")
+
+Extensions directory
+--------------------
+
+Some models require compiled TorchScript extensions. Point to their
+location with ``extensions_directory``:
+
+.. code-block:: python
+
+   model = MetatomicModel(
+       "model.pt",
+       extensions_directory="path/to/extensions/",
+   )
@@ -38,7 +38,7 @@ How to use the code
 
    import ase.build
    import torch_sim as ts
-   from metatomic.torchsim import MetatomicModel
+   from metatomic_torchsim import MetatomicModel
 
    model = MetatomicModel("model.pt", device="cpu")
 
@@ -50,5 +50,11 @@ How to use the code
    print(results["forces"])   # shape [n_atoms, 3]
    print(results["stress"])   # shape [1, 3, 3]
 
-For more details, see the `metatomic-torchsim documentation
-<https://docs.metatensor.org/metatomic/latest/torchsim/>`_.
+.. toctree::
+   :maxdepth: 2
+   :caption: torch-sim integration
+
+   torch-sim-getting-started
+   torch-sim-model-loading
+   torch-sim-batched
+   torch-sim-architecture
@@ -549,9 +549,7 @@ def export(self, file: str, collect_extensions: Optional[str] = None):
         )
         return self.save(file, collect_extensions)
 
-    def save(
-        self, file: Union[str, Path], collect_extensions: Optional[str] = None, **kwargs
-    ):
+    def save(self, file: Union[str, Path], collect_extensions: Optional[str] = None):
         """Save this model to a file that can then be loaded by simulation engine.
 
         The model will be saved with `requires_grad=False` for all parameters.
Original file line number	Diff line number	Diff line change
`@@ -549,9 +549,7 @@ def export(self, file: str, collect_extensions: Optional[str] = None):`
`549`	`549`	`)`
`550`	`550`	`return self.save(file, collect_extensions)`
`551`	`551`
`552`		`- def save(`
`553`		`- self, file: Union[str, Path], collect_extensions: Optional[str] = None, **kwargs`
`554`		`- ):`
	`552`	`+ def save(self, file: Union[str, Path], collect_extensions: Optional[str] = None):`
`555`	`553`	`"""Save this model to a file that can then be loaded by simulation engine.`
`556`	`554`
`557`	`555`	The model will be saved with `requires_grad=False` for all parameters.