[Proposal] Add support for `cpu`, `meta`, and `disk` to TransformerBridge `device_map`

### Proposal 

Support `cpu`, `meta`, and `disk` for the `device_map` parameter of `boot_transformers`.

### Motivation

Please outline the motivation for the proposal.
Is your feature request related to a problem? e.g.,"I'm always frustrated when [...]".
If this is related to another GitHub issue, please link here too.

### Pitch

We want to allow `cpu` and other non-GPU devices to be used by torch for multi-device processing. At current, we error when any of the above types are requested in a multi-device setup.

### Additional context

The dtype-cast loop in transformers.py:550-552 is the blocker for `cpu` support:
```
for param in hf_model.parameters():
    if param.is_floating_point() and param.dtype != dtype:
        param.data = param.data.to(dtype=dtype)
```
For meta/offloaded params, .to(dtype) either no-ops or breaks accelerate's bookkeeping. Two small fixes should unlock CPU offload:

- [ ] Skip params on meta device (offload placeholders)
- [ ] Use accelerate's `align_module_device` context manager when iterating offloaded params

### Checklist

- [x] I have checked that there is no similar [issue](https://github.com/TransformerLensOrg/Transformerlens/issues) in the repo (**required**)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal] Add support for `cpu`, `meta`, and `disk` to TransformerBridge `device_map` #1280

Proposal

Motivation

Pitch

Additional context

Checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Proposal] Add support for cpu, meta, and disk to TransformerBridge device_map #1280

Description

Proposal

Motivation

Pitch

Additional context

Checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Proposal] Add support for `cpu`, `meta`, and `disk` to TransformerBridge `device_map` #1280