Proposal
Support cpu, meta, and disk for the device_map parameter of boot_transformers.
Motivation
Please outline the motivation for the proposal.
Is your feature request related to a problem? e.g.,"I'm always frustrated when [...]".
If this is related to another GitHub issue, please link here too.
Pitch
We want to allow cpu and other non-GPU devices to be used by torch for multi-device processing. At current, we error when any of the above types are requested in a multi-device setup.
Additional context
The dtype-cast loop in transformers.py:550-552 is the blocker for cpu support:
for param in hf_model.parameters():
if param.is_floating_point() and param.dtype != dtype:
param.data = param.data.to(dtype=dtype)
For meta/offloaded params, .to(dtype) either no-ops or breaks accelerate's bookkeeping. Two small fixes should unlock CPU offload:
Checklist
Proposal
Support
cpu,meta, anddiskfor thedevice_mapparameter ofboot_transformers.Motivation
Please outline the motivation for the proposal.
Is your feature request related to a problem? e.g.,"I'm always frustrated when [...]".
If this is related to another GitHub issue, please link here too.
Pitch
We want to allow
cpuand other non-GPU devices to be used by torch for multi-device processing. At current, we error when any of the above types are requested in a multi-device setup.Additional context
The dtype-cast loop in transformers.py:550-552 is the blocker for
cpusupport:For meta/offloaded params, .to(dtype) either no-ops or breaks accelerate's bookkeeping. Two small fixes should unlock CPU offload:
align_module_devicecontext manager when iterating offloaded paramsChecklist