Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

Conversion from FP32 to Mixed Precision Models #14584

Closed
@anirudh2290

Description

@anirudh2290

API Addition

Users want to bring a FP32 model and convert it to a Mixed precision model to run inference on it. They want to use the model zoo to convert pretrained models in Python and other frontends. They can do this with gluon models today by casting the inputs and the blocks but the same cannot be done for symbolic models (json and params). Proposing to add an API to convert FP32 models to FP16.

Considering the recent AMP work in progress here: #14173, I think we should add a conversion API to FP16 model under AMP namespace:

amp.convert_model(sym, arg_params, aux_params, target_dtype="float16", 
                  target_precision_ops=None, original_precision_ops=None,
                  widest_precision_ops=None, excluded_sym_names=None)

With the target_precision_ops, original_precision_ops and widest_precision_ops, users should be able to override the default in the amp lists.

Backend Changes

Additionally, Add a NNVM pass for the backend. This would by default use the amp lists for FP16, FP32 and widest type casts to use FP16 or FP32 inputs.
This pass will perform graph traversal and adding amp_cast and amp_multicast layers for FP16 and FP32 ops.

Planning to start working on the POC unless someone is already working on this.

@ptrendx @DickJC123 @pengzhao-intel @ZhennanQin @eric-haibin-lin @Caenorst

EDIT: Proposal posted on dev list: https://cwiki.apache.org/confluence/display/MXNET/Conversion+from+FP32+to+Mixed+Precision+Models

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions