DeepSpeed: no operator matches operands error

63 Views Asked by At

When I try to use DeepSpeed example to finetune a OPT 1.3b model on my local machine, I have an unexpected error, which related to following code snippet:

    template <typename T>
__global__ void moe_res_matmul(T* residual, T* coef, T* mlp_out, int seq_len, int hidden_dim)
{
    constexpr int granularity = 16;
    constexpr int vals_per_access = granularity / sizeof(T);

    T* residual_seq = residual + blockIdx.x * hidden_dim;
    T* mlp_out_seq = mlp_out + blockIdx.x * hidden_dim;

    for (unsigned tid = threadIdx.x * vals_per_access; tid < hidden_dim;
         tid += blockDim.x * vals_per_access) {
        T mlp[vals_per_access];
        T res[vals_per_access];
        T coef1[vals_per_access];
        T coef2[vals_per_access];

        mem_access::load_global<granularity>(mlp, mlp_out_seq + tid);
        mem_access::load_global<granularity>(res, residual_seq + tid);
        mem_access::load_global<granularity>(coef1, coef + tid);
        mem_access::load_global<granularity>(coef2, coef + tid + hidden_dim);

#pragma unroll
        for (int idx = 0; idx < vals_per_access; idx++) {
            mlp[idx] = mlp[idx] * coef2[idx] + res[idx] * coef1[idx];
        }

        mem_access::store_global<granularity>(mlp_out_seq + tid, mlp);
    }
}

The error messages in the log is like this:

/.../python3.10/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/gelu.cu(529): 
error: no operator "*" matches these operands
    operand types are: __half * __half
        mlp[idx] = mlp[idx] * coef2[idx] + res[idx] * coef1[idx];
                                  ^
    detected during:
        instantiation of "void moe_res_matmul(T *, T *, T *, int, int) [with T=__half]"
at line 547
        instantiation of "void launch_moe_res_matmul(T *, T *, T *, int, int, cudaStream_t) [with T=__half]"
at line 566

The finetuning environment:

datasets>=2.8.0
sentencepiece>=0.1.97
protobuf==3.20.3
accelerate>=0.15.0
torch>=1.12.0
deepspeed>=0.9.0

Any idea how to solve this issue? Thanks!

0

There are 0 best solutions below