When I try to combine the two libraries (torch.compile and flash-attention) I get the following error:

torch._dynamo.variables.higher_order_ops: [WARNING] speculate_subgraph: while introspecting the user-defined autograd.Function, we were unable to trace function trampoline_autograd_fwd into a single graph. This means that Dynamo was unable to prove safety for this API and will fall back to eager-mode PyTorch, which could lead to a slowdown. [rank2]:[2023-10-09 15:38:00,809] [4/0] torch._dynamo.variables.higher_order_ops: [ERROR] call_method UserDefinedObjectVariable(fwd) call [TensorVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(float), ConstantVariable(float), ConstantVariable(bool), ConstantVariable(bool), ConstantVariable(NoneType)] {} .

Any ideas??

Each of them individually run properly. It is the combination that doesnt stack.

0

There are 0 best solutions below