No decrease in loss after lots traning

174 Views Asked by Ritu Lahkar At 12 April 2025 at 17:54

I was trying to train a convolution network. But it is not improving, i.e. loss is not decreasing. And the train function is also terminating much more quickly than usual. Below is the minimal code to show the problem.

using Flux

data=rand(200, 100, 1, 50)

label=rand([0.0,1.0], 1, 50)

model=Chain(

   Conv((3,3), 1=>5, pad=(1,1)),
   MaxPool((2,2)),
   Conv((3,3), 5=>5, pad=(1,1)),
   MaxPool((2,2)),
   Conv((3,3), 5=>5, pad=(1,1)),
   MaxPool((2,2)),
   x->reshape(x, :, size(x, 4)),
   x->σ.(x),
   Dense(1500,100),
   Dense(100,1)
)

model(data)

loss=Flux.mse

opt=Descent(0.1)

param=params(model)

loss(model(data), label)  #=>0.3492440767136241

Flux.train!(loss, param, zip(data, label), opt)

loss(model(data), label)  #=>0.3492440767136241

Original Q&A

There are 1 best solutions below

mcabbott On 20 March 2022 at 22:23 BEST ANSWER

The first argument to Flux.train! needs to be function which accepts the data, runs the model, and returns the loss. Its loop looks something like this:

for dtup in zip(data, label)
    gradient(() -> loss(dtup...), params)
    ...
end

But the function loss you provide doesn't call the model at all, it just compares the data point to the label directly.

There is more to fix here though. What's being iterated over is tuples of numbers, starting with zip(data, label) |> first, which I don't think is what you want. Maybe you wanted Flux.DataLoader to iterate batches of images?

No decrease in loss after lots traning

There are 1 best solutions below

Related Questions in JULIA

Related Questions in FLUX.JL

Trending Questions

Popular # Hahtags

Popular Questions