I'm not sure if this is a problem with ActiveRecord, PostgreSQL, Flynn or my app, but I recently added a new field, flynn_process_settings
, to a table in my app called environments
and for some reason, while the Environments#update request returns a 200 status, with the content of the updated environment including the new value for flynn_process_settings
, the UPDATE SQL statement that is sent to the database does not include flynn_process_settings
.
I feel like I've ruled out all the usual suspects like "did the database get migrated" etc. because of the fact that I can open a rails console in production and update it just fine, so it seems like most things are setup as intended.
And here's the real weird part. It works about 1 in 20-30 times if I just send the same update request over and over. Whether I wait a minute or 2 seconds in between requests doesn't seem to matter. It's always about a 5% chance of success.
For context: I am running this app in a Flynn container environment, with Postgres. I recently deployed the update to production, after having the same problem in staging, which I was able to fix by pushing to Flynn a couple more times. So It may be a Flynn issue of some kind, but I can't imagine what could cause this kind of problem...?
There are 2 instances of the rails process running in the latest release. The failure/success doesn't seem to correspond to either specific one (it seems to be configured so that my client is tied to a specific instance).
UPDATE: It looks like the parameters hash includes the automagically wrapped parameter "environment" => { "flynn_process_settings" => "..." }
on the requests where it actually works, so this might be a problem with parameter parsing/wrapping! Although I'm not sure why that nested parameter would be required, since my code accessing the parameters looks like this:
def update
if environment.update(environment_params)
render ...
else
render ...
end
end
def environment_params
setup_step_keys = [An Array]
params.permit(setup_step_keys + [:flynn_process_settings]) #This should be at the root of params, right?
end
UPDATE 2: It looks like Flynn has left an old app process running somehow (App 141), and that's the one that is having issues (which is not surprising, although I'm still confused as to how it's returning a 200 status). So now my main question is just why there is an old version of the app running after deploying the new version of the app to Flynn.
This may not fully answer the question, but it turns out there was a stray Passenger process left running that was causing the error results. Every working result came from a newer passenger process. So our primary theory is that the old process was started before the migration ran, and somehow continued running without exceptions, but still not updating the database for some reason, even after the migration ran.
We were using Passenger 5.1.5 which had "a refactoring error that lead to a memory corruption issue when running with the builtin engine" - so possibly it was related to that, although I don't know how likely that is.
In any case, the primary problem was that there was a rogue Passenger process causing the error behavior, and killing that process solved the problem. As to why/how this process was started and why it was not raising exceptions, I can't say yet, so I'll leave this open to further answers in case anybody has a more complete explanation.