How to delete old artifacts in Gitlab self hosted?

17.2k Views Asked by At

We have a self-hosted Gitlab running on one instance but every now and then we are facing space issues because the large artifacts filled up the space. We have to go and delete the older artifacts folders manually.

Is there a way to automate this? May be a script which runs overnight and delete the artifacts folder older than say 7 days?

The default expiration is set to 5 days in Gitlab Admin but that does not mean they are deleted from the box.

3

There are 3 best solutions below

0
On

Turns out, the expiration setting may be ignored by having "Keep artifacts from most recent successful jobs" setting set, and apparently it is enabled by default.

If you go to a pipeline old enough that it should've had its artifacts removed, and see the message These artifacts are the latest. They will not be deleted (even if expired) until newer artifacts are available.

enter image description here

…then likely you stumble upon this problem. To disable that go to your project Settings → CI/CD, expand Artifacts, uncheck Keep artifacts from most recent successful jobs. The URL you can use to access it (substitute the angle-bracket placeholders): https://<domain>/<user_or_group>/<project>/-/settings/ci_cd

0
On

GitLab 16.7 (December 2023) should improve this situation:

Improved ability to keep the latest job artifacts

In GitLab 13.0 we introduced the ability to keep the job artifacts from the most recent successful pipeline.
Unfortunately, the feature also marked all failed and blocked pipelines as the latest pipeline, regardless of whether they were the most recent or not. This led to a buildup of artifacts in storage which had to be deleted manually.

(see also the script in "Why is my artifact storage so high? How can I get rid of it?")

In GitLab 16.7 the bugs causing this unintended behavior are resolved.
Job artifacts from failed and blocked pipelines are only kept if they are from the most recent pipeline; otherwise they will follow the expire_in configuration.
Affected GitLab.com customers should see artifacts which were inadvertently kept now unlocked and removed after a new pipeline run.

The Keep artifacts from most recent successful jobs setting overrides the job’s artifacts: expire_in configuration and can result in a large number of artifacts stored without expiry.
If your pipelines create many large artifacts, they can fill up your project storage quota quickly. We recommend disabling this setting if this feature is not required.

See Documentation and Issue.

0
On

When artifacts expire, they should be deleted from disk. If your artifacts are not deleted from your physical storage, there is a configuration issue with your storage. Ensure you have write and delete permissions on your storage configuration.

Artifacts that were created before the default expiration setting was set will still need to be deleted manually -- but one time. All new artifacts will respect the artifact expiration.

However, you should do this through the API, not directly on the filesystem. Otherwise there will be a mismatch between what GitLab's database thinks exists and what actually exists on disk.

For an example script: see this answer.

Also note there are several circumstances under which artifacts are kept, such as the latest artifacts. New pipelines must run for old artifacts to expire. See documentation for more information.