Is there a way for Chef to become aware of an archive file's contents during a run?

600 Views Asked by At

I have a chef recipe which clones a specific branch of a git repository that contains two .tgz files and an .sql file. The file names in the repo follow a convention, but are timestamped, which means there's no way to be sure of their exact names with each run. After cloning the repository, I'd like chef to extract both of the .tgz files.

I've gotten everything to work up until the part where chef needs to extract the .tgz files. The client run always errors out with the tgz filenames as nil. I believe the problem is that because of the way chef works, it may not be possible for chef to "discover" a file name that's been added to a directory during its run phase.

During my testing I found that if I clone the git repository before the chef run so that its contents are stored inside of the recipe's files/ directory, those files are included in chef's cache and are extracted as expected. I believe this works because the .tgz files are known to chef at this point; they aren't being made available during the run. This is a solution I can consider as a last resort, but it's not ideal as I'd like to do as little work on the end user's local machine as possible.

I'd like to know if my understanding is correct and if there's a way to achieve what I've outlined. Here's my code:

# Clone the repository
execute "Cloning the #{backup_version} from the #{backup_repository_url} repository" do
    command "su #{user} -c 'git clone --single-branch --branch #{backup_version} #{backup_repository_url} #{backup_holding_area}'"
    cwd web_root
end

# I need all three files eventually, so find their paths in the directory 
# they were cloned to and store them in a hash
backup_files = Hash.new
["code", "media", "db"].each do |type|
    backup_files[type.to_sym] = Dir["#{backup_holding_area}/*"].find{ |file| file.include?(type) }
end

# I need to use all three files eventually, but only code and media are .tgz files
# This nil check is where chef fails
unless backup_files[:code].nil? || backup_files[:media].nil? || backup_files[:db].nil?
    backup_files.slice(:code, :media).each do |key, file|
        archive_file "Restore the backup from #{file}" do
            path file
            destination web_root
            owner user
            group group
            overwrite :auto
            only_if { ::File.exist?(file) }
        end
    end
end
1

There are 1 best solutions below

5
On BEST ANSWER

There are different phases of chef-client run. The "Compile" and "Converge" phase are the relevant ones in this situation. During the run, the "compile" phase comes first, then "converge".

  • Compile phase: "code" that is not within a Chef resource
  • Converge phase: "code" that is within Chef resources

For e.g., the below variable assignment will run during compile phase.

backup_files = Hash.new

Whereas the execute block (like below) will be run during converge:

execute "Cloning the #{backup_version} from the #{backup_repository_url} repository" do
    command "su #{user} -c 'git clone --single-branch --branch #{backup_version} #{backup_repository_url} #{backup_holding_area}'"
    cwd web_root
end

As all of the variable assignments are outside the resource blocks, they have been assigned long before the actual convergence. i.e. when files were not even in the destination directory. So they don't have the filenames as we are expecting.

One way to ensure that we get the filenames is to assign the variables inside a Chef resource. One such resource is the ruby_block resource.

Using this then we can have recipe like below:

# use execute to clone or use the git resource with properties as required
git backup_holding_area do
  repository backup_repository_url
  revision backup_version
  action :checkout
end

# Iterating over files in directory is still ok as there only 3 files
ruby_block 'get and extract code and media tar files' do
  block do
    Dir.entries("#{backup_holding_area}").each do |file|
      if file.include?('tar.gz')
        # appropriate flags can be used for "tar" command as per requirement
        system("tar xzf #{backup_holding_area}/#{file} -C #{web_root}")
      end
    end
  end
end