How to repair sshkey pairs after recreating global ssh keys with Ansible

387 Views Asked by At

In a nutshell, after deleting then recreating new global ssh keys on a managed host as part of an ansible play, the shared ssh keys between the controller and the host break. I would like to know a superior method to "fix" this issue and regain the original ssh key trust using ansible itself. Unfortunately this will require some explanation.

Basically as a start, right now, I don't have ansible set up when a new image is deployed. To remedy that, I have created a bash script, utilizing expect which nicely and neatly does 2 things on that new managed host:

  • Creates an ansible account with appropriate sudo permissions

  • Creates an ssh key pair between the controller and the controller and the managed host.

That's it, and that's all, however it does require manual input at this time as to the IP of the host to be run on. We now have a desired state from which ansible works well via ssh. However it seems cumbersome at 328 lines of code to check and do this procedure, more on this later.

The issue starts, due to the fact that the host/server is deployed from an image, there is a need to recreate the global keys on each so that they do not have the same set. The fix for this part of that issue is a simple 2 steps:

  • Find and delete all ^ssh_host_. files in the directory /etc/ssh/

  • Run the command: /usr/bin/ssh-keygen -A to generate new global ssh keys.

We however now have a problem, once the current ssh connection is broken to the managed host, we can no longer connect to our managed hosts as our known_hosts file on the controller now have keys that don't match. If you do nothing else, you get a prompt again to verify the remote key as it has "changed" and you can't continue until you do. (Stopping all playbooks from functioning) OR if you try to clear the IP out of the known_hosts file on the controller and put it back in, you get the lovely below message:

    "changed": false,
    "msg": "Failed to connect to the host via ssh: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@\r\n@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!  ***SNIP***  You can use following command to remove the offending key:\r\nssh-keygen -R 10.200.5.4 -f /home/ansible/.ssh/known_hosts\r\nECDSA host key for 10.200.5.4 has changed and you have requested strict checking.\r\nHost key verification failed.",
    "unreachable": true

So now I have an issue, and there must be a few commands which I can utilize with ssh-keygen, and/or ssh-keyscan to fix this mess cleanly. However for the life of me I can't figure it out. My only recourse now is to re-run the bash script which initially sets this all up, and replace everything on the controller/host sshkey wise. This seems like overkill, I can't possibly believe that is necessary.

My only hope now is that someone else has an idea how to solve this cleanly and permanently without manual intervention. Otherwise, the only thing I can do is set the ansible_ssh_common_args: "-o StrictHostKeyChecking=no"fact and run the commands my script does but only in playbook form. I can't believe there aren't any modules which can accomplish this. I tried the known_host module, but either I don't know how to use it properly, or it doesn't have this functionality. (Also it has the annoying property of changing my known_hosts file to root ownership, which I must then change back.)

If anyone can help that would be fantastic! Thanks in advance!




The below is not strictly needed as it's extra text clogging up the works, but it does illustrate how the bash script fixes this issue and maybe give some insight on a better solution:




In short, it generates an ssh public and private key, attaches the hostname to them, creates an ssh config identity file using a heredocs population method, puts them in the proper spots, and then copies the public key over to mangaged host in question.

The code snipits are below to show how this is accomplished. This is not the entire script just relevant parts:

#HOMEDIR is /home/ansible  This host is the IP of managed host in the run.  
#THISHOST is the IP of the managed host in question. Yes, we ONLY use IP's, there is no DNS.  
cd "$HOMEDIR"
rm -f $HOMEDIR/.ssh/id_rsa
ssh-keygen -t rsa -f "$HOMEDIR"/.ssh/id_rsa -q -P ""
sudo mkdir -p "$HOMEDIR"/.ssh/rsa_inventory && sudo chown ansible:users "$HOMEDIR"/.ssh/rsa_inventory
cp -p "$HOMEDIR"/.ssh/id_rsa "$HOMEDIR"/.ssh/rsa_inventory/$THISHOST-id_rsa
cp -p "$HOMEDIR"/.ssh/id_rsa.pub "$HOMEDIR"/.ssh/rsa_inventory/$THISHOST-id_rsa.pub


#Heredocs implementation of the ssh config identity file:  
cat <<EOT >> /home/ansible/.ssh/config

Host $THISHOST $THISHOST
    HostName $THISHOST
    IdentityFile ~/.ssh/rsa_inventory/${THISHOST}-id_rsa
    User ansible
EOT


#Define the variable earlier before the expect script is run so it makes sense in next snipit:
ssh_key=$( cat "$HOMEDIR"/.ssh/id_rsa.pub )

#Snipit in except script where it echos over the public ssh key to the managed host from the controller.
send "sudo echo '"$ssh_key"' >> /home/ansible/.ssh/authorized_keys\n"
expect -re {:~> *$}
send "sudo chmod 644 /home/ansible/.ssh/authorized_keys\n"
expect -re {:~> *$}
#etc etc, so on and so forth properly setting attributes on this file.  ```

Now things work with passwordless ssh as they should.  Until they are re-ruined by the global ssh key replacement.  
0

There are 0 best solutions below