I administer a large scale SSH server with upwards of 300+ concurrent sessions. This server presents some challenges. When migrated to AWS it started experiencing a large number of stale SSH sessions.
Fixing this was achieved in two parts. First was a bash script which ran every hour which culled all the stale connections.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
#!/bin/bash tmpaccounts="/tmp/tmpaccounts.txt" log="/var/log/ssh_clean.log" echo "`date` - Starting SSH Clean script." >> $log echo "`date` - Finding affected accounts" >> $log w| awk '{print $1}'|sort | uniq -c| grep -v " 1" | awk '{print $2}' > $tmpaccounts echo "`date` - Starting for loop" >> $log for i in `cat $tmpaccounts` do echo "`date` - Account $i has duplicate stale connections" >> $log for k in `ps waux | grep $i | grep -v grep | awk '{print $2}'` do echo "`date` - Killing $k stale connection" >> $log kill -9 $k done done echo "`date` - Loops complete" >> $log echo "`date` - Deleting temp account." >> $log rm -rf $tmpaccounts |
The second part was to specify keepalive packets at 5 second intervals in the /etc/ssh/sshd_config file.
1 |
ClientAliveInterval 5 |