I was reading an Ask HN the other day about not using Docker (or other) containers on servers. I’ve never run containers on my servers. When I used AWS, I was paying some compute unit, so I kept computation as low as possible. Since then, I have jumped from cheap host to cheap host.
These days I keep the cost of the server under $12 per year, with 128 MB of RAM. Running on resources this low means there is not really enough space for these containment systems, especially as I run more than a few things.
How do you maintain portability? Quite simply, just better testing. When you push some new code, be prepared to have to jump into the code and perform some change. Ideally you test enough that there is a hell of a good chance that it works. Containers should not be used as a method to avoid testing. If the code fails, find out why - don’t just keep making random changes until it does work. Then, once you find the root cause, add it to your check list. Eventually your pre-push testing will be so robust that you will not run into such issues.
How do you maintain security? Containers are not nearly as secure as you are lead to believe. Encrypt everything (even RAM) and trust nothing. Another point is to avoid security at all costs. If you can avoid having to handle user input or security, do so.
One thing you’ll find you want to do is for the server to auto-fetch and then pull new changes from a Git repository. You’ll then want to build the code for the server (optimize for local hardware) and then run your new code.
The following is an example I use for C/C++, and generally has evolved over the years from various projects as the needs have changed.
0001 #!/bin/bash 0002 0003 # String for process 0004 PROC="PROGRAM_NAME" 0005 STATS="/tmp/SERVER_STATS" 0006 0007 # Variables 0008 n_clients="0"
Here we have the variables. PROGRAM_NAME
is a unique identifier for your program running on the server. SERVER_STATS
are some interesting stats logged out to /tmp/SERVER_STATS
about the current server state.
As you can see, in our example we are interested in n_clients
(number of connected clients). The SERVER_STATS
file will write something like:
0009 n_clients="9" 0010 n_authed="2"
The idea is that this information informs this script as to whether we should perform some action. In our case, we want to wait to restart the server application whilst there are connected clients.
0011 # log() 0012 # 0013 # Log to standard error a script related point of interest. 0014 # 0015 # @param $@ The message to be printed. 0016 function log { 0017 echo "[$(date +%F_%H-%M-%S)] $@" >/dev/stderr 0018 }
Log stuff to the standard error of interest. If the server program suddenly reboots, it is nice to be able to check the logs to find out why.
0019 # read_config() 0020 # 0021 # Read the configuration file, otherwise set default values. 0022 function read_config { 0023 log "[before] n_clients = $n_clients" 0024 if [ -f "$STATS" ]; then 0025 . $STATS 0026 fi 0027 log "[after] n_clients = $n_clients" 0028 }
A function to read our SERVER_STATS
file, which we simple source. We log the before and after data as a sanity check if we ever need to debug the reason for a reboot.
0029 # restart_process() 0030 # 0031 # Stop any existing process by the same name and then start a new one. 0032 function restart_process { 0033 # Ensure we create a new stats file 0034 rm $STATS 0035 # If process is running 0036 res="$(ps ax | grep $PROC | grep -v grep)" 0037 if [ ! "${res:-null}" = null ]; then 0038 pid="$(echo $res | awk '{print $1}')" 0039 log "Trying to kill process $pid" 0040 kill $pid 0041 fi 0042 log "Trying to start process $PROC" 0043 bash server.sh & 0044 }
We assume the process name PROGRAM_NAME
is unique. We attempt to find it, then kill it. We then restart our server script, in this case called server.sh
. We wrap the server program in a small wrapper script simply because we may want to pass new parameters to the process.
NOTE: We purposely delete the SERVER_STATS
file and wait for the server program to write a new one. If the server fails to reboot for some reason, we don’t want to fool ourselves with old data.
0045 # rebuild_program 0046 # 0047 # Rebuild the program. 0048 function rebuild_program { 0049 root="$(pwd)" 0050 cd "SOURCE_CODE" 0051 make clean && make install 0052 cd "$root" 0053 }
Perform some rebuilding. In this case it is just a simple Makefile
.
0054 # ensure_safe() 0055 # 0056 # Make sure the program is not in active use. 0057 function ensure_safe { 0058 read_config 0059 while [ ! $n_clients -eq "0" ]; do 0060 log "Waiting to rebuild and reboot server, n_clients = $n_clients" 0061 sleep 30 0062 read_config 0063 done 0064 }
Loop and wait until it is safe to do something with the server. Once we hit this function we are simply waiting to perform a restart when the correct conditions are met. In this case, we wait for all of the clients to disconnect.
0065 # Restart process by default 0066 restart_process
The first thing we ever do is reboot the server. We kill any existing processes and then start the server program.
0067 # Infinite loop 0068 while : 0069 do 0070 # Fetch the latest changes 0071 git fetch 0072 # Check whether pull required 0073 if [ $(git rev-parse HEAD) != $(git rev-parse @{u}) ]; then 0074 # Pull the latest changes 0075 git pull 0076 # Rebuild the files 0077 ensure_safe 0078 rebuild_program 0079 # Restart the process 0080 restart_process 0081 else 0082 log "No changes" 0083 fi 0084 # Check if process is running 0085 res="$(ps ax | grep $PROC | grep -v grep)" 0086 if [ "${res:-null}" = null ]; then 0087 # As it's not running, rebuild and restart it 0088 make && restart_process 0089 # Do another loop shortly 0090 log "Failed build detected, quick reboot" 0091 sleep 30 0092 else 0093 # Sleep for 5 minutes and check again 0094 log "Nothing to do, sleeping" 0095 sleep 300 0096 fi 0097 done
This is the main loop. Here we are checking whether there are some changes in git every 5 minutes (300 seconds to be friendly to the Git server). If we detect there are some changes, we pull in the latest code, ensure there are no active users, rebuild and restart.
If not pulling in changes we simple ensure the process is running. If the sever program is not running, we increase our checking loop to every 30 seconds.
This is not foolproof and you should adjust this for your exact needs. Do not blindly use this script without first reading and understanding it.
Use at your own risk.
In the future I would make the following improvements:
Anyway, I will continue to iterate on the design. I do not want to make it more complex - the benefit of this script is it’s incredible simplicity. It’s not smart enough to cause too much trouble. Sure, there are some edge cases, but they are fairly obvious edge cases.