Coffee Space


Listen:

Virtual Repository Environment

Preview Image

Preview Image

Here we discuss the learning points from creating the prototype NHS VRE system - mostly being complaints about how Java and the internet works.

Reading From Network Stream

Implementing a very basic POST system, I found out some nice things about the way Java handles streams - in particular network streams. When writing a web server for scratch, there's normally no way around having to deal with them at some point to get the performance out of Java.

To start with, we read in the first chunk to figure out what sort of request we are dealing with - GET or POST (HEAD is not implemented on this server). In some more code about this exert, we figure out the positions of everything we are interested in, such as where the binary starts, how long the binary is, what the markers are for the binary, etc.

In this state, we start with the following to make sure it's worth going into a heavy read mode:

/* Write the file */
int rSize = r;
filesize -= r;
/* Make sure end is not in data to be written */
if(filesize > readSize){

Next we begin read into the file as fast as we can:

  os.write(request, startFileBin, r - startFileBin);
  rSize = filesize > readSize ? readSize : filesize;
  r = is.read(request, 0, rSize);
  filesize -= r;
  while(filesize > readSize){
    os.write(request, 0, r);
    rSize = filesize > readSize ? readSize : filesize;
    r = is.read(request, 0, rSize);
    filesize -= r;
  }

Note that r can read 0, meaning no bytes read - yet there are still bytes to be read from the stream! Worse yet, we can read -1 to indicate "end of stream" - but still not have collected all the bytes from the stream as the documents would suggest. This can be due to the client having a slow hard disk, the network being slow or a multitude of reasons.

For this reason, we do a very careful read of the end:

  byte[] oRequest = new byte[readSize * 4];
  System.arraycopy(request, 0, oRequest, 0, r);
  int oReqLen = r;
  filesize = (readSize * 4) - oReqLen;
  try{
    while(filesize > 0 && r >= 0){
      rSize = filesize > readSize ? readSize : filesize;
      r = is.read(oRequest, oReqLen, rSize);
      oReqLen += r;
      filesize -= r;
    }
  }catch(SocketTimeoutException e){
    r = e.bytesTransferred;
    oReqLen += r;
  }

Whatever we are able to read from the stream, we hunt for an ending to the file using a KMPMatch through the byte stream.

  int endFileBin = KMPMatch.indexOf(oRequest, bound.getBytes());
  for(int x = endFileBin - 1; x >= 0; x--){
    if(oRequest[x] == '\r'){
      endFileBin = x;
      break;
    }
  }

Saving something is better than saving nothing. Besides, it's better to save what we can than not to save anything at all - although this has never yet happened.

  /* If we fail to find the end, spit out what we have */
  if(endFileBin < 0){
    /* The best end we have */
    endFileBin = oReqLen;
  }
  os.write(oRequest, 0, endFileBin);
  System.out.println("File written."); // TODO: Remove me.

If the file was neevr that big, no need to load anything other than what we already have.

}else{
  /* Only a small file - only process current request */
  int endFileBin = KMPMatch.indexOf(request, startFileBin, bound.getBytes());
  for(int x = endFileBin - 1; x >= 0; x--){
    if(request[x] == '\r'){
      endFileBin = x;
      break;
    }
  }
  /* If we fail to find the end, spit out what we have */
  if(endFileBin < 0){
    endFileBin = r;
  }
  os.write(request, startFileBin, endFileBin - startFileBin);
}

Browsers Uploading Files

It appears that browsers are limited to uploading a limit of 2GB, at least 32 bit and 64 bit Firefox. This will become more of a problem in the future when files are larger and larger due to the ever more demanding users of our systems.

Conclusion

This write up is far from complete, with only a fraction of the problems discussed here. Hopefully this provides some insight into the difficulties and problems associated with writing a HTTPS web server from scratch to handle GET, POST and a database securely.

Who knows what the future holds?