Coffee Space


Listen:

Loader

loader.js is what is now used to convert pages on this site from a very strict raw markdown format to HTML. In the below sections we will de-construct the code and explain how it works.

There have been some design changes to remove markdown's many to one issue, where several formatting markers have many ways to format text. One of these examples can be the headers, where putting a # before a word and a number of #s underneath a word both result in a h1 HTML header.

There are also a few structural changes to the format to make it easier to build parsers for, one being the way in which lists are created. Now, exactly two spaces are required exactly instead of just less than four. Again, this removes the idea that multiple different inputs lead to the same output - something I consider to be messy design.

Another change from the original specification is where HTML is broke and where it is not. Only code indicated by four spaces is escaped, other code follows the rules of HTML.

The last difference was another to make parsing much easier - each line is treated separately from all others. This makes parsing quicker and easier, but does lead to high numbers of elements. It is thought that modern browsers are in fact capable of handling this and should easily be able to abstract these away into light weight elements in RAM.

The latest version of the code can be found here.

Aims

Hopefully the code you see below meets all of those expectations!

Use

Here are some simple cases the code can be used:

Headers

# Header 1
## Header 2
### Header 3
#### Header 4
##### Header 5
###### Header 6

Note that for simplicity and keeping with the "per-line" model that has been implemented requiring only one pass, the hashes and minuses under text do not produce headers for simplicity of implementation. Why there are two ways of doing this is not fully understood, as documents become difficult to read.

Block Code

    Code after four spaces!

Code blocks work after four spaces. These will escape everything but xmp tags.

In-line Code

Code can also go `here` too.

The code in-line is not escaped in any way, this allows for easy of processing as multiple types of formatting can be used inside.

Lists

These are basically anything with two spaces, followed by any string with no space, followed by a space. This allows for any type of system the user chooses. Some examples are below:

  * Un-ordered lists
  * Un-ordered lists

  1. Numbered lists
  2. Numbered lists

  a. Lettered lists
  b. Lettered lists

  i. Roman lists
  ii. Roman lists

You get the point!

Images

Images are simply in the following format:

![Image alternative text](Image URL)

Text Formatting

This has been chosen to be very simple and universal. The following options work well for formatting most text.

Bold
**Two stars either side of the text do the trick**
Italics
*One star either side of the text*

Code

/**
 * Loader
 *
 * This loader is responsible for loading pages from source without blocking
 * the browser the code is run in. In this loader, pages are converted from
 * MarkDown to HTML for browser viewing. Please use the source below for
 * information on how the conversion is performed.
 *
 * NOTE: This is a simplified version of markdown and is designed to carry
 *       little if any complexity in order to increase usability and speed. If
 *       bugs are found, please contact B[].
 *
 * Features:
 *   * Headers
 *   * Lists
 *   * Code blocks
 *   * Links
 *   * Images
 *   * Bold
 *   * Italics
 *   * In-line code blocks
 *
 * @author B[]
 * @version 1.0.8
 **/

Just comments about the file.

/* ---- Constants ---- */

var TASK_BREAK_TIME_MS;
var TASK_PROCESS_TIME_MS;
var TAB_MAX;

Keep track of some basic information, such as the time to allow the browser to break for and the target time to process for, stored as TASK_BREAK_TIME_MS and TASK_PROCESS_TIME_MS respectively. TAB_MAX stores the spacing before code starts.

/* ---- Global Variables ---- */

var tasksScanned;
var tasksCompleted;

tasksScanned store the number of tasks found whilst searching for tasks and tasksCompleted stores the number of tasks that have been run.

/* ---- Stack ---- */

var stackTimeout;
var taskStack;
var varStack;

stackTimeout is pre-calculated for the future timeout of the stack, instead of re-calculating every time. The taskStack is an array of tasks to be completed, accompanied by the varStack, containing useful variables for the taskStack. It's important these remain in sync, otherwise tasks will get the wrong variables back!

/* ---- Variables ---- */

var elements;

Keep a copy of the elements we are replacing, this can help reduce some errors in poorly implemented JS browsers.

/**
 * Loader()
 *
 * The task runner for the entire program. This function is responsible for
 * scanning the entire page and converting MarkDown to valid HTML.
 **/
function Loader(){
  /* Initialise tasking system */
  init();

Initialise the variables - we have no idea whether we have run this previously on the same page.

  /* Task payloads */
  elements = document.getElementsByName("md");
  for(var i = 0; i < elements.length; i++){
    /* Add task to remove element */
    addTask(function(){
      var elem = getVar();
      elem.innerHTML = "";
    });

Add a task to clear all of the elements, ready for our new data.

    /* Add reference to element to be cleaned */
    addVar(elements[i]);

The clearing tasks will need some elements to clean.

    /* Process the element lines */
    var lines = elements[i].innerHTML.split("\n");
    for(var e = 0; e < lines.length; e++){

For each line, perform our task processing (in the future).

      /* Add task to task stack */
      addTask(function(){

Add the below function code to be processed for each function. It's relatively heavy, but you can garuntee the code will work each time.

        var elem = getVar();
        var line = getVar();
        var skip = false;
        /* <<<< Entire line tests >>>> */
        if(line.length == 0){
          line = "<br /><br />";
        }
        /* <<<< Start of line tests >>>> */
        if(line[0] == '#'){
          var temp = line;
          /* Find out type of header */
          var len = line.length;
          var h = 1;
          for(var z = 1; z < len; z++){
            if(line[z] == '#'){
              h++;
            }else{
              /* Make sure next character is space */
              if(line[z] == ' '){
                /* Remove previous markers */
                temp = line.slice(h + 1);
              }
              z = line.length;
            }
          }
          /* Add HTML */
          temp = "<h" + h + ">" + temp + "</h" + h + ">";
          /* Replace line for searching */
          line = temp;
        }
        if(line[0] == ' '){
          if(line[1] == ' '){
            /* Check whether we have a list or potential code block */
            if(line[2] == ' '){
              /* Check whether we have code block */
              if(line[3] == ' '){
                /* Escape the string */
                temp = line.slice(4);
                temp = temp.split('&').join("&amp;");
                temp = temp.split('<').join("&lt;");
                temp = temp.split('>').join("&gt;");
                temp = temp.split('"').join("&quot;");
                /* Check the length, add some space is zero */
                if(temp.length <= 0){
                  temp += ' ';
                }
                /* Throw some pre-tags around it */
                line = "<pre name=\"code\" style=\"margin:0px;\">" + temp + "</pre>";
                skip = true;
              }
            }else{
              /* Indent the list */
              var point = line.slice(2).split(" ");
              var pointLen = point[0].length;
              if(point[0] == "*"){
                point[0] = "&middot;&nbsp;";
              }
              var temp = "<tt name=\"list\">&nbsp;&nbsp;" + point[0];
              for(var z = point[0].length; z < TAB_MAX; z++){
                temp += "&nbsp;";
              }
              temp += "</tt>" + line.slice(2 + pointLen);
              line = temp + "<br />";
            }
          }
        }
        /* <<<< Middle of line tests >>>> */
        /* Only perform tests if we shouldn't be skipping */
        if(!skip){
          var temp = "";
          var images = line.split("![");
          if(!(images.length == 1 && !(images[0] == '!' && images[1] == '['))){
            for(var z = 0; z < images.length; z++){
              var endS = images[z].indexOf(']');
              var begC = images[z].indexOf('(', endS);
              var endC = images[z].indexOf(')', begC);
              /* If invalid, skip over */
              if(endS < 0 || begC < 0 || endC < 0 || endS + 1 != begC){
                /* Put everything back as it was */
                if(z > 0){
                  temp += "![";
                }
                temp += images[z];
              }else{
                temp += "<img alt=\"";
                temp += images[z].slice(0, endS);
                temp += "\" src=\"";
                temp += images[z].slice(begC + 1, endC);
                temp += "\">";
                /* Add everything that wasn't part of the breakup */
                temp += images[z].slice(endC + 1);
              }
            }
            line = temp;
          }
          temp = "";
          var links = line.split("[");
          if(!(links.length == 1 && line[0] != '[')){
            for(var z = 0; z < links.length; z++){
              var endS = links[z].indexOf(']');
              var begC = links[z].indexOf('(', endS);
              var endC = links[z].indexOf(')', begC);
              /* If invalid, skip over */
              if(endS < 0 || begC < 0 || endC < 0 || endS + 1 != begC){
                /* Put everything back as it was */
                if(z > 0){
                  temp += "[";
                }
                temp += links[z];
              }else{
                temp += "<a href=\"";
                temp += links[z].slice(begC + 1, endC);
                temp += "\">";
                temp += links[z].slice(0, endS);
                temp += "</a>";
                /* Add everything that wasn't part of the breakup */
                temp += links[z].slice(endC + 1);
              }
            }
            line = temp;
          }
          var pos = 0;
          while(pos >= 0){
            /* Search for first instance */
            pos = line.indexOf("**");
            if(pos >= 0){
              /* Replace first instance */
              line = line.slice(0, pos) + "<b>" + line.slice(pos + 2);
              /* Search for second instance */
              pos = line.indexOf("**");
              if(pos >= 0){
                /* Replace second instance */
                line = line.slice(0, pos) + "</b>" + line.slice(pos + 2);
              }
            }
          }
          pos = 0;
          while(pos >= 0){
            /* Search for first instance that doesn't start with spaces */
            pos = line.indexOf("*");
            if(pos >= 0){
              /* Replace first instance */
              line = line.slice(0, pos) + "<i>" + line.slice(pos + 1);
              /* Search for second instance */
              pos = line.indexOf("*");
              if(pos >= 0){
                /* Replace second instance */
                line = line.slice(0, pos) + "</i>" + line.slice(pos + 1);
              }
            }
          }
          pos = 0;
          while(pos >= 0){
            /* Search for first instance that doesn't start with spaces */
            pos = line.indexOf("`");
            if(pos >= 0){
              /* Replace first instance */
              line = line.slice(0, pos) + "<pre class=\"inline\">" + line.slice(pos + 1);
              /* Search for second instance */
              pos = line.indexOf("`");
              if(pos >= 0){
                /* Replace second instance */
                line = line.slice(0, pos) + "</pre>" + line.slice(pos + 1);
              }
            }
          }
        }
        /* Add line to element */
        elem.innerHTML += line;
      });
      /* Add reference to elements */
      addVar(elements[i]);
      /* Allow function to access line */
      addVar(lines[e]);

Add the requires variables to reference each line.

    }
    /* Add task to swap elements XMP for P */
    addTask(function(){
      var elem = getVar();
      var nElem = document.createElement('p');
      nElem.innerHTML = elem.innerHTML;
      elem.parentNode.insertBefore(nElem, elem);
      elem.parentNode.removeChild(elem);
    });

Replace the xmp tags with p tags. We want some pretty formatting after all.

    /* Add reference to element to be cleaned */
    addVar(elements[i]);

Replacing the xmp tags will require a reference to them.

  }
  /* Process tasks */
  process();

Finally, let's start actually completing some tasks!

}

/**
 * init()
 *
 * The initialiser for the tasking system.
 **/
function init(){
  /* Allow the browser to process other tasks */
  TASK_BREAK_TIME_MS = 128;

A nice power of two will do nicely to allow the browser time to recover.

  /* Time to process the tasks for */
  TASK_PROCESS_TIME_MS = 256;

Again, a power of two for processing time, double that of the TASK_BREAK_TIME_MS, otherwise it may not even be worth coming back alive given the overhead of preparing the next task.

  /* Record progress completion */
  tasksScanned = 0;
  tasksCompleted = 0;

Reset the task status variables, who knows - this may not be our first rodeo on this page!

  /* The overall task stack */
  taskStack = [];
  /* The variable stack */
  varStack = [];
  /* Set tab space size */

Empty out the stacks, we don't want the possibility of processing any previous tasks.

  TAB_MAX = 4;

Set the tab size for code to four.

}

/**
 * addTask()
 *
 * Adds task to back of task list and increment task count.
 **/
function addTask(func){
  taskStack.push(func);
  tasksScanned++;
}

This function simply adds tasks in a well defined way. In the future it may be hashed in some way to make searching easier.

/**
 * addVar()
 *
 * Adds a variable to the variable stack.
 **/
function addVar(v){
  varStack.push(v);
}

This simply allows variables to be added for the current task. Multiple variables may be added for the task, but they all must be read using getVar() in order to correctly allow the next task to read its variables.

/**
 * getVar()
 *
 * Gets a variable and removes it from the list.
 **/
function getVar(){
  var r = varStack[0];
  varStack.shift();
  return r;
}

The getVar() function makes sure that the used variables are removed from the array.

/**
 * process()
 *
 * Process tasks.
 **/
function process(){
  /* Make temporary date variable */
  var now = new Date();

We will use the now variable to get the current time for this process loop.

  /* Set the stack time out for the future */
  stackTimeout = now.getTime() + TASK_PROCESS_TIME_MS;

Pre-calculate when we need to stop processing the stack and store that information in the global variable stackTimeout.

  /* Iterate over tasks that remain */
  var i = 0;
  var run = true;
  for(; i < taskStack.length && run == true; i++){

Start processing through the stack.

    now = new Date();

Regenerate the current time.

    /* Check whether we have run our time */
    if(now.getTime() >= stackTimeout){

Make sure that we haven't over-run our allowed time to process the tasks.

      /* Break out of the loop */
      run = false;
      /* Decrement indexing */
      i--;
    }else{
      /* Run next stack item */
      taskStack[i]();
      /* Increment number of tasks complete */
      tasksCompleted++;

Process the next task if we have time and record the fact we processed it in tasksCompleted.

    }
  }
  console.log("Time Took: " + (now.getTime() + TASK_PROCESS_TIME_MS - stackTimeout) + "ms")

Print a message about the progress for debug purposes. If other people have issues with browser freezing, this information is likely to be useful to them.

  /* When we get here, removed processed items from the stack */
  var tempStack = taskStack.slice(i);
  taskStack = tempStack;
  console.log("Stack Remaining = " + taskStack.length);

Remove stack items that have been used up and record how much is still left to be processed.

  /* Register break out time callback if more processing required */
  if(taskStack.length > 0){
    setTimeout(function(){ process(); }, TASK_BREAK_TIME_MS);
  }

Register our interest in being called back if more processing is required.

}

Conclusion

Whilst I could have these pages pre-generated and save the end-user the hassle and added complexity of building their own pages, I think there is some value in giving the end user a raw information format that is in theory timeless. Another positive to this method is the size of the pages dramatically comes down and the ability to compress the pages increases. With this format, the added size of HTML tags is avoided and in theory brings the page size down.

From a design perspective, this method enforces pages to follow similar if not identical formatting rules meaning that the site remains coherent despite small changes in stylisation and changes in direction over time.