loader.js
is what is now used to convert pages on this site from a very strict raw markdown
format to HTML
. In the below sections we will de-construct the code and explain how it works.
There have been some design changes to remove markdown’s many to one issue, where several formatting markers have many ways to format text. One of these examples can be the headers, where putting a #
before a word and a number of #
s underneath a word both result in a h1
HTML header.
There are also a few structural changes to the format to make it easier to build parsers for, one being the way in which lists are created. Now, exactly two spaces are required exactly instead of just less than four. Again, this removes the idea that multiple different inputs lead to the same output - something I consider to be messy design.
Another change from the original specification is where HTML is broke and where it is not. Only code indicated by four spaces is escaped, other code follows the rules of HTML.
The last difference was another to make parsing much easier - each line is treated separately from all others. This makes parsing quicker and easier, but does lead to high numbers of elements. It is thought that modern browsers are in fact capable of handling this and should easily be able to abstract these away into light weight elements in RAM.
The latest version of the code can be found here.
Hopefully the code you see below meets all of those expectations!
Here are some simple cases the code can be used:
0001 # Header 1 0002 ## Header 2 0003 ### Header 3 0004 #### Header 4 0005 ##### Header 5 0006 ###### Header 6
Note that for simplicity and keeping with the “per-line” model that has been implemented requiring only one pass, the hashes and minuses under text do not produce headers for simplicity of implementation. Why there are two ways of doing this is not fully understood, as documents become difficult to read.
0007 Code after four spaces!
Code blocks work after four spaces. These will escape everything but xmp
tags.
0008 Code can also go `here` too.
The code in-line is not escaped in any way, this allows for easy of processing as multiple types of formatting can be used inside.
These are basically anything with two spaces, followed by any string with no space, followed by a space. This allows for any type of system the user chooses. Some examples are below:
0009 * Un-ordered lists 0010 * Un-ordered lists 0011 0012 1. Numbered lists 0013 2. Numbered lists 0014 0015 a. Lettered lists 0016 b. Lettered lists 0017 0018 i. Roman lists 0019 ii. Roman lists
You get the point!
Links are simply in the following format:
0020 [Link text](Link URL)
Images are simply in the following format:
0021 ![Image alternative text](Image URL)
This has been chosen to be very simple and universal. The following options work well for formatting most text.
0022 **Two stars either side of the text do the trick**
0023 *One star either side of the text*
0024 /** 0025 * Loader 0026 * 0027 * This loader is responsible for loading pages from source without blocking 0028 * the browser the code is run in. In this loader, pages are converted from 0029 * MarkDown to HTML for browser viewing. Please use the source below for 0030 * information on how the conversion is performed. 0031 * 0032 * NOTE: This is a simplified version of markdown and is designed to carry 0033 * little if any complexity in order to increase usability and speed. If 0034 * bugs are found, please contact B[]. 0035 * 0036 * Features: 0037 * * Headers 0038 * * Lists 0039 * * Code blocks 0040 * * Links 0041 * * Images 0042 * * Bold 0043 * * Italics 0044 * * In-line code blocks 0045 * 0046 * @author B[] 0047 * @version 1.0.8 0048 **/
Just comments about the file.
0049 /* ---- Constants ---- */ 0050 0051 var TASK_BREAK_TIME_MS; 0052 var TASK_PROCESS_TIME_MS; 0053 var TAB_MAX;
Keep track of some basic information, such as the time to allow the browser to break for and the target time to process for, stored as TASK_BREAK_TIME_MS
and TASK_PROCESS_TIME_MS
respectively. TAB_MAX
stores the spacing before code starts.
0054 /* ---- Global Variables ---- */ 0055 0056 var tasksScanned; 0057 var tasksCompleted;
tasksScanned
store the number of tasks found whilst searching for tasks and tasksCompleted
stores the number of tasks that have been run.
0058 /* ---- Stack ---- */ 0059 0060 var stackTimeout; 0061 var taskStack; 0062 var varStack;
stackTimeout
is pre-calculated for the future timeout of the stack, instead of re-calculating every time. The taskStack
is an array of tasks to be completed, accompanied by the varStack
, containing useful variables for the taskStack
. It’s important these remain in sync, otherwise tasks will get the wrong variables back!
0063 /* ---- Variables ---- */ 0064 0065 var elements;
Keep a copy of the elements we are replacing, this can help reduce some errors in poorly implemented JS browsers.
0066 /** 0067 * Loader() 0068 * 0069 * The task runner for the entire program. This function is responsible for 0070 * scanning the entire page and converting MarkDown to valid HTML. 0071 **/ 0072 function Loader(){ 0073 /* Initialise tasking system */ 0074 init();
Initialise the variables - we have no idea whether we have run this previously on the same page.
0075 /* Task payloads */ 0076 elements = document.getElementsByName("md"); 0077 for(var i = 0; i < elements.length; i++){ 0078 /* Add task to remove element */ 0079 addTask(function(){ 0080 var elem = getVar(); 0081 elem.innerHTML = ""; 0082 });
Add a task to clear all of the elements, ready for our new data.
0083 /* Add reference to element to be cleaned */ 0084 addVar(elements[i]);
The clearing tasks will need some elements to clean.
0085 /* Process the element lines */ 0086 var lines = elements[i].innerHTML.split("\n"); 0087 for(var e = 0; e < lines.length; e++){
For each line, perform our task processing (in the future).
0088 /* Add task to task stack */ 0089 addTask(function(){
Add the below function code to be processed for each function. It’s relatively heavy, but you can garuntee the code will work each time.
0090 var elem = getVar(); 0091 var line = getVar(); 0092 var skip = false; 0093 /* <<<< Entire line tests >>>> */ 0094 if(line.length == 0){ 0095 line = "<br /><br />"; 0096 } 0097 /* <<<< Start of line tests >>>> */ 0098 if(line[0] == '#'){ 0099 var temp = line; 0100 /* Find out type of header */ 0101 var len = line.length; 0102 var h = 1; 0103 for(var z = 1; z < len; z++){ 0104 if(line[z] == '#'){ 0105 h++; 0106 }else{ 0107 /* Make sure next character is space */ 0108 if(line[z] == ' '){ 0109 /* Remove previous markers */ 0110 temp = line.slice(h + 1); 0111 } 0112 z = line.length; 0113 } 0114 } 0115 /* Add HTML */ 0116 temp = "<h" + h + ">" + temp + "</h" + h + ">"; 0117 /* Replace line for searching */ 0118 line = temp; 0119 } 0120 if(line[0] == ' '){ 0121 if(line[1] == ' '){ 0122 /* Check whether we have a list or potential code block */ 0123 if(line[2] == ' '){ 0124 /* Check whether we have code block */ 0125 if(line[3] == ' '){ 0126 /* Escape the string */ 0127 temp = line.slice(4); 0128 temp = temp.split('&').join("&"); 0129 temp = temp.split('<').join("<"); 0130 temp = temp.split('>').join(">"); 0131 temp = temp.split('"').join("""); 0132 /* Check the length, add some space is zero */ 0133 if(temp.length <= 0){ 0134 temp += ' '; 0135 } 0136 /* Throw some pre-tags around it */ 0137 line = "<pre name=\"code\" style=\"margin:0px;\">" + temp + "</pre>"; 0138 skip = true; 0139 } 0140 }else{ 0141 /* Indent the list */ 0142 var point = line.slice(2).split(" "); 0143 var pointLen = point[0].length; 0144 if(point[0] == "*"){ 0145 point[0] = "· "; 0146 } 0147 var temp = "<tt name=\"list\"> " + point[0]; 0148 for(var z = point[0].length; z < TAB_MAX; z++){ 0149 temp += " "; 0150 } 0151 temp += "</tt>" + line.slice(2 + pointLen); 0152 line = temp + "<br />"; 0153 } 0154 } 0155 } 0156 /* <<<< Middle of line tests >>>> */ 0157 /* Only perform tests if we shouldn't be skipping */ 0158 if(!skip){ 0159 var temp = ""; 0160 var images = line.split("!["); 0161 if(!(images.length == 1 && !(images[0] == '!' && images[1] == '['))){ 0162 for(var z = 0; z < images.length; z++){ 0163 var endS = images[z].indexOf(']'); 0164 var begC = images[z].indexOf('(', endS); 0165 var endC = images[z].indexOf(')', begC); 0166 /* If invalid, skip over */ 0167 if(endS < 0 || begC < 0 || endC < 0 || endS + 1 != begC){ 0168 /* Put everything back as it was */ 0169 if(z > 0){ 0170 temp += "!["; 0171 } 0172 temp += images[z]; 0173 }else{ 0174 temp += "<img alt=\""; 0175 temp += images[z].slice(0, endS); 0176 temp += "\" src=\""; 0177 temp += images[z].slice(begC + 1, endC); 0178 temp += "\">"; 0179 /* Add everything that wasn't part of the breakup */ 0180 temp += images[z].slice(endC + 1); 0181 } 0182 } 0183 line = temp; 0184 } 0185 temp = ""; 0186 var links = line.split("["); 0187 if(!(links.length == 1 && line[0] != '[')){ 0188 for(var z = 0; z < links.length; z++){ 0189 var endS = links[z].indexOf(']'); 0190 var begC = links[z].indexOf('(', endS); 0191 var endC = links[z].indexOf(')', begC); 0192 /* If invalid, skip over */ 0193 if(endS < 0 || begC < 0 || endC < 0 || endS + 1 != begC){ 0194 /* Put everything back as it was */ 0195 if(z > 0){ 0196 temp += "["; 0197 } 0198 temp += links[z]; 0199 }else{ 0200 temp += "<a href=\""; 0201 temp += links[z].slice(begC + 1, endC); 0202 temp += "\">"; 0203 temp += links[z].slice(0, endS); 0204 temp += "</a>"; 0205 /* Add everything that wasn't part of the breakup */ 0206 temp += links[z].slice(endC + 1); 0207 } 0208 } 0209 line = temp; 0210 } 0211 var pos = 0; 0212 while(pos >= 0){ 0213 /* Search for first instance */ 0214 pos = line.indexOf("**"); 0215 if(pos >= 0){ 0216 /* Replace first instance */ 0217 line = line.slice(0, pos) + "<b>" + line.slice(pos + 2); 0218 /* Search for second instance */ 0219 pos = line.indexOf("**"); 0220 if(pos >= 0){ 0221 /* Replace second instance */ 0222 line = line.slice(0, pos) + "</b>" + line.slice(pos + 2); 0223 } 0224 } 0225 } 0226 pos = 0; 0227 while(pos >= 0){ 0228 /* Search for first instance that doesn't start with spaces */ 0229 pos = line.indexOf("*"); 0230 if(pos >= 0){ 0231 /* Replace first instance */ 0232 line = line.slice(0, pos) + "<i>" + line.slice(pos + 1); 0233 /* Search for second instance */ 0234 pos = line.indexOf("*"); 0235 if(pos >= 0){ 0236 /* Replace second instance */ 0237 line = line.slice(0, pos) + "</i>" + line.slice(pos + 1); 0238 } 0239 } 0240 } 0241 pos = 0; 0242 while(pos >= 0){ 0243 /* Search for first instance that doesn't start with spaces */ 0244 pos = line.indexOf("`"); 0245 if(pos >= 0){ 0246 /* Replace first instance */ 0247 line = line.slice(0, pos) + "<pre class=\"inline\">" + line.slice(pos + 1); 0248 /* Search for second instance */ 0249 pos = line.indexOf("`"); 0250 if(pos >= 0){ 0251 /* Replace second instance */ 0252 line = line.slice(0, pos) + "</pre>" + line.slice(pos + 1); 0253 } 0254 } 0255 } 0256 } 0257 /* Add line to element */ 0258 elem.innerHTML += line; 0259 }); 0260 /* Add reference to elements */ 0261 addVar(elements[i]); 0262 /* Allow function to access line */ 0263 addVar(lines[e]);
Add the requires variables to reference each line.
0264 } 0265 /* Add task to swap elements XMP for P */ 0266 addTask(function(){ 0267 var elem = getVar(); 0268 var nElem = document.createElement('p'); 0269 nElem.innerHTML = elem.innerHTML; 0270 elem.parentNode.insertBefore(nElem, elem); 0271 elem.parentNode.removeChild(elem); 0272 });
Replace the xmp
tags with p
tags. We want some pretty formatting after all.
0273 /* Add reference to element to be cleaned */ 0274 addVar(elements[i]);
Replacing the xmp
tags will require a reference to them.
0275 } 0276 /* Process tasks */ 0277 process();
Finally, let’s start actually completing some tasks!
0278 } 0279 0280 /** 0281 * init() 0282 * 0283 * The initialiser for the tasking system. 0284 **/ 0285 function init(){ 0286 /* Allow the browser to process other tasks */ 0287 TASK_BREAK_TIME_MS = 128;
A nice power of two will do nicely to allow the browser time to recover.
0288 /* Time to process the tasks for */ 0289 TASK_PROCESS_TIME_MS = 256;
Again, a power of two for processing time, double that of the TASK_BREAK_TIME_MS
, otherwise it may not even be worth coming back alive given the overhead of preparing the next task.
0290 /* Record progress completion */ 0291 tasksScanned = 0; 0292 tasksCompleted = 0;
Reset the task status variables, who knows - this may not be our first rodeo on this page!
0293 /* The overall task stack */ 0294 taskStack = []; 0295 /* The variable stack */ 0296 varStack = []; 0297 /* Set tab space size */
Empty out the stacks, we don’t want the possibility of processing any previous tasks.
0298 TAB_MAX = 4;
Set the tab size for code to four.
0299 } 0300 0301 /** 0302 * addTask() 0303 * 0304 * Adds task to back of task list and increment task count. 0305 **/ 0306 function addTask(func){ 0307 taskStack.push(func); 0308 tasksScanned++; 0309 }
This function simply adds tasks in a well defined way. In the future it may be hashed in some way to make searching easier.
0310 /** 0311 * addVar() 0312 * 0313 * Adds a variable to the variable stack. 0314 **/ 0315 function addVar(v){ 0316 varStack.push(v); 0317 }
This simply allows variables to be added for the current task. Multiple variables may be added for the task, but they all must be read using getVar()
in order to correctly allow the next task to read its variables.
0318 /** 0319 * getVar() 0320 * 0321 * Gets a variable and removes it from the list. 0322 **/ 0323 function getVar(){ 0324 var r = varStack[0]; 0325 varStack.shift(); 0326 return r; 0327 }
The getVar()
function makes sure that the used variables are removed from the array.
0328 /** 0329 * process() 0330 * 0331 * Process tasks. 0332 **/ 0333 function process(){ 0334 /* Make temporary date variable */ 0335 var now = new Date();
We will use the now
variable to get the current time for this process loop.
0336 /* Set the stack time out for the future */ 0337 stackTimeout = now.getTime() + TASK_PROCESS_TIME_MS;
Pre-calculate when we need to stop processing the stack and store that information in the global variable stackTimeout
.
0338 /* Iterate over tasks that remain */ 0339 var i = 0; 0340 var run = true; 0341 for(; i < taskStack.length && run == true; i++){
Start processing through the stack.
0342 now = new Date();
Regenerate the current time.
0343 /* Check whether we have run our time */ 0344 if(now.getTime() >= stackTimeout){
Make sure that we haven’t over-run our allowed time to process the tasks.
0345 /* Break out of the loop */ 0346 run = false; 0347 /* Decrement indexing */ 0348 i--; 0349 }else{ 0350 /* Run next stack item */ 0351 taskStack[i](); 0352 /* Increment number of tasks complete */ 0353 tasksCompleted++;
Process the next task if we have time and record the fact we processed it in tasksCompleted
.
0354 } 0355 } 0356 console.log("Time Took: " + (now.getTime() + TASK_PROCESS_TIME_MS - stackTimeout) + "ms")
Print a message about the progress for debug purposes. If other people have issues with browser freezing, this information is likely to be useful to them.
0357 /* When we get here, removed processed items from the stack */ 0358 var tempStack = taskStack.slice(i); 0359 taskStack = tempStack; 0360 console.log("Stack Remaining = " + taskStack.length);
Remove stack items that have been used up and record how much is still left to be processed.
0361 /* Register break out time callback if more processing required */ 0362 if(taskStack.length > 0){ 0363 setTimeout(function(){ process(); }, TASK_BREAK_TIME_MS); 0364 }
Register our interest in being called back if more processing is required.
0365 }
Whilst I could have these pages pre-generated and save the end-user the hassle and added complexity of building their own pages, I think there is some value in giving the end user a raw information format that is in theory timeless. Another positive to this method is the size of the pages dramatically comes down and the ability to compress the pages increases. With this format, the added size of HTML tags is avoided and in theory brings the page size down.
From a design perspective, this method enforces pages to follow similar if not identical formatting rules meaning that the site remains coherent despite small changes in stylisation and changes in direction over time.