Coffee Space


Listen:

Serving Content

Introduction

It has become an extremely important past time of mine to try and convey information in a clean and clear format, where we consider information in as abstract a format as possible. There are a few key concepts I look for:

  1. Information stored in a pure as possible format, containing as little about the stylisation and structure of the information contained as possible. This should not compromise the necessary structure of the information structure as to make it meaningless. (An extreme example may be the ordering of letters or words, or as subtle as the way in which we separate key concepts.)
  2. The information should be readable in this “raw” format, where it is possible to make use of the information as is.
  3. The information format should be able to be stylised as to become more pleasant or meaningful to the reader, but via another formatting entirely that interacts with the media. This description of stylisation can then go onto to be re-used for similar documentation.

With this criteria, I hope to explain why I like formats and reject others. In the following we will discuss the advantages and disadvantages of given types of formats and discuss where I see them in the future.

HTML

I start with this one as the current key way in which human knowledge is stored in the world. Let’s start with identifying advantages, disadvantages and then compare to my criteria for a good format.

Advantages

  • Simplicity, you have to admire how easy it was for the many amateur internet users to program in the format to convey ideas and concepts, capture the imagination of others and generally achieve world wide adoption. This is not something to be taken lightly.
  • Plenty of ways in order to structure content, including bullet points, headers, text formatting, etc.

Disadvantages

  • Stylisation is highly coupled to the structure of the content.
  • There is unnecessarily large indicators for information structure, such as large starter and ender format indicators.

Comparison

  • Information is not in the purest possible format, the tags are much larger than required and offer more features than just structure.
  • In it’s raw format, HTML is not easily readable.
  • Information can be formatted via CSS and Javascript which is very nice. Most of the CSS stylisation can also be defined in HTML, often making it cluttered and disorganised and non-ideal as an information container.

LaTeX

In this we will look at the benefits of the source, as opposed to the many outputs it is capable of.

Advantages

  • Outputs are extremely reproducible.
  • Generally accepted as a good system for producing academia, therefore information is a variety of formats.
  • Many ways to structure content, including many types of content (books, articles, presentations, etc).

Disadvantages

  • A lot of content structuring is not necessary for conveying information.
  • LaTeX often makes a lot of decisions that are hard to control in terms of content formatting in an output.
  • Large overhead for simple documents.
  • Not easy to learn as the language is fairly unique.

Comparison

  • The information is not in as raw format as could be possible, but ultimately it is more readable than HTML.
  • It has some of the many pitfalls that HTML has with the many formatting markers meaning structure interferes with content.
  • Stylisation is highly coupled with content. It is possible to separate, but the fact it’s possible to couple them in the first place is not ideal.

Markdown

Finally, we are left with the up-and-coming method for representing information, now used in many Wikis and long term storage methods.

Advantages

  • Input formatting and output formatting well defined.
  • May be output in a multitude of formats, including PDFs and HTML (which can be done through pandoc for example).

Disadvantages

  • For some aspects there are multiple ways of achieving the same task, such as the definitions of headers where underlining a word in ### and beginning a word with # both achieve a first level header.
  • Allows for HTML to be used in-line. Ideally the format should be more strict in this sense and should have this ability removed in order to keep the format pure. This can make it less useful practically, but as a method of storing information this goes against it’s own core values.

Comparison

  • It would be fair to say that Markdown is very close to the having pure information with minimal information for structure.
  • The information is easily readable in this raw format with structure markers actually making the raw information more readable. This is because structural markers server their purpose in both raw and compiled versions of the information.
  • As the format can be output as PDF and HTML, additional stylisation can be used in these formats which are very decoupled from the original format.

Conclusion

Markdown by far offers the best method for storing information that can then be future proofed. As may be obvious with the source of these pages, I have personally opted to store these pages in a hybrid of minimal HTML, Markdown and Javascript for converting the Markdown on the fly. This in turn allows for a minimal transfer of information from server to client whilst offering all of the content.