Serving Content
Introduction
It has become an extremely important past time of mine to try and
convey information in a clean and clear format, where we consider
information in as abstract a format as possible. There are a few key
concepts I look for:
- Information stored in a pure as possible format, containing as
little about the stylisation and structure of the information contained
as possible. This should not compromise the necessary structure of the
information structure as to make it meaningless. (An extreme example may
be the ordering of letters or words, or as subtle as the way in which we
separate key concepts.)
- The information should be readable in this “raw” format, where it is
possible to make use of the information as is.
- The information format should be able to be stylised as to become
more pleasant or meaningful to the reader, but via another formatting
entirely that interacts with the media. This description of stylisation
can then go onto to be re-used for similar documentation.
With this criteria, I hope to explain why I like formats and reject
others. In the following we will discuss the advantages and
disadvantages of given types of formats and discuss where I see them in
the future.
HTML
I start with this one as the current key way in which human knowledge
is stored in the world. Let’s start with identifying advantages,
disadvantages and then compare to my criteria for a good format.
Advantages
- Simplicity, you have to admire how easy it was for the many amateur
internet users to program in the format to convey ideas and concepts,
capture the imagination of others and generally achieve world wide
adoption. This is not something to be taken lightly.
- Plenty of ways in order to structure content, including bullet
points, headers, text formatting, etc.
Disadvantages
- Stylisation is highly coupled to the structure of the content.
- There is unnecessarily large indicators for information structure,
such as large starter and ender format indicators.
Comparison
- Information is not in the purest possible format, the tags are much
larger than required and offer more features than just structure.
- In it’s raw format, HTML is not easily readable.
- Information can be formatted via CSS and Javascript which is very
nice. Most of the CSS stylisation can also be defined in HTML, often
making it cluttered and disorganised and non-ideal as an information
container.
LaTeX
In this we will look at the benefits of the source, as opposed to the
many outputs it is capable of.
Advantages
- Outputs are extremely reproducible.
- Generally accepted as a good system for producing academia,
therefore information is a variety of formats.
- Many ways to structure content, including many types of content
(books, articles, presentations, etc).
Disadvantages
- A lot of content structuring is not necessary for conveying
information.
- LaTeX often makes a lot of decisions that are hard to control in
terms of content formatting in an output.
- Large overhead for simple documents.
- Not easy to learn as the language is fairly unique.
Comparison
- The information is not in as raw format as could be possible, but
ultimately it is more readable than HTML.
- It has some of the many pitfalls that HTML has with the many
formatting markers meaning structure interferes with content.
- Stylisation is highly coupled with content. It is possible to
separate, but the fact it’s possible to couple them in the first place
is not ideal.
Markdown
Finally, we are left with the up-and-coming method for representing
information, now used in many Wikis and long term storage methods.
Advantages
- Input formatting and output formatting well defined.
- May be output in a multitude of formats, including PDFs and HTML
(which can be done through pandocfor example).
Disadvantages
- For some aspects there are multiple ways of achieving the same task,
such as the definitions of headers where underlining a word in
###and beginning a word with#both achieve a
first level header.
- Allows for HTML to be used in-line. Ideally the format should be
more strict in this sense and should have this ability removed in order
to keep the format pure. This can make it less useful practically, but
as a method of storing information this goes against it’s own core
values.
Comparison
- It would be fair to say that Markdown is very close to the having
pure information with minimal information for structure.
- The information is easily readable in this raw format with structure
markers actually making the raw information more readable. This is
because structural markers server their purpose in both raw and compiled
versions of the information.
- As the format can be output as PDF and HTML, additional stylisation
can be used in these formats which are very decoupled from the original
format.
Conclusion
Markdown by far offers the best method for storing information that
can then be future proofed. As may be obvious with the source of these
pages, I have personally opted to store these pages in a hybrid of
minimal HTML, Markdown and Javascript for converting the Markdown on the
fly. This in turn allows for a minimal transfer of information from
server to client whilst offering all of the content.