A term that refers to the set of computer characters that, when displayed, do not result in anything appearing on the screen.

In POSIX, white space characters are "space", "form feed", "carriage return", "new line", "horizontal tab" and "vertical tab". Most of these characters represent cursor movements.

In unix, white space is often considered insignificant, meaning that it is ignored when strings are matches or compared.

Making your write-ups easier to read


In page layout terms, white space is used to make a document more readable, by giving space for the eyes to rest. In designing documents, for example, wider margins are often used to direct the reader's eyes to the centre of a page.

Generally, the denser the words and paragraphs, the more tiring it is to read. You may witness this for yourself - write-ups with long paragraphs make for more difficult reading, and the longer they are, the more difficult it becomes. In E2, you should try to use the <p> tag to open paragraphs rather than the <br> (linebreak) tag.

The <blockquote></blockquote> tags can be used to create horizontal space, to separate out part of an article from the main body. They can also be used to truncate the <hr> horizontal rule, as demonstrated here, although <hr width=150> (or similar number) will do the same, thusly:


In addition, using subheadings relieves the eye of the reader. My personal preference is for a <strong> or <b> tag, which does two things; it creates a resting place for the eye, and breaks up a solid body of text, and also alerts the reader that there is another section, with its own 'theme'.

Not that it applies in E2, but in paper documents, the gutter between columns is also a kind of white space.

By checking your write-ups in the Scratch Pad and experimenting with white space, you will render your writing more accessible to your reader. Make it easier to read, and it stands to reason that it is more likely to be read as s/he will feel happier about it.

Thanks to heyoka for her input

~ the evolution of white space


In the ninth century bce, the mercantile Phoenicians of modern-day Lebanon and Syria introduced their trading partners, the Greeks, to the principles of the alphabetic system. The Greeks adapted the Phoenician system to their own phonetics, and the resulting system quickly displaced the primitive Linear B, a hieroglyphic system poorly equipped to deal with Indo-European phonetic systems. The ninth century bce Grecian alphabet became the ancestor of all three national script systems used throughout Europe: modern Greek, Cyrillic, and Latin, which is increasingly being used throughout the world.


Ancient Greek was written entirely without punctuation or letter spacing, in a style called boustrophedon (literally, "like a cow turning," as if to plow a field). In boustrophedon, the first line of text is written from left to right, but the subsequent line is written from right to left, beginning just below where the first line ended. The next line would be written from left to right again, beginning where the second line ended, and each line would alternate its alignment. The result would look something like this:



  ANCIENTGREEKWASWRITTENENTIRELYWITHOUTPUNCTUATIONO
  LLARETILNODEHPORTSUOBDELLACELYTSANIGNICAPSRETTELR
  YLIKEACOWTURNINGASIFTOPLOWAFIELDINBOUSTROPHEDONTH
  BUSEHTTUBTHGIROTTFELMORFNETTIRWSITXETFOENILTSRIFE
  SEQUENTLINEISWRITTENFROMRIGHTTOLEFTBEGINNINGJUSTW
  ORFNETTIRWEBDLUOWENILTXENEHTDEDNEENILTSRIFEHTEREH
  MLEFTTORIGHTAGAINBEGINNINGWHERETHESECONDLINEENDED
  OWTLUSEREHTTNEMNGILASTIETANRETLADLUOWENILHCAEDNA
  ULDLOOKSOMETHINGLIKETHIS


The sick, sick thing is that this represented an improvement over Linear B; at least, in this system, you could spell things properly – a feat which was impossible in Linear B, where simple words had to be mangled to fit the restrictions of the syllabary.


Over the next few hundred years, the Greeks began to refine their system, so that by the Classical age (starting c. the sixth century bce), the Greeks had incorporated word spacing into their writing. The Greeks also disposed of boustrophedon, and adopted a strict left-to-right line structure. It was this model that influenced the Etruscan, and later the Roman written languages. Early on, all of these languages were written without spaces, but eventually word breaks were introduced; in monuments these appear as dots, but on paper dots or spaces were used. This system still only incorporated one case of letter, the capital, and still lacked punctuation. Furthermore, it entirely lacked paragraphs; but what is important here is that the innovation of white space had been introduced. This writing looked something like this:


OVER·​THE·​NEXT·​FEW·​HUNDRED·​YEARS·​THE·​GREEKS·​BEGAN·​TO·​REFINE·​THEIR·​SYSTEM·​SO·​THAT·​BY·​THE·​CLASSICAL·​AGE·​STARTING·​C·​THE·​SIXTH·​CENTURY·​BCE·​THE·​GREEKS·​HAD·​INCORPORATED·​WORD·​SPACING·​INTO·​THEIR·​WRITING·​THE·​GREEKS·​ALSO·​DISPOSED·​OF·​BOUSTROPHEDON·​AND·​ADOPTED·​A·​STRICT·​LEFT·​TO·​RIGHT·​LINE·​STRUCTURE·​IT·​WAS·​THIS·​MODEL·​THAT·​INFLUENCED·​THE·​ETRUSCAN·​AND·​LATER·​THE·​ROMAN·​WRITTEN·​LANGUAGES·​EARLY·​ON·​ALL·​OF·​THESE·​LANGUAGES·​WERE·​WRITTEN·​WITHOUT·​SPACES·​BUT·​EVENTUALLY·​WORD·​BREAKS·​WERE·​INTRODUCED·​IN·​MONUMENTS·​THESE·​APPEAR·​AS·​DOTS·​BUT·​ON·​PAPER·​DOTS·​OR·​SPACES·​WERE·​USED·​THIS·​SYSTEM·​STILL·​ONLY·​INCORPORATED·​ONE·​CASE·​OF·​LETTER·​THE·​CAPITAL·​AND·​STILL·​LACKED·​PUNCTUATION·​FURTHERMORE·​IT·​ENTIRELY·​LACKED·​PARAGRAPHS·​BUT·​WHAT·​IS·​IMPORTANT·​HERE·​IS·​THAT·​THE·​INNOVATION·​OF·​WHITE·​SPACE·​HAD·​BEEN·​INTRODUCED·​THIS·​WRITING·​LOOKED·​SOMETHING·​LIKE·​THIS


During the Roman era, uncials were developed – smaller, more rounded forms of letters which would become our modern-day lower case letters. Some forms of punctuation were developed, such as the full stop, and by the Medieval period we had a full repertoire of punctuation. This is the system which was introduced to the Saxons, and which displaced their native runic system (which itself had been more or less similar to the Roman monumental system described above).


In the Medieval period, however, writing material was costly. It would be ridiculously expensive to manufacture paper on a large scale, and difficult to make much paper for oneself. Parchment, the more available writing surface, was made from animal skin, and therefore as expensive as leather. In some monasteries, monks in need of a writing surface for their meditations would create "palimpsests" by actually scraping the existing writing off an old manuscript (sometimes an ancient classic) and writing over it. The economics of writing did not encourage deployment of a proliferation of white space.


Nevertheless, the dialectical and rhetorical importance of paragraphs was well-known. For this reason, paragraphs were marked with a pilcrow, a special piece of punctuation used only for this purpose. The results would look something like this:


during the roman era, uncials were developed – smaller, more rounded forms of letters which would become our modern-day lower case letters. some forms of punctuation were developed, such as the full stop, and by the medieval period we had a full repertoire of punctuation. this is the system which was introduced to the saxons, and which displaced their native runic system (which itself had been more or less similar to the Roman monumental system described above). ¶ in the medieval period, however, writing material was costly. it would be ridiculously expensive to manufacture paper on a large scale, and difficult to make much paper for oneself. parchment, the more available writing surface, was made from animal skin, and therefore as expensive as leather. in some monasteries, monks in need of a writing surface for their meditations would create "palimpsests" by actually scraping the existing writing off an old manuscript (sometimes an ancient classic) and writing over it. The economics of writing did not encourage deployment of a proliferation of white space. ¶ nevertheless, the dialectical and rhetorical importance of paragraphs was well-known. for this reason, paragraphs were marked with a pilcrow, a special piece of punctuation used only for this purpose. The results would look something like this.


Two circumstances led to the development of our modern system of writing. One was Johann Gutenberg's invention of moveable type, which allowed printers to plan paragraph justification and line indentation. This meant it was no longer a technical problem to end a paragraph early on a line, and indent the next paragraph, or to space the words out so that the margins were no longer messy, uneven and distracting. The other major development was the invention of more efficient techniques for manufacturing paper. This meant it was also no longer an economic problem to indent and align. This development led to the creation of a book standard of justified paragraphs with indented first lines, the standard still used in almost any paper publication. The use of paper is still a concern, so lines are single spaced and there is only a single hard return between paragraphs, but there is enough white space between words and in paragraph indentations to be able to read for extended periods without getting sick.*


Another important element of white space is the margin itself. Margin white space serves a twofold purpose. First, text in moderately narrower columns tends to be easier to read. Second, in the print medium, a margin gives you something to hold on to when you're reading the book. Obviously, increasing margins increases paper area, which in turn increases the cost of the book to the consumer, which is why cheap books tend to have very narrow margins – which easily get greasy or cover your fingertips with smeary ink. Books which incorporate more white space in the margins tend to be more expensive, but are easier to scan and easier to hold. White space is a luxury.**


Enter desktop publishing. Hypertext changed everything once again, in three different ways:


1: First, publication on the Web became so easy that often, old standards went right out the window. Sometimes this was because users were ignorant of basic principles of sound layout; sometimes it was because they were ignorant of how to render layout in HTML.


2: Sometimes technology didn't support old standards. For example, in straight HTML, there's no way to double-space your work. Also, while it's possible to justify a paragraph and get rid of the ragged right margin which, for two hundred years, has been a sign of amateur and substandard layout; and while it's possible to indent the first line of text, giving an eye-pleasing piece of white space on the first line of each paragraph – it's impossible to do both. Because of concerns about browser cross-compatibility, most people didn't bother to do either.


3: Because hypertext is not displayed in paper, but on a screen, adding more white space only costs in terms of bits, not in terms of paper area. That means that you can massively deploy white space throughout a hypertext page with near-absolute impunity. However, because of the two previous points, it was very difficult for people to take advantage of this freedom; if you can't properly space your text horizontally and vertically, you can't take full advantage of all that white space.


This is the state of things on Everything2.


With the development of style sheet standards by the World Wide Web Consortium, and with some of Unicode's more crazy functions, users and readers have more control over the display of hypertext than ever before. For example, I prefer to have light text on dark backgrounds (so when I'm talking about "white space," I'm actually looking at black space, though the principles are the same), using 160% line spacing in a body paragraph, but a double-return between paragraphs, and indentation. This is not counting (what I like to think of as) my judicious use of wider margins and blocked text, which increases white space even more. This gives a very satisfying amount of white space; for an example, look at this page I designed:

< http://www.untext.com/text_jl_foucault.htm >
(This is no longer active; I'll put up a new url as soon as I can.)


Theoretically, with style sheet hypertext we've reached the high point in the evolution of white space. Space is no longer defined by breaks, returns, and justification; it can be quickly and efficiently defined using direct screen measurements. The only way it could possibly get more efficient would be to use trigonometry and distance to align text blocks, but this would not provide a significant advance over the already powerful cascading style sheets.


White space is your reader's friend. If you want to be your reader's friend, too, you'd better buddy up with white space. The history of white space shows that there has been a clear preference to increase white space wherever this was technically possible and not inordinately costly. For thousands of years, publishers have known the advantages of white space. Text without white space can make the reader sick, or simply make it easier to miss or skip text. It is a fundamental principle of sound design, and for the most part, it will only improve communication through the written media.


* RainDropUp has reminded me of a seeming example where white space is not used. When you submit a manuscript to an editor, the standard is for it to be in a monospaced font. The editorial standard is to double space your sentences, putting two spaces after every full stop instead of one, to make it easier for the editor to scan, annotate, or reckon word count. In publication, though, this white space is omitted. a scar faery corroborates this, adding that as proportional fonts and automatic kerning displaced monospaced fonts, there was greater distinction between words and sentences, making double-spacing of sentences pointless. As karmaflux points out, this is an editorial standard, and is not necessary as a publication standard, since publications are normally made with proportional fonts with justification.


On a somewhat tangential note (which is what footnotes are for, I suppose), French publication standards for punctuation are even heavier on white space. Any piece of punctuation other than a comma, a full stop, an apostrophe, or parentheses, must be preceded and followed by a single space – question marks and exclamation marks can be compounded with quotation marks (which are always angle quotes, and which can be omitted under certain circumstances), but otherwise everything is to be spaced out. So that would look like this:

Finally, Jack's resolve broke ; he asked Reba : « Where were you last night ? I waited until three for you. I was worried sick !»


** Thanks to heyoka for pointing this factor out to me.

Psst -- they have white space outside the Western world too, you know! Let's slowly work our way east...

Semitic

Arabic

The Arabic script is, odd as it may seem, actually a distant relative of our familiar Roman letters, and its rules of white space and punctuation have cross-fertilized with the Greek and Roman styles of writing described above. There are still quite a few differences, mind you!

Arabic is a cursive script, meaning that its letters flow together, and as in the West it uses white space to separate words. Arabic script has no case, omits weak vowels and is written from right to left. However, not all Arabic letters are made equal: certain letters, such as A ( alif ا ), are always followed by white space, even in the middle of a word! Thus, "God is Greatest" -- allahu akbar -- is actually written

الله اكبر
rbk a hll a
To the uninitiated this can be pretty confusing, as for example the only difference between an initial L (laam ل ) and an initial A is that the word may continue after an L, but not an A. Thus, in order to differentiate a L from an A at the end of a word, there is a special final L with a hook at the end. The amount of white space is also often wider between words than within words, but especially with more decorative scripts this alone would not be sufficient.

White space between sentences and paragraphs, on the other hand, is largely unknown in Classical Arabic, as best typified (and still retained) in the Qur'an. The end result is very much like the Medieval writing described by Cletus, except that Qur'anic Arabic has a much wider repertoire of punctuation to insert into the solid block of text. The Western pilcrow (¶) is replaced with a circle-shaped marker ( ۝ ) for the end of a verse (ayah) and another star-shaped marker ( ۞ ) for the end of a chapter (rub el-hizb). Western commas, semicolons and dashes are replaced by drawing little superscript Arabic letters, eg. a meem means a pause is obligatory, jeem means recommended but not required, saad is not recommended but possible, etc. This system is pretty opaque without extensive study, but it does add to the hypnotic beauty of written Qur'anic verse.

Modern Arabic, on the other hand, uses slightly modified but familiar versions of Western punctuation symbols. The period is still eschewed in Arabic itself, with a wider stretch of white space substituting, but Urdu (which is written with the Arabic script) uses the period. As white space thus acquires syntactic meaning, the preferred means of justification is to stretch the "bar" (tatweel) connecting the characters: بيت and بيـــــت are exactly the same word!

One last tidbit: the mathematical three-dot "therefore" symbol ∴ is originally from Arabic, where it is yet another Qur'anic symbol known as the muanqah and meaning that the word thus marked "therefore" continues from the previous word.

Don't worry too much if you got a few question marks above, most browsers can't quite handle Qur'anic Unicode yet...

Hebrew

Pretty much the same pattern repeats with Hebrew, which is also derived from the same Canaanite scripts as Arabic and Roman. Classical Hebrew, namely the Torah, has its own system of punctuation, but modern Hebrew is written with Western punctuation and formatting, Unlike Arabic, Hebrew is a block script and there are no funky inter-word white space rules.

Greek

Greek and its many, many relatives and lookalikes like Armenian, Cyrillic, Ethiopic, Georgian all employ modern Western white space and punctuation rules. Yes, this is a broad generalization and there are many tiny variances, drop me a note if you know of something really wacky.

Indic

Devanagari

Devanagari, the script used to write Hindi and many other Indian languages, is a left-to-right joined script much like Arabic, except that the letters in a word are always joined by a bar and the rules for ligatures are very, very complex. Words are separated by white space, sentences by a character called danda and verses with a double danda. Modern usage often substitutes the full set of Western punctuation.

Thai

The Thai script and its close cousins Lao and Myanmar are derived from Devanagari, but they do not separate words at all! White space is only used to separate sentences. Other Western punctuation like the exclamation point and quotation marks are used in modern Thai.

CJK

Chinese

The Chinese, on the other hand, had a complete system of writing at the time the Egyptians were still doodling hieroglyphs on pyramid walls. In an ideographic writing system like Chinese each character essentially represents one concept or "word" -- yes, this is a simplification, but it will have to do -- so words are already separate from each with no need for additional white space. And indeed, for a very long time Chinese was written with no white space or punctuation to speak of: text went from top to bottom in rows marching from right to left, leaving the sentences for the reader to figure out,
      ikok   951    海森
      sefi    62    之林
       tln    73    恋是
       hid    84    人大
Although for short poems and lists line breaks were often inserted at the end of each verse or item, improving readability somewhat. This classical style is still used for things like poetry and Buddhist sutras, which can thus be a royal pain to read since the characters used and their meanings have also tended to change over the millennia... but I digress.

Eventually, in China too the Western punctuation system crept in, once again with a few changes. To prevent confusion with the dots and curlies of the characters themselves, the period became a little circle "。" and the comma shifted direction and became a lot longer, "、".

Japanese

Japanese went the Chinese route and, despite the adoption of its own two kana syllabaries for phonetic writing, never saw the need to adopt white space. In a sentence like 俺が猫を食った, "I ate the cat", the content of the sentence is in the Chinese characters -- 俺 猫 食 -- and the kana syllables -- が を った -- sort out their relationship. This is considerably clearer than Chinese, where you have to rely on word order to figure out whether a particular character is acting as a noun, adjective or verb, and this is in fact one of the rare upsides of the otherwise hideously convoluted Japanese writing system.

After World War II and Japan's almost-wholesale embrace of things and ways Western, the Education Ministry decided to start writing Japanese in horizontal rows from left to right. (This had of course been practiced earlier as well on short texts like signs, but there had been no consensus about the right direction!) However, while Japanese school textbooks are to this day written Western-style, nearly all newspapers, magazines and books retain the old top-to-bottom formatting.

One last quirk: due to the similarity of the Western quote " and the voiced-sound indicator dakuten ゛, Japanese uses its own quotation marks, 「like this」, instead of the Western ones. These are not found in Chinese.

Korean

And Korean outweirds everybody with its Hangul system of writing, which involves packing little kana-like phonetic signs into square boxes. Each Hangul composed character is one syllable and consists of an optional initial consonant, a medial vowel and an optional final consonant (or two). If there is no final consonant, it can be simply omitted, but a missing initial must be indicated by drawing a circle ᄋ, which thus acts as visible white space -- "Hey! There's nothing here!". There is much more to Hanguk than this, but this probably isn't the right place to get into it...

Korean was formerly written in Chinese characters with Chinese white space rules (or lack thereof). In modern Hangul space is used to separate both words and sentences, and once again Western punctuation is in common use.

Summary

So in all, while the majority of the world appears less than convinced about the merits of the Roman alphabet, nearly the entire planet has adopted Western rules of white space and punctuation. The exclamation mark and question mark are effectively universal and the comma, period and quotation mark are only slightly less so. Every modern script that I know of uses white space to separate its sentences, and many -- albeit far from all -- also use it between their words.

And thanks fly out to the Unicode Consortium for making this writeup possible.

Y'know, if you log in, you can write something here, or contact authors directly on the site. Create a New User if you don't already have an account.