context-free grammar (thing) by rp

Formally, a context-free grammar is a finite set of rules. Each rule consists of a symbol (called the left hand side), an arrow (for cosmetic purposes only), and a sequence of symbols (called the right hand side). There are two kinds of symbols: terminals and nonterminals. The nonterminals are those symbols that occur at the left hand side of rules.

In general discussions about context-free grammars, it is customary to use lowercase letters to denote terminals and uppercase letters to denote nonterminals. For example,


  A -> aAb
  A ->

This is a context-free grammar with one nonterminal (A), and two rules, with 3 and 0 symbols on the right hand side, respectively.

The purpose of grammars is to describe languages, that is, sets of possible sequences of symbols. Context-free grammars describe languages in a generative way: pick a nonterminal (in this case, A) and apply the rules until there are no nonterminals left, and you have a sequence of (terminal) symbols; do this in every possible way, and you have a set of such sequences. This set can be infinitely large.

The example grammar, for instance, describes the language "all sequences consisting of a number of 'a's followed by an equal number of 'b's".

A context-free language is a language that can be described by a context-free grammar. Not all languages are context-free: the language "all sequences consisting of a number of 'a's followed by an equal number of 'b's followed by an equal number of 'c's" isn't.

What context-free grammars are basically capable of doing is expressing languages with 'nested' structures, where items on the left hand must be paired off with items on the right in a systematically nested way. This is considered to be a basic feature of natural languages.

The formalism was invented in the late 50s as a way to express the structure of natural language, and a way to define the structure of computer programming languages. But it is not powerful enough to describe either of them in full.

Chomsky hierarchy	Pushdown Automaton	"a^n b^n" language	Noam Chomsky
balanced braces language	regular grammar	formal language	The Art of Computer Programming
Earley's Algorithm	Chomsky Normal Form	pumping lemma	IFF
Computational linguistics	Mimic Functions	ambiguous grammar	grammar
BNF	semi-Thue grammar	star operation	context-free language
noun phrase	∪	What am I?	Language recognition and generation in Prolog