World Library  
Flag as Inappropriate
Email this Article

Context-free language

Article Id: WHEBN0000006867
Reproduction Date:

Title: Context-free language  
Author: World Heritage Encyclopedia
Language: English
Subject: Pushdown automaton, Formal grammar, Formal language, Deterministic context-free language, Abstract family of languages
Collection: Formal Languages, Languages, Linguistics
Publisher: World Heritage Encyclopedia
Publication
Date:
 

Context-free language

In formal language theory, a context-free language (CFL) is a language generated by some context-free grammar (CFG). Different CF grammars can generate the same CF language. It is important to distinguish properties of the language (intrinsic properties) from properties of a particular grammar (extrinsic properties).

The set of all context-free languages is identical to the set of languages accepted by pushdown automata, which makes these languages amenable to parsing. Indeed, given a CFG, there is a direct way to produce a pushdown automaton for the grammar (and corresponding language), though going the other way (producing a grammar given an automaton) is not as direct.

Context-free languages have many applications in programming languages; for example, the language of all properly matched parentheses is generated by the grammar S\to SS ~|~ (S) ~|~ \varepsilon. Also, most arithmetic expressions are generated by context-free grammars.

Contents

  • Examples 1
  • Languages that are not context-free 2
  • Closure properties 3
    • Nonclosure under intersection and complement and difference 3.1
  • Decidability properties 4
  • Parsing 5
  • See also 6
  • Notes 7
  • References 8

Examples

An archetypal context-free language is L = \{a^nb^n:n\geq1\}, the language of all non-empty even-length strings, the entire first halves of which are a's, and the entire second halves of which are b's. L is generated by the grammar S\to aSb ~|~ ab. This language is not regular. It is accepted by the pushdown automaton M=(\{q_0,q_1,q_f\}, \{a,b\}, \{a,z\}, \delta, q_0, z, \{q_f\}) where \delta is defined as follows:[note 1]

\delta(q_0, a, z) = (q_0, az)
\delta(q_0, a, a) = (q_0, aa)
\delta(q_0, b, a) = (q_1, \varepsilon)
\delta(q_1, b, a) = (q_1, \varepsilon)

Unambiguous CFLs are a proper subset of all CFLs: there are inherently ambiguous CFLs. An example of an inherently ambiguous CFL is the union of \{a^n b^m c^m d^n | n, m > 0\} with \{a^n b^n c^m d^m | n, m > 0\}. This set is context-free, since the union of two context-free languages is always context-free. But there is no way to unambiguously parse strings in the (non-context-free) subset \{a^n b^n c^n d^n | n > 0\} which is the intersection of these two languages.[1]

Languages that are not context-free

The set \{a^n b^n c^n d^n | n > 0\} is a context-sensitive language, but there does not exist a context-free grammar generating this language.[2] So there exist context-sensitive languages which are not context-free. To prove that a given language is not context-free, one may employ the pumping lemma for context-free languages[3] or a number of other methods, such as Ogden's lemma or Parikh's theorem.[4]

Closure properties

Context-free languages are closed under the following operations. That is, if L and P are context-free languages, the following languages are context-free as well:

Context-free languages are not closed under complement, intersection, or difference. However, if L is a context-free language and D is a regular language then both their intersection L\cap D and their difference L\setminus D are context-free languages.

Nonclosure under intersection and complement and difference

The context-free languages are not closed under intersection. This can be seen by taking the languages A = \{a^n b^n c^m \mid m, n \geq 0 \} and B = \{a^m b^n c^n \mid m,n \geq 0\}, which are both context-free.[note 2] Their intersection is A \cap B = \{ a^n b^n c^n \mid n \geq 0\}, which can be shown to be non-context-free by the pumping lemma for context-free languages.

Context-free languages are also not closed under complementation, as for any languages A and B: A \cap B = \overline{\overline{A} \cup \overline{B}} .

Context-free language are also not closed under difference: LC = Σ* \ L

Decidability properties

The following problems are undecidable for arbitrary context-free grammars A and B:

  • Equivalence: Given two context-free grammars A and B, is L(A)=L(B)?
  • Intersection Emptiness: Given two context-free grammars A and B, is L(A) \cap L(B) = \emptyset ? However, the intersection of a context-free language and a regular language is context-free,[5] and the variant of the problem where B is a regular grammar is decidable.
  • Containment: Given a context-free grammar A, is L(A) \subseteq L(B) ? Again, the variant of the problem where B is a regular grammar is decidable.
  • Universality: Given a context-free grammar A, is L(A)=\Sigma^* ?

The following problems are decidable for arbitrary context-free languages:

  • Emptiness: Given a context-free grammar A, is L(A) = \emptyset ?
  • Finiteness: Given a context-free grammar A, is L(A) finite?
  • Membership: Given a context-free grammar G, and a word w, does w \in L(G) ? Efficient polynomial-time algorithms for the membership problem are the CYK algorithm and Earley's Algorithm.

According to Hopcroft, Motwani, Ullman (2003),[6] many of the fundamental closure and (un)decidability properties of context-free languages were shown in the 1961 paper of Bar-Hillel, Perles, and Shamir[3]

Parsing

Determining an instance of the membership problem; i.e. given a string w, determine whether w \in L(G) where L is the language generated by a given grammar G; is also known as recognition. Context-free recognition for Chomsky normal form grammars was shown by Leslie G. Valiant to be reducible to boolean matrix multiplication, thus inheriting its complexity upper bound of O(n2.3728639).[7][8][note 3] Conversely, Lillian Lee has shown O(n3-ε) boolean matrix multiplication to be reducible to O(n3-3ε) CFG parsing, thus establishing some kind of lower bound for the latter.[9]

Practical uses of context-free languages require also to produce a derivation tree that exhibits the structure that the grammar associates with the given string. The process of producing this tree is called parsing. Known parsers have a time complexity that is cubic in the size of the string that is parsed.

Formally, the set of all context-free languages is identical to the set of languages accepted by pushdown automata (PDA). Parser algorithms for context-free languages include the CYK algorithm and the Earley's Algorithm.

A special subclass of context-free languages are the deterministic context-free languages which are defined as the set of languages accepted by a deterministic pushdown automaton and can be parsed by a LR(k) parser.[10]

See also parsing expression grammar as an alternative approach to grammar and parser.

See also

Notes

  1. ^ meaning of \delta's arguments and results: \delta(\mathrm{state}_1, \mathrm{read}, \mathrm{pop}) = (\mathrm{state}_2, \mathrm{push})
  2. ^ A context-free grammar for the language A is given by the following production rules, taking S as the start symbol: SSc | aTb | ε; TaTb | ε. The grammar for B is analogous.
  3. ^ In Valiant's papers, O(n2.81) given, the then best known upper bound. See Matrix multiplication#Algorithms for efficient matrix multiplication and Coppersmith–Winograd algorithm for bound improvements since then.

References

  1. ^ Hopcroft & Ullman 1979, p. 100, Theorem 4.7.
  2. ^ Hopcroft & Ullman 1979.
  3. ^ a b Yehoshua Bar-Hillel, Micha Asher Perles, Eli Shamir (1961). "On Formal Properties of Simple Phrase-Structure Grammars". Zeitschrift für Phonetik, Sprachwissenschaft und Kommunikationsforschung 14 (2): 143–172. 
  4. ^ How to prove that a language is not context-free?
  5. ^ Salomaa (1973), p. 59, Theorem 6.7
  6. ^ John E. Hopcroft, Rajeev Motwani, Jeffrey D. Ullman (2003). Introduction to Automata Theory, Languages, and Computation. Addison Wesley.  Here: Sect.7.6, p.304, and Sect.9.7, p.411
  7. ^ Leslie Valiant (Jan 1974). General context-free recognition in less than cubic time (Technical report). Carnegie Mellon University. p. 11. 
  8. ^ Leslie G. Valiant (1975). "General context-free recognition in less than cubic time". Journal of Computer and System Sciences 10 (2): 308–315.  
  9. ^ Lillian Lee (2002). "Fast Context-Free Grammar Parsing Requires Fast Boolean Matrix Multiplication". JACM 49 (1): 1–15.  
  10. ^  
  •  
  • Hopcroft, John E.; Ullman, Jeffrey D. (1979). Introduction to Automata Theory, Languages, and Computation (1st ed.). Addison-Wesley. 
  • Arto Salomaa (1973). Formal Languages. ACM Monograph Series. 
  •   Chapter 2: Context-Free Languages, pp. 91–122.
  • Jean-Michel Autebert, Jean Berstel, Luc Boasson, Context-Free Languages and Push-Down Automata, in: G. Rozenberg, A. Salomaa (eds.), Handbook of Formal Languages, Vol. 1, Springer-Verlag, 1997, 111-174.
This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and USA.gov, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for USA.gov and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
 
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
 
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.
 



Copyright © World Library Foundation. All rights reserved. eBooks from World eBook Library are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.