Context-Free Grammars and Languages

Benzer belgeler
Properties of Regular Languages. Mart 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 1

FINITE AUTOMATA. Mart 2006 Ankara Üniversitesi Bilgisayar Mühendisliği 1

WEEK 11 CME323 NUMERIC ANALYSIS. Lect. Yasin ORTAKCI.

BLM210 HAFTA 2 FORMAL METHODS OF SYNTAX DESCRIPTION (SÖZDİZİM TARİFİNİN BİÇİMSEL YÖNTEMLERİ)

Matematik Mühendisliği - Mesleki İngilizce

Dilbilgisi ve Diller

BBM Discrete Structures: Midterm 2 Date: , Time: 16:00-17:30. Question: Total Points: Score:

12. HAFTA BLM323 SAYISAL ANALİZ. Okt. Yasin ORTAKCI.

BBM Discrete Structures: Final Exam Date: , Time: 15:00-17:00

BBM Discrete Structures: Final Exam - ANSWERS Date: , Time: 15:00-17:00

Unlike analytical solutions, numerical methods have an error range. In addition to this

BAĞLAMDAN BAĞIMSIZ (CONTEXT-FREE) GRAMERLER (CFG) VE DİLLER (CFL)

4. HAFTA BLM323 SAYISAL ANALİZ. Okt. Yasin ORTAKCI.

Do not open the exam until you are told that you may begin.

First Stage of an Automated Content-Based Citation Analysis Study: Detection of Citation Sentences

WEEK 4 BLM323 NUMERIC ANALYSIS. Okt. Yasin ORTAKCI.

AB surecinde Turkiyede Ozel Guvenlik Hizmetleri Yapisi ve Uyum Sorunlari (Turkish Edition)

Syntax Analysis. 4/5/2004 Formal Diller 4.1

Do not open the exam until you are told that you may begin.

D-Link DSL 500G için ayarları

! " # $ % & '( ) *' ' +, -. /.,

YZM Biçimsel Diller ve Otomata Teorisi. Ders#06

Yarışma Sınavı A ) 60 B ) 80 C ) 90 D ) 110 E ) 120. A ) 4(x + 2) B ) 2(x + 4) C ) 2 + ( x + 4) D ) 2 x + 4 E ) x + 4

Cases in the Turkish Language

Yüz Tanımaya Dayalı Uygulamalar. (Özet)

a, ı ı o, u u e, i i ö, ü ü

! " # $ % & '( ) *' ' +, -. /) /) 0 # /) %, %, 1 2

BAĞLAMDAN BAĞIMSIZ VE BAĞLAMDAN BAĞIMSIZ OLMAYAN DİLLER (CONTEXT-FREE AND NON-CONTEXT-FREE LANGUAGES)

GAZİ İLKÖĞRETİM OKULU EĞİTİM-ÖĞRETİM YILI YETİŞTİRME KURSU İNGİLİZCE DERSİ 6. SINIF KURSU YILLIK PLANI

Prof. Dr. N. Lerzan ÖZKALE

Regular Expression vs. Context-Free Grammars. Ambiguity. NFA to CFG. Neden RE ler kullanilir?

Yaz okulunda (2014 3) açılacak olan (Calculus of Fun. of Sev. Var.) dersine kayıtlar aşağıdaki kurallara göre yapılacaktır:

Help Turkish -> English

CmpE 320 Spring 2008 Project #2 Evaluation Criteria

Delta Pulse 3 Montaj ve Çalıstırma Kılavuzu.

MM103 E COMPUTER AIDED ENGINEERING DRAWING I

Present continous tense

BBS 514 YAPISAL PROGRAMLAMA (STRUCTURED PROGRAMMING)

Çizge teorisi. 1736, Euler, Königsberg Köprüleri problemini çözdü

Formal Diller Ve Otomat Teorisi

Eco 338 Economic Policy Week 4 Fiscal Policy- I. Prof. Dr. Murat Yulek Istanbul Ticaret University

Virtualmin'e Yeni Web Sitesi Host Etmek - Domain Eklemek

mikroc Dili ile Mikrodenetleyici Programlama Ders Notları

CHAPTER 7: DISTRIBUTION OF SAMPLE STATISTICS. Sampling from a Population

Argumentative Essay Nasıl Yazılır?

Islington da Pratisyen Hekimliğinizi ziyaret ettiğinizde bir tercüman istemek. Getting an interpreter when you visit your GP practice in Islington

T.C. İSTANBUL AYDIN ÜNİVERSİTESİ SOSYAL BİLİMLER ENSTİTÜSÜ BİREYSEL DEĞERLER İLE GİRİŞİMCİLİK EĞİLİMİ İLİŞKİSİ: İSTANBUL İLİNDE BİR ARAŞTIRMA

Otomata Teorisi (BİL 2114)

( ) ARASI KONUSUNU TÜRK TARİHİNDEN ALAN TİYATROLAR

14.12 Oyun Teorisi. 2. oyuncunun sağdaki oyundaki kazançları soldaki oyundaki kazançlarının,

EGE UNIVERSITY ELECTRICAL AND ELECTRONICS ENGINEERING COMMUNICATION SYSTEM LABORATORY

Website review m.iyibahis.net

Konforun Üç Bilinmeyenli Denklemi 2016

L2 L= nh. L4 L= nh. C2 C= pf. Term Term1 Num=1 Z=50 Ohm. Term2 Num=2 Z=50 Oh. C3 C= pf S-PARAMETERS

İZDÜŞÜM. İzdüşümün Tanımı ve Önemi İzdüşüm Metodları Temel İzdüşüm Düzlemleri Noktanın İzdüşümü Doğrunun İzdüşümü Düzlemlerin İz Düşümleri

DOKUZ EYLÜL ÜNİVERSİTESİ MÜHENDİSLİK FAKÜLTESİ DEKANLIĞI DERS/MODÜL/BLOK TANITIM FORMU. Dersin Kodu: CME 3002

Level Test for Beginners 2

Mantik (Dergah Yayinlari) (Turkish Edition)

Semantik (Semantics): ifadelerin, deyimlerin, ve program birimlerinin anlamı Sentaks ve semantik bir dilin tanımı sağlar

1 $/ " {ww R : w {a, b} * } ## S asa, S bsb S e#(3 * 5 $(6 )# (2 #$,(- (25 #5

NATURAL LANGUAGE PROCESSING

Bu durumda ya cozum yoktur veya sonsuz cozum vardir. KIsaca cozum tek degildir. Veya cozumler birbirine lineer bagimlidir.

1 I S L U Y G U L A M A L I İ K T İ S A T _ U Y G U L A M A ( 5 ) _ 3 0 K a s ı m

LANDSCALE landscape sequences. [Enise Burcu Derinbogaz]

HTML 4. Bölüm. Doç. Dr. İsmail Rakıp Karaş Dersin Course Page:

Derleyici Kuramı (Compiler Theory)

Bağlaç 88 adet P. Phrase 6 adet Toplam 94 adet

UBE Machine Learning. Kaya Oguz

It is symmetrical around the mean The random variable has an in nite theoretical range: 1 to +1

Exercise 2 Dialogue(Diyalog)

Grade 8 / SBS PRACTICE TEST Test Number 9 SBS PRACTICE TEST 9

BBS 514 YAPISAL PROGRAMLAMA (STRUCTURED PROGRAMMING)

TEST RESULTS UFED, XRY and SIMCON

THE IMPACT OF AUTONOMOUS LEARNING ON GRADUATE STUDENTS PROFICIENCY LEVEL IN FOREIGN LANGUAGE LEARNING ABSTRACT

Fıstıkçı Şahap d t c ç

AİLE İRŞAT VE REHBERLİK BÜROLARINDA YAPILAN DİNİ DANIŞMANLIK - ÇORUM ÖRNEĞİ -

Ardunio ve Bluetooth ile RC araba kontrolü

We test validity of a claim or a conjecture (hypothesis) about a population parameter by using a sample data

ÖRNEKTİR - SAMPLE. RCSummer Ön Kayıt Formu Örneği - Sample Pre-Registration Form

Week 5 Examples and Analysis of Algorithms

Multiplication/division

Turkish and Kurdish influences in the Arabic Dialects of Anatolia. Otto Jastrow (Tallinn)

İŞLETMELERDE KURUMSAL İMAJ VE OLUŞUMUNDAKİ ANA ETKENLER

Tanrının Varlığının Ontolojik Kanıtı a

ÖZET. SOYU Esra. İkiz Açık ve Türkiye Uygulaması ( ), Yüksek Lisans Tezi, Çorum, 2012.

NEY METODU SAYFA 082 NEY METHOD PAGE 082. well.

A LANGUAGE TEACHER'S PERSONAL OPINION

Marble / Granite / Concrete / Asphalt

Gezici Tanıtım & Fuar Araçları Mobile Showroom & Fair Vehicles

Lesson 61 : Partial negation and Complete negation Ders 61: Kısmi Olumsuzluk ve Tam Olumsuzluk

Algoritmalar. İkili Arama Ağaçları. Bahar 2016 Doç. Dr. Suat Özdemir 1

6. Seçilmiş 24 erkek tipte ağacın büyüme biçimi, ağacın büyüme gücü (cm), çiçeklenmenin çakışma süresi, bir salkımdaki çiçek tozu üretim miktarı,

Industrial pollution is not only a problem for Europe and North America Industrial: Endüstriyel Pollution: Kirlilik Only: Sadece

Eğitim-Öğretim Yılında

Bölüm 6. Diziler (arrays) Temel kavramlar Tek boyutlu diziler Çok boyutlu diziler

Ege Üniversitesi Elektrik Elektronik Mühendisliği Bölümü Kontrol Sistemleri II Dersi Grup Adı: Sıvı Seviye Kontrol Deneyi.../..

a, ı ı o, u u e, i i ö, ü ü şu that (something relatively nearby) şu ekmek o that (something further away) o dondurma

BİR BASKI GRUBU OLARAK TÜSİADTN TÜRKİYE'NİN AVRUPA BİRLİĞl'NE TAM ÜYELİK SÜRECİNDEKİ ROLÜNÜN YAZILI BASINDA SUNUMU


My Year Manager is Sınıf Müdürüm. P.E. is on Beden eğitimi dersimin günü

Transkript:

Context-Free Grammars and Languages We have seen that many languages cannot be regular. Thus we need to consider larger classes of langs, called Context- Free Languages (CFL). These langs have a natural, recursive notation, called Context- Free Grammars (CFG). CFGs have played a central role in natural languages since the 1950's, and in compilers since the 1960's. Today CFL's are increasingly important for XML (extensible markup lang) and their DTD's (document type definition). We'll look at: CFG's, the languages they generate, parse trees, pushdown automata, and closure properties of CFL's. Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 1

An Informal Example of CFG s Consider the language of palindromes (is a string that reads the same forward and backward. L pal is not a regular language. Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 2

Palindrome example Let L pal = { 0 n 10 n : n>0 }. It is easy to show that it is not a regular lang. Apply pumping lemma. Let ω = xyz st y consists of 0 s from the first group. Then xy 0 z is not a palindrome because the number of 0 s are not equal. Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 3

Inductive Definition of L pal Basis: ε, 0, and 1 palindromes. Induction: If ω is a palindrome, so are 0ω0 and 1ω1. No string is a palindrome, unless it follows this basis and induction rule. Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 4

Formal Definition of CFGs There are four important components in a grammatical description of the language: 1. Finite set of symbols that form the strings of the lang. This set was {0,1} in palindrome example. This alphabet is called the terminals, or terminal symbols. 2. Finite set of variables, which are called nonterminals or syntactic categories. In our example here, it is P. 3. One of the variables represent the language being defined; it is called the start symbol. In our example it is P. 4. There is a finite set of productions or rules that represent the recursive definition of the language. Each production consists of a variable, the production symbol, and a string of zero or more terminals and variables, which is called the body of the production. Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 5

Formal Definition of CFGs A context-free grammar is a quadruple where G = (V, T, P, S) V is a Finite set of variables. T is a finite set of terminals. P is a finite set of productions of the form A α, where A is a variable and α (V U T)* S is a designated variable called the start symbol. Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 6

Example Regular expressions over {0,1} can be defined by the grammar G regex = ({E}, {0,1}, A, E) where A = {E ε, E 0, E 1, E E.E, E E+E, or E E*, E (E)} A = {E ε 0 1 E.E E+E E* (E)} Above representation of the production is called the compact notation. Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 7

Notice that E and I are variables, elements of T are terminal symbols, P is the productions at right, and E is the start symbol. Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 8

Derivations Using Grammar Recursive inference, using productions from body to head Derivations, using productions from head to body. Recursive inference example: Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 9

Derivations, Using Productions From Head To Body Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 10

We define to be the closure of, i.e., represent zero, one, many derivation steps. Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 11

Example 5.5 The Inference that a (a+b00) is in the lang. of variable E can be reflected in a derivation of that string, starting with the string E. Here is one such derivation: E E E I E a E a (E) a (E+E) a (I+E) a (a+e ) a (a+i ) a (a+i0 ) a (a+i00 ) a (a+b00). We can conclude that E a (a+b00). The two viewpoints recursive inference and derivation are equivalent. A string of terminals ω is inferred to be in the language of some variable A if and only if A ω. Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 12

Example 5.5 Cont. E E E I E a E a (E) a (E+E) a (I+E) a (a+e ) a (a+i ) a (a+i0 ) a (a+i00 ) a (a+b00) Note: At each step we might have several rules to choose from, e.g. I E a E a (E), versus I E I (E) a (E). Note2: Not all choices lead to successful derivations of a particular string, for instance E E + E won't lead to a derivation of a (a+b00). Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 13

Leftmost And Rightmost Derivations In order to restrict the number of choices we have in deriving a string, it is often useful to require that at each step we replace the leftmost (or rightmost) variable by one of its production bodies. Such a derivation is called leftmost derivation (or rightmost derivation). Leftmost derivation denoted by. Rightmost derivation denoted by. Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 14

Leftmost: Example 5.5 lm vs rm comparison Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 15

The Language of a Grammar Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 16

We shall prove that a string ω in {0,1}* is in L(G pal ) if and only if it is a palindrome. Proof: (if direction) Suppose ω = ω R. We show by induction on ω that ω L(G pal ). Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 17

Induction Hypothesis Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 18

Proof: (only if direction) We assume that ω L(G pal ) and must show that ω = ω R,that is, ω is palindrome. Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 19

Sentential Forms Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 20

Examples to Sentential Forms Example: E (I+E) is sentential form since E E E E (E) E (E+E) E (I+E) This derivation is neither leftmost nor rightmost. Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 21

Parse Trees Let G = (V, T, P, S) be a CFG. The parse trees for G are trees with the following conditions: 1. Each interior node is labelled by a variable in V. 2. Each leaf is labelled by a symbol in V U T U {ε}. Any ε -labelled leaf is the only child of its parent. 3. If an interior node is lablelled A, and its children (from left to right) labelled X 1, X 2, X 3,. X K then A X 1, X 2, X 3,. X K P Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 22

Parse Tree Examples In the grammar Parse tree 1. E I 2. E E + E 3. E E * E 4. E Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 23

Parse Tree Examples In the grammar Parse tree Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 24

Yield of Parse Tree If we look at the leaves of any parse tree and concatenate them from the left, we get a string, called yield of the tree, which is always a string that s derived from the root variable. Of special importance are those parse threes such that: 1. The yield is a terminal string. That is, all leaves are labeled either with a terminal or with ε. 2. The root is labeled by the start symbol. We shall see that the set of yields of these important parse trees is the language of the underlying grammar. Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 25

Yield of Parse Tree Example concatenate them from the left get a string, called yield of the tree, all leaves are labeled either with a terminal or with ε. The root is labeled by the start symbol. The yield is a (a+b00). Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 26

Equivalence of Inference, Derivations, and Parse Trees Let G = (V,T,P,S) be a CFG and A V. Then the followings are equivalent. Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 27

Ambiguity in Grammars and Languages In the grammar below, sentential form E + E E has two derivations: This gives us two parse trees Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 28

The mere existence of several derivations is not dangerous, it is the existence of several parse trees that ruins a grammar. Example: In the same grammar the string a+b has several derivations, e.g., However, their parse trees are the same, and the structure of a+b is unambiguous. Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 29

Let G = (V,T,P,S) be a CFG and A V. We say that G is ambiguous if there is a string in T* that has more than one parse tree. If every string in L(G) has at most one parse tree, G is said to be unambiguous. Example: The terminal string a+a a has two parse trees: Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 30

Removing Ambiguity From Grammars Good news: Sometimes we can remove ambiguity by hand. Bad news: There is no algorithm to do it. More bad news: Some CFL's have only ambiguous CFG's. We are studying the grammar E I E+E E E (E), I a b Ia Ib I0 Ib There are two problems: 1. There is no precedence between * and +. 2. A squence of identical operators can group either from the left or from the right. For example E+E+E. Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 31

Solution: We introduce more variables, each representing expressions of same binding strength. 1. A factor is an expression that cannot be broken apart by an adjacent or +. Our factors are (a) Identifiers (b) A parenthesized expression. 2. A term is an expression that cannot be broken by +. For instance a b can be broken by a1 such as a1 a b, which is (a1 a) b breaks a b. It cannot be broken by +, since e.g. a1+a b is (by precedence rules) same as a1+(a b), and a b+a1 is same as (a b)+a1. 3. The rest are expressions, i.e. they can be broken apart with or +. Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 32

Example 5.27 Let F stand for factors, T for terms, and E for expressions. From the previous form E I E+E E E (E), consider the following grammar: E T E+T T F T F F I (E) I a b Ia Ib I0 Ib Now the only parse tree for a + a a will be the following. I a b Ia Ib I0 Ib Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 33

Leftmost Derivations & Ambiguity While the derivations are not necessarily unique, even if the grammar is unambiguous, in an unambiguous grammar, leftmost and rightmost derivations will be unique. We shall consider leftmost derivations. Theorem 5.29: For any CFG G, a terminal string ω has two distinct parse trees if and only if ω has two distinct leftmost derivations from the start symbol. Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 34

Example to non-unique derivation The parse trees and derivations for a + a a. Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 35

Inherent Ambiguity A CFL L is inherently ambiguous if all grammars for L are ambiguous. Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 36

Nisan 2006 Ankara Üniversitesi Bilgisayar Mühendisliği - TY 37