České vysoké učení technické v Praze Fakulta dopravní Grammars and languages Hybrid and uncertain systems
České vysoké učení technické v Praze Fakulta dopravní Language A language L is a set of strings over the alphabet T –alphabet = finite set of symbols (letters) Example: T = {a,b,c} L = {abc, abbc, ab}
České vysoké učení technické v Praze Fakulta dopravní Grammars Grammar is a quaternition: where: N - a set of non-terminal symbols T - a set of terminal symbols P - a set of rules S - start symbol of the grammar S N
České vysoké učení technické v Praze Fakulta dopravní Rules set of rules: the left side of the rulethe right side of the rule is a arbitrary string consisting terminal and non-terminal symbols
České vysoké učení technické v Praze Fakulta dopravní Rules the rule ( , ) P is written in the form of –the sense: „ is transcribed to “ the left side contains always at least one non-terminal symbol (it is possible to rewrite non-terminal symbol)
České vysoké učení technické v Praze Fakulta dopravní An example of simple grammar the grammar generating symmetric strings of zeros and ones 0000…01…11111 G = (N,T,P, S ) N = { S, A } T = { 0, 1 } P = { S 0 A 1, A 0 A 1, A } (symbol is an empty symbol)
České vysoké učení technické v Praze Fakulta dopravní An example of simple grammar generated string (sentence): S 0 A 1 00 A 11 000 A 111 Terminology: = γ 1 αγ 2 generates =γ 1 βγ 2 directly, if the rule α β exists –it is denoted –example: 00 A 11 000 A 111
České vysoké učení technické v Praze Fakulta dopravní Terminology generates , if the sequence α 1, α 2,…, α n exists such = α 1, = α n a α i α i+1, i = 1 … n –it is denoted * –the sequence of string is a derivation –example: 0 A 1 * derivation description –the sequence of rules – the previous slide –derivation tree
České vysoké učení technické v Praze Fakulta dopravní Derivation tree S A 01 A 01 A 01
České vysoké učení technické v Praze Fakulta dopravní Languages and grammars the language L G is by the grammar G the grammar G and the language L G generated by the grammar are equivalent Note: sentences of the languages are composed only by terminal symbols
České vysoké učení technické v Praze Fakulta dopravní Grammar Classification by Chomski grammars are classified by the shape of rules –general (unlimited) –context –context-free –regular
České vysoké učení technické v Praze Fakulta dopravní Grammar classification unlimited - L(0) –rules are general context – L(1) – γ 1 Aγ 2 γ 1 βγ 2, A N, γ 1,γ 2 is a context, γ 1,γ 2 (N T) *, β (N T) + context-free – A β, A N, β (N T) + regular
České vysoké učení technické v Praze Fakulta dopravní Grammar classification unlimited grammars generate unlimited languages - L(0) context grammars generate context languages - L(1) context-free grammars generate context- free languages regular grammars generate regular languages
České vysoké učení technické v Praze Fakulta dopravní Example of unlimited grammar G = { N, T, P, S } N = { S, B } T = { a, b, c } P = { S abc, S aSBc, cB Bc, bB bb } the grammar generates language:
České vysoké učení technické v Praze Fakulta dopravní Example of context grammar the third rule, cB Bc, of the previous example is not the rule of context grammar, others are valid we transform the previous grammar to the context one the rule AB BA is replaced with the set of rules of context grammar: –the context is denote by the blue letter AB XB XB XA XA BA
České vysoké učení technické v Praze Fakulta dopravní Example of context grammar but the swapping of symbols can not be applied to the third rule Why? Because the terminal symbol can not be replaced. we add a new terminal symbol C, the rule cC cc and we modify other rules
České vysoké učení technické v Praze Fakulta dopravní Example of context language G = { N, T, P, S } N = { S, B, C, X } T = { a, b, c } P = { S abC, S aSBC, CB XB, XB XC, XC BC, bB bb, bC bc, cC c } the grammar generates the same language
České vysoké učení technické v Praze Fakulta dopravní Using grammars in programming lexical elements of programming languages (keyword, constants) are defined by the regular grammars programming languages are defined by context-free grammars
České vysoké učení technické v Praze Fakulta dopravní Regular grammars the shape for rules: A aB or A a, where A, B N, a T Note: –rules of shape A aB are members of the right regular grammar –rules of shape A Ba are members of the left regular grammar
České vysoké učení technické v Praze Fakulta dopravní Example grammar that generates positive integer constants in C programming language –decimal constants start with 1-9 –octal constants start with 0 –hexadecimal constants start with 0x G = ( N, T, P, S ) N = { S, X, D, H, O } T = { 0,...,9,x,A,..., F }
České vysoké učení technické v Praze Fakulta dopravní Example
České vysoké učení technické v Praze Fakulta dopravní Finite State Machines regular language generated by the regular grammar can be accepted by the finite state machine –FSM is a model of lexical analyzer that recognizes if the input string belongs to the language –FSM is equivalent to the regular grammar
České vysoké učení technické v Praze Fakulta dopravní FSMs FSM is a five-tuple where T is a finite set of input symbols Q is a finite set of internal states is a transition: –function : Q T Q for deterministic FSM –relation Q T Q for nondeterministic FSM
České vysoké učení technické v Praze Fakulta dopravní FSMs K is a set of final states q 0 is the initial state Note: –FSM has no output function –if FSM accepts a string from the language the present state is s K –FSM can be nondeterministic it is transformable to the deterministic one
České vysoké učení technické v Praze Fakulta dopravní FSMs
České vysoké učení technické v Praze Fakulta dopravní Algorithm of constructing a FSM from the regular grammar the set of input symbol is given X = T the set of internal states is given Q = N {U}, U N each rule A aB implicates the transition ( A,a )= B, each rule A a implicates the transition ( A,a )= U the set of final states K = {U}, or K={U,S}, if the rule S exists
České vysoké učení technické v Praze Fakulta dopravní Equivalent FSM to the regular grammar the FSM is nondeterministic S initial state U final state
České vysoké učení technické v Praze Fakulta dopravní Equivalent FSM to the regular grammar corresponding deterministic FSM S initial statefinal states
České vysoké učení technické v Praze Fakulta dopravní Regular expressions a finite alphabet T is given regular expressions generate regular language, they defined recursively, using operations „*“ (iteration), „·“ (concatenation) a „+“ (union) Definition: 1) Each letter x T is a regular expression 2) If E 1, E 2 are regular expressions, then E 1 · E 2, E 1 + E 2, E 1 *, (E 1 ) are regular expressions too.
České vysoké učení technické v Praze Fakulta dopravní Regular expression generating constants in C language ( ) ( )*+0 ( )*+ +0 x ( A+...+F) ( A+...+F)*
České vysoké učení technické v Praze Fakulta dopravní Equivalence regular grammars, regular expressions and FSMs are equivalent and convertible Regular grammarsFSMs Regular expressions
České vysoké učení technické v Praze Fakulta dopravní Example of context-free grammar the grammar generating a simple programming language G = { N,T,P,S } N = { S, Seq, Block, Comm, Cond } T = { main, {, }, ;, read_x,write_x, ++, --, if, (, ), else, ==, !=, 0, x, >, < }
České vysoké učení technické v Praze Fakulta dopravní Rules S main { Seq }, Seq Comm, Seq Comm Seq Block Comm, Block { Seq } Comm read_x;, Comm write_x; Comm x++;, Comm x—-; Comm if( Cond ) Block Comm if( Cond ) Block else Block Cond x==0, Cond x>0, Cond x<0, Cond x!=0
České vysoké učení technické v Praze Fakulta dopravní Generated sequence S main { Seq } main { Comm } main { if( Cond ) Block } main { if(x!=0) Block } main { if(x!=0) Comm } main { if(x!=0)if( Cond ) Block else Block } main { if(x!=0)if(x<0) Comm else Comm } main { if(x!=0)if(x<0)x++; else Comm } main { if(x!=0)if(x<0)x++; else x--; }
České vysoké učení technické v Praze Fakulta dopravní Other sequence S main { Seq } main { Comm } main { if( Cond ) Block else Block } main { if( Cond ) Comm else Block } main { if(x!=0) Comm else Comm } main { if(x!=0)if( Cond ) Block else Comm } main { if(x!=0)if(x<0) Comm else Comm } main { if(x!=0)if(x<0)x++; else Comm } main { if(x!=0)if(x<0)x++; else x--; } the left nonterminal symbol is always replaced (left derivation)
České vysoké učení technické v Praze Fakulta dopravní Ambiguity Remark: two syntactically identical sentences are generated by the two different derivations (the same syntax, but different semantics) –such languages are ambiguous solutions: –to define additional rules in programming languages, for example else is assigned to the nearest if
České vysoké učení technické v Praze Fakulta dopravní The analysis of context free languages context-free language is analyzed by FSM with stack (LIFO) – push down automaton Note: –analysis of context languages and unlimited languages is NP problem
České vysoké učení technické v Praze Fakulta dopravní Translation regular grammars translation regular grammar G = ( N,T,D,P, S ) where : N - a set of non-terminal symbols T - a set of terminal (input) symbols D - a set of output symbols P – a set of rules S - start symbol S N
České vysoké učení technické v Praze Fakulta dopravní Rules rules are of the form: A a B or A a , where A, B N, a T, D* (D* is a set of all strings over alphabet D)
České vysoké učení technické v Praze Fakulta dopravní Example of the transl. grammar G= { N,T,D,P,S } N = { S,A,K,X } T = { a,+,*} D = { , , } P = { S a A, A +K, A *X, K a , X a } example: S a A a +K a +a –the grammar translates expression a+a in infix form to output expression in postfix form
České vysoké učení technické v Praze Fakulta dopravní Translation FSM translation FSM is a six-tuple where T a set of input symbols D a set of output symbols Q a set of internal states K a set of terminal states q0 is a initial state
České vysoké učení technické v Praze Fakulta dopravní Translation FSM is a mapping: – : Q T { M i : M i Q D * } if a grammar contains rules A ayB, resp. A ay, where y D (the rule contains only one output symbol) and there are no two rules such that A ayB and A ayC then the translation FSM is deterministic and it hold properties of the sequential mapping
České vysoké učení technické v Praze Fakulta dopravní Translation FSM mapping can be divided into: –translation function : Q T Q –output function : Q T D then FSM is Mealy one Poznámka: –FSMs in hardware domain has usually no set of terminal states K
České vysoké učení technické v Praze Fakulta dopravní Equivalency there is an equivalency Regular translation grammars Translation FSMs
České vysoké učení technické v Praze Fakulta dopravní Examples Construct a regular grammar which generates decimal numbers with sign +/- Construct a context-free grammar which generates boolean expressions in disjunctive form using and (*), or (+), negation (-) ans input variables a,b,c, output variable is y. The expression is terminated by semicolon ";"
České vysoké učení technické v Praze Fakulta dopravní Notes grammars are used not only with languages other generative systems can be defined by grammars –grammars of the "nature" –L – systems (Lindenmayer systems) a group of fractals defined by grammars
České vysoké učení technické v Praze Fakulta dopravní Sierpinski triangle G = ( V,P, S ) V = {S,G,F,+,-} –a finite set of symbols P = {S FGF + +FF + +FF, F FF, G + + FGF − −FGF − −FGF + +} interpretation using "turtle graphics" –" F " – moving turtle forward (drawing a line) –" G " – ignore –" + " – rotate to the left around given angle –" – " – rotate to the right around given angle
České vysoké učení technické v Praze Fakulta dopravní angle = 60 degree - triangles
České vysoké učení technické v Praze Fakulta dopravní Helge von Koch curve G = ( V,P, S ) V = {S,F,+,-} –a finite set of symbols P = {S F +F − − F + F, F F +F − − F + F} "turtle graphics" –" F " – moving turtle forward (drawing a line) –" + " – rotate to the left around given angle –"–" – rotate to the right around given angle
České vysoké učení technické v Praze Fakulta dopravní angle = 60 degree