Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
Expression parser
Message
From
25/11/2007 08:51:02
 
General information
Forum:
Visual FoxPro
Category:
Coding, syntax & commands
Environment versions
Visual FoxPro:
VFP 8 SP1
OS:
Windows XP
Database:
Visual FoxPro
Miscellaneous
Thread ID:
01271086
Message ID:
01271115
Views:
31
This message has been marked as a message which has helped to the initial question of the thread.
>Hi everybody,
>
>I'm trying to work out an algorithm for expression parser. A while ago I read something about it in Knut, but I could not find this book at the moment, besides, my expressions are much easier.
>
>Let me try to describe the problem.
>
>The expression may include units, such as sf, lb, etc.
>
>It also may include variables that correspond to properties of the class, say,
>
>cw - this.nCW
>
>f - this.nFloor
>
>c - this.cCeiling
>
>w - this.nWalls
>
>etc.
>
>Examples of expressions:
>
>cf is the same as c*f = this.cCeiling * this.nFloor
>
>2f the same as 2 * this.nFloor
>
>2@ is the same as 2
>
>2sf is the same as 2 (units)
>
>multiplication may be expressed by *, by x or just by placing var var (no space) or Number var or var number.
>
>There could be multiple parts, such as, I think:
>
>sf + cf
>
>This would be 1 + this.nCeiling * this.nFloor
>
>My original idea based on limited number of examples was to first remove all units from string, replace x with * and then work with variables translating them into class properties and inserting * if needed.
>
>I'm wondering if you can give me some direction about approaching this problem.
>
>Thanks.

>Hi everybody,
>
>I'm trying to work out an algorithm for expression parser. A while ago I read something about it in Knut, but I could not find this book at the moment, besides, my expressions are much easier.
>
>Let me try to describe the problem.
>
>The expression may include units, such as sf, lb, etc.
>
>It also may include variables that correspond to properties of the class, say,
>
>cw - this.nCW
>
>f - this.nFloor
>
>c - this.cCeiling
>
>w - this.nWalls
>
>etc.
>
>Examples of expressions:
>
>cf is the same as c*f = this.cCeiling * this.nFloor
>
>2f the same as 2 * this.nFloor
>
>2@ is the same as 2
>
>2sf is the same as 2 (units)
>
>multiplication may be expressed by *, by x or just by placing var var (no space) or Number var or var number.
>
>There could be multiple parts, such as, I think:
>
>sf + cf
>
>This would be 1 + this.nCeiling * this.nFloor
>
>My original idea based on limited number of examples was to first remove all units from string, replace x with * and then work with variables translating them into class properties and inserting * if needed.
>
>I'm wondering if you can give me some direction about approaching this problem.
>
>Thanks.
__________________________________________________________________________________________________
(1) multiplication may be expressed by *, by x or just by placing var var (no space) or Number var or var number
Difficult
2xy
is this 2 * x * y or 2 * y ?

y2

is this Y * 2 or a variable

3y is ok, since you have a number preceding the x

3e2

is this 3 * e * 2 or 300 ?

3aw

is this 3 * aw or 3 * a * w ?

(2) First you have to make an algorithm which breaks the 'input stream' into tokens (variables, operators, parens)
You may even accept [] as parens besides the usual ()
Therefore
- you have to define what syntax a variable can have, eg RegEx : [_a-zA-Z][_0-9a-zA-z]*
- what operators there are, eg * / + - ^
- be able to distinguish between a minus as an operator (eg in 1-3) and a unary minus (eg -3)

3) Token: following are all tokens
I make a distinction between
- an operator : + - * / ^
- unary minus : - (which is an operator with higher precedence than any other operator)
- a symbol or operand: variable or number
- parenthesis

To find out whether an infix expression is valid, you have to convert it to rpn (Reverse Polish Notation )
infix: 1 - 3
rpn: 1 3 -

4) for each of the above define
- an input precedence
- a stack precedence
- a rank

Input precedence : precedence in the input stream
Stack precedence: precedence when the token has been pushed on the stack


Rank:
- for an operator : 1 - number of operands ( eg for + it becomes 1 - 2 = -1 and for unary operators the ranks is 1 - 1 = 0)
- for an operand it is 1
- for parens it is undefined

example
&& ranks:
&&		operators	         1 - number of operands
&&		operand		1
&&		parens		undefined

&& except for parentheses, there should be no equal value in(input precedence and stack precedence)
&& when IP == SP, the symbols are not pushed on the stack, and do not appear in the output

&& left  associative : IP < SP
&& right associative : IP > SP

&& sample
&& symbol	  input		stack
&&	  precedence	precedence		rank
&& 
&& +, -		1	2			-1  = 1 - 2
&& *, /		3	4			-1
&& ^		6	5			-1
&& unary -	7	8			0
&& symbol		9	10			1
&& (		99998	-1			undefined
&& )		-1	undefined			undefined
&& LeftInternal[  99999     -2                         undefined
&& RightInternal]  -2       undefined                  undefined
An expression is well formed if
(1) Its Rank equals 1
(2) at no time does the rank of an expression < 1

Algo:
(1) a function (Token_Next(@m.Token)) which reads the input stream and returns a token each time it is called

return values: TRUE = I have a token, FALSE = no token or EndOfStream

(2)
Initialize a lookup array with all the operators (including unary minus), left parens, right parens (not symbol nor LeftInternal[ and RightInternal]) and columns for their input precedence, stack precedence and ranks

Initialize a stack
=Stack.push(LeftInternal[)
Rank = 0
EndOfInputStream = FALSE

Success = TRUE

do while m.Success
   do case
   case EndOfInputStream
       exit

   case !Token_next(@Token)
        EndOfInputStream = TRUE
        s_InputPrecedence = -2
        s_StackPrecedence = undefined
        s_Rank = undefined

   case Token in (LeftInternal[, RightInternal])
        Success = FALSE

   case lookup of token in array succeeds
        && operator or parens
         s_InputPrecedence = the input precedence of Token
         s_StackPrecedence = the stack precedence of Token
         s_Rank = the rank of Token

   otherwise && assume variable/number
        
         s_InputPrecedence = 9
         s_StackPrecedence = 10
         s_Rank = 1

    endcase

   && remove symbols with greater precedence from Stack
   do while m.Success
       do case
       && Peek stack, do not pop, ie non destructive
       && peek returns FALSE if stack is empty
       case !Stack.Peek( , , @TopStack_SP)
          Success = FALSE 

       case s_StackPrecedence >= m.TopStack_SP
	exit   

       case !m.this.Stack.Pop(@p_Symbol, @p_IP, @p_SP, @p_Rank)
           Success = FALSE

       && parens mismatch, ie not enough )).  We have popped a left parens
       case p_Rank is undefined
           Success = FALSE

       && output symbol
       case !Output( p_symbol )
           Success = FALSE

       otherwise
	Rank = m.Rank + m.p_Rank
					
	if( m.Rank < 1 )
		Success = FALSE
	endif
					
      endcase
				
    enddo && InnerLoop

    && Matching parenthesis ?
    do case
    case !m.Success
			
    case (m.s_InputPrecedence == m.TopStack_SP)
			
	if( !m.this.Stack.Pop() )
		Success = FALSE
	endif
		
   && we are about to push an unmatched  right parens
   case UNDEFINED(m.s_Rank) and UNDEFINED(s_StackPrecedence)
	Success = FALSE
		
   case !m.this.Stack.Push(Token, m.s_InputPrecedence, s_StackPrecedence, m.s_Rank)
       Success = FALSE
			
   endcase

enddo
do case
case !m.Success
		
case !m.this.Stack.Empty()
	Success = FALSE
		
case m.Rank <> 1
	Success = FALSE

endcase
Notes
(1) this is basically what a compiler does. a + b * c is converted to a b c * +

(2) To distinguish between a unary minus and a binary minus: a unary minus is at the beginning of the expression or immediatly after a left parens

(3) if you support expressions like 3ab meaning 3 * ab, you may have to return the extra token * in between

(4) for further explanations, google for infix to rpn algo's
Gregory
Previous
Next
Reply
Map
View

Click here to load this message in the networking platform