Formal grammars
The new expression parser is defined using a formal grammar. It is a set of rules describing, how the tokens, the smallest units of the language, can be connected to produce something useful. In the example below, we have produced a simple grammar that allows adding and multiplying numbers:
expr -> expr PLUS expr expr -> expr MUL expr expr -> LEFT_PARENTHESIS expr RIGHT_PARENTHESIS expr -> NUMBER
Of course we have to define the operator precedence to make it fully correct, but the basic idea is simple. We write a set of rules (called 'productions') saying, how to produce the more and more complex parts of the expressions from the smallest units. For a number of grammar classes, there are computer algorithms that are able to produce a complete parser code for them, and the most popular one for programming languages is called LALR(1). Now all we have to do is to write the grammar in the format accepted by the parser generator and simply run it to produce a ready-to-use code.
For Open Power Template, I chose PHP Parser Generator, a clone of Lemon parser generator rewritten from C to PHP. The same stuff is used by the guys that develop Smarty 3, but they have applied it to parse the whole template. In case of OPT it is not necessary, as it uses an ordinary XML parser. Below, you can see a part of the grammar file that defines function calls:
calculated(res) ::= function_call(fc). { res = fc; }
calculated(res) ::= method_call(oc). { res = oc; }
function_call(res) ::= functional(fun). { res = $this->_expr->_makeFunction(fun); }
functional(f) ::= IDENTIFIER(s) L_BRACKET argument_list(a) R_BRACKET. { f = $this->_expr->_makeFunctional(s, a); }
functional(f) ::= IDENTIFIER(s) L_BRACKET container_def(a) R_BRACKET. { f = $this->_expr->_makeFunctional(s, array($this->_expr->_containerValue(a, Opt_Expression_Standard::CONTAINER_WEIGHT))); }
functional(f) ::= IDENTIFIER(s) L_BRACKET R_BRACKET. { f = $this->_expr->_makeFunctional(s, array()); }
As you can see, for each production we can define a PHP code snippet that may perform some action, if the specified sequence of symbols is found. This eases the process of defining a language very much.
New features of the OPT expression language
The new parser allowed me to implement lots of new and useful stuff. The most lacking feature in OPT 2.0 was the inability to create a container in a template, like arrays in PHP. We could not write:
{url(container('controller' => 'foo', 'action' => 'bar'))}
The routing path must have been defined either in a script or saved as a string. In OPT 2.1 this limitation is finally removed. We can create new containers dynamically, using a convenient syntax designed for using within XML documents:
xml
{url( [ 'controller': 'foo', 'action': 'bar'] )}
{url('controller': 'foo', 'action': 'bar')}
The shortened version (syntactic sugar) for functions is also possible, as we can see. The two lines above do exactly the same thing. The container call syntax was also extended. Now it is possible to select the container index dynamically:
$container.($index).foo
However, you should be aware that some data formats may not support it and it will not always work.
A lot of effort has been made to develop new operators which you should find very comfortable and convenient, especially with the longer expressions:
$number is between 5 and 10 $number is not between 5 and 10 $number is either 5 or 10 $number is neither 5 nor 10 $container contains 'foo' $container contains either 'foo' or 'bar' $container contains neither 'foo' nor 'bar' $container contains both 'foo' and 'bar' 'foo' is in $container 'foo' is not in $container 'foo' is either in $foo or $bar 'foo' is neither in $foo nor $bar 'foo' is both in $foo and $bar
The first group of operators is a shortened form for expressions like $number > 5 and $number < 10. The second grup tests the contents of containers, allowing to check if they contain one or more specified values. In the last group, the situation is reversed: we check if the element exists in one or more containers. The advantage of the new operators is that they will be able to be reprogrammed with data formats, hiding even more implementation details from the template designer and improving the template portability.
There is only one feature removed. In OPT 2.0 it is possible to write eq eq eq and the parser correctly recognizes, which "eq" is an operator and which is a string. Unfortunately, such trick is not possible with ordinary LALR(1) grammars so I was forced not to implement it in order not to produce a mess.
The last change concerning the expressions is that the direct access to PHP arrays and objects is disabled by default: $foo'bar', $foo::field. Using them instead of containers and data formats is considered as a bad programming practice in OPT, and reduces the template portability. The idea is to encourage the programmers to learn what containers are and why they are better in templates than PHP objects and arrays.
Conclusion
There is still a lot of work to do in Open Power Template 2.1. I hope that after the exams at university I will find more time to finish it. Anyway, I'll do my best to make OPT 2.1 the best template engine ever.
Last comments