CodeSmith Generator 6.0 Template Parser Progress Report – Part 1

I am currently working on the new parser for CodeSmith Generator 6.0. When I wrote the original template parser back in 2004, I really had no idea what I was doing. 🙂 I had no idea what an AST was or what an LL(*) parser was, etc. That being said, I think I did an OK job and it has certainly served it’s purpose, but it can’t be easily changed and things are more complicated than they need to be.

I have been working on our new parser for a few weeks now and I have a pretty good start. Here is a list of things that we want to get out of the new parser:

  • Structure – Generate an AST as an intermediate step so that we can do transformations using a Visitor pattern and even LINQ queries.
  • Grammar – Create a grammar for the syntax so that we can give grammar errors.
  • Testability – Make it easy to unit test.
  • Maintainability – Make it easy to implement changes and add new features.

I decided to make use of the parser framework that Actipro includes with their SyntaxEditor control. It is an LL(*) parser framework that lets you define the grammar right in C#. Here is what a sample grammar production looks like:

codeExpression.Production = @expressionStart + @expressionText["expr"] + @expressionEnd
> Ast<ExpressionNode>()
.SetProperty(n => n.Value, AstFrom("expr"));

This code is defining the grammar rule that makes up a code expression <%= SomeValue %> in a template. It says that there will be an expression start token followed by some expression text and then an expression end token. It is also constructing an AST node for the expression. The parser framework has really helped get our new parser of the ground quickly and I highly recommend it.

The cool thing about defining a grammar is that if there are errors in the template, you will get much more intuitive error information about what the parser saw and what it was expecting to see. Previously, we pretty much just parsed the template, generated code from it and tried to compile it. Here is an example of how we will be providing much better errors in Generator 6.0. Here is a template that has a missing % at the end of the first line:

<%@ CodeTemplate Language="C#" >
<%= DateTime.Now %>

Here is the error that Generator 5.3 gives:
error CodeSmith0104: You must specify a valid template language.

And here is the error message that the new parser gives:
Line 2, Char 3: %> expected.

In my next post I will talk about the AST and how we are using the visitor pattern to do transformations on the parsed template content.

P.S. If you haven’t figured it out by now, I’m really bad about writing to my blog. Rather than break out the random excuse generator, I’m just going to start posting. Happy New Year!

kick it on DotNetKicks.com