C++ Blog

Boost.Spirit – Easy to use Parser

Posted in boost by Umesh Sirsiwal on December 20, 2009

Note that this post applies to Spirit.Classic or 2.0.

I recently had write a parser in C++ and decided to give Boost.Spirit a chance. I was delighted with ease of use of the parser itself. It was a steep learning curve to get started. However, once I got started, it made life significantly simple.

Boost.Spirit provides a very simple way to create a parser using EBNF grammar. Let us consider an e-mail address parser as an example.
+alnum_p >> '@' >> +alnum_p >> *('.' >> +alnum_p);
Simple! Let us understand the above statement. In Boost.Spirit rule<> template defines basic grammar rule. A rule is made up of parsers and other rules. In Boost.Spirit built-in parser all have _p as suffix. The alnum_p parser matches any character or number. ‘+’ represents operator which matches one more more instances of the enclosed parser.

‘>>’ is an overloaded operator which simply says followed by.’@’ is short form of parser which matches character ‘@’.

So the above statement says any alphanumeric string followed by ‘@’ followed by any other alpha numeric string. This can than be followed by any number of “.<alphanumber>”.

By default the >> includes white space skipping. This implies the above parser will match (a@b as well as a @ b).

We don’t want white space skipping for e-mail address. So we need to modify the above rule such that >> does not skip white space. For this and other purposes, the Spirit allows directives. These are special instructions which can be applied to a part of the rule. No white space skipping is achieved by using lexeme_d.
lexeme_d[ +alnum_p >> '@' >> +alnum_p >> *('.' >> +alnum_p);];

I have no clue why the name. But the suffix is _d for all directives.OK! we got a rule. Now let us check if this rule matches a specified string:

bool isEmailAddress(const std::string& str){
return parse(str, lexeme_d[ +alnum_p >> '@' >> +alnum_p >> *('.' >> +alnum_p);], space_p).full;

The above statement will check if the supplied string is an e-mail address or not.

In the next post we will discuss how to create more complex examples as well as deal with parse actions.


One Response

Subscribe to comments with RSS.

  1. […] the last post we discussed how to write a simple parser using Spirit. Most real life parsers are a lot more […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: