Take note that the factor
rule does not explicitly use ch_p
.
Instead, the '('
, ')'
, and '-'
are applied directly as
arguments to the >>
operator. If one of the operands in a binary
expression is a Spirit parser, such as the factor
rule, the other operand
may be a literal character. The result of such an expression is another Spirit
parser. ch_p
is used in the integer
rule because a unary operator
(!)
is applied. Without the ch_p
, !'-'
would simply evaluate
to false.
This example is not very useful, although it is valid code. All it can do is recognize a string that belongs to the grammar. Listing One contains the complete code for a calculator that evaluates the expression given to it.
Listing One: A calculator that evaluates the expression given to it .
// Copyright (c) 2001, Joel de Guzman and Dan Nuffer // Permission is granted to use this code without restriction as // long as this copyright notice appears in all source files. #include <boost/spirit/spirit.hpp> #include <iostream> #include <stack> #include <functional> using namespace std; using namespace spirit; stack<long> evaluation_stack; struct push_int { void operator()(char const* str, char const*) const { long n = std::strtol(str, 0, 10); evaluation_stack.push(n); } }; template <class op> struct do_op { do_op(op const& the_op) : m_op(the_op) {} op m_op; void operator()(char const*, char const*) const { long rhs = evaluation_stack.top(); evaluation_stack.pop(); long lhs = evaluation_stack.top(); evaluation_stack.pop(); evaluation_stack.push(m_op(lhs, rhs)); } }; template <class op> do_op<op> make_op(op const& the_op) { return do_op<op>(the_op); } struct do_negate { void operator()(char const*, char const*) const { long lhs = evaluation_stack.top(); evaluation_stack.pop(); evaluation_stack.push(-lhs); } }; int main() { rule<> expression, term, factor, integer; integer = lexeme[ (!ch_p('-') >> +digit)[push_int()] ]; factor = integer | '(' >> expression >> ')' | ('-' >> factor)[do_negate()]; term = factor >> *( ('*' >> factor)[make_op(std::multiplies<long>())] | ('/' >> factor)[make_op(std::divides<long>())]); expression = term >> *( ('+' >> term)[make_op(std::plus<long>())] | ('-' >> term)[make_op(std::minus<long>())]); char str[256]; cin.getline(str, 256); if (parse(str, expression, space).full) { cout << "parsing succeeded\n"; cout << "result = " << evaluation_stack.top() << "\n\n"; evaluation_stack.pop(); } else { cout << "parsing failed\n"; } }
A semantic action can be either a free function or function object (or functor
in STL terminology). One may be attached to any expression within a Spirit grammar
definition, using the expression p[a]
where p
is a parser and
a
is a semantic action. When a match is made, the action will be called
and passed beginning and ending iterators (much like an STL algorithm) to the
input that was matched.
The push_int
semantic action functor converts its input from a string
into a long int
and then pushes it onto evaluation_stack
.
The do_op
semantic action template struct
will apply another
function to the top two values popped from evaluation_stack
and then
push the result back onto the stack. It is used to perform all binary arithmetic
operations done in the calculator.
For each binary operation (+
, -
, *
, and /
), the
appropriate do_op
is created using the function objects from the standard
library to do the operations. The make_op
helper function facilitates
creating do_op
classes.
The do_negate
semantic action will be called when the unary negation
-
operator is invoked. do_negate
pops a value from evaluation_stack
,
negates it, and pushes the result back onto the stack.
After the parser has been created, the program reads a line from cin
.
The expression is then parsed using the free parse
function from the
library. Various parse
functions in Spirit can be used in different situations.
The parse
function used in the example takes as its parameters a NULL
terminated string, the top-level rule (traditionally called the start symbol)
of the grammar to be used for parsing, and a skip parser, which in this case
is spirit::space
. The skip parser instructs the involved parser what
to skip. Using spirit::space
as the skip parser simply means that all
space characters in between symbols and words in the input will be skipped.
parse
returns a parse_info struct
:
template <typename IteratorT> struct parse_info { IteratorT stop; bool match; bool full; unsigned length; };
The member stop
points to the final parse position (i.e., parsing processed
the input up to this point). match
is true if parsing is successful, which
may be so if the parser consumed all the input (full match) or if the parser consumed
only a portion of the input (partial match). full
will be true when a full
match occurs, meaning the parser consumed all the input. length
is the
number of characters consumed by the parser and is valid only if a successful
match has occurred.
Finally, if the parse was successful, the result of the expression is printed along with a success message.
Compiling the example is straightforward because the Spirit library consists
of only headers, as all the classes are templates. This makes it easy to use.
The library can be used straight out of the box. You only need to include the
spirit.hpp
header. There is no library to link against.
The Parser
The most fundamental concept behind Spirit's design is the parser
class. A parser models a recognizer of a language from the simplest to the most
complex. It has a conceptual member function:
parse(iterator& first, iterator last)
This function does the work of inspecting the iterator
range and reporting
success or failure. The iterator first
, which is passed by reference, is
advanced accordingly when a match is found.
The parse
member function is conceptual, as opposed to virtual, in
the sense that the base class parser
does not really have any such member
function. Subclasses must provide one. The conceptual base class is a template
class parametized by its subclass, which gives it access to its subclass. Whenever
a parser is asked to do its task, it delegates the task to its subclass. This
process is very similar to how virtual functions work, but the difference is
that the member function is bound statically instead of dynamically (run-time
bound). James Coplien first popularized this technique of compile-time polymorphism
in an article in C++ Report entitled "Curiously Recurring Template
Patterns" [1]. Listing Two shows the
parser
class and some examples of trivial subclasses.
Listing Two: The parser class and some trivial subclass examples.
// Copyright (c) 2001, Joel de Guzman and Dan Nuffer // Permission is granted to use this code without restriction as // long as this copyright notice appears in all source files. template <typename DerivedT> struct parser { DerivedT& derived() { return *static_cast<DerivedT*>(this); } DerivedT const& derived() const { return *static_cast<DerivedT const*>(this); } }; template <typename DerivedT> struct char_parser : public parser<DerivedT> { template <typename IteratorT> match parse(IteratorT& first, IteratorT const& last) const { if (first != last) if (bool r = this->derived().test(*first)) { ++first; return match(1); } return match(); } ... }; template <typename CharT = char> class chlit : public char_parser<chlit<CharT> > { public: ... template <typename T> bool test(T ch_) const { return T(ch) == ch_; } private: CharT ch; };
Though quite simple, the example is not contrived, but it is an actual part
of the library. A chlit
object merely compares a single character for
a match and advances the iterator one step forward when successful. The success
of a parse is encoded in a match
object returned by the member function
parse
. Aside from reporting a true or false result, the number of matching
characters from the input can also be obtained from this object.