Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Open Source

Lua: an Extensible Embedded Language


In recent years, a multitude of little languages have been proposed for extending and customizing applications. In general, these extension languages should have the following attributes:

  • Clear and simple syntax (since it is not the main language for most of its users).
  • Small size and small implementation (so the cost of adding it to the host will not be too high).
  • Good data-description facilities (to make it useful as a configuration language).
  • Adequate extensibility (to allow its use in high abstraction levels — for interfacing with users in diverse domains).

Since extension languages are not for writing large pieces of software, mechanisms for supporting programming-in-the-large, like static type checking and information hiding, are not essential.

Lua, the extensible, embedded language we present here, satisfies these requirements. Its syntax and control structures are simple and familiar. Lua is small--the whole implementation is less than 6000 lines of ANSI C. Besides the facilities common to most procedural languages, Lua has special features that make it a powerful high-level extensible language:

  • Ability to define and manipulate functions as first-class values, which greatly simplifies the implementation of object-oriented facilities.
  • Associative arrays — powerful language constructs that implement most data containers.
  • Garbage collection, negating the need for explicit managing of memory allocations — a major source of programming errors.
  • A fallback mechanism, allowing extension of the semantics of the language.
  • Reflexive facilities, allowing the creation of highly polymorphic parts.

Lua is a general-purpose embedded programming language designed to support procedural programming with data-description facilities. Although it is not in the public domain (TeCGraf retains the copyright), Lua is freely available for both academic and commercial purposes at http://www.inf.puc-rio.br/~roberto/lua.html. The distribution also includes a standard library of mathematical functions (sin, cos, and so on), I/O and system functions, and string-manipulation functions. This optional library adds some 1000 lines of code to the system. Also included are a debugger and a separate compiler that produces portable binary files containing bytecodes. The code compiles without change in most ANSI C compilers, including gcc (on AIX, IRIX, Linux, Solaris, SunOS, and ULTRIX), Turbo C (on DOS), Visual C++ (on Windows 3.1/95/NT), Think C (MacOS), and CodeWarrior (MacOS).

All external identifiers are prefixed with lua to avoid name clashing when linking with applications. Even the code generated by yacc passes through a sed filter to comply with this rule, so that it is possible to link Lua with applications that use yacc for other purposes.

The Lua Implementation

Lua is provided as a small library of C functions to be linked to host applications. For example, the simplest Lua client is the interactive, stand-alone interpreter in Listing One. In this program, the function lua_dostring calls the interpreter over a section of code contained in a string. Each chunk of Lua code may contain a mixture of statements and function definitions.

The header file lua.h defines Lua's API, which has about 30 functions. Besides lua_dostring, there is a lua_dofile function to interpret Lua code contained in files, lua_getglobal and lua_setglobal to manipulate Lua global variables, lua_call to call Lua functions, lua_register to make C functions accessible from Lua, and so on.

Lua has a syntax somewhat similar to Pascal. To avoid dangling elses, control structures like ifs and whiles finish with an explicit end. Comments follow the Ada convention, starting with "--" and run until the end of the line. Lua supports multiple assignment; for example, x, y = y, x swaps the values of x and y. Likewise, functions can return multiple values.

Lua is a dynamically typed language. This means that values have types but variables don't, so there are no type or variable declarations. Internally, each value has a tag that identifies its type; the tag can be queried at run time with the built-in function type. Variables are typeless and can hold values of any type. Lua's garbage collection keeps track of which values are being used, discarding those that are not.

Lua provides the types nil, string, number, user data, function, and table. nil is the type of the value nil; its main property is that it is different from any other value. This is handy to use as the initial value of variables, for instance. The type number represents floating-point real numbers. string has the usual meaning. Type user data corresponds to a generic void* pointer in C, and represents host objects in Lua. All of these types are useful, but the flexibility of Lua is due to functions and tables, the product of two key lessons from Lisp and Scheme:

  • Functions should be first-class values.
  • Languages should have a single and strong unifying data constructor (lists in Lisp, tables in Lua).

Function values in Lua can be stored into variables, passed as parameters to other functions, stored in tables, and the like.

When you declare a function in Lua (see Listing Two), the function body is precompiled into bytecodes, creating a function value. This value is assigned to a global variable with the given name. C functions, on the other hand, are provided by the host program through an appropriate call to the API. Lua cannot call C functions that have not been registered by their host. Therefore, the host has complete control over what a Lua program can do, including any potentially dangerous access to the operating system.

Tables

Tables are for Lua what lists are for Lisp: powerful data-structuring mechanisms. A table in Lua is similar to an associative array. Associative arrays can be indexed with values of any type, not just numbers.

Many algorithms become trivial when implemented with associative arrays, because the data structures and algorithms for searching them are implicitly provided by the language. Lua implements associative arrays as hash tables.

Unlike other languages that implement associative arrays, tables in Lua are not bound to a variable name. Instead, they are dynamically created objects that can be manipulated much like pointers in conventional languages. In other words, tables are objects, not values. Variables do not contain tables, only references to them. Assignment, parameter passing, and function returns always manipulate references to tables, and do not imply any kind of copy. While this means that a table must be explicitly created before it is used, it also allows tables to freely refer to other tables. So, tables in Lua can be used to represent recursive data types and to create generic graph structures, even those with cycles.

Tables simulate records simply by using field names as indices. Lua makes this easier by providing a.name as syntactic sugar for a["name"]. Sets also can be easily implemented by storing their elements as indices of a table. Note that tables (and therefore sets) need not be homogeneous; they can store values of all types simultaneously, including functions and tables.

Lua provides a constructor, a special kind of expression to create tables, that is handy for initializing lists, arrays, records, etc. See Example 1.

User-Defined Constructors

Sometimes you need finer control over the data structures you are building. Following the philosophy of providing only a few general metamechanisms, Lua provides user-defined constructors. These constructors are written name{...}, which is a more intuitive version of name({...}). In other words, with such a constructor, a table is created, initialized, and passed as a parameter to a function. This function can do whatever initialization is needed, such as dynamic type checking, initialization of absent fields, and auxiliary data-structure update, even in the host program. (Listing Three)

User-defined constructors can be used to provide higher-level abstractions. So, in an environment with proper definitions, you can write window1=Window{x=200, y=300, color="blue"} and think about "windows," not plain tables. Moreover, because constructors are expressions, they can be nested to describe more complex structures in a declarative style, as in Listing Four.

Object-Oriented Programming

Because functions are first-class values, table fields can refer to functions. This is a step toward object-oriented programming, and one made easier by simpler syntax for defining and calling methods.

A method definition is written as Example 2(a), which is equivalent to Example 2(b). In other words, defining a method is equivalent to defining a function, with a hidden first parameter called self and storing the function in a table field.

A method call is written as receiver: method(params), which is translated to receiver.method(receiver,params). The receiver of the method is passed as the first argument of the method, giving the expected meaning to the parameter self.

These constructions do not provide information hiding, so purists may (correctly) claim that an important part of object orientation is missing. Moreover, Lua does not provide classes; each object carries its own method-dispatch tables. Nevertheless, these constructions are extremely light, and classes can be simulated using inheritance, as is common in other prototype-based languages, such as Self.

Fallbacks

Because Lua is an untyped language, many abnormal run-time events can happen: arithmetic operations being applied to nonnumerical operands, nontable values being indexed, nonfunction values being called. In typed, stand-alone languages, some of these conditions are flagged by the compiler; others result in aborting the program at run time. It's rude for an embedded language to abort its host program, so embedded languages usually provide hooks for error handling.

In Lua, these hooks are called "fallbacks" and are also used for handling situations that are not strictly error conditions, such as accessing an absent field in a table and signaling garbage collection. Lua provides default fallback handlers, but you can set your own handlers by calling the built-in function setfallback with two arguments: a string identifying the fallback condition (see Table 1), and the function to be called whenever the condition occurs. setfallback returns the old fallback function, so you can chain fallback handlers if necessary.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.