Querying
xOpen() is called when SQLite opens a cursor on a virtual table and prepares to iterate over it. xOpen() more or less just allocates a cursor object and passes it back. Similarly, when SQLite is done iterating over the table, it uses xClose() to clean up the cursor.
All of this so far is intuitive. The query process, however, is more involved.
SQLite's query process consists of three phases: compilation, execution, and finalization. The virtual table module interface addresses all three.
In compilation (when you call sqlite3_prepare()), SQLite calls xBestIndex(). This is where you can, to a degree, tell SQLite about certain optimizations that can be made to narrow the size of the result set.
In execution, SQLite calls xFilter() to initiate the search, followed by xNext() to iterate over each row of the result set. xNext() is called as a result of calling sqlite3_step(). xColumn() provides the values of each column of the current row. xColumn() is called as a result of calling one of the sqlite3_column_xxx() functions. SQLite uses xRowid() to obtain the primary key value for any given row, and xEof() to tell whether it has reached the end of the result set.
Using Indexes
Without xBestIndex(), the query process is really just a glorified for loop that iterates over the same set of data each time. In this case, it would just be a matter of starting at the root filesystem and recursively descending into each directory until you reach the end.
In this case, however, it would be horrible performance-wise to have to search the entire filesystem every time you did a query. What if you only wanted to search your home directory? It would be a complete waste of time for the virtual table to search every other directory outside of it as part of the search. This is where xBestIndex() comes in.
In some ways, xBestIndex() may be the most important part of the virtual table implementation. It is the only place where you can really dramatically reduce the size of your search path. And it may strongly influence the rest of your implementation, such as the contents of your vtab and cursor structures. Therefore, this is the area where we want to concentrate.
xBestIndex() and xFilter() work together, implementing a specific protocol whereby a set of columns can be identified as indexes. xBestIndex() works at query compile time and identifies the columns to be used as indexes. xFilter() works at runtime and obtains the constraint values for those columns.
To start, the xBestIndex() implementation is given by vt_best_index(), whose declaration is:
static int vt_best_index(sqlite3_vtab *tab, sqlite3_index_info *p_info)
SQLite passes xBestIndex() a reference to a sqlite3_index_info structure (p_info), which contains extensive information on the compiled query. This structure is organized into input and output sections. The input is an array of sqlite3_index_constraint pointers (Listing Five, available online). The output is an array of sqlite3_index_constraint_usage pointers (Listing Six, available online).
The aConstraint and aConstraintUsage contain the same number of elements, and each of their elements corresponds to the other. That is, aConstraintUseage[i] and aConstraint[i] both refer to the same column.
Consider the query:
SELECT * from fs WHERE name= 'foo' AND path='/var/log'
The constraint name='foo' AND path='/var/log' would be represented in xBestIndex() as:
p_info->aConstraint[0].iColumn = 0 p_info->aConstraint[0].op = SQLITE_INDEX_CONSTRAINT_EQ p_info->aConstraint[1].iColumn = 1 p_info->aConstraint[1].op = SQLITE_INDEX_CONSTRAINT_EQ
In this case, the name column is referenced in aConstraint[0]. We know it is the name column because its ordinal is 0 (given by aConstraint[0].iColumn), and according to our table schema, column ordinal 0 corresponds to name. The same follows for path. Furthermore, we are given the relational operator used in the expression as well. These operators are:
#define SQLITE_INDEX_CONSTRAINT_EQ 2 #define SQLITE_INDEX_CONSTRAINT_GT 4 #define SQLITE_INDEX_CONSTRAINT_LE 8 #define SQLITE_INDEX_CONSTRAINT_LT 16 #define SQLITE_INDEX_CONSTRAINT_GE 32 #define SQLITE_INDEX_CONSTRAINT_MATCH 64
Thus, we have the flexibility of choosing not only the column of interest, but the equality conditions under which it is being used.
Here, xBestIndex() reads through the inputs (p_info->aConstraint[]) and specifies which columns it wants to use in the output (p_info->aConstraintUsage[]). To use both columns as indexes in xFilter(), you would do this in xBestIndex():
p_info->aConstraintUsage[0] = 1 p_info->aConstraintUsage[1] = 2
The values assigned to the aConstraintUsage[] elements are significant. They specify the order in which they are passed to xFilter(). In the example, this says "pass xFilter() the name column first, then the path column."
To understand how this works, you need to know a little about xFilter(). The declaration of xFilter() is as follows:
static int vt_filter ( sqlite3_vtab_cursor *p_vtc, int idxNum, const char *idxSint argc, sqlite3_value **argv );
The values from xBestIndex() are passed via the argc and argv arguments. argc contains the size of argv, and argv values themselves. The order of the values in argv follows the order specified in p_info->aConstraintUsage in xBestIndex(). So, in our example above, argc will be 2 and argv will be as follows:
argv[0] = 'foo' argv[1] = '/var/log'
Now that xFilter() has this information, its job is to set up the cursor object so it can iterate through a specific set of data. In this case, we know that since path='/var/log', we will be limiting our search to just that directory (and not searching the whole filesystem). From that point forward, the cursor basically holds the state and iterates through all the files and directories in /var/log. The implementation is straightforwardit's just a standard recursive directory search using the APR library.