JDBC Statement Caching: Implicit vs. Explicit

05 Aug 2019

The Oracle JDBC driver offers two methods for client-side statement caching. Enabling statement caching changes the behavior when the client closes a prepared statement. With statement caching disabled, the driver simply closes the statement. This is a server call which closes the associated cursor on the database. When the same statement is executed again it has to be at least soft parsed again. With statement caching enabled, the statement and its associated database cursor are not closed. Instead the statement is put in a pool (basically a LRU list of statements) for reuse. This way soft parses can be avoided which saves database resources and improves performance, which is the main reason to use statement caching.

At a first glance, the two statement caching methods differ mainly in the way a statement can be retrieved from the cache, but they also differ in what kind of data is cached. There’s no detailed information about the latter. This post tries to shed some light into that area and reveals some interesting behavior. All tests where done with the 18.3 JDBC driver. If you want to test your driver version, check out the code on Github.

Cached Statement Retrieval

First, let’s have a look at some code to recap how these two methods are used.

Please note that while the examples always use PreparedStatement, caching is also done for CallableStatement but not for plain Statement.

Implicit Caching

The following code snippet demonstrates the use of implicit statement caching:

// Enable implicit cache. Cache 5 statements.
connection.setImplicitCachingEnabled(true);
connection.setStatementCacheSize(5);

// Prepare statement
PreparedStatement statement =
  connection.prepareStatement("select * from dual");

// Execute and fetch ...

// Statement is not closed but put it in the cache
statement.close();

// Statement is retrieved from the cache
statement = connection.prepareStatement("select * from dual");

The code dealing with the prepared statement looks exactly the same as it would without caching enabled. When preparing a statement, the cache is searched for an already existing statement with the exact same SQL text and result set type. This has the huge advantage that you could benefit from it without any code changes. It’s transparent to the client and therefore can be used with any framework in use. The only thing that the client needs to do, is to enable it either on the connection or on the data source.

Explicit Caching

Let’s have a look at code using explicit statement caching:

// Enable explicit cache. Cache 5 statements.
connection.setExplicitCachingEnabled(true);
connection.setStatementCacheSize(5);

// Prepare statement
OraclePreparedStatement statement =
  (OraclePreparedStatement) connection.prepareStatement("select * from dual");

// Execute and fetch ...

// Statement is not closed but put it in the cache
statement.closeWithKey("myCacheKey");

// Statement is retrieved from the cache
connection.getStatementWithKey("myCacheKey");

The main difference to implicit caching is how the statement is put into the cache and how it’s retrieved from it. The statement has to be closed by providing a key. This closeWithKey method is declared in OraclePreparedStatement and is not available in PreparedStatement. The cached statement can be retrieved with the same key. Again, getStatementWithKey needs an OracleConnection.

So explicit caching needs some more work to be used in your code and leads to tighter coupling to Oracle specific classes and methods. I’m not aware of any data access framework supporting it.

Cached Data

It looks like the two methods differ mainly in how statements are put in and retrieved from the cache. But there’s one more major difference. The Oracle JDBC Guide states the following:

Because explicit Statement caching retains statement data and state as well as metadata, it has a performance edge over implicit Statement caching, which retains only metadata. However, you must be cautious when using this type of caching, because explicit Statement caching saves all three types of information for reuse and you may not be aware of what data and state are retained from prior use of the statements.

The documentation talks about state, data and metadata. I don’t know what kind of data is supposed to fit into which of these categories. So, after some tests I defined the following categories to illustrate the caching behavior:

Properties - Properties set on the statement, which don’t retrieve or send any user data to the database server.
Data - User data or statement metadata to be sent or retrieved from the database server.

The test code can be found on Github.

Properties

The following table shows the different statement properties and whether these settings are retained when a statement is retrieved from the cache.

Property	Used method to set the property	Implicit	Explicit
Result Set Type	`Connection#prepareStatement`	N/A	✓
Result Set Concurrency	`Connection#prepareStatement`	N/A	✓
Fetch Size	`OraclePreparedStatement#setRowPrefetch`	✗	✓
LOB Prefetch Size	`OraclePreparedStatement#setLobPrefetchSize`	✓	✓
Max Field Size	`OraclePreparedStatement#setMaxFieldSize`	✗	✓
Query Timeout	`OraclePreparedStatement#setQueryTimeout`	✗	✓

Result Set Type and Concurrency

These properties are set by calling Connection#prepareStatement. For implicitly cached statements they are, together with the SQL text, part of the cache key. Therefore, a prepared statement is only found in the cache when these 2 properties match.

LOB Prefetch Size

As the regular fetch size setting is not retained in implicitly cached statements, I expected the same for LOB prefetch size. Interestingly this is not the case.

Data

The table below shows a summary of which kind of data is cached by which method. There are some interesting irregularities explained afterwards.

Data	Implicit	Explicit
Result Set	✗	✗
Result Set Metadata	✓	✓
Parameter Metadata	✓	✗
Update Count	✓	✓
Generated Keys	✗	✓
Bind Data	✗	✓
Batches	✗	✗

Result Set

When calling Statement#close the associated ResultSet is closed as well. This is also true when statement caching is enabled. Any subsequent calls on a result set returned by Statement#getResultSet will result in a SQLException.

Result Set Metadata

Result set metadata (e.g. retrieved by PreparedStatement#getMetaData) is retained by both explicit and implicit statement caching as it is stated in the documentation.

Parameter Metadata

As one would expect, parameter metadata (retrieved by PreparedStatement#getParameterMetaData) behaves exactly the same as result set metadata regarding caching. But interestingly parameter metadata is only cached when implicit caching is enabled. I assume it is because internally the JDBC driver executes some statements to retrieve this information from the database. With implicit caching enabled, these statements are automatically cached as well.

Update Count

After calling PreparedStatement#executeUpdate to insert or update records, the number of affected rows can be returned by calling Statement#getUpdateCount. The update count is retained for statements put into the cache. Interestingly this is also the case for implicitly cached statements. I don’t see any reason why this value is not reset for implicitly cached statements, and it’s not the behavior described in the documentation.

This is only a problem if Statement#getUpdateCount is called after retrieving a statement from the cache but before doing an update. In this case the method reports the update count from the previous execution, otherwise the value simply gets overwritten when an update statement is executed. The reported row count also depends on whether the Statement#getUpdateCount was called on the statement before it was put back into the cache. It either reports the updated rows (if the method was not called previously) or -1, which means the value was already retrieved.

Generated Keys

By calling Statement#getGeneratedKeys one can retrieve the keys generated by an insert statement. In fact, any columns could be returned, but it makes most sense to return generated keys, as they are not already known by the client. This method returns a regular ResultSet, but unlike the one returned by Statement#getResultSet this one is retained for explicitly cached statements.

Bind Data

Oracle uses three different ways for binding data depending on data size and type of SQL statement (SQL or PL/SQL): direct, stream or LOB binding. See the documentation for more information. Irrespective of the method used, bind data is always cached by explicit statement caching. You should be aware of this, as it can hugely impact your memory footprint when dealing with LOBs or other large data. Also keep it in mind when sizing the statement cache (OracleConnection#setStatementCacheSize).

Batches

To minimize round trips between the client and the database, a common technique is to batch bind data (PreparedStatement#addBatch) and send it to the database in one go. In contrast to non-batched statements where bind data is cached by explicit statement caching, it is not cached when using batches. Batches are always cleared when a statement is put into the cache.

Conclusion

Most data access frameworks nowadays take away the burden of SQL cursor handling. One of the benefits is, that it avoids cursor leaks, but on the other hands leads to more soft parses and cursor creation on the database side. JDBC statement caching does mitigate these issues, so using it is strongly recommended.

Implicit caching is very easy to use and there is no reason not to enable it.

Explicit caching requires changes in the application code. According to Oracle it can give you better performance because more data is cached. Whether this is only true if you actually reuse data from previously cached statements or also when you override this data (think of bind data) has to be thoroughly tested.

After all, you are not limited to use only one of the caching methods but can use both simultaneously and benefit from the strengths of each method. For this reason, you should be aware of the different data that is retained by the two caching methods. The tests have shown the behavior is not always what one would expect after reading the documentation.

blog. aregger.io