JDBC Statement Caching: Implicit vs. Explicit
05 Aug 2019The Oracle JDBC driver offers two methods for client-side statement caching. Enabling statement caching changes the behavior when the client closes a prepared statement. With statement caching disabled, the driver simply closes the statement. This is a server call which closes the associated cursor on the database. When the same statement is executed again it has to be at least soft parsed again. With statement caching enabled, the statement and its associated database cursor are not closed. Instead the statement is put in a pool (basically a LRU list of statements) for reuse. This way soft parses can be avoided which saves database resources and improves performance, which is the main reason to use statement caching.
At a first glance, the two statement caching methods differ mainly in the way a statement can be retrieved from the cache, but they also differ in what kind of data is cached. There’s no detailed information about the latter. This post tries to shed some light into that area and reveals some interesting behavior. All tests where done with the 18.3 JDBC driver. If you want to test your driver version, check out the code on Github.
Cached Statement Retrieval
First, let’s have a look at some code to recap how these two methods are used.
Please note that while the examples always use PreparedStatement
, caching is also done for CallableStatement
but not for plain Statement
.
Implicit Caching
The following code snippet demonstrates the use of implicit statement caching:
The code dealing with the prepared statement looks exactly the same as it would without caching enabled. When preparing a statement, the cache is searched for an already existing statement with the exact same SQL text and result set type. This has the huge advantage that you could benefit from it without any code changes. It’s transparent to the client and therefore can be used with any framework in use. The only thing that the client needs to do, is to enable it either on the connection or on the data source.
Explicit Caching
Let’s have a look at code using explicit statement caching:
The main difference to implicit caching is how the statement is put into the cache and how it’s retrieved from it. The statement has to be closed by providing a key. This closeWithKey
method is declared in OraclePreparedStatement
and is not available in PreparedStatement
. The cached statement can be retrieved with the same key. Again, getStatementWithKey
needs an OracleConnection
.
So explicit caching needs some more work to be used in your code and leads to tighter coupling to Oracle specific classes and methods. I’m not aware of any data access framework supporting it.
Cached Data
It looks like the two methods differ mainly in how statements are put in and retrieved from the cache. But there’s one more major difference. The Oracle JDBC Guide states the following:
Because explicit Statement caching retains statement data and state as well as metadata, it has a performance edge over implicit Statement caching, which retains only metadata. However, you must be cautious when using this type of caching, because explicit Statement caching saves all three types of information for reuse and you may not be aware of what data and state are retained from prior use of the statements.
The documentation talks about state, data and metadata. I don’t know what kind of data is supposed to fit into which of these categories. So, after some tests I defined the following categories to illustrate the caching behavior:
- Properties - Properties set on the statement, which don’t retrieve or send any user data to the database server.
- Data - User data or statement metadata to be sent or retrieved from the database server.
The test code can be found on Github.
Properties
The following table shows the different statement properties and whether these settings are retained when a statement is retrieved from the cache.
Property | Used method to set the property | Implicit | Explicit |
---|---|---|---|
Result Set Type | Connection#prepareStatement |
N/A | ✓ |
Result Set Concurrency | Connection#prepareStatement |
N/A | ✓ |
Fetch Size | OraclePreparedStatement#setRowPrefetch |
✗ | ✓ |
LOB Prefetch Size | OraclePreparedStatement#setLobPrefetchSize |
✓ | ✓ |
Max Field Size | OraclePreparedStatement#setMaxFieldSize |
✗ | ✓ |
Query Timeout | OraclePreparedStatement#setQueryTimeout |
✗ | ✓ |
Result Set Type and Concurrency
These properties are set by calling Connection#prepareStatement
. For implicitly cached statements they are, together with the SQL text, part of the cache key. Therefore, a prepared statement is only found in the cache when these 2 properties match.
LOB Prefetch Size
As the regular fetch size setting is not retained in implicitly cached statements, I expected the same for LOB prefetch size. Interestingly this is not the case.
Data
The table below shows a summary of which kind of data is cached by which method. There are some interesting irregularities explained afterwards.
Data | Implicit | Explicit |
---|---|---|
Result Set | ✗ | ✗ |
Result Set Metadata | ✓ | ✓ |
Parameter Metadata | ✓ | ✗ |
Update Count | ✓ | ✓ |
Generated Keys | ✗ | ✓ |
Bind Data | ✗ | ✓ |
Batches | ✗ | ✗ |
Result Set
When calling Statement#close
the associated ResultSet
is closed as well. This is also true when statement caching is enabled. Any subsequent calls on a result set returned by Statement#getResultSet
will result in a SQLException
.
Result Set Metadata
Result set metadata (e.g. retrieved by PreparedStatement#getMetaData
) is retained by both explicit and implicit statement caching as it is stated in the documentation.
Parameter Metadata
As one would expect, parameter metadata (retrieved by PreparedStatement#getParameterMetaData
) behaves exactly the same as result set metadata regarding caching. But interestingly parameter metadata is only cached when implicit caching is enabled. I assume it is because internally the JDBC driver executes some statements to retrieve this information from the database. With implicit caching enabled, these statements are automatically cached as well.
Update Count
After calling PreparedStatement#executeUpdate
to insert or update records, the number of affected rows can be returned by calling Statement#getUpdateCount
. The update count is retained for statements put into the cache. Interestingly this is also the case for implicitly cached statements. I don’t see any reason why this value is not reset for implicitly cached statements, and it’s not the behavior described in the documentation.
This is only a problem if Statement#getUpdateCount
is called after retrieving a statement from the cache but before doing an update. In this case the method reports the update count from the previous execution, otherwise the value simply gets overwritten when an update statement is executed.
The reported row count also depends on whether the Statement#getUpdateCount
was called on the statement before it was put back into the cache. It either reports the updated rows (if the method was not called previously) or -1, which means the value was already retrieved.
Generated Keys
By calling Statement#getGeneratedKeys
one can retrieve the keys generated by an insert statement. In fact, any columns could be returned, but it makes most sense to return generated keys, as they are not already known by the client.
This method returns a regular ResultSet
, but unlike the one returned by Statement#getResultSet
this one is retained for explicitly cached statements.
Bind Data
Oracle uses three different ways for binding data depending on data size and type of SQL statement (SQL or PL/SQL): direct, stream or LOB binding. See the documentation for more information. Irrespective of the method used, bind data is always cached by explicit statement caching. You should be aware of this, as it can hugely impact your memory footprint when dealing with LOBs or other large data. Also keep it in mind when sizing the statement cache (OracleConnection#setStatementCacheSize
).
Batches
To minimize round trips between the client and the database, a common technique is to batch bind data (PreparedStatement#addBatch
) and send it to the database in one go. In contrast to non-batched statements where bind data is cached by explicit statement caching, it is not cached when using batches. Batches are always cleared when a statement is put into the cache.
Conclusion
Most data access frameworks nowadays take away the burden of SQL cursor handling. One of the benefits is, that it avoids cursor leaks, but on the other hands leads to more soft parses and cursor creation on the database side. JDBC statement caching does mitigate these issues, so using it is strongly recommended.
Implicit caching is very easy to use and there is no reason not to enable it.
Explicit caching requires changes in the application code. According to Oracle it can give you better performance because more data is cached. Whether this is only true if you actually reuse data from previously cached statements or also when you override this data (think of bind data) has to be thoroughly tested.
After all, you are not limited to use only one of the caching methods but can use both simultaneously and benefit from the strengths of each method. For this reason, you should be aware of the different data that is retained by the two caching methods. The tests have shown the behavior is not always what one would expect after reading the documentation.