Migrating from Oracle to Azure SQL caveat — prepared statement set string causes implicit conversion
In the previous post, I discuss the issue we face with storing date and time after migrating to azure SQL and the solution. If interested, you can check it out here. In this post, I talk about an issue we encountered with regards to implicit conversion, causing high CPU usage and performance degradation.
While monitoring the azure database for issues after the migration, we noticed very high CPU and memory consumption because of a query that ran continuously and required an index scan. We have an index on the filtering column in the where clause, and the query returned only one record per run. Therefore, we were intrigued as to why the query caused high CPU usage. In fact, when we ran the query directly via SQL Server Management Studio, we got the result almost instantaneously because the database performed an index seek as expected. The slowness only occurred when the query originated from our java application.
Below is a simplified version of the query coming from the application:
SELECT id, column_1 from mySchema.myTable Where stringId = @PO
In the above query, @PO is the parameter in the prepared statement.
Below shows the simplified code in the java application for generating and executing the prepared statement:
String selectByStringId = "SELECT id, column_1 cd from mySchema.myTable WHERE stringId = ?";
String id = "abcde"
Connection connection = getConnection();
PreparedStatement preparedStatement = connection.prepareStatement(selectByStringId);
ResultSet resultSet = preparedStatement.executeQuery();
In the codes, we used
setString() method on the prepared statement to set the id, which is a string. When sending the parameter, Microsoft SQL JDBC driver set the type as nvarchar, which is for Unicode. However, the corresponding column's type is varchar. Because nvarchar precedes varchar, the database implicitly converted the value to varchar. The implicit conversion caused the high CPU usage.
When an operator combines expressions of different data types, the data type with the lower precedence is first converted to the data type with the higher precedence
The reason the JDBC driver sent the parameter as nvarchar is to avoid client side conversion as Java’s native string type is Unicode.
We needed a way to avoid implicit conversion from happening. One way is to migrate the data type of the target column from varchar to nvarchar. However, we did not go with this approach since we have not had the need to support Unicode characters, and using nvarchar could take more space.
Another approach requires setting the property sendStringParametersAsUnicode to false. As the name suggests, this property tells the JDBC driver to not send string parameters as nvarchar. The default value is true. Per the document, there are a couple ways to set the property. For us, we append the property to the connection string.
Once we set the property, we verified that the database no longer performed the implicit conversion, and the query no longer caused high CPU usage when running continuously.