Improving Data Load using the new Ingres 10 Batch Execution Feature
Posted 2010-04-14 at 09:14 PM by teresa
Updated 2010-04-14 at 09:24 PM by teresa (removed blank space)
Updated 2010-04-14 at 09:24 PM by teresa (removed blank space)
One of the new features coming with Ingres 10 is the complete support of batched statements in JDBC. The Ingres JDBC Driver has always supported the addBatch and executeBatch methods but under the covers executed each statement in the batch individually. While this worked, it resulted in less than optimal performance. With Ingres 10, a number of changes were made to the Ingres DBMS, Data Access (DAS) Servers and the JDBC Driver to utilize a feature of the Ingres communications layer to send statements across in batches, thereby reducing communications overhead. In addition, the Ingres DBMS implemented a number of optimizations to improve the processing of these batched queries. All of these changes have enabled big improvements in response time when processing large batches of statements through JDBC.
Batch execution is restricted to queries that return either a response or a row count, but no data. For example,
To trigger the optimizations in the DBMS Server the following criteria must be met:
In testing out the functionality internally, the following behaviors were seen.
Turning off the Improved Batch Execution Support
This new functionality will be enabled by default with the Ingres JDBC Driver as it is expected to provide significant performance benefits; however, if for your particular environment you find it doesn’t, you can force individual statement execution by using the ingres.jdbc.batch.enabled system property to disable batch query execution.
Availability in other Clients
Batch statement execution will also be supported in OpenAPI with Ingres 10. Future Ingres versions will extend this capability to other drivers.
How can I get access to the new functionality?
The feature is now available for testing and was recently released to the community in Ingres 10 b 121. The feature is enabled in JDBC Driver version 4.0.1 which is either available as part of the Ingres 10 b 121 release or from the JDBC 4.0 Community Preview download. More information regarding the features in the 4.0.x JDBC Driver can be found at JDBC 4.0.x New Features.
If you are interested in the details of the implementation in Ingres, the design specification can be found at here.
Sample programs demonstrating the batch capabilities for Java and OpenAPI can be found at Ingres Coding Samples.
Summary:
Performance testing so far has shown a significant improvement in processing batched rows with the largest improvements being with larger batch sizes.
Smaller batch sizes, say 10 records per batched query, don’t generate as much improvement due to the overhead of setting up the optimizations in the DBMS that offsets the performance improvements in sending the data in batch across the communications pipe. In fact, better performance for this case is gotten by using individual insert statements in a batch rather than using prepared statements because individual statements don’t invoke the new optimizations in the DBMS.
JDBC batched statement execution does not exactly match the performance of the Ingres Copy command but it is a significant improvement over Ingres 9.x and earlier.
Many factors can come into play when using the feature and generating performance numbers, from the table structure to the row size.
To benefit from this feature requires an updated JDBC Driver (minimum 4.0.1) and Ingres 10 b121 DBMS and Data Access (DAS) Servers.
We are very interested in having the functionality tested by some real world applications to determine whether the changes made are on the right track or do we need to step back and approach from another angle.
Feedback on your testing can either be provided to the forum or via a Service Desk issue. Any and all feedback greatly appreciated.
Batch execution is restricted to queries that return either a response or a row count, but no data. For example,
- Statements allowed include INSERT, DELETE, and statements that return no data, such as CREATE TABLE.
- SELECT statements and row-producing procedures are not allowed in batched queries.
- If the DBMS does not support batch processing, the Ingres JDBC driver will detect it and automatically execute the statements individually.
To trigger the optimizations in the DBMS Server the following criteria must be met:
- Statement must be a prepared dynamic statement
- Statement must be of the form “INSERT INTO……VALUES”
- Any values that are expected to change for each insert must be parameterized. As with all inserts, columns can be omitted if they are nullable or have default values (including identity columns).
- Parameters for the query must include only column values, other parameters (e.g. table name) are not allowed
In testing out the functionality internally, the following behaviors were seen.
- For local execution of non-dynamic statements and procedures, batch appears to have only a small performance improvement.
- For remote execution of non-dynamic statements, JDBC batch performance increases significantly up to a batch size of about 100 statements.
- For non-dynamic statements, there does not appear to be a significant improvement in performance for batch sizes over 100.
- Batch sizes of 100 and greater reduced communications round-trips to the point that little difference was seen between local and remote JDBC execution.
- For dynamic statements, performance just continues to increase as batch size increases due to the copy optimization.
- The copy-optimization for prepared statements is actually very detrimental for small batch sizes. Single statement execution ran at about a minute. Doing the same thing with a batch size of two statements increased the time to between 12 and 13 minutes! Execution times then dropped to 2 1/2 minutes at a batch size of 10 statements and only regained parity at a batch size approaching 100 statements. On the other hand, really large batches went screaming fast.
Turning off the Improved Batch Execution Support
This new functionality will be enabled by default with the Ingres JDBC Driver as it is expected to provide significant performance benefits; however, if for your particular environment you find it doesn’t, you can force individual statement execution by using the ingres.jdbc.batch.enabled system property to disable batch query execution.
Availability in other Clients
Batch statement execution will also be supported in OpenAPI with Ingres 10. Future Ingres versions will extend this capability to other drivers.
How can I get access to the new functionality?
The feature is now available for testing and was recently released to the community in Ingres 10 b 121. The feature is enabled in JDBC Driver version 4.0.1 which is either available as part of the Ingres 10 b 121 release or from the JDBC 4.0 Community Preview download. More information regarding the features in the 4.0.x JDBC Driver can be found at JDBC 4.0.x New Features.
If you are interested in the details of the implementation in Ingres, the design specification can be found at here.
Sample programs demonstrating the batch capabilities for Java and OpenAPI can be found at Ingres Coding Samples.
Summary:
Performance testing so far has shown a significant improvement in processing batched rows with the largest improvements being with larger batch sizes.
Smaller batch sizes, say 10 records per batched query, don’t generate as much improvement due to the overhead of setting up the optimizations in the DBMS that offsets the performance improvements in sending the data in batch across the communications pipe. In fact, better performance for this case is gotten by using individual insert statements in a batch rather than using prepared statements because individual statements don’t invoke the new optimizations in the DBMS.
JDBC batched statement execution does not exactly match the performance of the Ingres Copy command but it is a significant improvement over Ingres 9.x and earlier.
Many factors can come into play when using the feature and generating performance numbers, from the table structure to the row size.
To benefit from this feature requires an updated JDBC Driver (minimum 4.0.1) and Ingres 10 b121 DBMS and Data Access (DAS) Servers.
We are very interested in having the functionality tested by some real world applications to determine whether the changes made are on the right track or do we need to step back and approach from another angle.
Feedback on your testing can either be provided to the forum or via a Service Desk issue. Any and all feedback greatly appreciated.






