Optimizing Connector Performance for S3 to Teradata

Summary

User is developing a connector in Java and Kotlin to transfer 36GB of data from an S3 bucket to Teradata, experiencing performance issues with a transfer time of over 7 hours. They seek advice on modifying the batch size and other optimizations.


Question

hello, I am coding my own connector in Java and Kotlin. Fetching data from a S3 butket (AWS) to Teradata. I have like 36gb of data to load, the data contains an id, a json object and a date (totals 162M records). This is taking too much time; 7+ h. I want to optimize that. How can I modify the batch_size? the default value is 25MB it seems.
Can you suggest me additional tweaks to optimize the performance please.

Info: I am using the com.teradata.jdbc:terajdbc4:17.20.00.12 driver.



This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.

Join the conversation on Slack

['s3', 'teradata', 'batch-size', 'performance-optimization', 'java', 'kotlin', 'jdbc-driver']