Response streaming in Python CDK custom connector

I have an API endpoint that returns very large csv at once.

Is it safe to pass “stream=True” into request_kwargs method and iterate over response rows in parse_response method? Will response connection be automatically closed?

I am relating to iter_lines method of requests.Response object.

Hey can you help us understand how big the file can be? Also amazon source also has a similar implementation and we are doing good even though the file is big enough so would suggest you to try it.

Here I am talking about 3-4GB files. I don’t think it is good idea to store all of this data in memory at once since we’ve noticed few MemoryError exceptions in our old non-CDK connector and decided to rewrite download logic to streaming using requests.get(url, stream=True).

Can you please clarify what implementation are you relating to?

That’s a big file. Then yeah I think streaming is the possible way. This is the file https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-amazon-seller-partner/source_amazon_seller_partner/streams.py where you can find amazon implementation which also has CSV download.

Also what is this source you are talking about ?