Response streaming in Python CDK custom connector

dipertq · May 28, 2022, 3:22pm

I have an API endpoint that returns very large csv at once.

Is it safe to pass “stream=True” into request_kwargs method and iterate over response rows in parse_response method? Will response connection be automatically closed?

I am relating to iter_lines method of requests.Response object.

harshith · May 30, 2022, 6:00am

Hey can you help us understand how big the file can be? Also amazon source also has a similar implementation and we are doing good even though the file is big enough so would suggest you to try it.

dipertq · May 30, 2022, 10:12pm

Here I am talking about 3-4GB files. I don’t think it is good idea to store all of this data in memory at once since we’ve noticed few MemoryError exceptions in our old non-CDK connector and decided to rewrite download logic to streaming using requests.get(url, stream=True).

Can you please clarify what implementation are you relating to?

harshith · May 31, 2022, 4:47am

That’s a big file. Then yeah I think streaming is the possible way. This is the file https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-amazon-seller-partner/source_amazon_seller_partner/streams.py where you can find amazon implementation which also has CSV download.

Also what is this source you are talking about ?

marcosmarxm · July 13, 2022, 12:00am

Hi there from the Community Assistance team.
We’re letting you know about an issue we discovered with the back-end process we use to handle topics and responses on the forum. If you experienced a situation where you posted the last message in a topic that did not receive any further replies, please open a new topic to continue the discussion. In addition, if you’re having a problem and find a closed topic on the subject, go ahead and open a new topic on it and we’ll follow up with you. We apologize for the inconvenience, and appreciate your willingness to work with us to provide a supportive community.

Topic		Replies	Views
Custom Python Connector for Fixed Length FIles Connector Development source-files , connectors	5	392	July 25, 2022
Large Backfill with unreliable API Connector Development data-loading , connectors	5	595	July 14, 2022
SubStreams for Non HTTP Source Connector Development data-loading , connectors	2	599	July 9, 2022
The endpoint of the API is not a JSON body but a CSV Connector Questions & Issues connectors	3	277	December 28, 2022
Creating custom connector using Python CDK with API request, report generation, and download Connector Development python-cdk , api-request , connectorbuilder , question , custom-connector	2	204	May 16, 2024

Response streaming in Python CDK custom connector

Related topics