Memory / Disk: you can use something like 4Gb / 1 Tb
Deployment: Docker
Airbyte Version: v0.40.11
Source name/version: source-marketo
Destination name/version: destination-redshift
Step: The issue is happening during sync
Description: marketo sync fail with the following error :
destination > Starting a new buffer for stream marketo_leads (current state: 0 bytes in 1 buffers)
source > Encountered an exception while reading stream leads
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/abstract_source.py", line 113, in read
yield from self._read_stream(
File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/abstract_source.py", line 182, in _read_stream
for record in record_iterator:
File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/abstract_source.py", line 246, in _read_incremental
stream_state = stream_instance.get_updated_state(stream_state, record_data)
File "/airbyte/integration_code/source_marketo/source.py", line 99, in get_updated_state
self.cursor_field: max(
TypeError: '>' not supported between instances of 'str' and 'NoneType'
This happen after an update of the Marketo connector, connection is only syncing leads, campaigns and programs from marketo.
At the end of the sync, only campaign has a connection state
Hello there! You are receiving this message because none of your fellow community members has stepped in to respond to your topic post. (If you are a community member and you are reading this response, feel free to jump in if you have the answer!) As a result, the Community Assistance Team has been made aware of this topic and will be investigating and responding as quickly as possible.
Some important considerations that will help your to get your issue solved faster:
It is best to use our topic creation template; if you havenāt yet, we recommend posting a followup with the requested information. With that information the team will be able to more quickly search for similar issues with connectors and the platform and troubleshoot more quickly your specific question or problem.
Make sure to upload the complete log file; a common investigation roadblock is that sometimes the error for the issue happens well before the problem is surfaced to the user, and so having the tail of the log is less useful than having the whole log to scan through.
Be as descriptive and specific as possible; when investigating it is extremely valuable to know what steps were taken to encounter the issue, what version of connector / platform / Java / Python / docker / k8s was used, etc. The more context supplied, the quicker the investigation can start on your topic and the faster we can drive towards an answer.
We in the Community Assistance Team are glad youāve made yourself part of our community, and weāll do our best to answer your questions and resolve the problems as quickly as possible. Expect to hear from a specific team member as soon as possible.
Thank you for your time and attention.
Best,
The Community Assistance Team
Hey @CBarbault, sorry to hear youāre having this issue! Let me look into it, my guess is that something got updated in the new version and caused breaking changes with your data. As a first step, could you try making a new test connector and do a small trial sync?
Hey, when I run a small sync (just a few days) the sync succeed and the state is correctly saved.
But when I run a full sync of my data (2 years), my sync worker fail.
(The connector Iām using is a new connector)
EDIT : I tried syncing 1 year and 1 month without success
EDIT 2: When running on my local machine (docker Airbyte v0.40.25) I got the following error message:
Additional Failure Information: message='java.lang.IllegalStateException: Job ran during migration from Legacy State to Per Stream State.
One of the streams that did not have state is: io.airbyte.protocol.models.StreamDescriptor@13dbea3d[name=leads,namespace=<null>,additionalProperties={}].
Job must be retried in order to properly store state.', type='java.lang.RuntimeException', nonRetryable=false
Yes Iām using " Normalized tabular data". The amount of data Iām trying to sync is quite important (ex: 400000+ leads) so Iām using the incremental dedup sync
Iām thinking there must be something in your older data that is causing a type error - the connector is trying to compare a string and a null value and thatās causing the exception. Do you see any airbyte_raw tables? Look for the leads one and look for a null value if you can! Then we can take it from there!
We do have data in Marketo (e.g. more than 400k leads).
Iāve tried running the connection in full refresh override but this time it failed with the error message Additional Failure Information: invalid literal for int() with base 10: 'Asia/Bangkok'
Ah! So looks like more type errors. I think there is something that needs to be corrected in the connector code, or something is being set incorrectly in Marketo. Could you tell me what field this āAsia/Bangkokā datapoint comes from? You ran only the leads stream, right? We need to pinpoint where this is happening. Iām looking through the leads stream and see a few integer fields. https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-marketo/source_marketo/schemas/leads.json
@natalyjazzviolin
I was able to fix the issue by using the unicodecsv library instead of the csvone and removing the decode_unicode=True.
Edit : I found an even simpler fix : seting response.encoding = 'utf-8'
In fact, response.encoding was set the default value of ISO-8859-1 as it wasnāt able to detect the utf-8 encoding
Hey @natalyjazzviolin Iām still in need of a review on the PR, youāre in the last of reviewer could you take a look ? That would be super nice of you !