Marketo connector error TypeError: '>' not supported between instances of 'str' and 'NoneType'

CBarbault · December 13, 2022, 2:52pm

Is this your first time deploying Airbyte?: No
OS Version / Instance: docker on EC2
Memory / Disk: you can use something like 4Gb / 1 Tb
Deployment: Docker
Airbyte Version: v0.40.11
Source name/version: source-marketo
Destination name/version: destination-redshift
Step: The issue is happening during sync
Description: marketo sync fail with the following error :

destination > Starting a new buffer for stream marketo_leads (current state: 0 bytes in 1 buffers)
source > Encountered an exception while reading stream leads
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/abstract_source.py", line 113, in read
    yield from self._read_stream(
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/abstract_source.py", line 182, in _read_stream
    for record in record_iterator:
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/abstract_source.py", line 246, in _read_incremental
    stream_state = stream_instance.get_updated_state(stream_state, record_data)
  File "/airbyte/integration_code/source_marketo/source.py", line 99, in get_updated_state
    self.cursor_field: max(
TypeError: '>' not supported between instances of 'str' and 'NoneType'

This happen after an update of the Marketo connector, connection is only syncing leads, campaigns and programs from marketo.
At the end of the sync, only campaign has a connection state

27960fb9_f0c0_414c_bda4_792667d7d031_logs_1222_txt.txt (186.1 KB)

marcosmarxm · December 13, 2022, 5:02pm

Hello there! You are receiving this message because none of your fellow community members has stepped in to respond to your topic post. (If you are a community member and you are reading this response, feel free to jump in if you have the answer!) As a result, the Community Assistance Team has been made aware of this topic and will be investigating and responding as quickly as possible.
Some important considerations that will help your to get your issue solved faster:

It is best to use our topic creation template; if you haven’t yet, we recommend posting a followup with the requested information. With that information the team will be able to more quickly search for similar issues with connectors and the platform and troubleshoot more quickly your specific question or problem.
Make sure to upload the complete log file; a common investigation roadblock is that sometimes the error for the issue happens well before the problem is surfaced to the user, and so having the tail of the log is less useful than having the whole log to scan through.
Be as descriptive and specific as possible; when investigating it is extremely valuable to know what steps were taken to encounter the issue, what version of connector / platform / Java / Python / docker / k8s was used, etc. The more context supplied, the quicker the investigation can start on your topic and the faster we can drive towards an answer.
We in the Community Assistance Team are glad you’ve made yourself part of our community, and we’ll do our best to answer your questions and resolve the problems as quickly as possible. Expect to hear from a specific team member as soon as possible.

Thank you for your time and attention.
Best,
The Community Assistance Team

natalyjazzviolin · December 13, 2022, 6:21pm

Hey @CBarbault, sorry to hear you’re having this issue! Let me look into it, my guess is that something got updated in the new version and caused breaking changes with your data. As a first step, could you try making a new test connector and do a small trial sync?

CBarbault · December 14, 2022, 9:56am

Hey, when I run a small sync (just a few days) the sync succeed and the state is correctly saved.
But when I run a full sync of my data (2 years), my sync worker fail.

(The connector I’m using is a new connector)

EDIT : I tried syncing 1 year and 1 month without success
EDIT 2: When running on my local machine (docker Airbyte v0.40.25) I got the following error message:

Additional Failure Information: message='java.lang.IllegalStateException: Job ran during migration from Legacy State to Per Stream State. 
One of the streams that did not have state is: io.airbyte.protocol.models.StreamDescriptor@13dbea3d[name=leads,namespace=<null>,additionalProperties={}]. 
Job must be retried in order to properly store state.', type='java.lang.RuntimeException', nonRetryable=false

CBarbault · December 19, 2022, 2:18pm

I’m still getting this error, do you have any idea what could be the cause @natalyjazzviolin ?

natalyjazzviolin · December 19, 2022, 9:29pm

Hey Cyprien! I have completed a successful sync and was not able to replicate this issue. Are you using normalization?

CBarbault · December 20, 2022, 8:57am

Yes I’m using " Normalized tabular data". The amount of data I’m trying to sync is quite important (ex: 400000+ leads) so I’m using the incremental dedup sync

natalyjazzviolin · December 21, 2022, 3:48pm

I’m thinking there must be something in your older data that is causing a type error - the connector is trying to compare a string and a null value and that’s causing the exception. Do you see any airbyte_raw tables? Look for the leads one and look for a null value if you can! Then we can take it from there!

CBarbault · December 22, 2022, 2:35pm

I do have those tables, but it appears that _airbyte_raw_marketo_programs is empty.

natalyjazzviolin · December 22, 2022, 4:34pm

And does that data exist for you in Marketo?

CBarbault · December 26, 2022, 9:30am

We do have data in Marketo (e.g. more than 400k leads).

I’ve tried running the connection in full refresh override but this time it failed with the error message Additional Failure Information: invalid literal for int() with base 10: 'Asia/Bangkok'

natalyjazzviolin · December 27, 2022, 3:40pm

Ah! So looks like more type errors. I think there is something that needs to be corrected in the connector code, or something is being set incorrectly in Marketo. Could you tell me what field this ‘Asia/Bangkok’ datapoint comes from? You ran only the leads stream, right? We need to pinpoint where this is happening. I’m looking through the leads stream and see a few integer fields.
https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-marketo/source_marketo/schemas/leads.json

CBarbault · January 2, 2023, 10:23am

Hello, I wish you a happy new year,

I was not able to find which field is causing the error, but here are the logs from a failing sync :
27960fb9_f0c0_414c_bda4_792667d7d031_logs_1334_txt.txt (46.3 KB)

To narrow the problem, this sync is only targeting the “lead” stream since 2023-12-20.

Edit: Looking at our data, “Asia/Bangkok” is refering to a persontimezone from Marketo

CBarbault · January 2, 2023, 5:15pm

Hey again.

I tried pulling the Airbyte repository code to investigate the marketo connector and I found the following :

The lead who is breaking the pipeline has a name containing the following letters : ễ Đ ứ c (Vietnamese character I guess)
All its field are in the wrong order (ex: billingCity contains the email, id contains the timezone…)

I guess this has something to do with character encoding not being handle in the right way.

CBarbault · January 3, 2023, 10:08am

@natalyjazzviolin
I was able to fix the issue by using the unicodecsv library instead of the csvone and removing the decode_unicode=True.

Edit : I found an even simpler fix : seting response.encoding = 'utf-8'
In fact, response.encoding was set the default value of ISO-8859-1 as it wasn’t able to detect the utf-8 encoding

natalyjazzviolin · January 9, 2023, 6:44pm

That is wonderful to hear, thank you so much for the update!

CBarbault · January 10, 2023, 3:32pm

Thanks for your assistance.

Link to the PR → 🐛 Source Marketo: fix encoding error for Lead sync by CyprienBarbault · Pull Request #20973 · airbytehq/airbyte · GitHub

CBarbault · January 27, 2023, 8:47am

Hey @natalyjazzviolin I’m still in need of a review on the PR, you’re in the last of reviewer could you take a look ? That would be super nice of you !

github.com/airbytehq/airbyte

🐛 Source Marketo: fix encoding error for Lead sync

airbytehq:master ← CyprienBarbault:feat-20641/marketo-encoding

opened 04:01PM - 03 Jan 23 UTC

CyprienBarbault

+5 -3

## What This is a fix for an issue I reported -> https://github.com/airbytehq/a…irbyte/issues/20641 When syncing Marketo lead, it appears that those with name containing non-latin character would make the sync fail. In particular we have a Lead whose name is Vietnamese (cf https://en.wikipedia.org/wiki/Vietnamese_name for exemple names) In fact, the response from the marketo API was being fall-backed to the default encoding `ISO-8859-1` instead of `utf-8` (cf https://github.com/psf/requests/issues/5445#issuecomment-661654120) ## How Forcing the encoding to `utf-8` fixed the issue ## Recommended reading order 1. `source.py` ## 🚨 User Impact 🚨 Are there any breaking changes? What is the end result perceived by the user? If yes, please merge this PR with the 🚨🚨 emoji so changelog authors can further highlight this if needed. I'm not sure but I don't think so ## Pre-merge Checklist <summary>Updating a connector</summary> ### Community member or Airbyter - [x] Grant edit access to maintainers ([instructions](https://docs.github.com/en/github/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests)) - [ ] Secrets in the connector's spec are annotated with `airbyte_secret` - [ ] Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run `./gradlew :airbyte-integrations:connectors:<name>:integrationTest`. - [ ] Code reviews completed - [ ] Documentation updated - [ ] Connector's `README.md` - [ ] Connector's `bootstrap.md`. See [description and examples](https://docs.google.com/document/d/1ypdgmwmEHWv-TrO4_YOQ7pAJGVrMp5BOkEVh831N260/edit?usp=sharing) - [ ] Changelog updated in `docs/integrations/<source or destination>/<name>.md` including changelog. See changelog [example](https://docs.airbyte.io/integrations/sources/stripe#changelog) - [x] PR name follows [PR naming conventions](https://docs.airbyte.com/contributing-to-airbyte/issues-and-pull-requests) ### Airbyter If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items. - [ ] Create a non-forked branch based on this PR and test the below items on it - [ ] Build is successful - [ ] If new credentials are required for use in CI, add them to GSM. [Instructions](https://docs.airbyte.io/connector-development#using-credentials-in-ci). - [ ] [`/test connector=connectors/<name>` command](https://docs.airbyte.io/connector-development#updating-an-existing-connector) is passing - [ ] New Connector version released on Dockerhub and connector version bumped by running the `/publish` command described [here](https://docs.airbyte.io/connector-development#updating-an-existing-connector) ## Tests <details><summary>Unit</summary> [unit_test_result.txt](https://github.com/airbytehq/airbyte/files/10338292/unit_test_result.txt) </details> <details><summary>Integration</summary> *Put your integration tests output here.* </details> <details><summary>Acceptance</summary> *Put your acceptance tests output here.* </details>

Topic		Replies	Views
Marketo Connector Does Not Work At All Connector Questions & Issues source-marketo	1	273	September 20, 2022
Marketo connector is not working Connector Questions & Issues connectors	2	158	January 18, 2023
Issue with Marketo Sync Connector Questions & Issues source-marketo , connectors	8	187	July 14, 2022
Marketo sync fail with "generator raised StopIteration" error Connector Questions & Issues source-marketo , connectors	2	202	March 10, 2023
Source ZohoCRM: fails to connect Connector Questions & Issues data-loading , connectors , source-zoho-crm	6	887	September 25, 2022

Marketo connector error TypeError: '>' not supported between instances of 'str' and 'NoneType'

Related topics