Hi,
I succeeded in getting a custom report data from GA4 API , then send it to bigquery.
Now, I wanna update only incremental data from the source but there is nowhere to setup custom cursor field or set up the end date is yesterday’s date.
Is there any way to do so?
Let’s me explain a little bit:
So, here is some of my setup info:
GA4 version: Google Analytics Data API ALPHA 0.0.3
Custom reports: [{“name”: “Revenue by ad events”, “dimensions”: [“date”, “country”, “appVersion”, “eventName”], “metrics”: [“totalAdRevenue”, “totalPurchasers”, “totalRevenue”]}]
If I choose Sync Mode = Incremental | Append (or Deduped History), the Cursor Field is default as date, which means when I call api at, ex: 9am 15/09/2022, then all record after 9am of 15/09/2022 won’t be taken anymore.
Even though I can use Sync Mode = Full Refresh | Overwrite, this means is definitely a bad idea for long term usage (in case I get more custom report, and the date range is longer)
Hello there! You are receiving this message because none of your fellow community members has stepped in to respond to your topic post. (If you are a community member and you are reading this response, feel free to jump in if you have the answer!) As a result, the Community Assistance Team has been made aware of this topic and will be investigating and responding as quickly as possible.
Some important considerations that will help your to get your issue solved faster:
It is best to use our topic creation template; if you haven’t yet, we recommend posting a followup with the requested information. With that information the team will be able to more quickly search for similar issues with connectors and the platform and troubleshoot more quickly your specific question or problem.
Make sure to upload the complete log file; a common investigation roadblock is that sometimes the error for the issue happens well before the problem is surfaced to the user, and so having the tail of the log is less useful than having the whole log to scan through.
Be as descriptive and specific as possible; when investigating it is extremely valuable to know what steps were taken to encounter the issue, what version of connector / platform / Java / Python / docker / k8s was used, etc. The more context supplied, the quicker the investigation can start on your topic and the faster we can drive towards an answer.
We in the Community Assistance Team are glad you’ve made yourself part of our community, and we’ll do our best to answer your questions and resolve the problems as quickly as possible. Expect to hear from a specific team member as soon as possible.
Thank you for your time and attention.
Best,
The Community Assistance Team
Thanks for your post and welcome to the community! I may be mistaken but the Incremental | Deduped + History mode should sync all of the data as long it’s a daily sync. Have you tried creating this connection and are there indeed records that are not synced?
Unfortunately, the GA4 connector doesn’t support any other cursors other than the date.
Hi @sajarin,
Thanks for replying.
I did try Incremental | Deduped + History mode first, and the problem occurs like I explained above:
When I made a daily API call at 7am. And saying for example today is 19/09/2022
Then any record between 7AM of 19/09/2022 and 12AM 20/09/2022 will NOT be updated since Cursors Field of the Source is Date
(I assume Airbyte syncs data of date 19/09 at 7AM, and when it checks the source again, the date 19/09 already exists, so it skips updating record between 7AM and 12AM)
Thanks for clarifying. This issue happens because the date field is not granular enough to pick up records that are updated in the course of a day. You can either schedule one sync within a 24 hour period everyday, or use Incremental | Append (if the specific stream supports it) and perform a clean job to remove the duplicate/stale records. Another option would be to update the GA4 connector to use something more granular for the cursor but that would require some time to investigate and merge onto our main branch.
Hi @sajarin,
Hope you can help!
I tried to apply Incremental | Append methods since my data full refresh are getting bigger and bigger (take hours to query if I use Full Refresh | Overwrite).
So I have read about cursor field value in this doc. And in Known Limitations Part, it talks about max cursor field value.
Thus, I have a question about where the ‘last_sync_max_cursor_field_value’ stored? Is there anyway I can modify this value without writing custom code?
FYI: I did try to delete today’s records in destination (in my case is Big Query), then sync again using Incremental | Append, but it always returns no new records.
Here some logs:
First sync return all records up to today’s date which is 03/10/2022. Then I deleted all 03/10/2022 record in destination
Second sync when using Incremental | Append
2022-10-03 03:23:22 source > Starting syncing SourceGoogleAnalyticsDataApi
2022-10-03 03:23:22 source > Syncing stream: revenue by ad events
2022-10-03 03:23:22 source > Setting state of revenue by ad events stream to {'date': '2022-10-03'}
2022-10-03 03:23:22 source > Read 0 records from revenue by ad events stream
2022-10-03 03:23:22 source > Finished syncing revenue by ad events
2022-10-03 03:23:22 source > SourceGoogleAnalyticsDataApi runtimes:
Try to sync again:
2022-10-03 03:32:04 source > Starting syncing SourceGoogleAnalyticsDataApi
2022-10-03 03:32:04 source > Syncing stream: revenue by ad events
2022-10-03 03:32:04 source > Setting state of revenue by ad events stream to {'date': '2022-10-04'}
2022-10-03 03:32:04 source > Read 0 records from revenue by ad events stream
2022-10-03 03:32:04 source > Finished syncing revenue by ad events
2022-10-03 03:32:04 source > SourceGoogleAnalyticsDataApi runtimes:
The date cursor field automatedly increase to tomorrow’s date!!
Sorry for the delay. I saw that you opened up a new post for this issue. Can you try adding data instead of removing data and see if it appends properly? My guess is that when you delete data it’s not resetting the state of the cursor (it’s still 10/3/22) and so it bumps whatever is the current state is which is reflective of the data.
Hey there
I would also love to try your fix to the GA incremental issue
Any chance there is also some PR in airbyte git to see the code fix ?
I’m wrapping my head around this issue for the last few days, but I’m airbyte noob
hi @phucdinh I already pull your docker image and add the source in my airbyte, but I got error when I create connection from your source to redshit, this is the error. I hope you can help me