Google Analytics V4 - Incremental does not work

Hi, I try to use GA4 API (google analytic v4) for a while.
Currently, I’m using version 0.03.

I did setup a lot of time for Incremental | Append or Incremental | Dedupted
But after 24 hours waiting , there always no record is synced from the source.

At first I thought it was because the end_date for the first run = today. Thus, when tomorrow comes, airbyte sycned no record for yesterday (yesterday of tomorrow) due to yesterday’s data (which is today’s date) was captured.

So i tried to wait 2 days then sync. But it happens the same → no record

Because I thought the problem reside in the end_date = today

So, i tried to modify the code base a little and craft a custom version which changes only the end_date stops at yesterday, which in turn, when tomorrow comes, airbyte will continue to sync yesterday of tomorrow data (or today data)

But it still returns no record…

Hope that someone can help out with this problem because full refresh | overwrite is extremely heavy to run daily.

Hi @phucdinh,
I see that you were discussing this issue earlier -
https://discuss.airbyte.io/t/source-google-analytics-4-ga4-api-to-bigquery-problem-with-incremental-data/2591/6

I agree with Sajarin that the cursor field (example 2022-10-03) is not granular enough. I wonder how difficult it would be to add a timestamp to it so it doesn’t skip records that could be added on the same day (like the midnight - 7:00 example you mentioned).

Let me look into this and I’ll get back to you!

1 Like

Thank you @natalyjazzviolin
Really appreciate!!!

Hi @natalyjazzviolin,
Just for the sake of education, I have a question (just one :smile:
I have read the stream slices method in the code base (below). I wonder why we have to put end_date = start_date + n days (while n is defined by users, in my case it is 1, and I guess most people will input 1 the same as me)
Why don’t we set end_date = start_date, which in turn, create a date dict {“startDate”: “2022-10-06”, “endDate”:“2022-10-06”}
I have read the doc about adding incremental sync and the guidance code just have end_date = start_date
I just wonder is there any problem with doing so? Why we have to get the n days in the code?

    def stream_slices(
        self, *, sync_mode: SyncMode, cursor_field: List[str] = None, stream_state: Mapping[str, Any] = None
    ) -> Iterable[Optional[Mapping[str, Any]]]:
        dates = []

        today: datetime.date = datetime.date.today()
        start_date: datetime.date = self.state[self.cursor_field]

        timedelta: int = self.config["window_in_days"] or self._default_window_in_days

        while start_date <= today:
            end_date: datetime.date = start_date + datetime.timedelta(days=timedelta)
            if timedelta > 1 and end_date > today:
                end_date: datetime.date = start_date + datetime.timedelta(days=timedelta - (end_date - today).days)

            dates.append(
                {
                    "startDate": utils.date_to_string(start_date, self._date_format),
                    "endDate": utils.date_to_string(end_date, self._date_format),
                }
            )

            start_date: datetime.date = end_date + datetime.timedelta(days=1)

        return dates or [None]