De-duplicating ads_insights data in Facebook Marketing source


Struggling to de-duplicate ads_insights data in Facebook Marketing source due to changes in ad performance metrics after reporting date. Facing challenge with using row_number() function and update_time column in date format.


[Facebook Marketing source]
I am using the Facebook Marketing source and I am currently struggling to de-duplicate the ads_insights data. So by utilizing incremental-append sync mode, I would get duplicates as e.g. in this case the performance/metrics of an ad could change/improve after the reporting date (e.g. the conversions for an ad could still increase days after the reporting data aka the day when the ad got clicked/the impression happened). So I am struggling now to de-duplicate as I would have used the row_number () function on the ads_insights table and would have created a “rank” supporting column that would have helped me to identify the most recent row. Then I could filter the table for only the most recent record.
However, I’ve noticed that the column update_time is in date format and not a timestamp or datetime .
Would be great if anyone could give me some hints to resolve this issue :pray:

This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want to access the original thread.

Join the conversation on Slack

["facebook-marketing-source", "de-duplicate", "ads-insights", "incremental-append-sync", "row-number", "update-time"]

Not sure if the case here, but when airbyte syncs, it creates a airbyte sync datetime value, we use that to get the latest entry for a particular update to make sure we use the latest value during our transformations

thank you <@U06C5BK9H9P>! This solves it! We’ve somehow skipped these _airbyte* columns so far