Summary
User is experiencing synchronization issues with the Facebook Marketing connector to PostgreSQL/MongoDB, questioning the impact of the free trial account and seeking architectural guidance for managing data from multiple sources and storage formats.
Question
Hello everyone,
I have started exploring aitbyte recently for one of my project. I’m able to configure connection properly from FB marketing to Postgre/Mongo. However syncing seems to be stuck forever. Never progress from one or two table/collection. PFA
- Is this because of free trial account or any other issue? I can see couple of table/collection records but not more than that.
- How can I fix this? I want to make decision from architecture point of view.
- If I have multiple source for FB marketing for different user and same destination either postgres or mongo then how airbyte will manage data?
a. Is it in same table separated by account id / ad id?
b. How actual image and videos are stored using airbyte for any connector?
- Found https://dbdocs.io/airbyteio/source-facebook-marketing?view=relationships from one the article for FB schema, however airbyte storing it in jsonb in postgres. This is bit confusing. PFA
- What would be the idea destination for FB marketing data?
Appreciate your help on this.
This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.
Join the conversation on Slack
['facebook-marketing', 'postgres', 'mongo', 'data-synchronization', 'connector-issues', 'jsonb']
Same log repetition in cloud version :
"account_id":"**********",
"id": "act_13849284192*****"
}, time_range=<Period [2024-11-25 -> 2024-11-25]>, breakdowns=[]): is 0 complete (Job Started)
2024-12-04 02:32:24 source INFO InsightAsyncJob(id=592251676551573, <AdAccount> {
"account_id":"**********",
"id": "act_13849284192*****"
}, time_range=<Period [2024-11-26 -> 2024-11-26]>, breakdowns=[]): is 0 complete (Job Started)
2024-12-04 02:32:24 source INFO InsightAsyncJob(id=629711752715432, <AdAccount> {
"account_id":"**********",
"id": "act_13849284192*****"
}, time_range=<Period [2024-11-27 -> 2024-11-27]>, breakdowns=[]): is 0 complete (Job Started)
2024-12-04 02:32:24 source INFO InsightAsyncJob(id=1144173110674473, <AdAccount> {
"account_id":"**********",
"id": "act_13849284192*****"
}, time_range=<Period [2024-11-28 -> 2024-11-28]>, breakdowns=[]): is 0 complete (Job Started)
2024-12-04 02:32:24 source INFO InsightAsyncJob(id=1285987509379961, <AdAccount> {
"account_id":"**********",
"id": "act_13849284192*****"
}, time_range=<Period [2024-11-29 -> 2024-11-29]>, breakdowns=[]): is 0 complete (Job Started)
2024-12-04 02:32:24 source INFO InsightAsyncJob(id=1906383469886466, <AdAccount> {
"account_id":"**********",
"id": "act_13849284192*****"
}, time_range=<Period [2024-11-30 -> 2024-11-30]>, breakdowns=[]): is 0 complete (Job Started)
2024-12-04 02:32:24 source INFO InsightAsyncJob(id=1975275356253960, <AdAccount> {
"account_id":"**********",
"id": "act_13849284192*****"
}, time_range=<Period [2024-12-01 -> 2024-12-01]>, breakdowns=[]): is 0 complete (Job Started)
2024-12-04 02:32:24 source INFO InsightAsyncJob(id=1243714613347522, <AdAccount> {
"account_id":"**********",
"id": "act_13849284192*****"
}, time_range=<Period [2024-12-02 -> 2024-12-02]>, breakdowns=[]): is 0 complete (Job Started)
2024-12-04 02:32:24 source INFO InsightAsyncJob(id=1291120645164373, <AdAccount> {
"account_id":"**********",
"id": "act_13849284192*****"
}, time_range=<Period [2024-12-03 -> 2024-12-03]>, breakdowns=[]): is 0 complete (Job Started)
2024-12-04 02:32:24 source INFO Completed jobs: 0, Failed jobs: 0, Running jobs: 9
2024-12-04 02:32:24 source INFO No jobs ready to be consumed, wait for 30 seconds```
Not progressing at all. Hardly data fetched from facebook. It got stuck after schema creation. I have been trying this from last 2 days.
<@U083BAT6JLT>, are you using Airbyte open-source or cloud? If it’s the former, have you requested a rate limit increase for your Facebook account? I recommend creating a sync to Postgres and selecting a recent start date to run tests.
Facebook requires several steps to obtain a valid token please read more in the <Facebook Marketing Connector | Airbyte Documentation documentation>.
Hi
There is no rate limit issue from facebook. It is not even reaching to threshold level. Although I see one error recently regarding reducing the no of fields.
Start date : I have set it just one week ago.
Are you using a custom insights/report or defautl reports?
No default and infact I disabled lot of tables for fetching.
See this has been syncing for 2 hours for one table data fetch which is also time range within last 7 days. How can I use this in production for multiple users data fetching for like 6 months or more?
And this has been syncin only one table. Rest are still in queue for last 2 hours.
<@U083BAT6JLT> you’re using Airbyte Cloud. Can you open a Zendesk Ticket? I’m going to ask the tech support team to take a look into it.
I did try to create one but it is also quite frustrating to receive mail like this :
Also, postgres destination stored data in jsonb. Is this always be the case? Why not all traditional columns instead for facebook marketing data?
You’re looking the raw tables not the final tables.
non tmp table is also same.
What I mean is, if the sync finished successful it will generate a table called ad_creatives
in the namespace you selected during the connection creation.
But in that ad_creatives
, does it have same schema with jsonb for storing data like above screenshot or different?
Somehow I managed to solve the syncing failure issue… However, I still don’t understand why Airbyte is taking hours to sync only 1k-3k records for single user? I started this sync 5-6 hours back. How can I rely on Airbyte cloud to sync 6 months or maybe more data for 1000 customers parallelly?
Using free trial for exploring.
I asked someone from connector team to take a look Darshan. I don’t have much knowledge of limitations and limits of Facebook Marketing API. Let’s wait to see their input.
Hi <@U083BAT6JLT>! I havent checked the details for this case but here are a couple of things I can say about why the sync is very long for you:
• We extract data from the Facebook API by creating jobs and polling them until they are completed
• Regarding the “Please reduce the amount of data you’re asking for, then retry your request”. Facebook can stop jobs if it uses more computation than a certain limit. The details of what this limit is very unclear and I wouldn’t be surprised if the fact that it is a trial account would be considered here (although I can’t confirm it). You can always request less field to reduce the complexity of the job on their side.
• Jobs normally go from Started
to Running
to Completed
. In your case, I see jobs that don’t even exit the Started
state and is reported as Failed
after ~20 minutes (see job ID 613607577776140
in the latest execution for example). I see very few jobs even starting which seems to indicated that Facebook is throttling your account in some way.
• When a job is failing, our connector will try to split it in smaller chunks. I go from querying on the account level to requerying for every campaign. If a job for a campaign fail, we might split it at the adset level. If a job on the adset level fails, we might split it at the ad level. All this retrying adds to the sync time significantly.
TLDR: this seems to point to Facebook not letting us executing jobs for your account. Could you please confirm this with them so that we can know what is the path forward for us? Once we have more information, we can see what are the next steps