Sandbox accounts with sample data?

arbiter · April 8, 2022, 1:07am

Suppose I want to build an analytics tool for Marketo users, using Airbyte to pull their data into a data warehouse. If I don’t yet have a customer using Marketo, I am stuck since I have no idea what the data would look like. What is needed is a “sandbox” Marketo account. Is there a place where I can find such sandbox accounts for various sources?

marcosmarxm · April 8, 2022, 4:19pm

You can try to reach Marketo to see if they can provide you a sandbox account. Airbyte has an integration account but is used by tests and development. Unfortunately we can’t share them.

arbiter · April 8, 2022, 4:37pm

It’s a really tedious process requesting a sandbox Marketo account.
I had another idea – the thing I really want is the set of DBT transformations that do the basic normalizations after an ingest from (say) Marketo. The various json files under source_marketo/schemas contain all the “hard work” of specifying the details of the data elements that are returned from an API call. I have a theory, that I can run the test_normalization.py in the integration_tests pointing to the Marketo json schemas, and this will produce the needed DBT transformations.

The DBT transformations would of course have no data yet to actually run on, but I will have an idea of the resulting tables, and I can prepare my downstream DBT transformation models for my analytics tool. This way, I can be “ready” for a Marketo customer, i.e. as soon as I connect airbyte to their Marketo, I will have the downstream processes ready.

I may be totally wrong about generating the DBT models in this way though, let me know your thoughts

marcosmarxm · April 14, 2022, 12:40am

I see, I can run one example of Marketo output to you.

arbiter · April 14, 2022, 1:38am

Thanks @marcosmarxm … I was actually able to create a catalog.json for marketo under the base-normalizations and produced the dbt base-norm models.

But Marketo is just one example, and from the POV of someone trying to build a general analytics tool that can work for a variety of MarTech platforms, it would be nice to have a way to produce the “base-normalization” dbt models (such as airbyte_ctes and airbyte_incremental) without having any credentials for the source.

I do have HubSpot credentials for a test account, and looking closely at the docker logs during a sync, it looks like the key is this log line:

transform-config --config destination_config.json \
  --integration-type bigquery --out /data/12/0/normalize

This depends on some configs and jsons generated by earlier docker cmds. I don’t know if there is an easy way to identify those.

So to summarize, my basic question is: is there a sequence of commands I can use to generate base-normalization dbt-models for a source X, without any credentials for X?
I know this may not be your target use case, but this ability would expand your potential use scenarios from “end-users with data they want to pull from sources X, Y, Z”, to “meta-level app-builders who want to build analytics tools for end-users who have data in X, Y, Z”.

Don’t know if that made sense, any pointers appreciated.

marcosmarxm · April 14, 2022, 9:30pm

Today only doing the manual process you made to generate the models.
I created the issue Generate normalization models without running the sync or having credentials · Issue #12047 · airbytehq/airbyte · GitHub to implement this feature in the future.

Topic		Replies	Views
Docker-free commands to sync a source? Q&A normalization	3	770	April 19, 2022
Using Airbyte Cloud for Running DBT Transformations Platform Questions platform , airbyte-cloud , question , dbt-transformations , raw-data	0	59	May 14, 2024
Integrating Salesforce Marketing Cloud with Snowflake using Airbyte Connector Development airbyte , api , connector , question , salesforce-marketing-cloud	0	159	June 6, 2024
Custom Transformation with dbt for API Connector to Postgres Sync Connector Development connector , question , custom-transformation , api-connector , dbt	14	196	May 14, 2024
Marketo connector is not working Connector Questions & Issues connectors	2	158	January 18, 2023

Sandbox accounts with sample data?

Related topics