Summary
After upgrading Airbyte to version 0.63.3, encountering a 500 error when trying to check job status using the API endpoints for specific job IDs or statuses. The general /v1/jobs endpoint works fine.
Question
Hi everyone,
I upgrade Airbyte from 0.58.0 to 0.63.3, and I’m using the API to check the status of the jobs, but now i get an error :
• when i do : http://localhost:8006/v1/jobs works fine
• when i do http://localhost:8006/v1/jobs/{jobId} or http://localhost:8006/v1/jobs?status=succeeded I get
"type": "about:blank",
"status": 500
}```
I follow this documentation <https://reference.airbyte.com/reference/listjobs|Airbyte API>. And it works with 0.58.0
Can you please help me ?
<br>
---
This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. [Click here](https://airbytehq.slack.com/archives/C021JANJ6TY/p1719931599230069) if you want
to access the original thread.
[Join the conversation on Slack](https://slack.airbyte.com)
<sub>
["error", "airbyte-api", "job-status", "upgrade", "500-error"]
</sub>
thank you so much, it works
thanks <@U035912NS77> for providing new link
Hello. I try the same but got access denied response. I run airbyte as standalone docker. Should I need to add bearrer token?
<@U07DC8RBN3V> If you’re running in stand-alone docker with basic auth enabled (meaning you have a username/password to enter to log into Airbyte, which is the default for the Docker Compose method), you’ll need to pass basic auth to the API as well.
The normal format for this in headers is:
Authorization: Basic <ENCODED_CREDENTIAL>
. . . where <ENCODED_CREDENTIAL>
is a base64-encoded string of the username and password separated by a colon. So if your username “admin” and your password is “abc123”, the string you’d encode is admin:abc123
(Depending on your platform, you may have to first set the encoding of the text to UTF-8)
How you do this depends on your platform or programming language, but many curl implementations automatically handle the encoding for you. For example, in PHP, you would set this before your curl request with:
curl_setopt($ch,CURLOPT_USERPWD,"admin:abc123");
// If you're setting a headers array
$headers = array(
'Content-Type: application/json',
'Authorization: Basic ' . base64_encode("admin:abc123")
);
curl_setopt($ch,CURLOPT_HTTPHEADER,$headers);```
Some platforms will also allow you to pass this in the URL space using a format like `<https://admin:abc123@hostname.example.com/path>`
TLDR; Google your specific use case to find how to utilize HTTP Basic auth in an outbound HTTP request.
Thanls for your answer. I work in python. I will try to apply your transformation and check if it works
<@U07DC8RBN3V> Cool, if you’re using the requests
library you can just do something like:
r = requests.get('<https://example.com/api/public/v1/jobs',auth=('admin','abc123')>)```
Or in `httplib2`, you could do something like:
```import httplib2
h = httplib2.Http()
h.add_credentials('admin','abc123')
resp,content = h.request('<https://example.com/api/public/v1/jobs','GET',body='whatever>')```
. . . or do it yourself with `pycurl` and other libraries.
Thanks for your reply. And how to select streams to fetch in the connection ? I found how to achieve via airbyte api but not via api request
in RESTful APIs, the URL path (or “endpoint”) determines the object being represented, and the HTTP method/verb (e.g. GET
, POST
, etc.) represents the action taken against it. So make sure you’re not only using the right endpoint, but also the appropriate HTTP method.
In this case setting the streams would be achieved by calling the /connections
endpoint (either using POST
to create a new connection or PATCH
to update an existing one) and then passing the configurations.streams
config. You can play with building the payload here:
https://reference.airbyte.com/reference/createconnection
Note that you select the streams at the connection level, NOT the source or destination (because different connections may need to pull different tables from the source).
It’s also worth noting that for working with the Airbyte API in Python, you may find it easier to use the <https://pypi.org/project/airbyte-api|PyPi Package> or the <https://github.com/airbytehq/airbyte-api-python-sdk|Python SDK> as this will give you a smoother experience and clearer expectation of the objects to pass to the various actions (sorry—probably should have mentioned these before!)
So how to filter streams. For example I want to create a github to s3 connection and only activate issues. What is the query for that?
Effectively the starting point for that request would end up something like this:
"configurations": {
"streams": [
{
"name": "Issues",
"syncMode": "incremental_deduped_history"
}
]
},
"schedule": {
"scheduleType": "manual"
},
"namespaceDefinition": "destination",
"namespaceFormat": null,
"nonBreakingSchemaUpdatesBehavior": "ignore",
"sourceId": "95e66a59-8045-4307-9678-63bc3c9b8c93",
"destinationId": "e478de0d-a3a0-475c-b019-25f7dd29e281",
"name": "Example Connection",
"status": "active"
}```
Note that you'll have to replace the source/destination IDs with yours, and also may need to add the cursor or primary key to the incremental streams if the connector type doesn't supply a default. You'll see that pages like <https://reference.airbyte.com/reference/createconnection|Create Connection> have a little sample request builder on the right side that will show you what the result should be like if you fill out the form in the middle (and the ability to grab it as python code too—just make sure you change the endpoint since it's set up for Airbyte Cloud by default).
And again, it may be easier to do this in the SDK, where it would look more like . . .
```import airbyte_api
from airbyte_api import models
s = airbyte_api.AirbyteAPI(
server_url="<https://example.com/api/public/v1>",
security=models.Security(
basic_auth=models.SchemeBasicAuth(
password="<YOUR_PASSWORD_HERE>",
username="<YOUR_USERNAME_HERE>",
)
)
)
res = s.connections.create_connection(request=models.ConnectionCreateRequest(
destination_id='e478de0d-a3a0-475c-b019-25f7dd29e281',
source_id='95e66a59-8045-4307-9678-63bc3c9b8c93',
name='Exampe-Connector',
namespace_format='${SOURCE_NAMESPACE}',
configurations=[
models.StreamConfigurations([
models.StreamConfiguration(
name='issues',
sync_mode=models.ConnectionSyncModeEnum.INCREMENTAL_DEDUPED_HISTORY
)
])
]
))
if res.connection_response is not None:
# handle response
pass```
Again, the <https://github.com/airbytehq/airbyte-api-python-sdk?tab=readme-ov-file#sdk-example-usage|SDK docs> define all the details on what models you need to pass, just make sure you follow the nested links to the child objects
Thanks a lot for ypur answer. To use incremental_deduped_history I do not have to add extra keys like cursorField and primaryKey?
You only need to include them if the source doesn’t define a default for them.
In this case they do, and you could check that by either looking at the YAML source in the Airbyte repo or by making a test connection in the UI and seeing if it selects them by default. (Most do, with the exception of database sources)
Great thanks for these precisions. I will check the yaml files.
Lastly, when I create a source via api call, do the endpoint automatically test the source like in UI? Or how can I but sure after api retur 200 status that the source is created and tested and all is good?
I believe that it still runs a connection check, but I would recommend passing it an invalid credential when creating/updating a source to see what the response looks like. IIRC it will return a 400 with a specific error, but it’s been a while since I’ve implemented our logic for this so I don’t remember the specifics
Thanks for your help. Realy appreciate it.