Is there any way to do Cookie-based authentication with Airbyte?

The API I’m working with today only supports “cookie” based authentication i.e you send a POST request with a username / password and the response has a Set-Cookie header in it.

I’m aware that requests can handle this for me, but are there any hooks into it from the Airbyte CDK?

Hi @cornjuliox,
The CDK is using request.Session on a stream basis to perform post requests to the source API. And, according to request documentation:

The Session object allows you to persist certain parameters across requests. It also persists cookies across all requests made from the Session instance, and will use urllib3 ’s connection pooling. So if you’re making several requests to the same host, the underlying TCP connection will be reused, which can result in a significant performance increase (see HTTP persistent connection).

TLDR; the CDK should handle cookies out of the box.

Got it. Thank you. I actually got it working - I’ve taken to accessing self._session directly, is this safe? My code works, its just that the _ preceding session implies that we’re not supposed to be doing this, yet I see no other way.

It’s great if your code works, but could you explain why you need to access/modify the self._session object?

My understanding is that requests won’t automatically set the cookie if you’re not using self._session, and it seemed like self._session was being used for the entirety of any given run so that’s where I decided to put this.

class CustomApiStream(CustomStream):
    def __init__(self, config, *args, **kwargs):
        # ....<SEVERAL LINES OF CODE OMITTED FOR BREVITY>
        
        #  API requires that the api key be obfuscated.
        obfuscated = self.obfuscateApiKey(self.config["api_key"])
        login_body = {
            "username": self.config["username"],
            "password": self.config["password"],
            "apiKey": obfuscated["obfuscated_key"],
            "timestamp": obfuscated["timestamp"]
        }
        login_url = urljoin(self.url_base, "authenticatedSession")
        # NOTE: This should handle auth for the whole run, as Requests
        #       will automatically track and send cookies with each request 
        #       after this one.
        resp = self._session.post(url=login_url, json=login_body)
        # NOTE: Maybe raise_for_status() here?
        resp.raise_for_status()

Thank you for the code sample.
I would suggest that you perform your authentication in a custom authenticator class that inherits from HttpAuthenticator. I wrote an example for you:

class CookieAuthenticator(HttpAuthenticator):

    def __init__(self, config):
        self.cookie_jar = self.login(config["api_key"], config["username"], config["pasword"])


    def login(self, api_key, username, password):
        obfuscated = self.obfuscateApiKey(api_key)
        login_body = {
            "username": username,
            "password": password,
            "apiKey": obfuscated["obfuscated_key"],
            "timestamp": obfuscated["timestamp"]
        }
        login_url = urljoin(self.url_base, "authenticatedSession")
        resp = requests.post(url=login_url, json=login_body)
        resp.raise_for_status()
        return resp.cookies

    def get_auth_header(self) -> Mapping[str, Any]:
        return {"cookie": "; ".join([f"{k}={v}" for k,v in requests.dict_from_cookiejar(self.cookie_jar)])}

You will then pass the authenticator to your CustomApiStreamin the init, such as:

cookie_autenticator = CookieAuthenticator(config)
custom_api_stream = CustomApiStream(config, authenticator=cookie_authenticator)

This is a cleaner approach in my opinion, by separating concerns and following our CDK practices.

Thank you very much. I’ll be sure to put this to good use.