We are seeing the following error in almost every sync that runs, although they do eventually succeed, sometimes they keep running forever and I have to manually cancel and rerun the sync.
2022-07-07 10:51:29 WARN ActivityExecutionContextImpl(doHeartBeat):165 - Heartbeat failed io.grpc.StatusRuntimeException: UNKNOWN: maximum attempts exceeded to update history at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:262) ~[grpc-stub-1.42.1.jar:1.42.1]
It is deployed on EKS with 4 nodes of M5.xlarge size, backed by an external DB (m5.2xlarge), usage level never exceeds 30% on the DB. On reviewing your architecture I don’t see a seperate DB for Temporal.
- What is causing this issue?
- Which parameters would I have to fine tune to avoid this?
- What are the implications if this goes unresolved? (more failure rate?)