Skip to main content

clouddriver-caching errors | ClusteredAgentScheduler - Unable to run agents

Issue

When attempting to run agents, the following errors can be found in the clouddriver-caching logs:

Exception: ERROR 1 --- [ClusteredAgentScheduler-0] c.n.s.c.r.c.ClusteredAgentScheduler : Unable to run agents redis.clients.jedis.exceptions.JedisConnectionException: Could not get a resource from the pool redis.clients.jedis.util.Pool.getResource(Pool.java:59) ~[jedis-3.1.0.jar:na] redis.clients.jedis.JedisPool.getResource(JedisPool.java:234) ~[jedis-3.1.0.jar:na] com.netflix.spinnaker.kork.jedis.telemetry.InstrumentedJedisPool.getResource(InstrumentedJedisPool.java:60) ~[kork-jedis-7.41.0.jar:na] com.netflix.spinnaker.kork.jedis.telemetry.InstrumentedJedisPool.getResource(InstrumentedJedisPool.java:26) ~[kork-jedis-7.41.0.jar:na] com.netflix.spinnaker.kork.jedis.JedisClientDelegate.withCommandsClient(JedisClientDelegate.java:54) ~[kork-jedis-7.41.0.jar:na] com.netflix.spinnaker.cats.redis.cluster.ClusteredAgentScheduler.acquireRunKey(ClusteredAgentScheduler.java:183) ~[cats-redis-GCSFIX.jar:na] com.netflix.spinnaker.cats.redis.cluster.ClusteredAgentScheduler.acquire(ClusteredAgentScheduler.java:136) ~[cats-redis-GCSFIX.jar:na] com.netflix.spinnaker.cats.redis.cluster.ClusteredAgentScheduler.runAgents(ClusteredAgentScheduler.java:163) ~[cats-redis-GCSFIX.jar:na] com.netflix.spinnaker.cats.redis.cluster.ClusteredAgentScheduler.run(ClusteredAgentScheduler.java:156) ~[cats-redis-GCSFIX.jar:na] java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[na:na] java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) ~[na:na] java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) ~[na:na] java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[na:na] java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[na:na] java.base/java.lang.Thread.run(Thread.java:834) ~[na:na] Caused by: redis.clients.jedis.exceptions.JedisConnectionException: Failed connecting to host redis-clouddriver-master:6379 redis.clients.jedis.Connection.connect(Connection.java:204) ~[jedis-3.1.0.jar:na] redis.clients.jedis.BinaryClient.connect(BinaryClient.java:100) ~[jedis-3.1.0.jar:na] redis.clients.jedis.BinaryJedis.connect(BinaryJedis.java:1866) ~[jedis-3.1.0.jar:na] redis.clients.jedis.JedisFactory.makeObject(JedisFactory.java:117) ~[jedis-3.1.0.jar:na] org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:889) ~[commons-pool2-2.7.0.jar:2.7.0] org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:424) ~[commons-pool2-2.7.0.jar:2.7.0] org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:349) ~[commons-pool2-2.7.0.jar:2.7.0] redis.clients.jedis.util.Pool.getResource(Pool.java:50) ~[jedis-3.1.0.jar:na] \t... 14 common frames omitted Caused by: java.net.SocketTimeoutException: connect timed out java.base/java.net.PlainSocketImpl.socketConnect(Native Method) ~[na:na] java.base/java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:399) ~[na:na] java.base/java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:242) ~[na:na] java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:224) ~[na:na] java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:403) ~[na:na] java.base/java.net.Socket.connect(Socket.java:609) ~[na:na] redis.clients.jedis.Connection.connect(Connection.java:181) ~[jedis-3.1.0.jar:na] \t... 21 common frames omitted

Cause

Rare Redis issue where when two pods try to start at the same time.  Because they start at the same time, they end up competing for shared resources, and only one pod ends up starting.