admin管理员组文章数量:1122846
I have two Apache Ignite servers with three clients already connected to them, each using separate data region configurations for their respective caches. These three clients work fine, but now, when I connect a fourth client, the node occasionally stops.
WARN 1 --- [vent-worker-#44] o.a.i.i.m.d.GridDiscoveryManager : Node FAILED:
TcpDiscoveryNode [id=6ff310ca-dd51-4115-9fdf-fbf3d093b5b3, consistentId=6ff310ca-dd51-4115-9fdf-fbf3d093b5b3,
addrs=ArrayList [0:0:0:0:0:0:0:1%lo, x.y.z.a, 127.0.0.1], sockAddrs=null, discPort=0, order=967,
intOrder=487, lastExchangeTime=1731680665767, loc=false, ver=2.15.0#20230425-sha1:f98f7f35, isClient=true]
Whenever I get this error , my entire Spring boot application is getting restarted from the beginning. Why is this happening and how can I avoid this. Below is my configuration
@Configuration
public class IgniteConfig {
@Bean
public Ignite igniteInstance() {
IgniteConfiguration cfg = new IgniteConfiguration();
cfg.setMetricsLogFrequency(0);
// Set client mode
cfg.setClientMode(true);
// Configure discovery SPI
TcpDiscoverySpi discoverySpi = new TcpDiscoverySpi();
TcpDiscoveryVmIpFinder ipFinder = new TcpDiscoveryVmIpFinder();
ipFinder.setAddresses(Arrays.asList(
"x.y.z.a:47500..47509"
));
discoverySpi.setIpFinder(ipFinder);
discoverySpi.setNetworkTimeout(10000); // Network timeout (5 seconds)
discoverySpi.setJoinTimeout(10000); // Join timeout (10 seconds)
cfg.setDiscoverySpi(discoverySpi);
// Set failure detection timeouts
cfg.setFailureDetectionTimeout(120000); // 120 seconds
cfg.setClientFailureDetectionTimeout(120000); // 120 seconds
// Configure TCP communication SPI
TcpCommunicationSpi spi = new TcpCommunicationSpi();
spi.setConnectTimeout(30000); // Initial connection timeout (3 seconds)
spi.setMaxConnectTimeout(10000); // Max connection timeout (6 seconds)
spi.setReconnectCount(3); // Number of reconnection attempts
spi.setIdleConnectionTimeout(3000); // Idle connection timeout (100 ms)
cfg.setCommunicationSpi(spi);
// Configure event logging to capture node failures and disconnections
cfg.setIncludeEventTypes(
EventType.EVT_NODE_FAILED,
EventType.EVT_NODE_LEFT,
EventType.EVT_NODE_JOINED,
EventType.EVT_NODE_SEGMENTED
);
// Configure event storage for diagnostics
MemoryEventStorageSpi eventStorageSpi = new MemoryEventStorageSpi();
eventStorageSpi.setExpireCount(1000); // Store up to 1000 events in memory
cfg.setEventStorageSpi(eventStorageSpi);
// Set metrics log frequency to zero to reduce logging noise
cfg.setMetricsLogFrequency(0);
// Start the Ignite instance
return Ignition.start(cfg);
}
}
I have two Apache Ignite servers with three clients already connected to them, each using separate data region configurations for their respective caches. These three clients work fine, but now, when I connect a fourth client, the node occasionally stops.
WARN 1 --- [vent-worker-#44] o.a.i.i.m.d.GridDiscoveryManager : Node FAILED:
TcpDiscoveryNode [id=6ff310ca-dd51-4115-9fdf-fbf3d093b5b3, consistentId=6ff310ca-dd51-4115-9fdf-fbf3d093b5b3,
addrs=ArrayList [0:0:0:0:0:0:0:1%lo, x.y.z.a, 127.0.0.1], sockAddrs=null, discPort=0, order=967,
intOrder=487, lastExchangeTime=1731680665767, loc=false, ver=2.15.0#20230425-sha1:f98f7f35, isClient=true]
Whenever I get this error , my entire Spring boot application is getting restarted from the beginning. Why is this happening and how can I avoid this. Below is my configuration
@Configuration
public class IgniteConfig {
@Bean
public Ignite igniteInstance() {
IgniteConfiguration cfg = new IgniteConfiguration();
cfg.setMetricsLogFrequency(0);
// Set client mode
cfg.setClientMode(true);
// Configure discovery SPI
TcpDiscoverySpi discoverySpi = new TcpDiscoverySpi();
TcpDiscoveryVmIpFinder ipFinder = new TcpDiscoveryVmIpFinder();
ipFinder.setAddresses(Arrays.asList(
"x.y.z.a:47500..47509"
));
discoverySpi.setIpFinder(ipFinder);
discoverySpi.setNetworkTimeout(10000); // Network timeout (5 seconds)
discoverySpi.setJoinTimeout(10000); // Join timeout (10 seconds)
cfg.setDiscoverySpi(discoverySpi);
// Set failure detection timeouts
cfg.setFailureDetectionTimeout(120000); // 120 seconds
cfg.setClientFailureDetectionTimeout(120000); // 120 seconds
// Configure TCP communication SPI
TcpCommunicationSpi spi = new TcpCommunicationSpi();
spi.setConnectTimeout(30000); // Initial connection timeout (3 seconds)
spi.setMaxConnectTimeout(10000); // Max connection timeout (6 seconds)
spi.setReconnectCount(3); // Number of reconnection attempts
spi.setIdleConnectionTimeout(3000); // Idle connection timeout (100 ms)
cfg.setCommunicationSpi(spi);
// Configure event logging to capture node failures and disconnections
cfg.setIncludeEventTypes(
EventType.EVT_NODE_FAILED,
EventType.EVT_NODE_LEFT,
EventType.EVT_NODE_JOINED,
EventType.EVT_NODE_SEGMENTED
);
// Configure event storage for diagnostics
MemoryEventStorageSpi eventStorageSpi = new MemoryEventStorageSpi();
eventStorageSpi.setExpireCount(1000); // Store up to 1000 events in memory
cfg.setEventStorageSpi(eventStorageSpi);
// Set metrics log frequency to zero to reduce logging noise
cfg.setMetricsLogFrequency(0);
// Start the Ignite instance
return Ignition.start(cfg);
}
}
Share
Improve this question
edited Nov 21, 2024 at 10:41
Dude Ramasamy
asked Nov 21, 2024 at 10:30
Dude RamasamyDude Ramasamy
256 bronze badges
1 Answer
Reset to default 1Firstly, Why are you using 3 separate data regions. It may very well be fine to do so, but it is pre-dividing your memory space which in and of itself is not any issue unless one of your applications needs to use more than its slice of the pie! If all three were in the same data region then you are only limited by total memory as all consumers draw from the 1 and only pie! In terms of node failure you would need to look at the log file to try to see if there are indications of failure there. I have seen long GC pauses ultimately end up causing a node to crash. I can't say that is your issue, but if your log showed long GC pauses before a crash that you be one example of a node failure reason. Hope that helps.
本文标签: javaApache Ignite node getting failed oftenStack Overflow
版权声明:本文标题:java - Apache Ignite node getting failed often - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1736311760a1934853.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论