Cluster startup time - 1 minute?

Cacheonix support forum is dedicated to support requests, bug reports and feature suggestions.
stodge

Cluster startup time - 1 minute?

Postby stodge » Fri Jan 25, 2013 1:53 pm

If I run a single app, it takes a minute for the cluster to become operational:

08:47:54,402 INFO: DistributedCacheonix(309) [ main] - Starting up cluster node: 225.0.1.2:9998:8879
08:47:54,501 INFO: TCPServer(152) [ main] - Starting TCP server bound to 127.0.0.1:8879
08:47:54,507 INFO: MulticastServerImpl(425) [ main] - Starting multicast server, 225.0.1.2:9998
08:47:54,507 INFO: MulticastServerImpl(415) [ main] - Started multicast server, address: 225.0.1.2:9998, receive buffer size: 131071
08:47:54,518 INFO: DistributedCacheonix(357) [ main] - Started up cluster node: 225.0.1.2:9998:8879

2013-01-25 08:48:54,397 DEBUG ( Cloud.java:199) - Cluster event subscription state changed ClusterStateChangedEventImpl{newClusterState=ClusterState{code=2, name='Reconfiguring'}}
2013-01-25 08:48:54,398 DEBUG ( Cloud.java:199) - Cluster event subscription state changed ClusterStateChangedEventImpl{newClusterState=ClusterState{code=1, name='Operational'}}
2013-01-25 08:48:54,401 DEBUG ( Cloud.java:189) - Cluster event subscription member joined ClusterMemberJoinedEventImpl{joinedMembers=[ClusterMemberImpl{clusterName='Cacheonix', clusterMemberAddresses=[ClusterMemberAddressImpl{inetAddress=/127.0.0.1}], clusterMemberPort=8879}]}

Is this expected? This seems like a long time. Thanks

stodge

Postby stodge » Fri Jan 25, 2013 3:08 pm

My simple Cacheonix application is running at 100% CPU on Linux

3109 mike 20 0 6608m 218m 9.9m S **100.1** 1.4 1:47.65 /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/bin/java -agentl 2205 root 20 0 147m 33m 11m S 2.0 0.2 0:09.92 /usr/bin/Xorg :0 -br -verbose -logverbose 7 -auth /var/run/gdm/auth-..

I'll try a simpler example but I'm using the OpenJDK 1.6 on Fedora 16. Does Cacheonix require Oracle's JDK 1.6 or 1.7? Thanks

stodge

Postby stodge » Fri Jan 25, 2013 3:22 pm

CPU usage rises to 100% as soon as I do:

setCacheManager(Cacheonix.getInstance(configFile));

I have JDK7 installed so I'll try that.

stodge

Postby stodge » Fri Jan 25, 2013 3:24 pm

No difference.

simeshev

Postby simeshev » Fri Jan 25, 2013 3:42 pm

This doesn't sound right. Could you take a stacktrace and it to support@cacheonix.com when your app is running at 100%? You can do this by sending

kill -3 <application pid>

The stack trace will go to stdout.


Also, how many servers in the cluster do you have?

Slava

stodge

Postby stodge » Fri Jan 25, 2013 3:49 pm

I'll try to do that later. I have one server.

simeshev

Postby simeshev » Fri Jan 25, 2013 3:55 pm

Just in case, if you are running your app in an IDE like IntelliJ, it should have a button to take a stack trace.

Slava

simeshev

Postby simeshev » Fri Jan 25, 2013 3:59 pm

stodge wrote:I'll try a simpler example but I'm using the OpenJDK 1.6 on Fedora 16. Does Cacheonix require Oracle's JDK 1.6 or 1.7? Thanks


Cacheonix requires JDK 1.5 minimum.

Slava

stodge

Postby stodge » Fri Jan 25, 2013 4:02 pm

Code: Select all

2013-01-25 10:59:20
Full thread dump Java HotSpot(TM) 64-Bit Server VM (23.5-b02 mixed mode):

"DestroyJavaVM" prio=10 tid=0x00007f0fe000a000 nid=0x1c6a waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"CacheonixEventExecutor" prio=10 tid=0x00007f0f68001800 nid=0x1c8b waiting on condition [0x00007f0fb590f000]
   java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  <0x0000000705e1c010> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
   at java.util.concurrent.locks.LockSupport.park(Unknown Source)
   at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(Unknown Source)
   at java.util.concurrent.LinkedBlockingQueue.take(Unknown Source)
   at java.util.concurrent.ThreadPoolExecutor.getTask(Unknown Source)
   at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
   at java.lang.Thread.run(Unknown Source)

"McastServer:0" daemon prio=10 tid=0x00007f0fe042a800 nid=0x1c8a runnable [0x00007f0fb5a10000]
   java.lang.Thread.State: RUNNABLE
   at java.net.PlainDatagramSocketImpl.receive0(Native Method)
   - locked <0x0000000705e1e028> (a java.net.PlainDatagramSocketImpl)
   at java.net.AbstractPlainDatagramSocketImpl.receive(Unknown Source)
   - locked <0x0000000705e1e028> (a java.net.PlainDatagramSocketImpl)
   at java.net.DatagramSocket.receive(Unknown Source)
   - locked <0x00000007b97405d0> (a java.net.DatagramPacket)
   - locked <0x0000000705e1e058> (a java.net.MulticastSocket)
   at com.cacheonix.net.multicast.server.MulticastServerImpl.run(MulticastServerImpl.java:228)
   at java.lang.Thread.run(Unknown Source)

"ClusterProcessor:8879" prio=10 tid=0x00007f0fe0422000 nid=0x1c89 runnable [0x00007f0fb5b11000]
   java.lang.Thread.State: RUNNABLE
   at java.io.FileInputStream.readBytes(Native Method)
   at java.io.FileInputStream.read(Unknown Source)
   at sun.security.provider.NativePRNG$RandomIO.readFully(Unknown Source)
   at sun.security.provider.NativePRNG$RandomIO.ensureBufferValid(Unknown Source)
   at sun.security.provider.NativePRNG$RandomIO.implNextBytes(Unknown Source)
   - locked <0x0000000705e2d8f8> (a java.lang.Object)
   at sun.security.provider.NativePRNG$RandomIO.access$200(Unknown Source)
   at sun.security.provider.NativePRNG.engineNextBytes(Unknown Source)
   at java.security.SecureRandom.nextBytes(Unknown Source)
   - locked <0x0000000705e12218> (a java.security.SecureRandom)
   at com.cacheonix.net.processor.UUID.randomUUID(UUID.java:61)
   at com.cacheonix.net.processor.Request.<init>(Request.java:57)
   at com.cacheonix.net.cluster.ClusterRequest.<init>(ClusterRequest.java:46)
   at com.cacheonix.net.cluster.MarkerRequest.<init>(MarkerRequest.java:70)
   at com.cacheonix.net.cluster.OperationalMarker.<init>(OperationalMarker.java:74)
   at com.cacheonix.net.cluster.BlockedMarker.<init>(BlockedMarker.java:57)
   at com.cacheonix.net.cluster.BlockedMarker$Builder.create(BlockedMarker.java:782)
   at com.cacheonix.net.serializer.WireableFactory.createWireable(WireableFactory.java:307)
   at com.cacheonix.net.serializer.JavaSerializer.deserialize(JavaSerializer.java:165)
   at com.cacheonix.net.serializer.JavaSerializer.deserialize(JavaSerializer.java:114)
   at com.cacheonix.net.processor.Router.route(Router.java:245)
   at com.cacheonix.net.processor.RequestProcessor.route(RequestProcessor.java:294)
   at com.cacheonix.net.processor.RequestProcessor.post(RequestProcessor.java:262)
   at com.cacheonix.net.cluster.BlockedMarker.forward(BlockedMarker.java:154)
   at com.cacheonix.net.cluster.BlockedMarker.processBlocked(BlockedMarker.java:560)
   at com.cacheonix.net.cluster.ClusterRequest.execute(ClusterRequest.java:80)
   at com.cacheonix.net.cluster.MarkerRequest.execute(MarkerRequest.java:93)
   at com.cacheonix.net.processor.RequestProcessor.processMessage(RequestProcessor.java:217)
   at com.cacheonix.net.cluster.ClusterProcessor.processMessage(ClusterProcessor.java:229)
   at com.cacheonix.net.cluster.ClusterProcessor.processMessage(ClusterProcessor.java:207)
   at com.cacheonix.net.processor.RequestProcessor.processCommand(RequestProcessor.java:154)
   at com.cacheonix.net.processor.SimpleProcessor$Worker.process(SimpleProcessor.java:233)
   at com.cacheonix.net.processor.SimpleProcessor$Worker.run(SimpleProcessor.java:185)
   at java.lang.Thread.run(Unknown Source)

"Receiver:8879" prio=10 tid=0x00007f0fe0421000 nid=0x1c88 runnable [0x00007f0fb5c12000]
   java.lang.Thread.State: RUNNABLE
   at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
   at sun.nio.ch.EPollArrayWrapper.poll(Unknown Source)
   at sun.nio.ch.EPollSelectorImpl.doSelect(Unknown Source)
   at sun.nio.ch.SelectorImpl.lockAndDoSelect(Unknown Source)
   - locked <0x0000000705e1c148> (a sun.nio.ch.Util$2)
   - locked <0x0000000705e1c158> (a java.util.Collections$UnmodifiableSet)
   - locked <0x0000000705e17170> (a sun.nio.ch.EPollSelectorImpl)
   at sun.nio.ch.SelectorImpl.select(Unknown Source)
   at com.cacheonix.cluster.node.SelectorWorker.select(SelectorWorker.java:230)
   at com.cacheonix.cluster.node.SelectorWorker.run(SelectorWorker.java:83)
   at java.lang.Thread.run(Unknown Source)

"Sender:8879" prio=10 tid=0x00007f0fe0406000 nid=0x1c87 runnable [0x00007f0fb5f14000]
   java.lang.Thread.State: RUNNABLE
   at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
   at sun.nio.ch.EPollArrayWrapper.poll(Unknown Source)
   at sun.nio.ch.EPollSelectorImpl.doSelect(Unknown Source)
   at sun.nio.ch.SelectorImpl.lockAndDoSelect(Unknown Source)
   - locked <0x0000000705e1e138> (a sun.nio.ch.Util$2)
   - locked <0x0000000705e1e148> (a java.util.Collections$UnmodifiableSet)
   - locked <0x0000000705e1e0f0> (a sun.nio.ch.EPollSelectorImpl)
   at sun.nio.ch.SelectorImpl.select(Unknown Source)
   at com.cacheonix.cluster.node.SelectorWorker.select(SelectorWorker.java:230)
   at com.cacheonix.cluster.node.SelectorWorker.run(SelectorWorker.java:83)
   at java.lang.Thread.run(Unknown Source)

"Skippy" prio=10 tid=0x00007f0fe037c800 nid=0x1c86 in Object.wait() [0x00007f0fb6015000]
   java.lang.Thread.State: WAITING (on object monitor)
   at java.lang.Object.wait(Native Method)
   - waiting on <0x0000000705e1c168> (a java.util.TaskQueue)
   at java.lang.Object.wait(Object.java:503)
   at java.util.TimerThread.mainLoop(Unknown Source)
   - locked <0x0000000705e1c168> (a java.util.TaskQueue)
   at java.util.TimerThread.run(Unknown Source)

"CacheonixTimer" prio=10 tid=0x00007f0fe0391800 nid=0x1c85 in Object.wait() [0x00007f0fb6116000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
   at java.lang.Object.wait(Native Method)
   - waiting on <0x0000000705e17b90> (a java.util.TaskQueue)
   at java.util.TimerThread.mainLoop(Unknown Source)
   - locked <0x0000000705e17b90> (a java.util.TaskQueue)
   at java.util.TimerThread.run(Unknown Source)

"Service Thread" daemon prio=10 tid=0x00007f0fe0104800 nid=0x1c83 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" daemon prio=10 tid=0x00007f0fe0102000 nid=0x1c82 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" daemon prio=10 tid=0x00007f0fe00ff000 nid=0x1c81 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"JDWP Command Reader" daemon prio=10 tid=0x00007f0f8c001000 nid=0x1c7e runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"JDWP Event Helper Thread" daemon prio=10 tid=0x00007f0fe00fd000 nid=0x1c7d runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"JDWP Transport Listener: dt_socket" daemon prio=10 tid=0x00007f0fe00f9800 nid=0x1c7a runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x00007f0fe00ec800 nid=0x1c78 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x00007f0fe00a4000 nid=0x1c76 in Object.wait() [0x00007f0fa3efd000]
   java.lang.Thread.State: WAITING (on object monitor)
   at java.lang.Object.wait(Native Method)
   - waiting on <0x0000000705e1e1f8> (a java.lang.ref.ReferenceQueue$Lock)
   at java.lang.ref.ReferenceQueue.remove(Unknown Source)
   - locked <0x0000000705e1e1f8> (a java.lang.ref.ReferenceQueue$Lock)
   at java.lang.ref.ReferenceQueue.remove(Unknown Source)
   at java.lang.ref.Finalizer$FinalizerThread.run(Unknown Source)

"Reference Handler" daemon prio=10 tid=0x00007f0fe00a1800 nid=0x1c75 in Object.wait() [0x00007f0fa3ffe000]
   java.lang.Thread.State: WAITING (on object monitor)
   at java.lang.Object.wait(Native Method)
   - waiting on <0x0000000705e22010> (a java.lang.ref.Reference$Lock)
   at java.lang.Object.wait(Object.java:503)
   at java.lang.ref.Reference$ReferenceHandler.run(Unknown Source)
   - locked <0x0000000705e22010> (a java.lang.ref.Reference$Lock)

"VM Thread" prio=10 tid=0x00007f0fe009a800 nid=0x1c74 runnable

"GC task thread#0 (ParallelGC)" prio=10 tid=0x00007f0fe0018000 nid=0x1c6c runnable

"GC task thread#1 (ParallelGC)" prio=10 tid=0x00007f0fe0019800 nid=0x1c6d runnable

"GC task thread#2 (ParallelGC)" prio=10 tid=0x00007f0fe001b800 nid=0x1c6e runnable

"GC task thread#3 (ParallelGC)" prio=10 tid=0x00007f0fe001d800 nid=0x1c6f runnable

"GC task thread#4 (ParallelGC)" prio=10 tid=0x00007f0fe001f000 nid=0x1c70 runnable

"GC task thread#5 (ParallelGC)" prio=10 tid=0x00007f0fe0021000 nid=0x1c71 runnable

"GC task thread#6 (ParallelGC)" prio=10 tid=0x00007f0fe0023000 nid=0x1c72 runnable

"GC task thread#7 (ParallelGC)" prio=10 tid=0x00007f0fe0024800 nid=0x1c73 runnable

"VM Periodic Task Thread" prio=10 tid=0x00007f0fe010f000 nid=0x1c84 waiting on condition

JNI global references: 2141

Heap
 PSYoungGen      total 73472K, used 56035K [0x00000007aca00000, 0x00000007b9760000, 0x0000000800000000)
  eden space 73344K, 76% used [0x00000007aca00000,0x00000007b00a0c90,0x00000007b11a0000)
  from space 128K, 75% used [0x00000007b9740000,0x00000007b9758000,0x00000007b9760000)
  to   space 192K, 0% used [0x00000007b9700000,0x00000007b9700000,0x00000007b9730000)
 ParOldGen       total 170752K, used 1939K [0x0000000705e00000, 0x00000007104c0000, 0x00000007aca00000)
  object space 170752K, 1% used [0x0000000705e00000,0x0000000705fe4de8,0x00000007104c0000)
 PSPermGen       total 21248K, used 11986K [0x0000000700c00000, 0x00000007020c0000, 0x0000000705e00000)
  object space 21248K, 56% used [0x0000000700c00000,0x00000007017b4bb0,0x00000007020c0000)

stodge

Postby stodge » Fri Jan 25, 2013 6:29 pm

This basic sample causes 100+%:

Code: Select all

import cacheonix.Cacheonix;
import cacheonix.cluster.Cluster;
import cacheonix.cluster.ClusterEventSubscriber;
import cacheonix.cluster.ClusterEventSubscriptionEndedEvent;
import cacheonix.cluster.ClusterEventSubscriptionStartedEvent;
import cacheonix.cluster.ClusterMemberJoinedEvent;
import cacheonix.cluster.ClusterMemberLeftEvent;
import cacheonix.cluster.ClusterStateChangedEvent;

public class TestCache {

   public TestCache() {

      try {
         Cacheonix cacheManager = Cacheonix.getInstance("config/test.xml");
         Cluster cluster = cacheManager.getCluster();
      } catch (cacheonix.cache.ConfigurationException e) {
         e.printStackTrace();
      }
   }

   /**
    * @param args
    */
   public static void main(String[] args) {

      TestCache cache = new TestCache();

   }
}


And here's the config:

Code: Select all

<?xml version ="1.0"?>
<cacheonix xmlns="http://www.cacheonix.com/schema/configuration"
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xsi:schemaLocation="http://www.cacheonix.com/schema/configuration http://www.cacheonix.com/schema/cacheonix-config-2.0.xsd">

   <server>

      <listener>
         <tcp port="8879" buffer="128k" />
      </listener>

      <broadcast>
         <multicast multicastAddress="225.0.1.2" multicastPort="9998"
            multicastTTL="0" />
      </broadcast>

      <partitionedCache name="default">
         <store>
            <lru maxElements="10000" maxBytes="10mb"/>
            <expiration idleTime="120s"/>
         </store>
      </partitionedCache>
     
   </server>
</cacheonix>

simeshev

Postby simeshev » Mon Feb 11, 2013 7:38 am

Mike,

We have tried the same configuration and the same code, and has gotten 357 milliseconds for the cluster node to start up, see below. It seems that something is missing in your code example, because the log output in your post also shows similar startup times, but also includes lines from Cloud.java:199. It looks like you are trying to subscribe to events - could you show us how you do it, and what does the subscriber do?

Slava


Log output:

Code: Select all

23:29:43,755  INFO:           DistributedCacheonix(309) [                    main] - Starting up cluster node: 225.0.1.2:9998:8879
23:29:43,896  INFO:                      TCPServer(152) [                    main] - Starting TCP server bound to 127.0.0.1:8879
23:29:43,904  INFO:            MulticastServerImpl(425) [                    main] - Starting multicast server, 225.0.1.2:9998
23:29:43,905  INFO:            MulticastServerImpl(415) [                    main] - Started multicast server, address: 225.0.1.2:9998, receive buffer size: 188032
23:29:43,922  INFO:           DistributedCacheonix(357) [                    main] - Started up cluster node: 225.0.1.2:9998:8879
23:29:43,924  INFO:  DistrbutedCacheonixBug415Test( 25) [                    main] - Time to start: 357


Test code:

Code: Select all

package com.cacheonix.cluster.node;

import java.io.IOException;

import cacheonix.Cacheonix;
import cacheonix.CacheonixTestCase;
import cacheonix.TestUtils;
import com.cacheonix.util.logging.Logger;

/**
 * Tests a bug report in CACHEONIX-445 - "Single node cluster takes a lot of time to start"
 */
public final class DistrbutedCacheonixBug415Test extends CacheonixTestCase {

   @SuppressWarnings("UnusedDeclaration")
   private static final Logger LOG = Logger.getLogger(DistrbutedCacheonixBug415Test.class); // NOPMD


   public void testClusterStartupTime() throws IOException {

      // Get cluster
      final long start = System.currentTimeMillis();
      final Cacheonix cacheonix = Cacheonix.getInstance(TestUtils.getTestFile("cacheonix-config-CACHEONIX-445.xml"));
      cacheonix.getCluster();
      final long end = System.currentTimeMillis();
     
      // Print time
      LOG.info("Time to start: " + (end - start));
   }
}


simeshev

Postby simeshev » Mon Feb 11, 2013 7:44 am

Actually, we've been able to reproduce the problem and now working on the fix.

Slava

simeshev

Postby simeshev » Mon Feb 11, 2013 8:32 am

This problem has been fixed:

http://forums.cacheonix.com/viewtopic.php?p=106

Slava

stodge

Postby stodge » Mon Feb 11, 2013 3:57 pm

I still see the same problem.. :(

simeshev

Postby simeshev » Mon Feb 11, 2013 8:07 pm

Just to make sure, did you download the new build?

Slava


Return to “Cacheonix Support”

Who is online

Users browsing this forum: No registered users and 1 guest