Troubleshooting user session load balancing

Incorrect load balancing of user sessions occurs when one application server node receives the majority of active user sessions.

Also, active user sessions may become cycled through application server nodes. When viewed through an external monitoring tool, such as JConsole or Java VisualVM, session counts display in a roller coaster pattern.

Typical causes

Incorrect load balancing is typically caused by an imbalance in CPU and JVM memory utilization among application server nodes.

Cycling of user sessions is typically due to NexJ load-balancing or Cluster Manager parameters being incorrectly tuned for the garbage collection policy used by WAS, JBoss, or NexJ Server. This may include the CPU threshold, memory threshold, and memory limit settings.

Information to collect

Collect the following information to assist with troubleshooting:

  • Snapshots of JMX stats, including session count, CPU, and JVM memory statistics
  • Cluster Manager CPU and memory threshold values
  • Verbose garbage collection logs on each node
  • Thread dumps from each node while the pattern is present. Ensure that you collect 3 dumps which are 30 seconds apart.

Troubleshooting

When troubleshooting the issue, review JMX statistics and attempt to correlate the session count, CPU utilization, and JVM Memory utilization statistics or graphs. Determine if any of the nodes are busy and consider what may be causing that condition.

Verify that Cluster Manager thresholds are configured with the settings that were determined for your deployment. Validate the CPU Threshold (%), Memory Threshold (%), and Memory Limit (%) settings.

If your environment uses WebSphere Application Server (WAS), verify that session stickiness is enabled in WebSphere’s plugin for HTTP server by the following setting: IgnoreAffinityRequests = false.

Additional considerations

Load balancers in the web tier typically distribute traffic equally among the available HTTP server processes. NexJ’s load balancer, specifically the Cluster Manager and Session Manager components, manage user session distribution across application server nodes. The CPU threshold, memory threshold and memory limit parameters for the Cluster Manager directly affect the load-balancing behavior. You can view configured values in NexJ System Admin Console under Statistics > nexj.application > Administration.

Note: Do not change Cluster Manager parameters before consulting NexJ Support. The currently configured values were determined after load testing the application in combination with your garbage collection policy.

If a node is busy, the node is not assigned any new sessions. A node is busy when the JVM heap utilization reaches the specified memory threshold value, for example, a value of 95%. If a node is overloaded, the load balancer will actively swap user sessions to other available nodes. A node is considered overloaded when the JVM heap reaches the specified memory limit value, for example, a value of 99%.

Elevated CPU utilization makes a node appear less attractive to the load balancing logic. CPU utilization is used as a secondary metric.