Blossom Logo Blossom

Debug High CPU Usage

High CPU usage can make your server unresponsive and difficult to manage. This guide helps you identify and resolve common causes of high CPU load.

Common Causes

1. Failed Service Connections

One of the most common causes of high CPU usage is failed connection attempts to services (databases, Redis, etc.). When an application can’t connect to a required service, it may:

  • Continuously retry connections
  • Create multiple connection attempts
  • Consume excessive CPU resources
  • Eventually make the server unresponsive

This can happen when:

  • A service or database is unlinked from an app but the app still tries to use it
  • Service credentials or connection strings are incorrect
  • The service itself is unavailable

2. Server Resource Constraints

High CPU can also occur when:

  • The server is undersized for the application workload
  • Multiple resource-intensive applications share the same server
  • Memory constraints cause excessive swapping

Identifying the Problem

Check Application Logs

  1. Go to your app’s App Logs section
  2. Look for repeated error messages about:
    • Failed database connections
    • Redis connection errors
    • Other service connection failures
    • Timeout errors

Check Server Status

If the server is becoming unresponsive:

  • The “Check SSH” button may fail or timeout
  • SSH connections may be slow or fail completely
  • Web requests to your application may become very slow or timeout
  • Docker commands on the server may be slow to respond or fail

Immediate Actions

If your server becomes unresponsive due to high CPU:

  1. Scale Down the App

    • Set the app’s process count to 0 to stop connection retry attempts
    • This gives you time to investigate without constant retries
  2. Emergency Stop (if you can SSH):

    SSH into the server and find the /blossom/apps/UID/UID-ROLE folder and run:

    docker compose down
    

    Remember to run docker compose up -d later to start the app service again.

  3. Last Resort

    • If the server is completely unresponsive, you may need to:
      1. Force restart the server
      2. SSH in quickly to investigate before issues recur
      3. In extreme cases, delete and recreate the server

Prevention and Resolution

  • Review your app’s Databases section
  • Ensure all required services are properly linked
  • Verify connection strings in the Environment section
  • Test service connections before deploying

Monitor Resource Usage

  • Use appropriate server sizes for your workload
  • Monitor server metrics through your cloud provider’s dashboard:
    • CPU utilization
    • Memory usage
    • Disk I/O
    • Network traffic
  • Consider splitting apps across servers if needed

When to Upgrade Server Size

Consider upgrading your server if you see:

  • Consistently high CPU usage (>60%)
  • Frequent connection timeouts
  • Slow application response times
  • Regular SSH connection failures

Remember: While larger servers can handle more load, they won’t fix underlying connection issues. Always investigate the root cause of high CPU usage.