Article How do I monitor the health of my community?

System Status

Some configuration issues and service accessibility issues could go undetected for periods of time resulting in temporary issues that are difficult to reproduce when they reach support. To ensure that each installation of Community is properly installed, configured, and working well, the System Status page can give an overview of you community.

The System Status panel is available at Administration > Monitoring > System Status and reports on system indicator plugins (implementations of Telligent.Evolution.Extensibility.Administration.Version1.ISystemIndicator).

Each indicator reports its overall status as Good, Information, Warning, or Critical and can communicate the details of that status along with a URL to take action on the related functionality. 

Badging and Notifications

Indicators reporting in the Warning or Critical state are represented in the badge count for the System Status and Monitoring panel and are shown in the front-UI in the Management toolbar.

Indicators that are identified as Critical will be monitored by the platform to determine if they quickly self-resolve. If not, they will be sent via system notifications to administrators to identify the critical issues requiring resolution. All system notifications related to system indicators will self-resolve when the indicator is no longer in a critical state.


Status Details/Diagnostics

System indicators can provide helpful diagnostic data related to their current state. For example,

Platform-defined Indicators

The platform defines the following indicators and validations/information:

  • Cache (Infrastructure)
    • Distributed caching available but not enabled (warning)
    • Detecting unbalanced cache pressure across web nodes (warning)
    • High cache pressure (warning)
  • Database (Infrastructure)
    • Mismatched configuration on nodes (critical)
    • Excessive errors (warning)
  • Email (Infrastructure)
    • Enabled but not configured (critical)
    • Excessive email template errors (critical)
    • Email is disabled (warning)
    • Excessive Email sending errors (warning)
  • File Storage (Infrastructure)
    • Mismatched configuration on nodes in connectionstrings.config or communityserver.config (critical)
    • Mismatched CDN configuration (critical)
    • Cannot read known file from system file store on a node (critical)
    • Cannot write/read/delete from a file store group configuration on a node (critical)
  • Interface (Infrastructure)
    • Mismatched site URL configuration on nodes in connectionstrings.config (critical)
    • Script configuration errors detected (critical)
    • Legacy options enabled (warning)
    • Excessive execution errors (warning)
  • Job Server (Infrastructure)
    • Multiple job servers detected (critical)
    • Configured to run jobs locally and a job server is running (critical)
    • Configured to run jobs locally but multiple web nodes detected (critical)
    • No job server running (and not running locally) (critical)
    • Configured to run jobs locally (warning)
    • Stalled but running job server (warning)
  • Licensing
    • Not licensed (critical)
    • Over-license (critical)
    • Expiring soon (warning)
  • Message Bus (Infrastructure)
    • No bus is enabled (and not running jobs locally, so expecting multiple nodes) (critical)
    • Bus is enabled and connected but no job server is detected (and not running jobs locally) (critical)
    • Not connected (critical)
    • Connected nodes are not running the same version of Community (critical)
    • No bus is enabled (and running jobs locally) (warning)
    • Database bus enabled (warning)
    • Excessive reconnects (warning)
  • Plugins (Infrastructure)
    • Misconfigured enabled plugins (critical)
    • Plugin initialization errors (critical)
    • Excessive errors by individual plugins (warning)
  • Reporting
    • Disabled (information)
    • Can’t connect (critical)
    • Mismatched configuration (critical)
    • ETL not run (critical)
    • ETL not recent (critical)
    • Long running ETL (warning)
    • ETL running (information)
  • Search (Infrastructure)
    • No search provider configured (critical)
    • SOLR unavailable (either content and/or conversations) (critical)
    • Mismatched configuration on nodes (critical)
    • Excessive errors (warning)

Disabling Infrastructure System Notifications

While not recommended, critical infrastructure notifications can be disabled. This can be useful in deployments where infrastructure is managed by a different team or development environments where service reliability fluctuates or services are intentionally turned off.

  1. Create a file named 'hosting.config' and set the attribute values for 'enableBadging' and 'enableSystemNotification' on the <systemStatusIndicators> element to false (see below).
  2. Deploy the hosting.config file to the root of each web node and base directory of the jobs service.
  3. Restart the job service and recycle each web node.

Example hosting.config file

<?xml version="1.0" encoding="utf-8"?>
<hosting>
    <systemStatusIndicators enableBadging="false" enableSystemNotification="false" emailTo="it-support@mycompany.com" endpointAllowedIpAddresses="127.0.0.*" />
</hosting>

Configuration options for <systemStatusIndicators> include:

  • enableBadging: (optional) When set to 'false', the badge counts on 'System Notifications' panel will not be updated when issues are identified.
  • enableSystemNotification: (optional) When set to 'false', in-site system notifications and emails will not be raised/sent when issues are identified.
  • emailTo: (optional) When set, all infrastructure notifications will be sent to the configured email address. Only one email address is allowed so an alias is recommended when needing to notify more than one person.
  • endpointAllowedIpAddresses: (optional) When set, unauthenticated requests from the specified IP (or range) can retrieve system status details using the /api.ashx/v2/systemstatus.json|xml endpoint. Note: Authenticated requests are always allowed for users with 'Manage Settings' permissions.
  • Notes:
    • Regardless of configured values, community administrators can always view the current health by going to Administration -> Monitoring -> System Status.
    • If email is disabled site-wide, no email notifications will be sent, including emails configured in 'emailTo'.
    • If the job service is unavailable/down, emails will not be sent until the service is restarted.

Jobs

Jobs (located at Administration > Monitoring > Jobs) will give you a list of all jobs, their scheduled operation, and their outcome. An error, depending upon type, can show up in the Exceptions log or in the errors log in JavaScript.

Events

Events will give you a link to the Events log that shows failed operations, messages, event date(s), machine name, event ID, event type, and Settings ID.

Also see How can I diagnose a problem or get help?