The Alarms view group provides the following views:
- Node Alarms - information about node alarms in the cluster
- Volume Alarms - information about volume alarms in the cluster
- User/Group Alarms - information about users or groups that have exceeded quotas
- Alarm Notifications - configure where notifications are sent when alarms are raised
The following controls are available on views:
- Clicking any column name sorts data in ascending or descending order by that column.
- Selecting the Filter checkbox displays the Filter toolbar, which provides additional data filtering options.
- Clicking the Column Controls icon ( ) opens a dialog that lets you select which columns to view. Click any item to toggle its column on or off. You can also specify the refresh rate for updating data on the page. For example:
The Node Alarms view displays information about alarms on any node in the cluster that has raised an alarm.
The first two columns display
- Hlth - a color indicating the status of each node (see Cluster Heat Map)
- Hostname - the hostname of the node
The remaining columns are based on alarm type, such as:
- Version Alarm - one or more services on the node are running an unexpected version
- No Heartbeat Alarm - no heartbeat has been detected for over 5 minutes, and the node is not undergoing maintenance
- UID Mismatch Alarm - services in the cluster are being run with different usernames (UIDs)
- Duplicate HostId Alarm - two or more nodes in the cluster have the same Host ID
- Too Many Containers Alarm - the number of containers on this node reached the maximum limit
- Excess Logs Alarm - debug logging is enabled on this node, which can fill up disk space
- Disk Failure Alarm - a disk has failed on the node (the disk health log indicates which one failed)
- Time Skew Alarm - the clock on the node is out of sync with the master CLDB by more than 20 seconds
- Root Partition Full Alarm - the root partition ("/") on the node is 99% full and running out of space
- Installation Directory Full Alarm - the partition /opt/mapr on the node is running out of space (95% full)
- Core Present Alarm - a service on the node has crashed and created a core dump file
- High FileServer Memory Alarm - the FileServer service on the node has high memory consumption
- Pam Misconfigured Alarm - the PAM authentication on the node is configured incorrectly
- TaskTracker Local Directory Full Alarm - the local directory used by the TaskTracker is full, and the TaskTracker cannot operate as a result
- CLDB Alarm - the CLDB service on the node has stopped running
- FileServer Alarm - the FileServer service on the node has stopped running
- JobTracker Alarm - the JobTracker service on the node has stopped running
- TaskTracker Alarm - the TaskTracker service on the node has stopped running
- HBase Master Alarm - the HBase Master service on the node has stopped running
- HBase RegionServer Alarm - the HBase RegionServer service on the node has stopped running
- NFS Gateway Alarm - the NFS Gateway service on the node has stopped running
- Webserver Alarm - the WebServer service on the node has stopped running
- HostStats Alarm - the HostStats service on the node has stopped running
- Metrics write problem Alarm - metric data was not written to the database, or there were issues writing to a logical volume
See Alarms Reference.
Note the following behavior on the Node Alarms view:
- Clicking a node's Hostname navigates to the Node Properties View, which provides detailed information about the node.
- The left pane of the Node Alarms view displays the available topologies. Click a topology name to view only the nodes in that topology.
- Properties - navigates to the Node Properties View
- Forget Node - opens the Forget Node dialog to remove the node(s) from active management in this cluster. Services on the node must be stopped before the node can be forgotten.
- Manage Services - opens the Manage Node Services dialog, which lets you start and stop services on the node
- Change Topology - opens the Change Node Topology dialog, which lets you change the rack or switch path for a node
The Volume Alarms view displays information about volume alarms in the cluster:
- Mnt - whether the volume is mounted
- Vol Name - the name of the volume
- Snapshot Alarm - last Snapshot Failed alarm
- Mirror Alarm - last Mirror Failed alarm
- Replication Alarm - last Data Under-Replicated alarm
- Data Alarm - last Data Unavailable alarm
- Vol Advisory Quota Alarm - last Volume Advisory Quota Exceeded alarm
- Vol Quota Alarm- last Volume Quota Exceeded alarm
Clicking any column name sorts data in ascending or descending order by that column. Clicking a volume name displays the Volume Properties dialog
Selecting the Show Unmounted checkbox shows unmounted volumes as well as mounted volumes.
Selecting the Filter checkbox displays the Filter toolbar, which provides additional data filtering options.Buttons:
- New Volume displays the New Volume Dialog.
- Properties - if the checkboxes beside one or more volumes is selected,displays the Volume Properties dialog
- Mount (Unmount) - if an unmounted volume is selected, mounts it; if a mounted volume is selected, unmounts it
- Remove - if the checkboxes beside one or more volumes is selected, displays the Remove Volume dialog
- Start Mirroring - if a mirror volume is selected, starts the mirror sync process
- Snapshots - if the checkboxes beside one or more volumes is selected,displays the Snapshots for Volume dialog
- New Snapshot - if the checkboxes beside one or more volumes is selected,displays the Snapshot Name dialog
The User/Group Alarms view displays information about user and group quota alarms in the cluster:
- Name - the name of the user or group
- User Advisory Quota Alarm - the last Advisory Quota Exceeded alarm
- User Quota Alarm - the last Quota Exceeded alarm
- Edit Properties - opens a User Properties dialog box that lets you change user properties and clear alarms.
The Alerts dialog lets you specify which alarms cause a notification event and where email notifications are sent.
- Alarm Name - select the alarm to configure
- Standard Notification - send notification to the default for the alarm type (the cluster administrator or volume creator, for example)
- Additional Email Address - specify an additional custom email address to receive notifications for the alarm type
- OK - save changes and exit
- Cancel - exit without saving changes