Getting CAF Metrics

There are variety of stats genereated by CAF. These can be used to find out how system is performing. How is the system is being used etc. For accessing CAF metrics use: ``` $ ioxclient plt cafmetrics Currently active profile : def_newapi Command Name: plt-cafmetrics Available Metrics:

  1. RESTAPI
  2. SYSTEM
  3. HASYNC
  4. PUSHNOT
  5. MONITOR
  6. RESOURCE
  7. CAF
  8. all Choose the metrics file you want to download [1-8] :6 -------------CAF Metrics---------------- { "global": { "available_cpu": { "value": 1000 }, "available_memory": { "value": 256 }, "available_persistent_disk": { "value": 256 }, "total_cpu": { "value": 1000 }, "total_memory": { "value": 256 }, "total_persistent_disk": { "value": 256 } } }

### CAF Metrics RESTAPI
The RESTAPI metric provides the following information for a REST API request:

*    Time taken to service the request
*    Total noumber of requests
*    Average time taken to serve the request
*    Maximum time taken to serve the request
*    Minimum time taken to serve the request
*    Total number of errors
*    Most recent error

The following example shows the metrics for REST API requests: 

$ioxclient plt cafmet RESTAPI Currently active profile : def_newapi Command Name: plt-cafmetrics -------------CAF Metrics---------------- { "/iox/api/v2/hosting/apps": { "error_cnt": { "count": 1 }, "time": { "75_percentile": 0.510253, "95_percentile": 0.510253, "999_percentile": 0.510253, "99_percentile": 0.510253, "avg": 0.17064899999999997, "count": 3, "max": 0.510253, "min": 0.000562, "std_dev": 0.2941058293148913 } }, "/iox/api/v2/hosting/apps/lxc/state": { "error_cnt": { "count": 1 }, "time": { "75_percentile": 0.462845, "95_percentile": 0.462845, "999_percentile": 0.462845, "99_percentile": 0.462845, "avg": 0.231929, "count": 2, "max": 0.462845, "min": 0.001013, "std_dev": 0.3265645389689456 } } }

### CAF Metrics SYSTEM
The SYSTEM metric provides the following details about system health and system resources.

*  CPU
*    CPU usage percentage
*    Statistics for each CPU
*    CPU stats for system, user, guest
*  Memory
*    Memory usage percentage
*    Physical memory stats - used, free, total
*    Virtual memory stats - used, free, total
*    Swap memory stats - used, free, total
*  Disk
*    Disk usage stats for each mounted partition - free, used, total
*    Number of reads, write
*  Network
*    Network statistics for each interface
*    Bytes received and sent
*    Errors - in, out
*    Packets - sent, received
*    Drops - in, out
*  System load
*    Average in 1 minute, 5 minutes, 15 minutes

The following example shows the metrics for cpu, disk, network, system

ioxclient plt cafmet SYSTEM Currently active profile : def_newapi Command Name: plt-cafmetrics -------------CAF Metrics---------------- { "cpu": { "cpu_percent": { "75_percentile": 1, "95_percentile": 2.4299999999999997, "999_percentile": 8, "99_percentile": 8, "avg": 0.8521739130434783, "count": 46, "max": 8, "min": 0, "std_dev": 1.2152254716141786 } }, "cpu0": { "guest": { "value": 0 }, "guest_nice": { "value": 0 }, "idle": { "value": 17766.72 }, "iowait": { "value": 4.99 }, "irq": { "value": 0 }, "nice": { "value": 0 }, "softirq": { "value": 4.52 }, "steal": { "value": 0 }, "system": { "value": 60.41 }, "user": { "value": 165.5 } }, "df-root": { "free": { "value": 2.805202944e+09 }, "total": { "value": 1.9945680896e+10 }, "used": { "value": 1.6103698432e+10 } }, "disk-loop0": { "read_bytes": { "value": 0 }, "read_count": { "value": 0 }, "read_time": { "value": 0 }, "write_bytes": { "value": 0 }, "write_count": { "value": 0 }, "write_time": { "value": 0 } }, "disk-sda1": { "read_bytes": { "value": 1.307874304e+09 }, "read_count": { "value": 58236 }, "read_time": { "value": 22140 }, "write_bytes": { "value": 1.336438784e+09 }, "write_count": { "value": 33865 }, "write_time": { "value": 13348 } }, "global": { "uptime": { "value": 18065 } }, "loadavg": { "15min": { "value": 0.05 }, "1min": { "value": 0 }, "5min": { "value": 0.02 } }, "nic-eth0": { "bytes_recv": { "value": 8.0668528e+07 }, "bytes_sent": { "value": 9.446851e+06 }, "dropin": { "value": 0 }, "dropout": { "value": 0 }, "errin": { "value": 0 }, "errout": { "value": 0 }, "packets_recv": { "value": 63178 }, "packets_sent": { "value": 30176 } }, "nic-lo": { "bytes_recv": { "value": 6.795144e+06 }, "bytes_sent": { "value": 6.795144e+06 }, "dropin": { "value": 0 }, "dropout": { "value": 0 }, "errin": { "value": 0 }, "errout": { "value": 0 }, "packets_recv": { "value": 7473 }, "packets_sent": { "value": 7473 } }, "phymem": { "active": { "value": 1.91465472e+09 }, "available": { "value": 6.443024384e+09 }, "buffers": { "value": 2.36244992e+08 }, "cached": { "value": 1.18059008e+09 }, "free": { "value": 5.026189312e+09 }, "inactive": { "value": 7.3773056e+08 }, "percent": { "value": 22.6 }, "total": { "value": 8.329035776e+09 }, "usage_percent": { "75_percentile": 3.302940672e+09, "95_percentile": 3.3031110656e+09, "999_percentile": 3.303133184e+09, "99_percentile": 3.303133184e+09, "avg": 3.2841046817391305e+09, "count": 46, "max": 3.303133184e+09, "min": 2.147483647e+09, "std_dev": 2.127291142627017e+07 }, "used": { "value": 3.302846464e+09 } }, "swap": { "free": { "value": 1.071640576e+09 }, "percent": { "value": 0 }, "sin": { "value": 0 }, "sout": { "value": 0 }, "total": { "value": 1.071640576e+09 }, "used": { "value": 0 } }, "virtmem": { "free": { "value": 1.071640576e+09 }, "percent": { "value": 0 }, "sin": { "value": 0 }, "sout": { "value": 0 }, "total": { "value": 1.071640576e+09 }, "usage_percent": { "75_percentile": 0, "95_percentile": 0, "999_percentile": 0, "99_percentile": 0, "avg": 0, "count": 46, "max": 0, "min": 0, "std_dev": 0 }, "used": { "value": 0 } } }


### CAF Metrics CAF
The CAF metric provides the following information that relates to CAF process
*  Threads - Detailed information about the CAF threads and their states
*    Current state, is daemon or not 
*  Processes
*    Number, state, is daemon or not
*  Memory
*    Current memory used
*    Increase in memory, maximum memory used, minimum memory used
*  Garbage collector statistics
*    Object count, reference count, referrent count
The following example shows the CAF metrics for threads, memory and processes

$ioxclient plt cafmet CAF Currently active profile : def_newapi Command Name: plt-cafmetrics -------------CAF Metrics---------------- { "gc": { "collection.count0": { "value": 641 }, "collection.count1": { "value": 11 }, "collection.count2": { "value": 1 }, "objects.count": { "value": 32839 }, "referents.count": { "value": 0 }, "referrers.count": { "value": 0 } }, "memory": { "memory.increase": { "75_percentile": 0, "95_percentile": 0.34746093749999873, "999_percentile": 45.2890625, "99_percentile": 45.2890625, "avg": 0.5328634510869565, "count": 92, "max": 45.2890625, "min": 0, "std_dev": 4.721528918533509 }, "memory.usage": { "value": 49.0234375 } }, "processes": { "alive": { "value": 0 }, "count": { "value": 0 }, "daemon": { "value": 0 } }, "threads": { "CP Server Thread-10.alive": { "value": true }, "CP Server Thread-10.daemon": { "value": true }, "CP Server Thread-11.alive": { "value": true }, "CP Server Thread-11.daemon": { "value": true }, "CP Server Thread-4.alive": { "value": true }, "CP Server Thread-4.daemon": { "value": true }, "CP Server Thread-5.alive": { "value": true }, "CP Server Thread-5.daemon": { "value": true }, "CP Server Thread-6.alive": { "value": true }, "CP Server Thread-6.daemon": { "value": true }, "CP Server Thread-7.alive": { "value": true }, "CP Server Thread-7.daemon": { "value": true }, "CP Server Thread-8.alive": { "value": true }, "CP Server Thread-8.daemon": { "value": true }, "CP Server Thread-9.alive": { "value": true }, "CP Server Thread-9.daemon": { "value": true }, "MainThread.alive": { "value": true }, "MainThread.daemon": { "value": false }, "MonitoringService.alive": { "value": true }, "MonitoringService.daemon": { "value": true }, "total_threads": { "value": 13 } } }

### CAF Metrics HASYNC
The HASYNC metric provides the following information about the CAF HA service:

*   Date and time when the last successful sync was performed 
*   Last successful command
*   Total number of errors
*   Last error
The following example shows the hasync metrics 

$ioxclient plt cafmet HASYNC Currently active profile : def_newapi Command Name: plt-cafmetrics -------------CAF Metrics---------------- { "hasync": { "last_rsync_cmd": { "value": [ "/home/mrathees/CAF/iox-dev/core/caf/scripts/sync_data.sh", "/sw/opt/cisco/caf", "/tmp", "/tmp/sync_exclude.lst", "300", "120" ] }, "last_rsync_at": { "value": "2017-02-08 14:10:47.615738" } } }

### CAF Metrics MONITOR
The MONITOR metric provides the following information about the CAF monitoring service:

*   Last operation performed by the monitoring service
*   Total number of errors
*   Last error
*   Total number of successful operations
The following example shows monitoring metrics:

ioxclient plt cafmt MONITOR Currently active profile : def_newapi Command Name: plt-cafmetrics -------------CAF Metrics---------------- { "monitor": { "last_operation": { "value": "2017-02-07 23:38:06.822355:App:lxc state is changed to start by monitoring service" }, "success_cnt": { "count": 1 } } }

### CAF Metrics RESOURCE
The RESOURCE metric provides the following information about CAF resources and how they are being used:

*   Current resource allocation (memory, CPU, persistent storage)
*   Resource allocation per app
The following example shows resource metrics:

$ioxclient plt cafmet RESOURCE Currently active profile : def_newapi Command Name: plt-cafmetrics -------------CAF Metrics---------------- { "global": { "available_cpu": { "value": 800 }, "available_memory": { "value": 192 }, "available_persistent_disk": { "value": 246 }, "total_cpu": { "value": 1000 }, "total_memory": { "value": 256 }, "total_persistent_disk": { "value": 256 } }, "lxc": { "cpu": { "value": 200 }, "disk": { "value": 10 }, "memory": { "value": 64 } } }

### CAF Metrics PUSHNOT
The PUSHNOT metric provides the following information abou the push/async notification service:

*    Total number of events sent
*    Last event
*    Total number of connection errors
*    Last error
*    Total number of solicited events
*    Total number of unsolicited events
The following example shows push notification metrics:

$ioxclient plt cafmet PUSHNOT Currently active profile : def_newapi Command Name: plt-cafmetrics -------------CAF Metrics---------------- { "websocket": { "last_connection_url": { "value": "wss://10.0.2.16:8444/api/v1/appmgr/notification" }, "connection_error": { "count": 3 }, "first_connection_opened_at": { "value": "2017-01-31 13:28:04.893358" }, "total_connections_closed": { "count": 1 }, "websocket_uptime": { "count": 1, "999_percentile": 569.472773, "99_percentile": 569.472773, "min": 569.472773, "95_percentile": 569.472773, "75_percentile": 569.472773, "std_dev": 0, "max": 569.472773, "avg": 569.472773 }, "last_connection_opened_at": { "value": "2017-01-31 13:28:04.893368" }, "last_connection_tried_at": { "value": "2017-01-31 13:37:29.358293" }, "last_retry_count": { "value": 3 }, "connection_tries": { "count": 2 }, "last_connection_closed_at": { "value": "2017-01-31 13:37:34.366171" }, "last_websocket_uptime": { "value": "0:09:29.472773" }, "total_connection_open_try": { "count": 4 }, "total_connections_opened": { "count": 1 } }, "event": { "total_events": { "count": 7 }, "metrics_cnt": { "count": 5 }, "last_event": { "value": "Source: None. Type : publish_app_metrics. AppID: None Payload [OrderedDict([('host_id', '8A1477CA-73D0-45A6-8637-71B5F2738AC2'), ('resource_usage', [OrderedDict([('app_id', 'python'), ('type', 'APP'), ('app_status', 'RUNNING'), ('timestamp', '2017-01-31 13:36:19'), ('memory', {'current': 9172L, 'unit': 'KB'}), ('cpu', {'current': 0.01, 'unit': 'percent'}), ('network', {'current': 51573, 'unit': 'bytes'}), ('disk', {'current': 0.03, 'unit': 'MB'})])])])]" }, "app_metrics_collect_freq": { "value": "120" }, "last_event_time": { "value": "2017-01-31 13:36:22.908443" }, "app_metrics_publish_freq": { "value": "120" }, "unsoliciated_cnt": { "count": 2 } } }


### CAF Metrics TASKMGR
The TASKMGR metric provides the following task manager statistics:

*    Total number of tasks in queue
*    Execution  time information for each task
*    Status of each task
The following example shows the task manager metrics:

$ioxclient plt cafmet TASKMGR Currently active profile : def_newapi Command Name: plt-cafmetrics -------------CAF Metrics---------------- {
"task-75a6e92c-e2ff-47b4-b78a-5a27f81754e0": {
"runtime": {
"count": 9,
"999_percentile": 0.005129,
"99_percentile": 0.005129,
"min": 0.000036,
"95_percentile": 0.005129,
"75_percentile": 0.0012374999999999999,
"std_dev": 0.001711348769570689,
"max": 0.005129,
"avg": 0.000933111111111111
},
"task_status": {
"value": "queued"
}
},
"task-c265d99c-802f-45cc-9d21-d89b0ea961b6": {
"runtime": {
"count": 1,
"999_percentile": 0.168591,
"99_percentile": 0.168591,
"min": 0.168591,
"95_percentile": 0.168591,
"75_percentile": 0.168591,
"std_dev": 0,
"max": 0.168591,
"avg": 0.168591
}, "task_status": { "value": "queued" } }, "task": { "total_tasks": { "count": 4 } }, "task-605f9e4e-f204-4844-8251-3561159dc65a": { "runtime": { "count": 7, "999_percentile": 0.007237, "99_percentile": 0.007237, "min": 0.000052, "95_percentile": 0.007237, "75_percentile": 0.000108, "std_dev": 0.0027045195154636634, "max": 0.007237, "avg": 0.0011038571428571428 }, "task_status": { "value": "queued" } }, "task-e3fa8afc-f775-4735-8167-62da0540fb92": { "runtime": { "count": 7, "999_percentile": 0.003189, "99_percentile": 0.003189, "min": 0.000039, "95_percentile": 0.003189, "75_percentile": 0.000097, "std_dev": 0.0011817132840559446, "max": 0.003189, "avg": 0.0005094285714285715 }, "task_status": { "value": "queued" } } }


### Resetting CAF
The following command restores will restore the CAF to factory default settings. It also removes all CAF artifacts, including apps and cartridges.

$ ioxclient plt reset Currently active profile : def_newapi Command Name: plt-reset Reset flag set. Reset will be done on next restart.


### Managing core files

Core dumps may get generated on the platform following a nasty issue. You can view, download and manage core files.

Core file commands:

~$ ioxclient platform core NAME: ioxclient platform core - Manage core files on the platform

USAGE: ioxclient platform core command [command options] [arguments...]

COMMANDS: list, li List all existing core files delete, d Delete a corefile snapshot file download, dnld Download a corefile snapshot file help, h Shows a list of commands or help for one command

OPTIONS: --help, -h show help --generate-bash-completion


#### Listing core files on the platform

~$ ioxclient platform core list Currently using profile : default Command Name: plt-core-list No core files found!


#### Downloading core files

~$ ioxclient platform core download


#### Deleting core files

~$ ioxclient platform core delete ```