Debuggability and Diagnostics

Debuggability

Debuggability provides information about the errors that occur during various operations on the platform.

~$ ioxclient  platform errors -h
NAME:
   ioxclient platform errors - Get platform errors

USAGE:
   ioxclient platform errors command [command options] [arguments...]

COMMANDS:
   list, li             Get most recent CAF errors.
   forward, fwd         List errors from the beginning
   detail, d            Get details about a record
   filter, flt          Display records based on a filter phrase
   statistics, stat     Get statistics
   help, h              Shows a list of commands or help for one command

OPTIONS:
   --help, -h                   show help
   --generate-bash-completion

Display platform errors in reverse order

To display records with most recent errors starting first, use the below command. This command defaults the number of records displayed to 10.

~$ioxclient platform errors list
Currently active profile :  mycaf
Command Name:  plt-errors-list
Defaulting number of records to 10
Record: 148
ERROR   2017-11-14 23:45:31,101 [controller.py:2412 - _upgradeConnector()] Exception while upgrading app: <type 'exceptions.IOError'>
Record: 147
ERROR   2017-11-14 23:45:30,197 [controller.py:2367 - _upgradeConnector()] Exception while deploying new package : [Errno 28
Record: 146
ERROR   2017-11-14 23:45:30,180 [controller.py:2119 - _deployConnector()] Exception while deploying connector:paas. Exception:[Errno 28
Record: 145
ERROR   2017-11-14 23:42:51,871 [libvirtcontainer.py:148 - start()] Failed to start container LXC Container : Name: lxcapp, UUID: aac8191f-2a0e-4183-ac14-225b54a6bdd8, Error: unsupported configuration: Unable to find security driver for model smack
Record: 144
ERROR   2017-11-14 23:42:08,130 [controller.py:1596 - _activateConnector()] Error in Activating the container , cause App Activation error: Platform does not have enough cpu resource, available cpu 132
Record: 143
ERROR   2017-11-14 23:42:08,129 [controller.py:1446 - _activateConnector()] Error in check resources availability Platform does not have enough cpu resource, available cpu 132
Record: 142
ERROR   2017-11-14 23:41:24,267 [controller.py:1596 - _activateConnector()] Error in Activating the container , cause App Activation error: Platform does not have enough cpu resource, available cpu 132
Record: 141
ERROR   2017-11-14 23:41:24,266 [controller.py:1446 - _activateConnector()] Error in check resources availability Platform does not have enough cpu resource, available cpu 132
Record: 140
ERROR   2017-11-14 23:41:20,907 [controller.py:1596 - _activateConnector()] Error in Activating the container , cause App Activation error: Platform does not have enough cpu resource, available cpu 132
Record: 139
ERROR   2017-11-14 23:41:20,906 [controller.py:1446 - _activateConnector()] Error in check resources availability Platform does not have enough cpu resource, available cpu 132
Press enter to list more records, or record number to see details about the record, or q to quit:

A different number can be specified after the list command to display any other number of records.

~$ioxclient platform errors list 5
Currently active profile :  mycaf
Command Name:  plt-errors-list
Record: 148
ERROR   2017-11-14 23:45:31,101 [controller.py:2412 - _upgradeConnector()] Exception while upgrading app: <type 'exceptions.IOError'>
Record: 147
ERROR   2017-11-14 23:45:30,197 [controller.py:2367 - _upgradeConnector()] Exception while deploying new package : [Errno 28
Record: 146
ERROR   2017-11-14 23:45:30,180 [controller.py:2119 - _deployConnector()] Exception while deploying connector:paas. Exception:[Errno 28
Record: 145
ERROR   2017-11-14 23:42:51,871 [libvirtcontainer.py:148 - start()] Failed to start container LXC Container : Name: lxcapp, UUID: aac8191f-2a0e-4183-ac14-225b54a6bdd8, Error: unsupported configuration: Unable to find security driver for model smack
Record: 144
ERROR   2017-11-14 23:42:08,130 [controller.py:1596 - _activateConnector()] Error in Activating the container , cause App Activation error: Platform does not have enough cpu resource, available cpu 132
Press enter to list more records, or record number to see details about the record, or q to quit: 144

Record number of any of the records can be entered in the above prompt to see the details of that record.

Display platform errors in ascending order

To display records with the initial errors starting first, use the below command. This command defaults the number of records displayed to 10.

~$ioxclient platform errors forward
Currently active profile :  mycaf
Command Name:  plt-errors-forward
Defaulting number of records to 10
Record: 137
ERROR   2017-11-14 23:32:12,051 [controller.py:1446 - _activateConnector()] Error in check resources availability Platform does not have enough cpu resource, available cpu 132
Record: 138
ERROR   2017-11-14 23:32:12,052 [controller.py:1596 - _activateConnector()] Error in Activating the container , cause App Activation error: Platform does not have enough cpu resource, available cpu 132
Record: 139
ERROR   2017-11-14 23:41:20,906 [controller.py:1446 - _activateConnector()] Error in check resources availability Platform does not have enough cpu resource, available cpu 132
Record: 140
ERROR   2017-11-14 23:41:20,907 [controller.py:1596 - _activateConnector()] Error in Activating the container , cause App Activation error: Platform does not have enough cpu resource, available cpu 132
Record: 141
ERROR   2017-11-14 23:41:24,266 [controller.py:1446 - _activateConnector()] Error in check resources availability Platform does not have enough cpu resource, available cpu 132
Record: 142
ERROR   2017-11-14 23:41:24,267 [controller.py:1596 - _activateConnector()] Error in Activating the container , cause App Activation error: Platform does not have enough cpu resource, available cpu 132
Record: 143
ERROR   2017-11-14 23:42:08,129 [controller.py:1446 - _activateConnector()] Error in check resources availability Platform does not have enough cpu resource, available cpu 132
Record: 144
ERROR   2017-11-14 23:42:08,130 [controller.py:1596 - _activateConnector()] Error in Activating the container , cause App Activation error: Platform does not have enough cpu resource, available cpu 132
Record: 145
ERROR   2017-11-14 23:42:51,871 [libvirtcontainer.py:148 - start()] Failed to start container LXC Container : Name: lxcapp, UUID: aac8191f-2a0e-4183-ac14-225b54a6bdd8, Error: unsupported configuration: Unable to find security driver for model smack
Record: 146
ERROR   2017-11-14 23:45:30,180 [controller.py:2119 - _deployConnector()] Exception while deploying connector:paas. Exception:[Errno 28
Press enter to list more records, or record number to see details about the record, or q to quit:

A different number can be specified after the list command to display any other number of records.

~$ioxclient platform errors forward 5
Currently active profile :  mycaf
Command Name:  plt-errors-forward
Record: 137
ERROR   2017-11-14 23:32:12,051 [controller.py:1446 - _activateConnector()] Error in check resources availability Platform does not have enough cpu resource, available cpu 132
Record: 138
ERROR   2017-11-14 23:32:12,052 [controller.py:1596 - _activateConnector()] Error in Activating the container , cause App Activation error: Platform does not have enough cpu resource, available cpu 132
Record: 139
ERROR   2017-11-14 23:41:20,906 [controller.py:1446 - _activateConnector()] Error in check resources availability Platform does not have enough cpu resource, available cpu 132
Record: 140
ERROR   2017-11-14 23:41:20,907 [controller.py:1596 - _activateConnector()] Error in Activating the container , cause App Activation error: Platform does not have enough cpu resource, available cpu 132
Record: 141
ERROR   2017-11-14 23:41:24,266 [controller.py:1446 - _activateConnector()] Error in check resources availability Platform does not have enough cpu resource, available cpu 132
Press enter to list more records, or record number to see details about the record, or q to quit:

Display detailed information about a record

A detailed information including traceback and errors that follow a certain error can be viewed using the below command. The number of the record must be specified as an argument to the command.

~$ioxclient platform errors detail
NAME:
   detail - Get details about a record

USAGE:
   command detail <error_id>

Example: ~$ioxclient platform errors detail 137
Currently active profile :  mycaf
Command Name:  plt-errors-detail

                ----LEADING LINES----
2017-11-14 23:32:12,044 [runtime.hosting:DEBUG] [Thread-14] [resourcemanager.py:263 - construct_cpu_shares_from_cpu_units()] app cpu shares : 6831
2017-11-14 23:32:12,045 [runtime.hosting:INFO] [Thread-14] [resourcemanager.py:807 - resolve_app_resource_dependencies()] Network manifest: [{'ipv6_required': True, 'interface-name': 'eth0', 'ports': {'udp': [10000], 'tcp': [9000]}}]
2017-11-14 23:32:12,046 [runtime.hosting:DEBUG] [Thread-14] [resourcemanager.py:814 - resolve_app_resource_dependencies()] Associating eth0 with network iox-bridge0
2017-11-14 23:32:12,047 [runtime.hosting:DEBUG] [Thread-14] [resourcemanager.py:913 - resolve_app_resource_dependencies()] Updated resources: {'profile': u'custom', 'network': [{'interface-name': 'eth0', 'port_map': None, 'network-name': u'iox-bridge0', 'ipv6_required': True, 'mode': None, 'ipv6': {}, 'ports': {'udp': [10000], 'tcp': [9000]}, 'ipv4': {}}], 'memory': 64, 'vcpu': 1, 'disk': 2, 'cpu': 6831}
2017-11-14 23:32:12,047 [runtime.hosting:DEBUG] [Thread-14] [controller.py:1432 - _activateConnector()] App resources: {'profile': u'custom', 'network': [{'interface-name': 'eth0', 'port_map': None, 'network-name': u'iox-bridge0', 'ipv6_required': True, 'mode': None, 'ipv6': {}, 'ports': {'udp': [10000], 'tcp': [9000]}, 'ipv4': {}}], 'memory': 64, 'vcpu': 1, 'disk': 2, 'cpu': 6831}
2017-11-14 23:32:12,048 [runtime.hosting:DEBUG] [Thread-14] [controller.py:1436 - _activateConnector()] Resolved app resource dependencies: {'profile': u'custom', 'network': [{'interface-name': 'eth0', 'port_map': None, 'network-name': u'iox-bridge0', 'ipv6_required': True, 'mode': None, 'ipv6': {}, 'ports': {'udp': [10000], 'tcp': [9000]}, 'ipv4': {}}], 'memory': 64, 'vcpu': 1, 'disk': 2, 'cpu': 6831}
2017-11-14 23:32:12,048 [runtime.hosting:DEBUG] [Thread-14] [controller.py:1439 - _activateConnector()] Checking resource availability..
2017-11-14 23:32:12,049 [runtime.hosting:DEBUG] [Thread-14] [resourcemanager.py:271 - construct_cpu_units_from_shares()] construct cpu units
2017-11-14 23:32:12,050 [runtime.hosting:DEBUG] [Thread-14] [resourcemanager.py:276 - construct_cpu_units_from_shares()] total cpu:732.0 parent cpu shares: 10000.0 app shares: 6831 app_cpu_units: 500
2017-11-14 23:32:12,050 [runtime.hosting:DEBUG] [Thread-14] [platformcapabilities.py:192 - check_runtime_resource_availability()] check runtime resources, requested by App, cpu: 500, memory: 64

                ----RECORD LINE----
2017-11-14 23:32:12,051 [runtime.hosting:ERROR] [Thread-14] [controller.py:1446 - _activateConnector()] Error in check resources availability Platform does not have enough cpu resource, available cpu 132

                ----FOLLOWING LINES----
2017-11-14 23:32:12,052 [runtime.hosting:ERROR] [Thread-14] [controller.py:1596 - _activateConnector()] Error in Activating the container , cause App Activation error: Platform does not have enough cpu resource, available cpu 132
Traceback (most recent call last):
  File "/tmp/tmpdrDwuj/ir829/caf/src/appfw/runtime/controller.py", line 1447, in _activateConnector
AppActivationError: App Activation error: Platform does not have enough cpu resource, available cpu 132
2017-11-14 23:32:12,053 [runtime.hosting:DEBUG] [Thread-14] [controller.py:1637 - _deactivateConnector()] DeActivating connector testLxc
2017-11-14 23:32:12,054 [runtime.hostingmanager:DEBUG] [Thread-14] [hostingmgmt.py:129 - get_instance()] Hosting manager initialization not completed
2017-11-14 23:32:12,054 [runtime.hostingmanager:DEBUG] [Thread-14] [hostingmgmt.py:129 - get_instance()] Hosting manager initialization not completed
2017-11-14 23:32:12,055 [runtime.hostingmanager:DEBUG] [Thread-14] [hostingmgmt.py:129 - get_instance()] Hosting manager initialization not completed
2017-11-14 23:32:12,056 [runtime.hostingmanager:DEBUG] [Thread-14] [hostingmgmt.py:129 - get_instance()] Hosting manager initialization not completed
2017-11-14 23:32:12,056 [utils:DEBUG] [Thread-14] [utils.py:1005 - wrappedFunction()] Synchronizing function 'deactivateConnector' with args '(<appfw.runtime.connectorwrapper.ConnectorWrapper object at 0x7f9308090d10>, False), {}'
2017-11-14 23:32:12,057 [utils:DEBUG] [Thread-14] [utils.py:986 - wrappedFunction()] Synchronizing function 'setConnectorState' with args '(<appfw.runtime.connectorwrapper.ConnectorWrapper object at 0x7f9308090d10>, 'DEPLOYED'), {}'
2017-11-14 23:32:12,057 [utils:DEBUG] [Thread-14] [utils.py:1009 - wrappedFunction()] Done synchronizing on function 'deactivateConnector'
2017-11-14 23:32:12,058 [runtime.hosting:DEBUG] [MonitoringService] [monitoring.py:428 - _update_app_infomap()] Updating the apps map

Filter platform errors based on category or user-input

Platform records can be filtered based on whether they are critical records, errors, warnings. The records can also be filtered based on a user input keyword.


~$ ioxclient platform errors filter
NAME:
   ioxclient platform errors filter - Display records based on a filter phrase

USAGE:
   ioxclient platform errors filter command [command options] [arguments...]

COMMANDS:
   critical, cr         Filter critical records
   error, err           Filter error records
   warning, warn        Filter warning records
   keyword, key         Filter based on the user-input keyword provided. Accepts regular expressions
   help, h              Shows a list of commands or help for one command

OPTIONS:
   --help, -h                   show help
   --generate-bash-completion

Display platform statistics

This command displays statistics on errors, local time and device uptime.

~$ioxclient platform errors statistics
Currently active profile :  mycaf
Command Name:  plt-errors-statistics

caf_uptime:     3d 20:43:59
device_uptime:  3d 20:44:8
last_timestamp_str:     2017-11-09 16:15:14,966
local_time_str: 2017-11-13 12:26:04 UTC
num_critical_last_caf:  0
num_error_last_caf:     27
num_warning_last_caf:   41

Diagnostics

Diagnostics can be run on the platform to see information about disk usage, memory, networking etc. Use the below command to run diagnostics.

~$ ioxclient platform diagnostics
Currently active profile :  mica
Command Name:  plt-diagnostics
1. summary
2. memory
3. disk
4. process
5. networking
6. application
7. all
Choose the diagnostic [1-7] : 2
Do you want detailed output (y/n)?n
eid: iox-ir809-11
pfm: IR809G-LTE-GA-K9
s/n: JMX2020X021
boot: 2017-11-09 02:46:00
time: 2017-11-14 23:57:50
load: 23:57:50 up 5 days, 21:11, 2 users, load average: 0.00, 0.07, 0.09

--Free Memory--
 total used free shared buff/cache available
Mem: 936 102 422 12 412 787

--Meminfo--
MemTotal: 959484 kB
MemFree: 432636 kB
MemAvailable: 806760 kB

--Top Memory Usage--
 PPID PID %CPU %MEM RSS TIME STIME CMD
 1 12008 0.4 8.3 80040 00:34:46 Nov09 python /home/root/iox/caf/scripts/startup.pyc /home/root/iox/caf/config/system-config.ini /home/root/iox/caf/config/log-config.ini
 1 726 0.0 2.0 19916 00:04:18 Nov09 python /home/root/fap/tpmc.py
 1 11220 0.0 1.2 12084 00:00:49 Nov09 /usr/sbin/libvirtd --daemon --listen
 1 30604 0.0 0.8 7692 00:00:00 23:41 /usr/lib64/libvirt/libvirt_lxc --name nt02Test --console 22 --security=none --handshake 25 --veth vnet1
 1 623 0.0 0.5 5612 00:00:00 Nov09 /sbin/klogd

Detailed version of any of the diagnostics can also be seen using the below inputs.

~$ ioxclient platform diagnostics
Currently active profile :  mica
Command Name:  plt-diagnostics
1. summary
2. memory
3. disk
4. process
5. networking
6. application
7. all
Choose the diagnostic [1-7] : 3
Do you want detailed output (y/n)?y

eid: iox-ir809-11
pfm: IR809G-LTE-GA-K9
s/n: JMX2020X021
boot: 2017-11-09 02:46:00
time: 2017-11-14 23:55:14
load: 23:55:14 up 5 days, 21:09, 2 users, load average: 0.01, 0.11, 0.12

--Free Disk--
Filesystem             1024-blocks   Used Available Capacity Mounted on
/dev/root                   407321 255478    130391      67% /oldroot
devtmpfs                    478876    216    478660       1% /dev
tmpfs                           40      0        40       0% /oldroot/mnt/.psplash
tmpfs                       479740    200    479540       1% /run
tmpfs                       479740   7944    471796       2% /var/volatile
tmpfs                       479740   4552    475188       1% /oldroot/ovfs-rw
none                        479740   4552    475188       1% /
/dev/sdb                    807088 443988    321272      59% /software
cgroup                      479740      0    479740       0% /sys/fs/cgroup
/dev/mapper/caf_cc_dev        6907     83      6415       2% /software/caf/CC
/dev/loop1                    2989     34      2666       2% /software/caf/work/repo-lxc/lxc-data/nt02Test
/dev/loop2                  114877   9618     99327       9% /software/caf/work/repo-lxc/nt02Test/rootfs_mnt
/dev/loop3                    2989     34      2666       2% /software/caf/work/repo-lxc/lxc-data/lxcapp
/dev/loop4                   83956  68889     15067      83% /software/caf/work/repo-lxc/lxcapp/rootfs_mnt

--Mount--
/dev/sda1 on /oldroot type ext3 (ro,noatime,errors=continue,user_xattr,acl,barrier=1,data=ordered)
devtmpfs on /oldroot/dev type devtmpfs (rw,relatime,size=478876k,nr_inodes=119719,mode=755)
sysfs on /oldroot/sys type sysfs (rw,relatime)
proc on /oldroot/proc type proc (rw,relatime)
tmpfs on /oldroot/mnt/.psplash type tmpfs (rw,relatime,size=40k)
debugfs on /oldroot/sys/kernel/debug type debugfs (rw,relatime)
tmpfs on /oldroot/run type tmpfs (rw,nosuid,nodev,mode=755)
tmpfs on /oldroot/var/volatile type tmpfs (rw,relatime)
smackfs on /oldroot/sys/fs/smackfs type smackfs (rw,relatime)
devpts on /oldroot/dev/pts type devpts (rw,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /oldroot/ovfs-rw type tmpfs (rw,relatime)
none on / type overlay (rw,relatime,lowerdir=/,upperdir=/ovfs-rw/upperdir,workdir=/ovfs-rw/workdir)
devtmpfs on /dev type devtmpfs (rw,relatime,size=478876k,nr_inodes=119719,mode=755)
sysfs on /sys type sysfs (rw,relatime)
proc on /proc type proc (rw,relatime)
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
tmpfs on /run type tmpfs (rw,nosuid,nodev,mode=755)
tmpfs on /var/volatile type tmpfs (rw,relatime)
smackfs on /sys/fs/smackfs type smackfs (rw,relatime)
devpts on /dev/pts type devpts (rw,relatime,gid=5,mode=620,ptmxmode=000)
/dev/sdb on /software type ext3 (rw,noatime,errors=continue,barrier=1,data=ordered)
/dev/sdb on /oldroot/software type ext3 (rw,noatime,errors=continue,barrier=1,data=ordered)
cgroup on /sys/fs/cgroup type tmpfs (rw,relatime,mode=755)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,relatime,cpuset)
cgroup on /sys/fs/cgroup/cpu type cgroup (rw,relatime,cpu)
cgroup on /sys/fs/cgroup/cpuacct type cgroup (rw,relatime,cpuacct)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,relatime,blkio)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,relatime,memory)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,relatime,devices)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,relatime,freezer)
cgroup on /sys/fs/cgroup/net_cls type cgroup (rw,relatime,net_cls)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,relatime,perf_event)
cgroup on /sys/fs/cgroup/debug type cgroup (rw,relatime,debug)
nfsd on /proc/fs/nfsd type nfsd (rw,relatime)
/dev/mapper/caf_cc_dev on /software/caf/CC type ext3 (rw,relatime,errors=continue,user_xattr,acl,barrier=1,data=ordered)
/software/caf/datastore/nt02Test_data.ext4 on /software/caf/work/repo-lxc/lxc-data/nt02Test type ext4 (rw,noatime,data=ordered)
/software/caf/work/repo/nt02Test/extract_archive/app.ext2 on /software/caf/work/repo-lxc/nt02Test/rootfs_mnt type ext2 (rw,noatime,errors=continue,user_xattr,acl)
/software/caf/datastore/lxcapp_data.ext4 on /software/caf/work/repo-lxc/lxc-data/lxcapp type ext4 (rw,noatime,data=ordered)
/software/caf/work/repo/lxcapp/extract_archive/rootfs.img on /software/caf/work/repo-lxc/lxcapp/rootfs_mnt type ext2 (rw,noatime,errors=continue,user_xattr,acl)
tracefs on /oldroot/sys/kernel/debug/tracing type tracefs (rw,relatime)


--Top Disk Usage--
/*:
Error: 651M     /oldroot
25M     /home
11M     /lib64
7.5M    /boot
5.7M    /bin
3.8M    /lib
3.3M    /etc
1.1M    /usr
84K     /var
12K     /lost+found
6.0K    /sbin
4.0K    /mnt
2.0K    /selinux
2.0K    /ovfs-rw
2.0K    /ovfs
2.0K    /media
2.0K    /downloads
du: cannot access '/oldroot/proc/455/task/455/fdinfo/4': No such file or directory
du: cannot access '/oldroot/proc/455/task/455/fd/4': No such file or directory
du: cannot access '/oldroot/proc/455/fdinfo/4': No such file or directory
du: cannot access '/oldroot/proc/455/fd/4': No such file or directory
/software/*:
507M    /software/caf
48K     /software/techsupport
20K     /software/apps
16K     /software/lost+found
12K     /software/downloads
4.0K    /software/tmp
4.0K    /software/backup