This section provides information to help you use logs and daemon information and provides useful commands for Appliance troubleshooting. The Appgate SDP appliance runs a customized version of Ubuntu 22.04. The information below is based on standard Linux but does include some information that is more Appgate SDP centric where relevant. The User/device troubleshooting section has more information, particularly in respect to Gateways.
Health warnings and errors
The Collective performs about 50 Site, appliance, and functional healthchecks. The results of these healthchecks are shown in the dashboard in the Sites widget or the appliances widget. From there you can get to the Appliance Health Details, where any warnings or errors will be shown. It is important to take any corrective actions before user access is impacted. Table below suggests some actions to perform when your Collective is reporting that it is unhealthy:
Source | Error Level | Urgency | Message | Action to be taken |
|---|---|---|---|---|
appliance | Offline | High | Current Controller cannot reach the appliance on appliance-hostname:443 | Verify the Appliance is running. Verify the hostname of the Controller(s) are resolvable on the appliance and that port 433 is open to the Controller(s). Verify the time on the appliance. SSH to the appliance and run: "nc -zv controller-hostname 443". That should return succeeded if the Collective is in TCP-SPA mode. If not, verify the network firewall for 'Man In the Middle' interference. |
appliance | Error | High | I/O Stalled | This error indicates an underlying storage issue. The disk I/O stalled for several seconds is an indication of either an availability issue or capacity problem of the underlying storage system. Please verify the hardware diagnostics or check your hypervisor stack for storage issues. |
appliance | Error | Low | I/O Error | This error is caused by the underlying storage system of the hardware or hypervisor. Verify that the storage is working correctly, run storage diagnostics, and verify potential read or write issues on the underlying disk system. I/O errors could lead to data corruption. |
appliance | Error | Medium | Geoip database missing | The appliance cannot download geoip data from https://bin.appgate-sdp.com nor https://updates.maxmind.com. Make sure your appliance has access to the internet and DNS is working correctly. If no geoip data is required or no external server connection is allowed, it can be disabled on the Settings > Global Settings page. |
appliance | Error | Medium | Failed to read ntp status | NTP status cannot be verified. Go to the appliance and verify if the command `sudo ntpq -np` returns any errors. Most likely the appliance has a DNS or connectivity issue, as it cannot receive the current time from the configured NTP servers. |
appliance | Error | High | Failed to perform Healthcheck for <appliance name> | The healthcheck service is not running on this appliance. Check cz-configd logs for more info. |
appliance | Error | Medium | Not connected to any Controller | The appliance is not able to reach any of the Controllers in the Collective. Make sure the appliance can reach Controller TCP port <default 443>. If UDP-SPA is enabled, make sure it can connect to UDP port 53 and 443 of the Controllers. Also, ensure that the time is set the same on the appliances. |
appliance | Error | High | Customization error | The appliance has a broken customization script. Download the appliance logs and verify the logs_by_daemon/cz-customization.log file. |
appliance | Error | High | Stuck initializing cloud instance | The appliance is expecting cloudinit information and is not receiving it. Verify your network settings and check with your cloud provider that it is sending the cloudinit information. Additionally, make sure DHCP settings for DNS and the default Gateway are enabled, as they are typically required in most platforms to receive cloudinit information. |
appliance | Error | Medium | The following services are not running: | This error is generated when certain daemons are not started when they should. Sign in to the appliance and verify the status of the daemon: `sudo systemctl status <daemon name>` |
appliance | Error | Medium | High volume usage <name> [X%] | The specific volume on the disk is >90% full. Check what is taking up space and remove files that are not required, such as old core dumps. |
Controller | Error | High | Unable to connect dbd instance | The Controller cannot reach the database daemon. Check cz-dbs for status and contact support. |
Controller | Error | High | Unexpected state for running Controller | Please contact support. |
Controller | Error | High | IP Pool <name> has X Ips allocated out of Y (Error on 90+% usage) | This Controller is running out of IPs from the IP pool. Check the IP pool page in the admin UI under Identity > IP Pools. Check the currently used IPs vs total size of that pool. If the used IPs is almost equal to the total size, you can either change the lease time to a lower number (to clear out some IPs), or you can add additional ranges. When adding additional ranges, make sure those ranges are routed properly for each Site that is not using SNAT. If the currently used IPs is at least 50% lower than the total size, the reason is probably that your users can only reach one of the Controllers. Make sure you fix the connectivity issue and the new Controller will be able to assign the unused IPs. |
Gateway | Error | High | Failed to query cz-sessiond for status | The cz-sessiond daemon does not seem to be working correctly. Try restarting the daemon with sudo systemctl restart cz-sessiond. |
Gateway | Error | High | Very high number of active connections | The connection tracking table has reached 95%. Once it reaches 100% your user might experience dropped application sessions. Verify the conntrack settings on the appliance with the following command: sudo sysctl -a | grep nf_conntrack and verify _max with _count value. If needed, the max value can be bumped with sudo sysctl -w net.nf_conntrack_max=<New Value>. In 6.1 and 6.2 it requires a customization. In later versions, check the cz-config command to change conntrack limits. |
Portal | Error | High | Failed to query cz-nginx@urlaccess for status | Run sudo systemctl status cz-nginx@urlaccess to check that it is started correctly. |
Portal | Error | High | cz-nginx@urlaccess: Shared memory size is not enough to save all the HTTP up Action objects + the auxiliary data | Add more memory, or reduce the amount of http UP actions. |
LogServer | Error | Medium | Opensearch is down or starting up. | Run sudo systemctl status cz-opensearch to check that it is started correctly. |
LogServer/LogForwarder | Error | Medium | cz-logd: Unable to connect to elasticsearch | LogServer or LogForwarder is unable to communicate with Elasticsearch. Check to see if Elasticsearch is down. |
LogServer/LogForwarder | Error | Medium | cz-logd: Unable to prepare POST request for inserting data into elasticsearch | Unable to post inserts into Elasticsearch. Check connectivity to the http port of Elasticsearch. Check that the configuration of Elasticsearch matches the configuration of the LogForwarder. |
LogServer/LogForwarder | Error | Medium | cz-logd: Unable to create health request for elasticsearch | LogForwarder or LogServer is unable to query health information from Elasticsearch. Check the connectivity and configuration of Elasticsearch, or in the case of a LogServer, if the Elasticsearch service is running at all. |
LogServer/LogForwarder | Error | Medium | cz-logd: Unable to create index into elasticsearch, got status: X | LogServer/LogForwarder is unable to create indexes in Elasticsearch. Check the Elasticsearch configuration. |
LogServer/LogForwarder | Error | Medium | cz-logd: Elasticsearch status is not green/yellow | LogServer/LogForwarder status of Elasticsearch is not green. Check Elasticsearch server status and fix the status to green/yellow. |
LogForwarder | Error | Medium | cz-logd: Unable to get stream, does it exist?, streamname: X, details: Y | LogForwarder is configured with a stream that does not exist in AWS. Check configured name or create it in AWS. |
LogForwarder | Error | Medium | cz-logd: Unable to get delivery stream, does it exist?, streamname: %v, details: %v | LogForwarder is unable to get delivery stream. Check IAM roles for the Kinesis output. |
LogForwarder | Error | Medium | cz-logd: Could not compile filters, X | LogForwarder has been configured with a filter but is not compiling Check the configuration of filters for the LogForwarder. |
LogForwarder | Error | Medium | cz-logd: No credials provided | LogForwarder, Kinesis output: no credentials or faulty credentials provided. Check the Kinesis LogForwarder configuration. |
LogForwarder | Error | Medium | cz-logd: Unable to get AWS region, details: X | LogForwarder is unable to get info about an AWS region. Ensure that the correct region is configured for the Kinesis output. |
LogForwarder | Error | Medium | cz-logd: Unable to create AWS session, details: X | LogForwarder is unable to create an AWS session using the AWS SDK. |
appliance | Warning | Low | Geoip database was last updated X days ago | The appliance missed receiving the latest geoip data from https://bin.appgate-sdp.com or https://updates.maxmind.com. Make sure your appliance has access to the internet and DNS is working correctly. You can manually force an update using sudo /etc/cron.daily/geoIpDbUpdate --force |
appliance | Warning | Medium | High volume usage <name> [X%] | The specific volume on the disk is >75% full. Check what is taking up space and remove files that are not required, such as old core dumps. |
appliance | Warning | Medium | Certificate with subject <name> for <appliance name> has expired. You must replace it now if it's in use. | When an appliance certificate has expired, this warning will appear and the appliance will stop accepting a connection. The certificates automatically renew since version 6.1. So this message would appear only if the appliance was offline. |
appliance | Warning | Medium | Certificate with subject <name> for <appliance name> is expiring. You must replace it before <date>. | There is a 30 day warning when an appliance certificate is about to expire. Press the appliance renew certificate option in the System > Appliances menu. Renewing the certificate will restart all services on this appliance. |
appliance | Warning | Medium | The following services have debug logs enabled: | Running with debug logs enabled may harm performance. Switch back to normal logs as soon as possible. |
appliance | Warning | Low | Configuration from Controller is incompatible with this appliance | The configuration from the Controller does not match the configuration format of this appliance. This might be because of a version incompatibiliy. |
appliance | Warning | Low | cz-ffwd: Unable to connect to X@X (X) | Appliance is unable to connect a websocket connection from the appliance to the LogServer/LogForwarder. Check connectivity between appliances (default TCP port 443, UDP ports 53 and 443). |
appliance | Warning | Low | This system has a CD drive attached | The appliance is running on VMWare and has still a CD drive attached. Go to the VMWare console and remove the attached CD Drive from the virtual machine. |
Controller | Warning | High | There are more X than your license allows. | You went over the amount of users. Access to only the first licensed users will be granted. |
Controller | Warning | High | IP Pool <name> is too small to be utilized by this Controller. | This error occurs if you assign an IP pool that has less IPs then the amount of Controllers. For example, a /30 with 6 Controllers will give this error. |
Controller | Warning | High | Controller appliance certificate does not include the Client profile DNS name <name>. You must renew it to allow Client connections to this Controller. | The Client Profile DNS name included in the Client profile is not present as a SAN in the appliance's certificate. The certificate can be renewed from the Appliances page in the admin UI. |
Controller | Warning | Medium | X of the user licenses are in use. | You are almost running out of user licenses. Contact support or sales to update your license count. |
Controller | Warning | Medium | X of the Portal licenses are in use. | You are almost out of Portal licenses. Contact support or sales to update your license count. |
Controller | Warning | Medium | X of the service licenses are in use. | You are almost out of service licenses. Contact support or sales to update your license count. |
Controller | Warning | Medium | Controller is running in maintenance mode | The Controller is running in maintenance mode due to an ongoing upgrade. If the upgrade failed and your node is still in maintenance mode, you can take it out of maintenance with the following command `sudo cz-config set -j Controller/maintenance false`. Be careful to use only when the upgrade has been cancelled. |
Controller | Warning | Medium | Database node not replicating | This Controller is unable to replicate the database with another Controller. Check the connectivity between the two controllers. Bi-derectional connectivity is required. |
Controller | Warning | Medium | IP Pool <name> has X Ips allocated out of Y (warning between 75-90% usage) | This Controller is running out of IPs from the IP pool. First, check the IP pool page in the admin UI under Identity > IP Pools. Check the Currently used IPs vs total size of that pool. If used IPs is almost equal as the total size, you can either change the lease time to a lower number (to clear out IPs), or you can add additional ranges. When adding additional ranges, make sure those ranges are routed properly for each Site that is not using SNAT. If the currently used IPs is at least 50% lower than the total size, the reason is probably that your users can only reach one of the Controllers. Fix the connectivity issue and the new Controller will be able to assign the unused IPs. |
Controller | Warning | Low | BDR conflict | This error would occur if different Controllers have conflicting versions of the data. This would occur when there was a temporary network connectivity issue between different Controllers. Most of these conflicts will be automatically resolved by accepting the latest update of the record. You can run the following command to resolve the remaining conflicts: `sudo cz-config bdr resolve-conflicts` if it keeps appearing please contact support. |
Controller | Warning | Low | The following are using deprecated Risk Based Access feature. Please migrate them to Condition Based Access. <Entitlement name>, <entitlement name> | The same functionality can be achieved using Conditions and checking the risk score criteria. |
LogForwarder | Warning | Medium | cz-logd: Not connected to X | A output from the LogForwarder named X in the LogForwarder configuration is not connected. Check connectivity from LogForwarder to log destination. |
LogForwarder | Warrning | Medium | cz-logd: Unable to perform log-retention, check access or elasticsearch status. Details: X | LogForwarder has Elasticsearch output configured. But the LogForwarder is unable to perform remove of indexes in the Elasticsearch. Check connectivity or configuration of the Elasticsearch. |
LogForwarder | Warning | Medium | cz-logd: Failed to send logs to X | Connectivity for a http based output is not working. Check connectivity from LogForwarder to output destination X. |
LogForwarder | Warning | Medium | cz-logd: Kinisis was unable to handle all records, not enough shards? | LogForwarder has Kinesis configured for output. Kinesis output is getting throttled, so might require additional shards configured. |
LogForwarder | Warning | Medium | cz-logd: Kinesis error codes: %v | LogForwarder has Kinesis configured for output, but AWS is returning an error code. In many cases these are IAM based errors that need to be fixed in the AWS configuration. |
LogForwarder | Warning | Medium | cz-logd: Firehose was unable to handle all records, throughput exceeded? | LogForwarder has Kinesis-firehose configured and is getting throttled. More resources needed to be configured on the AWS side to handle the load. |
LogForwarder | Warning | Medium | cz-logd: unable to connect to LogForwarding destination %s (%s) (TLS) %v | LogForwarder has TCP based output configured. Check connectivity towards configured output. |
LogForwarder | Warning | Medium | cz-logd: tcp output (%s) is slow, incoming amount of logs exceeds outgoing amout | The amount of generated logs is larger than the amount that is being sent. This might indicate a slow destination SIEM, a large amount of logs are being generated by the appliance, or a very slow connection to the destination SIEM. |
Gateway | Warning | High | Gateway appliance certificate does not include the profile hostname <name>. You must renew it to allow client connections to this gateway. | The Client hostname name included in the configuration is not present as a SAN in the appliance's certificate. The certificate can be renewed from the Appliances page in the admin UI. |
Gateway | Warning | High | cz-sessiond: High watermark event queue | Gateway is struggling to keep up with fw-rules generation and sessions. Make sure you are running version 6.1.x or later, add more Gateways to handle the load, refactor Entitlements to be less demanding, change dynamic rules to be more static. |
Gateway | Warning | High | cz-sessiond: No revocation has been received for X secs | Gateway to Controller communication is not working. The Controller is unable to push revocation list to the Gateway, and the Gateway is unable to pull them from the Controller |
Gateway | Warning | Medium | The following DNS names are unstable: <DNS names> | The DNS names used in the Entitlements are not always returning the same answer. This causes the FW rules to be updated all the time, but also could lead to different IPs being resolved on the Client vs the Gateways. To solve this, create a DNS Policy for the DNA name or domains and add it to the DNS Forwarder configuration. Then replace the dns://<name> in the Entitlement with a *.domain.com<Domain name>. This will make sure the DNS result sent to the Client is the same as the one used by the Gateway. This is needed to address public DNS names. |
Gateway | Warning | Medium | cz-sessiond: The following applications are reported as unhealthy: X, Y, Z | Gateway has flagged the applications X, Y, Z as unhealthy. See manual about App Monitoring feature. |
Connector | Warning | Low | Connector client X: Waiting for configuration Connection failed. | Client can't connect to the Controller. Check the Global Client Profile DNS name and make sure the connector can resolve it. |
Run Commands (admin UI)
Run Commands will open the Remote Commands window.
There are eight limited remote commands available which can be run on this appliance, thus avoiding any immediate requirement to SSH to remote machines to perform basic diagnostics.
| •addressshow •dig •ip route show •netcat •nptq •ping •tcpdump •traceroute |
Most of the commands have a Timeout field that accepts a value in seconds.
NOTE
The max number of concurrently running commands allowed is five.
Daemon Log commands (SSH)
journalctl | To see live logs:
To show a specific service:
To show logs in reverse order :
To show logs since last boot:
To show logs in a specific time range:
All the above flags can be combined. For more information, see: http://manpages.ubuntu.com/manpages/xenial/man1/journalctl.1.html |
SYSLOG |
However, in the case that the binary logs are corrupt, you can fall-back to |
Saving logs | Appgate SDP has two types of log records: daemon logs and audit logs. Daemon logs are used to examine the workings of the Appgate SDP system and audit logs are used to record the actions performed by the system. Logs are automatically saved on the appliance. Logs can be downloaded from System>Appliances for examination locally or copied to another machine using the secure copy command: scp or sftp. Debug log level The types of system events that are stored in the appliance's Debug Log depends on the appliance Debug log level setting. For details of changing Debug log levels, refer to: System Logs > Debug Logs |
Troubleshooting commands (SSH)
You can use the SSH command line to run the following troubleshooting commands:
NOTE
There is a list of more cz-config commands in the cz-setup and cz-config commands section.
Current version |
|
Current state of the appliance |
|
Up time and system load |
|
Reboot the appliance (Some Clients may need to reconnect) |
|
Restart appliance services in lieu of 'reboot' command | sudo service cz-configd restart |
Restart an arbitrary daemon |
|
Bypass root (requires access to GRUB menu) | Reboot the appliance. Press 'e' to edit the GRUB menu. Append
|
Collect appliance diagnostics |
This command will collect a full set of appliance diagnostic information, including license usage, and save it to /tmp/cz-system-info.txt.gz. The option --full dumps additional information about the state of certain processes. The file will be owned by root. Alternatively, running the command(s) below will make the file be owned by the cz user which may make it easier to scp or sftp it from the appliance.
It can be downloaded from there using SCP; example command:
|
Remove a core dump (and warning in dashboard) | Core dumps are under /mnt/data/core so sudo rm (remove) any files from there. |
Update the geoIP database now |
|
View the memory available and used on the appliance |
|
View the processes, for example those using the most CPU |
|
View running processes |
Alternatively, issue:
|
View the current system firewall rules |
For IPv6:
|
View network interface addresses |
For IPv6:
|
View routes |
For IPv6:
|
View configuration files | Daemons: The configuration file for each Appgate SDP daemon is stored under
System: the current combined system configuration is stored under
To view these files use the commands jql (local) and jqr (remote) from anywhere in the file system. The previous combined system configuration is stored under |
These advanced tools are detailed here to help recover from situations when add/remove of a Controller has failed. Please contact support before trying to use any of these tools. |
Display human readable BDR status for all controller databases
Show also the nodes already parted when showing the status
Exclude RAFT status
Output JSON
Table format when not outputting JSON. Use psql for compatibility with terminals like putty (VALUES: fancy_grid, psql)--show-parted-nodes
Force appliance to single_controller_ready state, use with caution
Force appliance to appliance_ready state, use with caution
Clear all BDR barriers, use with caution
Forcefully remove a node from BDR on the current node
Take the current BDR leave barrier in the name of a dead node
Enable IP allocation for current node
Disable IP allocation for current node
Re-partition IP allocations to match current controllers
Run the query clean-up query on this node only, by default it runs on every node
Don't delete the conflict history
Access help |
System internals (excluding LogServer) showing Daemons

Purple | Represents the communication between appliances inside the Collective. All communication occurs between ports 443 mTLS. |
Blue | Represents all user metadata flows. The Client connects on 443 mTLS to the Controller and/or Gateway, where the initial packet is handled by spaD and proxyD. Subsequent packets are then routed using unix domain sockets. |
Red | Represents user application traffic that travels over the mTLS tunnel. |
Black | Internal appliance traffic between daemons. |
Green | Represents appliance initiated traffic to do DNS resolving, API connectivity, or ARP traffic. |
.png?sv=2022-11-02&spr=https&st=2026-04-17T02%3A28%3A30Z&se=2026-04-17T02%3A56%3A30Z&sr=c&sp=r&sig=XDu3HG4ZREi9kdudhLcB4dlXLxesP31CMxaGvEx9YnA%3D)
