Error Codes and Troubleshooting of Nodes
Understanding common errors and their solutions helps maintain a healthy node operation.
Common Error Codes
Here are the most frequent errors you might encounter and their solutions:
Consensus Errors
When you encounter consensus errors, quick and appropriate action is essential:
Error: "Consensus failure - height halted"
Solution: Check for network upgrades or chain halts
Command: seid status | jq .SyncInfo
Error: "Private validator file not found"
Solution: Restore validator key or check file permissions
Location: $HOME/.sei/config/priv_validator_key.json
Error: "Duplicate signature"
Solution: IMMEDIATELY STOP NODE - potential double signing risk
Action: Check validator operation on other machines
Network Errors
Network errors can prevent your node from participating in consensus:
Error: "Dial tcp connection refused"
Solution: Check network connectivity and firewall rules
Commands:
- netstat -tulpn | grep seid
- ufw status
Error: "No peers available"
Solution: Verify peer connections and network config
Commands:
- curl localhost:26657/net_info
Database Errors
Database corruption can require immediate attention:
Error: "Database is corrupted"
Solution: Reset database or restore from backup
Commands:
- seid tendermint unsafe-reset-all
- cp -r backup/data $HOME/.sei/
Diagnostic Commands
These commands help you investigate issues and monitor your node:
# Check node synchronization
seid status | jq '.sync_info'
# Check validator status
seid query staking validator $(seid tendermint show-validator)
# Monitor real-time logs
journalctl -fu seid -o cat
# View system resource usage
top -p $(pgrep seid)
AppHash Mismatch Errors
If you encounter an AppHash mismatch, you’ll need to capture the state for comparison with a known good version:
# For SeiDB (most non-archive nodes):
git clone https://github.com/sei-protocol/sei-db.git
cd sei-db/tools
make install
systemctl stop seid
seidb dump-iavl -d $HOME/.sei/data/committer.db -o /home/ubuntu/iavl-dump
systemctl restart seid
# For Legacy IAVL DB:
seid debug dump-iavl <latest height>
Always include the app hash, commit hash, and block height from your logs when reporting issues.
Identifying AppHash Errors
AppHash errors typically appear in logs as:
ERR wrong Block.Header.AppHash. Expected [EXPECTED_HASH], got [ACTUAL_HASH]
block_id={"hash":"...","parts":{"hash":"...","total":1}} height=[HEIGHT]
Common Causes:
- Using incorrect node version during sync (ensure you’re on the latest version)
- Corrupted or incorrectly applied snapshots
- Database inconsistencies from improper shutdowns
- Syncing with outdated or incompatible peers
Resolution Steps:
-
Stop the node immediately.
-
Try a node rollback first:, see here
-
If rollback fails, restore from a fresh snapshot:
- Download a recent snapshot from trusted providers (Polkachu, PublicNode)
- Ensure you’re using the correct node version
- Verify peer configurations are up to date
-
Restart the node and monitor logs for continued errors
Peer Connection Issues as AppHash Red Herrings
Important Note: Peer connection failures are often symptoms of underlying AppHash errors, not the root cause.
When you see extensive peer connection errors like:
ERR failed to handshake with peer
ERR failed to send request for peers
ERR peer handshake failed endpoint={} err=EOF
Don’t focus solely on fixing peer connections first. Instead:
- Scan your logs carefully for AppHash errors that may appear intermittently
- Look for the actual error pattern:
ERR wrong Block.Header.AppHash. Expected [HASH], got [HASH]
- Check if your node is stuck at a specific height despite peer connection attempts
Why This Happens:
- AppHash mismatches prevent proper block validation
- Node cannot advance to new blocks due to state inconsistency
- Peers may reject connections from nodes with corrupted state
- Network appears to be the problem when it’s actually a local state issue
Debugging Approach:
- First, check for AppHash errors in your logs (search for “wrong Block.Header.AppHash”)
- If AppHash errors are found, treat this as the primary issue
- Only focus on peer connection fixes if no AppHash errors exist
This approach can save hours of debugging time by addressing the root cause rather than symptoms.
Peer Connection and Handshake Issues
Identifying Peer Issues:
Look for these error patterns in your logs:
ERR failed to handshake with peer err="expected to connect with peer \"[EXPECTED_ID]\", got \"[ACTUAL_ID]\""
ERR failed to send request for peers err="no available peers to send a PEX request to (retrying)"
ERR peer handshake failed endpoint={} err=EOF module=p2p
Common Causes:
- Outdated peer configurations with mismatched node IDs
- Network infrastructure changes on peer side
- Firewall blocking connections on port 26656
- DNS resolution issues
Resolution Steps:
-
Update peer configurations with current node IDs:
# Example updated persistent peers for Sei mainnet persistent-peers = "3be6b24cf86a5938cce7d48f44fb6598465a9924@p2p.state-sync-0.pacific-1.seinetwork.io:26656,b21279d7092fde2e41770832a1cacc7d0051e9dc@p2p.state-sync-1.pacific-1.seinetwork.io:26656,616c05e9ba24acc89c0de630b5e3adbedaebb478@p2p.state-sync-2.pacific-1.seinetwork.io:26656"
-
Verify network connectivity:
# Test connection to peer endpoints nc -zv p2p.state-sync-0.pacific-1.seinetwork.io 26656 # Check if port 26656 is open for inbound connections netstat -tulpn | grep :26656
-
Check current peer status:
curl http://localhost:26657/net_info | jq '.result.peers | length' curl http://localhost:26657/lag_status | jq .
Sync Performance Issues
Identifying Sync Problems:
Monitor these indicators:
# Check sync status and lag
curl http://localhost:26657/lag_status | jq .
# Monitor if height is progressing
curl http://localhost:26657/status | jq '.result.sync_info'
Common Solutions:
-
Increase packet payload size for large block processing:
# In config.toml [p2p] section max-packet-msg-payload-size = 1024000 # Increase from default 102400
-
Optimize mempool settings in
config.toml
:# In [mempool] section keep-invalid-txs-in-cache = true ttl-duration = "5s" ttl-num-blocks = 5
-
If node gets stuck at specific height:
- Try restarting the node
- If restart doesn’t help, perform rollback
- Consider taking a fresh snapshot
Warning Signs to Watch For:
- Current height not increasing over time
- Increasing lag between current height and max peer height
- Repeated timeout errors in logs
- Mempool size consistently reaching limits
Crash and Panic Debugging
For crashes, panics, or nil pointer exceptions:
- Capture at least 1,000 lines of logs leading up to the crash
- Or collect 15 minutes of log data, whichever provides more context
- Include the full stack trace if available
Logging Configuration
Proper logging configuration is essential for debugging and monitoring:
# In config.toml
# Set appropriate log level
log_level = "debug" # Use "trace" for maximum detail
# Choose log format
log_format = "json" # Use "plain" for human-readable logs
Configure log rotation to manage storage effectively:
# Example logrotate configuration
sudo tee /etc/logrotate.d/seid << EOF
/var/log/seid/*.log {
daily
rotate 14
compress
delaycompress
notifempty
create 0640 sei sei
sharedscripts
postrotate
systemctl reload seid
endscript
}
EOF
Enable core dumps for crash analysis:
# Set unlimited core dump size
ulimit -c unlimited
# Configure core dump location
echo "/tmp/core.%e.%p" > /proc/sys/kernel/core_pattern
Other common Issues and Fixes
-
Sync Problems
- Check available disk space (
df -h
) - Ensure proper peer connections (
curl http://localhost:26657/net_info
) - Verify firewall settings (port 26656 open)
- Check available disk space (
-
Performance Issues
- Monitor system resources (
htop
oriotop
) - Check disk I/O performance (
iostat
) - Analyze network traffic (
iftop
)
- Monitor system resources (
-
Database Issues
-
Run database integrity checks using:
seid debug dump-db | grep -i error
If errors are detected, consider restoring from a recent backup.
-
Consider pruning excessive historical data by adjusting
ss-keep-recent
inapp.toml
or running:seid unsafe-reset-all --home=$HOME/.sei --keep-addr-book
Alternatively, manually remove old state snapshots to free up space:
rm -rf $HOME/.sei/data/snapshots/*
-
Node Rollback
To rollback a node from an AppHashed state, you need to stop the node first. Do this in your preferred way.
Next, do a soft rollback with:
seid rollback
And then a hard rollback with:
seid rollback --hard
Then, restart the node.
In case you see the following error while trying to rollback:
failed to initialize database: resource temporarily unavailable
This means that you did not shutdown the node properly. Try to shutdown or kill the seid
process directly in that case. If this doesn’t help, restart your machine.
Then try the rollback steps again.