Skip to Content
OperateTroubleshooting

Error Codes and Troubleshooting of Nodes

Understanding common errors and their solutions helps maintain a healthy node operation.

Common Error Codes

Here are the most frequent errors you might encounter and their solutions:

Consensus Errors

When you encounter consensus errors, quick and appropriate action is essential:

Error: "Consensus failure - height halted" Solution: Check for network upgrades or chain halts Command: seid status | jq .SyncInfo
Error: "Private validator file not found" Solution: Restore validator key or check file permissions Location: $HOME/.sei/config/priv_validator_key.json
Error: "Duplicate signature" Solution: IMMEDIATELY STOP NODE - potential double signing risk Action: Check validator operation on other machines

Network Errors

Network errors can prevent your node from participating in consensus:

Error: "Dial tcp connection refused" Solution: Check network connectivity and firewall rules Commands: - netstat -tulpn | grep seid - ufw status
Error: "No peers available" Solution: Verify peer connections and network config Commands: - curl localhost:26657/net_info

Database Errors

Database corruption can require immediate attention:

Error: "Database is corrupted" Solution: Reset database or restore from backup Commands: - seid tendermint unsafe-reset-all - cp -r backup/data $HOME/.sei/

Diagnostic Commands

These commands help you investigate issues and monitor your node:

# Check node synchronization seid status | jq '.sync_info' # Check validator status seid query staking validator $(seid tendermint show-validator) # Monitor real-time logs journalctl -fu seid -o cat # View system resource usage top -p $(pgrep seid)

AppHash Mismatch Errors

If you encounter an AppHash mismatch, you’ll need to capture the state for comparison with a known good version:

# For SeiDB (most non-archive nodes): git clone https://github.com/sei-protocol/sei-db.git cd sei-db/tools make install systemctl stop seid seidb dump-iavl -d $HOME/.sei/data/committer.db -o /home/ubuntu/iavl-dump systemctl restart seid # For Legacy IAVL DB: seid debug dump-iavl <latest height>

Always include the app hash, commit hash, and block height from your logs when reporting issues.

Identifying AppHash Errors

AppHash errors typically appear in logs as:

ERR wrong Block.Header.AppHash. Expected [EXPECTED_HASH], got [ACTUAL_HASH] block_id={"hash":"...","parts":{"hash":"...","total":1}} height=[HEIGHT]

Common Causes:

  • Using incorrect node version during sync (ensure you’re on the latest version)
  • Corrupted or incorrectly applied snapshots
  • Database inconsistencies from improper shutdowns
  • Syncing with outdated or incompatible peers

Resolution Steps:

  1. Stop the node immediately.

  2. Try a node rollback first:, see here

  3. If rollback fails, restore from a fresh snapshot:

    • Download a recent snapshot from trusted providers (Polkachu, PublicNode)
    • Ensure you’re using the correct node version
    • Verify peer configurations are up to date
  4. Restart the node and monitor logs for continued errors

Peer Connection Issues as AppHash Red Herrings

Important Note: Peer connection failures are often symptoms of underlying AppHash errors, not the root cause.

When you see extensive peer connection errors like:

ERR failed to handshake with peer ERR failed to send request for peers ERR peer handshake failed endpoint={} err=EOF

Don’t focus solely on fixing peer connections first. Instead:

  1. Scan your logs carefully for AppHash errors that may appear intermittently
  2. Look for the actual error pattern:
    ERR wrong Block.Header.AppHash. Expected [HASH], got [HASH]
  3. Check if your node is stuck at a specific height despite peer connection attempts

Why This Happens:

  • AppHash mismatches prevent proper block validation
  • Node cannot advance to new blocks due to state inconsistency
  • Peers may reject connections from nodes with corrupted state
  • Network appears to be the problem when it’s actually a local state issue

Debugging Approach:

  1. First, check for AppHash errors in your logs (search for “wrong Block.Header.AppHash”)
  2. If AppHash errors are found, treat this as the primary issue
  3. Only focus on peer connection fixes if no AppHash errors exist

This approach can save hours of debugging time by addressing the root cause rather than symptoms.

Peer Connection and Handshake Issues

Identifying Peer Issues:

Look for these error patterns in your logs:

ERR failed to handshake with peer err="expected to connect with peer \"[EXPECTED_ID]\", got \"[ACTUAL_ID]\"" ERR failed to send request for peers err="no available peers to send a PEX request to (retrying)" ERR peer handshake failed endpoint={} err=EOF module=p2p

Common Causes:

  • Outdated peer configurations with mismatched node IDs
  • Network infrastructure changes on peer side
  • Firewall blocking connections on port 26656
  • DNS resolution issues

Resolution Steps:

  1. Update peer configurations with current node IDs:

    # Example updated persistent peers for Sei mainnet persistent-peers = "3be6b24cf86a5938cce7d48f44fb6598465a9924@p2p.state-sync-0.pacific-1.seinetwork.io:26656,b21279d7092fde2e41770832a1cacc7d0051e9dc@p2p.state-sync-1.pacific-1.seinetwork.io:26656,616c05e9ba24acc89c0de630b5e3adbedaebb478@p2p.state-sync-2.pacific-1.seinetwork.io:26656"
  2. Verify network connectivity:

    # Test connection to peer endpoints nc -zv p2p.state-sync-0.pacific-1.seinetwork.io 26656 # Check if port 26656 is open for inbound connections netstat -tulpn | grep :26656
  3. Check current peer status:

    curl http://localhost:26657/net_info | jq '.result.peers | length' curl http://localhost:26657/lag_status | jq .

Sync Performance Issues

Identifying Sync Problems:

Monitor these indicators:

# Check sync status and lag curl http://localhost:26657/lag_status | jq . # Monitor if height is progressing curl http://localhost:26657/status | jq '.result.sync_info'

Common Solutions:

  1. Increase packet payload size for large block processing:

    # In config.toml [p2p] section max-packet-msg-payload-size = 1024000 # Increase from default 102400
  2. Optimize mempool settings in config.toml:

    # In [mempool] section keep-invalid-txs-in-cache = true ttl-duration = "5s" ttl-num-blocks = 5
  3. If node gets stuck at specific height:

    • Try restarting the node
    • If restart doesn’t help, perform rollback
    • Consider taking a fresh snapshot

Warning Signs to Watch For:

  • Current height not increasing over time
  • Increasing lag between current height and max peer height
  • Repeated timeout errors in logs
  • Mempool size consistently reaching limits

Crash and Panic Debugging

For crashes, panics, or nil pointer exceptions:

  • Capture at least 1,000 lines of logs leading up to the crash
  • Or collect 15 minutes of log data, whichever provides more context
  • Include the full stack trace if available

Logging Configuration

Proper logging configuration is essential for debugging and monitoring:

# In config.toml # Set appropriate log level log_level = "debug" # Use "trace" for maximum detail # Choose log format log_format = "json" # Use "plain" for human-readable logs

Configure log rotation to manage storage effectively:

# Example logrotate configuration sudo tee /etc/logrotate.d/seid << EOF /var/log/seid/*.log { daily rotate 14 compress delaycompress notifempty create 0640 sei sei sharedscripts postrotate systemctl reload seid endscript } EOF

Enable core dumps for crash analysis:

# Set unlimited core dump size ulimit -c unlimited # Configure core dump location echo "/tmp/core.%e.%p" > /proc/sys/kernel/core_pattern

Other common Issues and Fixes

  1. Sync Problems

    • Check available disk space (df -h)
    • Ensure proper peer connections (curl http://localhost:26657/net_info)
    • Verify firewall settings (port 26656 open)
  2. Performance Issues

    • Monitor system resources (htop or iotop)
    • Check disk I/O performance (iostat)
    • Analyze network traffic (iftop)
  3. Database Issues

    • Run database integrity checks using:

      seid debug dump-db | grep -i error

      If errors are detected, consider restoring from a recent backup.

    • Consider pruning excessive historical data by adjusting ss-keep-recent in app.toml or running:

      seid unsafe-reset-all --home=$HOME/.sei --keep-addr-book

      Alternatively, manually remove old state snapshots to free up space:

      rm -rf $HOME/.sei/data/snapshots/*

Node Rollback

To rollback a node from an AppHashed state, you need to stop the node first. Do this in your preferred way.

Next, do a soft rollback with:

seid rollback

And then a hard rollback with:

seid rollback --hard

Then, restart the node.

In case you see the following error while trying to rollback:

failed to initialize database: resource temporarily unavailable

This means that you did not shutdown the node properly. Try to shutdown or kill the seid process directly in that case. If this doesn’t help, restart your machine.

Then try the rollback steps again.

Last updated on