Performance, Monitoring and Alerting

Optimizing System Configuration

There are a virtgually unlimited number of unique individual setups that cannot be covered in this document. As well, even very similar builds and configurations can behave very differently due to external factors, so your results may vary.

Here are some general guidelines to use as a starting point. Be cautious, make incremental changes, testing and observing before moving forward. Always focus on only one specific area at a time - avoid making changes to memory, storage, and CPU configs all at once. Diagnosing potential problems becomes nearly impossible otherwise.

Memory Management

The following settings in /etc/sysctl.conf can optimize memory usage and disk I/O patterns:


# Minimize swapping
vm.swappiness = 1
 
# Control disk write behavior
vm.dirty_background_ratio = 3
vm.dirty_ratio = 10
vm.dirty_expire_centisecs = 300
vm.dirty_writeback_centisecs = 100

Apply changes: sudo sysctl -p

Network Stack

The following settings in /etc/sysctl.conf may improve network performance:


# Increase connection handling capacity
net.core.somaxconn = 32768
net.core.netdev_max_backlog = 32768
net.ipv4.tcp_max_syn_backlog = 16384
 
# Optimize buffer sizes
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 87380 16777216

Storage Configuration

For NVMe drives, optimize I/O scheduling:

Storage Optimization Commands


# Set IO scheduler
echo "none" > /sys/block/nvme0n1/queue/scheduler
 
# Set read-ahead buffer
blockdev --setra 4096 /dev/nvme0n1
 
# Set IO priority in systemd service
sudo tee -a /etc/systemd/system/seid.service << EOF
[Service]
IOSchedulingClass=realtime
IOSchedulingPriority=2
EOF
 
# Configure disk mount options
sudo tee -a /etc/fstab << EOF
/dev/nvme0n1p1 /data ext4 defaults,noatime,nosuid,nodev,noexec,commit=60 0 0
EOF

Infrastructure Monitoring

Monitoring is one of the most critical components of network infrastructure. performance tuning, and alerting configuration for Cosmos-SDK/Tendermint nodes.

Prometheus Setup

First, install Prometheus:


wget https://github.com/prometheus/prometheus/releases/download/v2.42.0/prometheus-2.42.0.linux-amd64.tar.gz
tar xvf prometheus-2.42.0.linux-amd64.tar.gz

Example Prometheus configuration:


global:
	scrape_interval: 15s
	evaluation_interval: 15s
 
scrape_configs:
	- job_name: 'sei_node'
		static_configs:
			- targets: ['node1_ip:port']
		metrics_path: /metrics
	- job_name: 'node'
		static_configs:
			- targets: ['node2_ip:port']

Grafana Integration

Install and configure Grafana:


sudo apt install -y apt-transport-https software-properties-common
sudo add-apt-repository "deb https://packages.grafana.com/oss/deb stable main"
sudo apt update && sudo apt-get install grafana

Sample Grafana Dashboard JSON


{
	"annotations": {
		"list": [
			{
				"builtIn": 1,
				"datasource": "-- Grafana --",
				"enable": true,
				"hide": true,
				"iconColor": "rgba(0, 211, 255, 1)",
				"name": "Annotations & Alerts",
				"type": "dashboard"
			}
		]
	},
	"editable": true,
	"gnetId": null,
	"graphTooltip": 0,
	"id": 1,
	"links": [],
	"panels": [
		{
			"alerting": {},
			"aliasColors": {},
			"bars": false,
			"dashLength": 10,
			"dashes": false,
			"datasource": null,
			"fieldConfig": {
				"defaults": {
					"custom": {}
				},
				"overrides": []
			},
			"fill": 1,
			"fillGradient": 0,
			"gridPos": {
				"h": 8,
				"w": 12,
				"x": 0,
				"y": 0
			},
			"hiddenSeries": false,
			"id": 2,
			"legend": {
				"avg": false,
				"current": false,
				"max": false,
				"min": false,
				"show": true,
				"total": false,
				"values": false
			},
			"lines": true,
			"linewidth": 1,
			"nullPointMode": "null",
			"options": {
				"alertThreshold": true
			},
			"percentage": false,
			"pluginVersion": "7.2.0",
			"pointradius": 2,
			"points": false,
			"renderer": "flot",
			"seriesOverrides": [],
			"spaceLength": 10,
			"stack": false,
			"steppedLine": false,
			"targets": [
				{
					"expr": "tendermint_consensus_height",
					"interval": "",
					"legendFormat": "",
					"refId": "A"
				}
			],
			"thresholds": [],
			"timeRegions": [],
			"title": "Block Height",
			"tooltip": {
				"shared": true,
				"sort": 0,
				"value_type": "individual"
			},
			"type": "graph",
			"xaxis": {
				"buckets": null,
				"mode": "time",
				"name": null,
				"show": true,
				"values": []
			},
			"yaxes": [
				{
					"format": "short",
					"label": null,
					"logBase": 1,
					"max": null,
					"min": null,
					"show": true
				},
				{
					"format": "short",
					"label": null,
					"logBase": 1,
					"max": null,
					"min": null,
					"show": true
				}
			],
			"yaxis": {
				"align": false,
				"alignLevel": null
			}
		}
	],
	"schemaVersion": 26,
	"style": "dark",
	"tags": [],
	"templating": {
		"list": []
	},
	"time": {
		"from": "now-6h",
		"to": "now"
	},
	"timepicker": {},
	"timezone": "",
	"title": "Sei Node Metrics",
	"uid": "sei_metrics",
	"version": 1
}

Alert Management

Install Alertmanager:


wget https://github.com/prometheus/alertmanager/releases/download/v0.25.0/alertmanager-0.25.0.linux-amd64.tar.gz
tar xvf alertmanager-0.25.0.linux-amd64.tar.gz

Create Alert Rules Configuration


groups:
	- name: validator_alerts
		rules:
			- alert: NodeDown
				expr: up == 0
				for: 5m
				labels:
					severity: critical
				annotations:
					summary: 'Node {{ $labels.instance }} down'
 
			- alert: BlockProductionSlow
				expr: rate(tendermint_consensus_height[5m]) < 0.1
				for: 5m
				labels:
					severity: warning
				annotations:
					summary: 'Block production is slow on {{ $labels.instance }}'
			- alert: ValidatorMissedBlocks
				expr: increase(tendermint_consensus_validator_missed_blocks[1h]) > 0
				labels:
					severity: critical
				annotations:
					summary: 'Validator missing blocks'
 
			- alert: ValidatorJailed
				expr: tendermint_consensus_validator_status == 0
				labels:
					severity: critical
				annotations:
					summary: 'Validator has been jailed'
 
			- alert: ConsensusStalled
				expr: tendermint_consensus_height_status == 0
				for: 5m
				labels:
					severity: critical
				annotations:
					summary: 'Consensus has stalled'

Log Management

Loki Setup

Using Loki for log aggregation:


wget https://github.com/grafana/loki/releases/download/v2.8.0/loki-linux-amd64.zip
unzip loki-linux-amd64.zip

Promtail Configuration


server:
  http_listen_port: 9080
 
positions:
  filename: /tmp/positions.yaml
 
clients:
  - url: http://localhost:3100/loki/api/v1/push
 
scrape_configs:
  - job_name: sei_logs
    static_configs:
      - targets:
          - localhost
        labels:
          job: seid_logs
          __path__: /var/log/seid/*.log

Log Rotation

Configure logrotate to manage log files:


sudo tee /etc/logrotate.d/sei << EOF
/var/log/sei/*.log {
    daily
    rotate 14
    compress
    delaycompress
    notifempty
    create 0640 sei sei
    sharedscripts
    postrotate
        systemctl reload seid
    endscript
}
EOF

Security Configuration

Network Security

UFW firewall configuration:


sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow 26656/tcp comment 'Sei P2P'
sudo ufw allow 26657/tcp comment 'Sei RPC'
sudo ufw allow 9090/tcp comment 'Sei gRPC'
sudo ufw enable

Rate Limiting

Example Nginx Configuration with Rate Limiting


http {
    limit_req_zone $binary_remote_addr zone=sei_rpc:10m rate=10r/s;
 
    server {
        listen 26657;
        location / {
            limit_req zone=sei_rpc burst=20 nodelay;
            proxy_pass http://localhost:26657;
        }
    }
}

Validator-Specific Monitoring

Status Query

Query validator status through SDK:


seid query staking validator $(seid keys show --bech val -a <validator_keyfile_name>)

Query through REST API:


curl -s "http://localhost:1317/cosmos/staking/v1beta1/validators/<valoper_address>"

Validator “Status” Query Script


#!/bin/bash
 
MONIKER="$1"
API_URL="http//localhost:1317/cosmos/staking/v1beta1/validators?pagination.limit=500"
 
echo "Querying validators from $API_URL..."
 
VALIDATOR_DATA=$(curl -s "$API_URL" | jq -c --arg MONIKER "$MONIKER" '.validators[] | select(.description.moniker == $MONIKER)')
 
if [[ -z "$VALIDATOR_DATA" ]]; then
    echo "❌ No validator found with moniker: $MONIKER"
    exit 1
fi
 
echo "Validator details:"
echo "$VALIDATOR_DATA" | jq '.'

Critical Metrics

Monitor these validator-specific metrics:


# Check signing status
seid query slashing signing-info $(seid tendermint show-validator)
 
# Check current delegations
seid query staking delegations-to $(seid keys show -a $VALIDATOR_KEY)

Oracle Price Feeder Monitoring

The price feeder exposes metrics at <listen_addr>/api/v1/metrics when telemetry is enabled in config.toml. Health status is available at <listen_addr>/api/v1/healthz.

Backup Management

Complete Automated Backup Script


#!/bin/bash
BACKUP_DIR="/backup/sei"
DATE=$(date +%Y%m%d)
NODE_HOME="/root/.sei"
 
# Create backup directory
mkdir -p $BACKUP_DIR
 
# Stop service
systemctl stop seid
 
# Backup configuration
tar czf $BACKUP_DIR/sei-config-$DATE.tar.gz $NODE_HOME/config
 
# Backup data directory
tar czf $BACKUP_DIR/sei-data-$DATE.tar.gz $NODE_HOME/data
 
# Backup key files
tar czf $BACKUP_DIR/sei-keys-$DATE.tar.gz $NODE_HOME/keyring-file
 
# Start service
systemctl start seid
 
# Remove backups older than 7 days
find $BACKUP_DIR -type f -mtime +7 -name '*.tar.gz' -delete
 
# Log backup completion
echo "Backup completed successfully on $(date)" >> $BACKUP_DIR/backup.log

Host System Monitoring

Resource Usage Tracking

Install and configure node_exporter:


wget https://github.com/prometheus/node_exporter/releases/download/v1.5.0/node_exporter-1.5.0.linux-amd64.tar.gz
tar xvf node_exporter-1.5.0.linux-amd64.tar.gz

Add to Prometheus configuration:


scrape_configs:
  - job_name: 'node'
    static_configs:
      - targets: ['localhost:9100']

Performance Testing

Example Benchmark Script using `eth_getLogs`


import { ethers } from 'ethers';
 
// Configuration
const EVM_RPC_URL = 'http://localhost:8545'; // EVM RPC endpoint to test
const CONTRACT_ADDRESS = '0x0000000000000000000000000000000000001002'; // replace with very active contract for best results
const INITIAL_BLOCK_RANGE = 50; // range of blocks to query using 'eth_getLogs'
const RANGE_INCREMENT = 10; // additional blocks to query each consecutive round
const MAX_TESTS = 50; // total number of rounds for testing
 
// Store metrics for final analysis
const metrics = [];
 
function getResponseSize(logs) {
  return Buffer.byteLength(JSON.stringify(logs), 'utf8');
}
 
function formatBytes(bytes) {
  if (bytes === 0) return '0 B';
  const k = 1024;
  const sizes = ['B', 'KB', 'MB', 'GB'];
  const i = Math.floor(Math.log(bytes) / Math.log(k));
  return `${parseFloat((bytes / Math.pow(k, i)).toFixed(2))} ${sizes[i]}`;
}
 
function padString(str, length) {
  return String(str).padEnd(length);
}
 
function analyzeResults(metrics) {
  console.log('\nPerformance Analysis');
  console.log('='.repeat(50));
 
  // Filter out queries with no logs for meaningful statistics
  const queriesWithLogs = metrics.filter((m) => m.logsCount > 0);
  const totalQueries = metrics.length;
 
  console.log(`\nGeneral Statistics:`);
  console.log(`Total Queries Run: ${totalQueries}`);
  console.log(`Queries with Logs: ${queriesWithLogs.length}`);
  console.log(`Empty Responses: ${totalQueries - queriesWithLogs.length}`);
 
  if (queriesWithLogs.length > 0) {
    const avgResponseTime = queriesWithLogs.reduce((acc, m) => acc + m.responseTime, 0) / queriesWithLogs.length;
    const avgLogsPerQuery = queriesWithLogs.reduce((acc, m) => acc + m.logsCount, 0) / queriesWithLogs.length;
    const maxLogs = Math.max(...queriesWithLogs.map((m) => m.logsCount));
    const maxLogsQuery = queriesWithLogs.find((m) => m.logsCount === maxLogs);
 
    console.log(`\nPerformance Metrics:`);
    console.log(`Average Response Time (with logs): ${avgResponseTime.toFixed(2)}ms`);
    console.log(`Average Logs per Query: ${avgLogsPerQuery.toFixed(2)}`);
    console.log(`Maximum Logs in Single Query: ${maxLogs}`);
    if (maxLogsQuery) {
      console.log(`- At Range Size: ${maxLogsQuery.rangeSize} blocks`);
      console.log(`- Response Time: ${maxLogsQuery.responseTime}ms`);
      console.log(`- Efficiency: ${maxLogsQuery.logsPerMs.toFixed(3)} logs/ms`);
    }
 
    // Identify optimal range size based on logs/ms
    const bestEfficiency = queriesWithLogs.reduce((best, m) => (m.logsPerMs > best.logsPerMs ? m : best));
    console.log(`\nOptimal Performance:`);
    console.log(`Best Efficiency: ${bestEfficiency.logsPerMs.toFixed(3)} logs/ms`);
    console.log(`- At Range Size: ${bestEfficiency.rangeSize} blocks`);
    console.log(`- Retrieved ${bestEfficiency.logsCount} logs in ${bestEfficiency.responseTime}ms`);
  }
}
 
async function testEthGetLogs() {
  const provider = new ethers.JsonRpcProvider(EVM_RPC_URL);
 
  try {
    const latestBlock = await provider.getBlockNumber();
    console.log(`Latest block: ${latestBlock} (0x${latestBlock.toString(16)})`);
 
    let currentToBlock = latestBlock;
    let currentRange = INITIAL_BLOCK_RANGE;
    let testCount = 0;
 
    // Column headers with fixed widths
    console.log('\nBlock Range         Time  Logs    Size     B/ms   Logs/ms  KB/Log  Range');
    console.log('='.repeat(80));
 
    while (testCount < MAX_TESTS && currentToBlock > 0) {
      const fromBlock = Math.max(0, currentToBlock - currentRange);
 
      try {
        const startTime = Date.now();
        const filter = {
          fromBlock: fromBlock,
          toBlock: currentToBlock,
          address: CONTRACT_ADDRESS
        };
 
        const logs = await provider.getLogs(filter);
 
        const endTime = Date.now();
        const responseTime = endTime - startTime;
        const logsCount = logs.length;
        const responseSize = getResponseSize(logs);
 
        // Calculate metrics
        const bytesPerMs = (responseSize / responseTime).toFixed(1);
        const logsPerMs = (logsCount / responseTime).toFixed(3);
        const kbPerLog = logsCount > 0 ? (responseSize / 1024 / logsCount).toFixed(2) : 'N/A';
 
        // Store metrics for analysis
        metrics.push({
          rangeSize: currentRange,
          responseTime,
          logsCount,
          responseSize,
          bytesPerMs: parseFloat(bytesPerMs),
          logsPerMs: parseFloat(logsPerMs),
          kbPerLog: kbPerLog !== 'N/A' ? parseFloat(kbPerLog) : 0
        });
 
        // Format block range
        const rangeDisplay = `${fromBlock.toString(16)}-${currentToBlock.toString(16)}`;
 
        // Log with fixed column widths
        console.log(padString(rangeDisplay, 17) + padString(responseTime, 6) + padString(logsCount, 8) + padString(formatBytes(responseSize), 9) + padString(bytesPerMs, 8) + padString(logsPerMs, 9) + padString(kbPerLog, 8) + currentRange);
 
        if (logsCount === 10000) {
          console.log(`\nWarning: Hit 10000 log limit at range ${currentRange}`);
        }
 
        currentToBlock = fromBlock - 1;
        currentRange += RANGE_INCREMENT;
        testCount++;
      } catch (error) {
        console.log(`Error at range ${currentRange}: ${error.message}`);
        currentRange = Math.max(INITIAL_BLOCK_RANGE, currentRange - RANGE_INCREMENT);
        currentToBlock = fromBlock - 1;
        testCount++;
      }
 
      await new Promise((resolve) => setTimeout(resolve, 1000));
    }
 
    // Perform final analysis
    analyzeResults(metrics);
  } catch (error) {
    console.error('Failed to initialize or get latest block:', error);
    process.exit(1);
  }
}
 
// Run the test
testEthGetLogs();

For specific customizations or additional metrics, consult the Sei technical communities in Telegram or Discord .