Performance, Monitoring and Alerting
Optimizing System Configuration
There are a virtgually unlimited number of unique individual setups that cannot be covered in this document. As well, even very similar builds and configurations can behave very differently due to external factors, so your results may vary.
Here are some general guidelines to use as a starting point. Be cautious, make incremental changes, testing and observing before moving forward. Always focus on only one specific area at a time - avoid making changes to memory, storage, and CPU configs all at once. Diagnosing potential problems becomes nearly impossible otherwise.
Memory Management
The following settings in /etc/sysctl.conf
can optimize memory usage and disk I/O patterns:
# Minimize swapping
vm.swappiness = 1
# Control disk write behavior
vm.dirty_background_ratio = 3
vm.dirty_ratio = 10
vm.dirty_expire_centisecs = 300
vm.dirty_writeback_centisecs = 100
Apply changes: sudo sysctl -p
Network Stack
The following settings in /etc/sysctl.conf
may improve network performance:
# Increase connection handling capacity
net.core.somaxconn = 32768
net.core.netdev_max_backlog = 32768
net.ipv4.tcp_max_syn_backlog = 16384
# Optimize buffer sizes
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 87380 16777216
Storage Configuration
For NVMe drives, optimize I/O scheduling:
Storage Optimization Commands
# Set IO scheduler
echo "none" > /sys/block/nvme0n1/queue/scheduler
# Set read-ahead buffer
blockdev --setra 4096 /dev/nvme0n1
# Set IO priority in systemd service
sudo tee -a /etc/systemd/system/seid.service << EOF
[Service]
IOSchedulingClass=realtime
IOSchedulingPriority=2
EOF
# Configure disk mount options
sudo tee -a /etc/fstab << EOF
/dev/nvme0n1p1 /data ext4 defaults,noatime,nosuid,nodev,noexec,commit=60 0 0
EOF
Infrastructure Monitoring
Monitoring is one of the most critical components of network infrastructure. performance tuning, and alerting configuration for Cosmos-SDK/Tendermint nodes.
Prometheus Setup
First, install Prometheus:
wget https://github.com/prometheus/prometheus/releases/download/v2.42.0/prometheus-2.42.0.linux-amd64.tar.gz
tar xvf prometheus-2.42.0.linux-amd64.tar.gz
Example Prometheus configuration:
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'sei_node'
static_configs:
- targets: ['node1_ip:port']
metrics_path: /metrics
- job_name: 'node'
static_configs:
- targets: ['node2_ip:port']
Grafana Integration
Install and configure Grafana:
sudo apt install -y apt-transport-https software-properties-common
sudo add-apt-repository "deb https://packages.grafana.com/oss/deb stable main"
sudo apt update && sudo apt-get install grafana
Sample Grafana Dashboard JSON
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": "-- Grafana --",
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"editable": true,
"gnetId": null,
"graphTooltip": 0,
"id": 1,
"links": [],
"panels": [
{
"alerting": {},
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": null,
"fieldConfig": {
"defaults": {
"custom": {}
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 0
},
"hiddenSeries": false,
"id": 2,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"nullPointMode": "null",
"options": {
"alertThreshold": true
},
"percentage": false,
"pluginVersion": "7.2.0",
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "tendermint_consensus_height",
"interval": "",
"legendFormat": "",
"refId": "A"
}
],
"thresholds": [],
"timeRegions": [],
"title": "Block Height",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
}
],
"schemaVersion": 26,
"style": "dark",
"tags": [],
"templating": {
"list": []
},
"time": {
"from": "now-6h",
"to": "now"
},
"timepicker": {},
"timezone": "",
"title": "Sei Node Metrics",
"uid": "sei_metrics",
"version": 1
}
Alert Management
Install Alertmanager:
wget https://github.com/prometheus/alertmanager/releases/download/v0.25.0/alertmanager-0.25.0.linux-amd64.tar.gz
tar xvf alertmanager-0.25.0.linux-amd64.tar.gz
Create Alert Rules Configuration
groups:
- name: validator_alerts
rules:
- alert: NodeDown
expr: up == 0
for: 5m
labels:
severity: critical
annotations:
summary: 'Node {{ $labels.instance }} down'
- alert: BlockProductionSlow
expr: rate(tendermint_consensus_height[5m]) < 0.1
for: 5m
labels:
severity: warning
annotations:
summary: 'Block production is slow on {{ $labels.instance }}'
- alert: ValidatorMissedBlocks
expr: increase(tendermint_consensus_validator_missed_blocks[1h]) > 0
labels:
severity: critical
annotations:
summary: 'Validator missing blocks'
- alert: ValidatorJailed
expr: tendermint_consensus_validator_status == 0
labels:
severity: critical
annotations:
summary: 'Validator has been jailed'
- alert: ConsensusStalled
expr: tendermint_consensus_height_status == 0
for: 5m
labels:
severity: critical
annotations:
summary: 'Consensus has stalled'
Log Management
Loki Setup
Using Loki for log aggregation:
wget https://github.com/grafana/loki/releases/download/v2.8.0/loki-linux-amd64.zip
unzip loki-linux-amd64.zip
Promtail Configuration
server:
http_listen_port: 9080
positions:
filename: /tmp/positions.yaml
clients:
- url: http://localhost:3100/loki/api/v1/push
scrape_configs:
- job_name: sei_logs
static_configs:
- targets:
- localhost
labels:
job: seid_logs
__path__: /var/log/seid/*.log
Log Rotation
Configure logrotate to manage log files:
sudo tee /etc/logrotate.d/sei << EOF
/var/log/sei/*.log {
daily
rotate 14
compress
delaycompress
notifempty
create 0640 sei sei
sharedscripts
postrotate
systemctl reload seid
endscript
}
EOF
Security Configuration
Network Security
UFW firewall configuration:
sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow 26656/tcp comment 'Sei P2P'
sudo ufw allow 26657/tcp comment 'Sei RPC'
sudo ufw allow 9090/tcp comment 'Sei gRPC'
sudo ufw enable
Rate Limiting
Example Nginx Configuration with Rate Limiting
http {
limit_req_zone $binary_remote_addr zone=sei_rpc:10m rate=10r/s;
server {
listen 26657;
location / {
limit_req zone=sei_rpc burst=20 nodelay;
proxy_pass http://localhost:26657;
}
}
}
Validator-Specific Monitoring
Status Query
Query validator status through SDK:
seid query staking validator $(seid keys show --bech val -a <validator_keyfile_name>)
Query through REST API:
curl -s "http://localhost:1317/cosmos/staking/v1beta1/validators/<valoper_address>"
Validator “Status” Query Script
#!/bin/bash
MONIKER="$1"
API_URL="http//localhost:1317/cosmos/staking/v1beta1/validators?pagination.limit=500"
echo "Querying validators from $API_URL..."
VALIDATOR_DATA=$(curl -s "$API_URL" | jq -c --arg MONIKER "$MONIKER" '.validators[] | select(.description.moniker == $MONIKER)')
if [[ -z "$VALIDATOR_DATA" ]]; then
echo "❌ No validator found with moniker: $MONIKER"
exit 1
fi
echo "Validator details:"
echo "$VALIDATOR_DATA" | jq '.'
Critical Metrics
Monitor these validator-specific metrics:
# Check signing status
seid query slashing signing-info $(seid tendermint show-validator)
# Check current delegations
seid query staking delegations-to $(seid keys show -a $VALIDATOR_KEY)
Oracle Price Feeder Monitoring
The price feeder exposes metrics at <listen_addr>/api/v1/metrics
when telemetry is enabled in config.toml
. Health status is available at <listen_addr>/api/v1/healthz
.
Backup Management
Complete Automated Backup Script
#!/bin/bash
BACKUP_DIR="/backup/sei"
DATE=$(date +%Y%m%d)
NODE_HOME="/root/.sei"
# Create backup directory
mkdir -p $BACKUP_DIR
# Stop service
systemctl stop seid
# Backup configuration
tar czf $BACKUP_DIR/sei-config-$DATE.tar.gz $NODE_HOME/config
# Backup data directory
tar czf $BACKUP_DIR/sei-data-$DATE.tar.gz $NODE_HOME/data
# Backup key files
tar czf $BACKUP_DIR/sei-keys-$DATE.tar.gz $NODE_HOME/keyring-file
# Start service
systemctl start seid
# Remove backups older than 7 days
find $BACKUP_DIR -type f -mtime +7 -name '*.tar.gz' -delete
# Log backup completion
echo "Backup completed successfully on $(date)" >> $BACKUP_DIR/backup.log
Host System Monitoring
Resource Usage Tracking
Install and configure node_exporter:
wget https://github.com/prometheus/node_exporter/releases/download/v1.5.0/node_exporter-1.5.0.linux-amd64.tar.gz
tar xvf node_exporter-1.5.0.linux-amd64.tar.gz
Add to Prometheus configuration:
scrape_configs:
- job_name: 'node'
static_configs:
- targets: ['localhost:9100']
Performance Testing
Example Benchmark Script using eth_getLogs
eth_getLogs
import { ethers } from 'ethers';
// Configuration
const EVM_RPC_URL = 'http://localhost:8545'; // EVM RPC endpoint to test
const CONTRACT_ADDRESS = '0x0000000000000000000000000000000000001002'; // replace with very active contract for best results
const INITIAL_BLOCK_RANGE = 50; // range of blocks to query using 'eth_getLogs'
const RANGE_INCREMENT = 10; // additional blocks to query each consecutive round
const MAX_TESTS = 50; // total number of rounds for testing
// Store metrics for final analysis
const metrics = [];
function getResponseSize(logs) {
return Buffer.byteLength(JSON.stringify(logs), 'utf8');
}
function formatBytes(bytes) {
if (bytes === 0) return '0 B';
const k = 1024;
const sizes = ['B', 'KB', 'MB', 'GB'];
const i = Math.floor(Math.log(bytes) / Math.log(k));
return `${parseFloat((bytes / Math.pow(k, i)).toFixed(2))} ${sizes[i]}`;
}
function padString(str, length) {
return String(str).padEnd(length);
}
function analyzeResults(metrics) {
console.log('\nPerformance Analysis');
console.log('='.repeat(50));
// Filter out queries with no logs for meaningful statistics
const queriesWithLogs = metrics.filter((m) => m.logsCount > 0);
const totalQueries = metrics.length;
console.log(`\nGeneral Statistics:`);
console.log(`Total Queries Run: ${totalQueries}`);
console.log(`Queries with Logs: ${queriesWithLogs.length}`);
console.log(`Empty Responses: ${totalQueries - queriesWithLogs.length}`);
if (queriesWithLogs.length > 0) {
const avgResponseTime = queriesWithLogs.reduce((acc, m) => acc + m.responseTime, 0) / queriesWithLogs.length;
const avgLogsPerQuery = queriesWithLogs.reduce((acc, m) => acc + m.logsCount, 0) / queriesWithLogs.length;
const maxLogs = Math.max(...queriesWithLogs.map((m) => m.logsCount));
const maxLogsQuery = queriesWithLogs.find((m) => m.logsCount === maxLogs);
console.log(`\nPerformance Metrics:`);
console.log(`Average Response Time (with logs): ${avgResponseTime.toFixed(2)}ms`);
console.log(`Average Logs per Query: ${avgLogsPerQuery.toFixed(2)}`);
console.log(`Maximum Logs in Single Query: ${maxLogs}`);
if (maxLogsQuery) {
console.log(`- At Range Size: ${maxLogsQuery.rangeSize} blocks`);
console.log(`- Response Time: ${maxLogsQuery.responseTime}ms`);
console.log(`- Efficiency: ${maxLogsQuery.logsPerMs.toFixed(3)} logs/ms`);
}
// Identify optimal range size based on logs/ms
const bestEfficiency = queriesWithLogs.reduce((best, m) => (m.logsPerMs > best.logsPerMs ? m : best));
console.log(`\nOptimal Performance:`);
console.log(`Best Efficiency: ${bestEfficiency.logsPerMs.toFixed(3)} logs/ms`);
console.log(`- At Range Size: ${bestEfficiency.rangeSize} blocks`);
console.log(`- Retrieved ${bestEfficiency.logsCount} logs in ${bestEfficiency.responseTime}ms`);
}
}
async function testEthGetLogs() {
const provider = new ethers.JsonRpcProvider(EVM_RPC_URL);
try {
const latestBlock = await provider.getBlockNumber();
console.log(`Latest block: ${latestBlock} (0x${latestBlock.toString(16)})`);
let currentToBlock = latestBlock;
let currentRange = INITIAL_BLOCK_RANGE;
let testCount = 0;
// Column headers with fixed widths
console.log('\nBlock Range Time Logs Size B/ms Logs/ms KB/Log Range');
console.log('='.repeat(80));
while (testCount < MAX_TESTS && currentToBlock > 0) {
const fromBlock = Math.max(0, currentToBlock - currentRange);
try {
const startTime = Date.now();
const filter = {
fromBlock: fromBlock,
toBlock: currentToBlock,
address: CONTRACT_ADDRESS
};
const logs = await provider.getLogs(filter);
const endTime = Date.now();
const responseTime = endTime - startTime;
const logsCount = logs.length;
const responseSize = getResponseSize(logs);
// Calculate metrics
const bytesPerMs = (responseSize / responseTime).toFixed(1);
const logsPerMs = (logsCount / responseTime).toFixed(3);
const kbPerLog = logsCount > 0 ? (responseSize / 1024 / logsCount).toFixed(2) : 'N/A';
// Store metrics for analysis
metrics.push({
rangeSize: currentRange,
responseTime,
logsCount,
responseSize,
bytesPerMs: parseFloat(bytesPerMs),
logsPerMs: parseFloat(logsPerMs),
kbPerLog: kbPerLog !== 'N/A' ? parseFloat(kbPerLog) : 0
});
// Format block range
const rangeDisplay = `${fromBlock.toString(16)}-${currentToBlock.toString(16)}`;
// Log with fixed column widths
console.log(padString(rangeDisplay, 17) + padString(responseTime, 6) + padString(logsCount, 8) + padString(formatBytes(responseSize), 9) + padString(bytesPerMs, 8) + padString(logsPerMs, 9) + padString(kbPerLog, 8) + currentRange);
if (logsCount === 10000) {
console.log(`\nWarning: Hit 10000 log limit at range ${currentRange}`);
}
currentToBlock = fromBlock - 1;
currentRange += RANGE_INCREMENT;
testCount++;
} catch (error) {
console.log(`Error at range ${currentRange}: ${error.message}`);
currentRange = Math.max(INITIAL_BLOCK_RANGE, currentRange - RANGE_INCREMENT);
currentToBlock = fromBlock - 1;
testCount++;
}
await new Promise((resolve) => setTimeout(resolve, 1000));
}
// Perform final analysis
analyzeResults(metrics);
} catch (error) {
console.error('Failed to initialize or get latest block:', error);
process.exit(1);
}
}
// Run the test
testEthGetLogs();
For specific customizations or additional metrics, consult the Sei technical communities in Telegram or Discord .