PI System Learning
ModulesSystem Health & Monitoring
📡
Module 14Advanced

System Health & Monitoring

Develop a proactive monitoring strategy for PI System: health checks, performance monitoring with PerfMon and PI SMT, alerting strategies, root cause analysis workflows, and common failure resolutions.

75 min8 topics1 code examples

Proactive Monitoring Philosophy

The best PI System administrators catch problems before they impact operations. A proactive monitoring strategy has three layers:

Layer 1: Real-time Alerts (< 5 min response)
  - Interface disconnections
  - Buffer overflow
  - Archive fill > 90%
  - Service failures

Layer 2: Trend Monitoring (hourly review)
  - Event rates declining
  - Latency increasing
  - Disk space trending up
  - CPU/memory creep

Layer 3: Scheduled Audits (daily/weekly)
  - Data quality review
  - Security audit
  - Archive management
  - Backup validation

PI System Health Checks

Interface Health Metrics

Event Rate (events/sec):
  Normal:   matches expected scan rate
  Warning:  < 50% of expected rate
  Critical: 0 events/sec (interface disconnected)

Buffer Queue Depth:
  Normal:   < 1,000 events
  Warning:  > 10,000 events (backlog building)
  Critical: > 100,000 events (data at risk)

Archive Health

Archive Fill Monitoring:
  Warning:  80% fill
  Critical: 95% fill (create new archive immediately)

PI System Tags to Monitor:
  System.Archive.CurrentArchive.PercentFull
  System.Archive.CurrentArchive.Name
  System.Snapshot.QueueCount
  System.Snapshot.EventsPerSec

Performance Monitoring with PerfMon

Key Windows Performance Counters

Disk Performance:
  PhysicalDisk - Avg. Disk Queue Length  < 2 (alert if > 5)
  PhysicalDisk - % Disk Time             < 80%

Memory:
  Memory - Available MBytes              > 2 GB (alert if < 1 GB)
  Memory - Pages/sec                     < 100 (alert if > 1000)

CPU:
  Processor - % Processor Time           < 80% (alert if > 90%)
  System - Processor Queue Length        < 4

PI SMT Diagnostic Tools

PI Log Files

C:\PI\log\pipc.log          - Main PI Server log
C:\PI\log\piarchss.log      - Archive subsystem
C:\PI\log\pisnapss.log      - Snapshot subsystem
C:\PI\log\pinetmgr.log      - Network manager

Common Error Patterns:
  "Connection refused"     → Interface node network issue
  "Archive full"           → Create new archive immediately
  "License exceeded"       → Point count > licensed limit
  "Authentication failed"  → Kerberos/mapping issue
  "Timeout"               → Network latency or server overload

piartool Commands

piartool -al          List all archives
piartool -collstatus  Check collective status
piartool -verify      Verify archive integrity
piartool -stats       Show server statistics
piartool -connections List active connections

Root Cause Analysis: Data Gap Investigation

Step 1: Identify scope
  - Which tags? Single tag or multiple?
  - Which time range?
  - Is gap ongoing or historical?

Step 2: Check interface status
  - PI SMT → Interfaces → Check event rate
  - Review pipc.log for errors during gap period

Step 3: Check buffer
  - PI SMT → Buffering → Queue depth
  - If buffer is full: data may be lost

Step 4: Check PI point configuration
  - Is the tag active (Scan = On)?
  - Is the interface scanning this tag?
  - Check ExDesc for correct source address

Step 5: Verify data quality
  - Check system digital states (Shutdown, No Data)
  - Review compression settings

Common Failure Scenarios

ScenarioSymptomsRoot CauseResolution
Interface disconnectionData gaps on all interface tagsNetwork failure, DCS restartRestart interface, check network
Archive fullSystem digital state ShutdownArchive not pre-createdCreate new archive immediately
Buffer overflowLarge data gaps, high queuePI Server overloadedIncrease buffer size, tune server
Kerberos failureAuthentication errorsSPN missing or expiredRe-register SPN, check AD
AF analysis failureCalculated attributes not updatingExpression error, circular refCheck analysis log in PSE
PI Vision slowDashboard load > 10sToo many symbolsReduce symbols, optimize queries

Ready to test your knowledge?

Take the quiz for this module to earn completion credit and unlock achievements.