Any Linux-Based System Performance Analysis in 60 Seconds:
The following ten commands can give you a high-level idea of system resource usage and running processes in 60 seconds in linux.
uptime
dmesg | tail
vmstat 1
mpstat -P ALL 1
pidstat 1
iostat -xz 1
free -m
sar -n DEV 1
sar -n TCP, ETCP 1
top
1.uptime
This is a quick way to view the load averages, which indicate the number of tasks (processes) wanting to run.
This gives a high-level idea of resource load (or demand),
but can’t be properly understood without other tools.
2. dmesg | tail
This views the last 10 system messages if there are any. Look for errors that can cause performance issues. The example above includes the oom-killer, and TCP dropping a request.
Don’t miss this step! dmesg is always worth checking.
3. vmstat 1
virtual memory stat, vmstat(8) is a commonly available tool . It prints a summary of key server statistics on each line.
4. mpstat -P ALL 1
This command prints CPU time breakdowns per CPU, which can be used to check for an imbalance.
A single hot CPU can be evidence of a single-threaded application.
5. pidstat 1
Pidstat is a little like top’s per-process summary, but prints a rolling summary instead of clearing the screen, useful for watching patterns over time
6. iostat -xz 1
Great tool for understanding block devices (disks), both the workload applied and the resulting performance
7. free -m
The right two columns show:
buffers: For the buffer cache, used for block device I/O.
cached: For the page cache, used by file systems.
8 .sar -n DEV 1
Tool to check network interface throughput: rxkB/s and txkB/s, as a measure of workload, and also to check if any limit has been reached.
9. sar -n TCP, ETCP 1
Summarized view of some key TCP metrics. These include:
active/s: Number of locally-initiated TCP connections per second (e.g., via connect()).
passive/s: Number of remotely-initiated TCP connections per second (e.g., via accept()).
retrans/s: Number of TCP retransmits per second.
The active and passive counts are often useful as a rough measure of server load: number of newly accepted connections (passive), and number of downstream connections (active). It might help to think of active as outbound, and passive as inbound, but this isn’t strictly true.
Retransmits are a sign of a network or server issue; it may be an unreliable network, or it may be due to a server being overloaded and dropping packets.
10. top
The top command includes many of the metrics we checked earlier. It can be handy to run it to see if anything looks wildly different from the earlier commands, which would indicate that the load is variable.
A downside to the top is that it is harder to see patterns over time, which may be more clear in tools like vmstat and pidstat, which provide rolling output.
Look for errors and saturation metrics, as they are both easy to interpret, and then resource
Credits: Netflix Blog
https://netflixtechblog.com/linux-performance-analysis-in-60-000-milliseconds-accc10403c55