Skip to main content

CLI madness part III – Useful vsish and esxcli commands for iSCSI environments


CLI madness part III – Useful vsish and esxcli commands for iSCSI environments

In the Linux world, /proc nodes are crucial for troubleshooting issues with the operating system.  It provides kernel level configuration info as well as statistics.  What about ESXi?  Where are those /proc nodes when we need them?  Try entering “#cd /proc” in the shell of ESXi and you will get disappointed – well, not really.  There’s ‘vsish’! :-D  Inspired by William Lam’s post on vsish, I took a few stabs on narrowing down a few that could be useful for ESX environments using iSCSI storage.  Here we go:
With most deployments of ESXi using software iSCSI initiator, having knowledge of the NICs dedicated to the vmk port for iSCSI storage is important.  Remember, it is not just the ESX host or the storage array, the network in between is crucial to your success in building out a solid foundation (“private cloud” – COUGH, damn marketing jargon :-P).  Here’s an example of NIC driver setting tweak that could impact performance: setting the interrupt throttle rate of the Intel igb driver could boost performance for the 1G Intel NICs (check out this VMW KB).  Now here’s the fun part:
Q. How the hell do I find out what NIC driver my iSCSI vmnics are using?
A. This is totally NOT obvious in the VI Client – I tried clicking through “Network Adapter” and other places in the VI Client and I can’t seem to find that shit.  If you had an easy way that I overlooked, please post a comment so I stand corrected.  With the CLI, you could find it pretty easily:
# esxcli iscsi networkportal list
In the output, you could easily see the vmhba# & NIC driver that the sw/iscsi initiator belongs to, and from there, you could determine whether specific esx patches contain updates to the driver that your current iSCSI storage traffic depends on, or tunable parameters that could help boosting/resolving performance issues (like the recent VMW KB above)















The same info could be found in vsish as well – first find out which vmnic the sw/iscsi vmhba is tied to.   Quickest way is to go into the VI Client->click on the ESX host->click on “Configuration” tab->select “Storage Adapters”->click on iSCSI Software Adapter->Properties then select “Network Configuration” tab














Once you’ve found the vmnic, do the following in the ESXi shell:

#vsish
/> cat /net/pNics/vmnic2/properties
The output will clearly show the driver that the vmnic uses, its version, as well as the NIC capabilities:

Server class NICs nowadays have a pretty rich set of capabilities to offload tasks such as packet checksum from the OS.   If you care to know what each capability means, this site has a pretty good comment reference for each capability.  If I were you, I’d simply check to make sure TSO capability exists and is activated.
Another interesting place to look is the pNIC stats:
#cd /net/pNics/vmnic2
/net/pNics/vmnic2/> cat stats
Pay attention to Tx/Rx errors, CRC errors, Tx/Rx error counters.  Seeing a non-zero value for these counters don’t necessarily mean your network is screwed up.  The typical rule of thumb is no more than 1 CRC error for every 5000 packets received, and not more than 1 packet tx error for every 5000 transmit errors.  If you do see large number of errors, try swapping out the network adapter cable first, trigger some I/O, then check the counters again.  Typical culprit for NIC adapter errors is from layer 1(cables, NIC/switch ports) so do the usual problem isolation and test one factor at a time.  Layer 2 could also trigger these errors – a common culprit would be speed/duplex mismatch between the NIC and the switch port.  #esxtop is useful in seeing real time performance data/network packet stats on the ESX host, but tracking historical stats is much easier with vsish.
Now that you have checked the stats for the NICs that your iSCSI storage relies on, here’s a quick command to list all the datastores/LUNs that have been presented to your ESX host from the iSCSI array:
# esxcli iscsi adapter target list

Pay attention to the last column “Last Error” to make sure there were no error associated with any datastore in your ESX host.
Before wrapping up this post, here’s a little food for thought on further tuning for iSCSi environment.
# esxcli iscsi adapter param get -A vmhba37 <—this command will get the list of parameters for the sw/iscsi adapter; I’ve noticed a few that look interesting.  There are three things I plan to try out next to see what effect it has on the ESXi environment, if any…if you have tried these out, feel free to share your results!
Name                  Current     Default     Min  Max       Settable  Inherit
——————–  ———-  ———-  —  ——–  ——–  ——-
FirstBurstLength      262144      262144      512  16777215      true    false
MaxBurstLength        262144      262144      512  16777215      true    false
MaxRecvDataSegment    131072      131072      512  16777215      true    false
MaxOutstandingR2T     1           1           1    8             true    false
LoginTimeout          5           5           1    60            true    false
NoopOutInterval       15          15          1    60            true    false
NoopOutTimeout        10          10          10   30            true    false
RecoveryTimeout       10          10          1    120           true    false
DelayedAck            true        true        na   na            true    false
HeaderDigest          prohibited  prohibited  na   na            true    false
DataDigest            prohibited  prohibited  na   na            true    false
Food for thought 1: increase MaxOutstandingR2T (this parameter controls how many outstanding R2T(ready to transmits) can be outstanding when existing PDUs have not been acknowledged)
Food for thought 2: the LUN queue depth for the sw/iscsi adapter is also tunable (with default value of 128 for ESXi5); increasing this reduces the number of LUNs you could have in the environment per vSphere documentation
Food for thought 3: PSP_RR policy is recommended for iSCSI storage so all active paths get utilized; by default, the switching point is 1000 IOPS per path.  If we reduce it to 10, IO would traverse to the next path quicker while incurring some CPU overhead.

Comments

Popular posts from this blog

  Issue with Aria Automation Custom form Multi Value Picker and Data Grid https://knowledge.broadcom.com/external/article?articleNumber=345960 Products VMware Aria Suite Issue/Introduction Symptoms: Getting  error " Expected Type String but was Object ", w hen trying to use Complex Types in MultiValue Picker on the Aria for Automation Custom Form. Environment VMware vRealize Automation 8.x Cause This issue has been identified where the problem appears when a single column Multi Value Picker or Data Grid is used. Resolution This is a known issue. There is a workaround.  Workaround: As a workaround, try adding one empty column in the Multivalue picker without filling the options. So we can add one more column without filling the value which will be hidden(there is a button in the designer page that will hide the column). This way the end user will receive the same view.  

57 Tips Every Admin Should Know

Active Directory 1. To quickly list all the groups in your domain, with members, run this command: dsquery group -limit 0 | dsget group -members –expand 2. To find all users whose accounts are set to have a non-expiring password, run this command: dsquery * domainroot -filter “(&(objectcategory=person)(objectclass=user)(lockoutTime=*))” -limit 0 3. To list all the FSMO role holders in your forest, run this command: netdom query fsmo 4. To refresh group policy settings, run this command: gpupdate 5. To check Active Directory replication on a domain controller, run this command: repadmin /replsummary 6. To force replication from a domain controller without having to go through to Active Directory Sites and Services, run this command: repadmin /syncall 7. To see what server authenticated you (or if you logged on with cached credentials) you can run either of these commands: set l echo %logonserver% 8. To see what account you are logged on as, run this command: ...
  The Guardrails of Automation VMware Cloud Foundation (VCF) 9.0 has redefined private cloud automation. With full-stack automation powered by Ansible and orchestrated through vRealize Orchestrator (vRO), and version-controlled deployments driven by GitOps and CI/CD pipelines, teams can build infrastructure faster than ever. But automation without guardrails is a recipe for risk Enter RBAC and policy enforcement. This third and final installment in our automation series focuses on how to secure and govern multi-tenant environments in VCF 9.0 with role-based access control (RBAC) and layered identity management. VCF’s IAM Foundation VCF 9.x integrates tightly with enterprise identity providers, enabling organizations to define and assign roles using existing Active Directory (AD) groups. With its persona-based access model, administrators can enforce strict boundaries across compute, storage, and networking resources: Personas : Global Admin, Tenant Admin, Contributor, Viewer Projec...