Openstack, locate a failed CEPH node disk
Issue:
We have a failed disk on a CEPH storage node on an OPENSTACK deployment.
Resolution:
In order to locate the failed disk slot, you need to utilise MegaRAIDs storecli in order to pull this information from the CEPH node.
Steps below:
1. SSH to fuel node (this has ssh access to the other nodes)
2. fuel node list (see list and ip address of alerting device)
3. ssh node-# <– insert node number
4. ceph osd tree (see from list which disk is in a down state)
5. cd /opt/MegaRAID/storcli/
6. navigate to:> root@node-8:/opt/MegaRAID/storcli# ./storcli64 /call show
7. see which disk slot has failed.
Notes:
Other useful StorCli commands:
>shows everything
./storcli64 /call show
>For vdisks
./storcli64 /c0 /vall show#
> For drives
./storcli64 /c0 /e0 /sall show
> General Info
./storcli64 /c0 /eall /sall show
> Adapter info for disk slots etc
./storcli64 -AdpAllInfo -aALL