Mapping Virtual Machines to Datastores to Storage, Part 1

summary of the problem

This past week, I had the (unfortunate) opportunity to help provide some on-site support to a customer suffering through an outage (as a matter of fact, I am again this week).  The customer in question lost a large amount of storage due to an array failure.  This, of course, caused their DR plans to kick in, they declared failover for their critical workloads, and they began the process of recovery.  Along the way, they discovered a large number of workloads did not, in fact, actually have plans for backup and/or DR.  As a result, they needed to quickly establish the scope of the impact, which virtual machines were on which storage, etc.  They didn’t / don’t have any tools appropriate to that ask – ideally they would have some sort of storage resource management view that informs them – QUICKLY – of which VM is on which datastore and therefore which LUN and array.  Yes, you can get this information out of vCenter, but it is a top-down view rather than a storage centric view… mapping datastores to LUNs and arrays isn’t quick, intuitive, or a native feature of vCenter.  They also didn’t have the appropriate vCenter plug-ins from their storage vendor(s) to tie that information together quickly.

Aside from the fact that the ESX hosts and vCenter were all seriously compromised due to the sudden loss of BOOT LUNs, datastores (separate post to come later), etc., their only option was to:
1- pull a list of WWNs for the impacted LUNs off the array
2- run a report of some kind from vCenter to pull the vms and their datastores, and then
3- run a report of some kind from vCenter to pull the datastores and their NAA ids…

My first thought was of course to try to pull the appropriate information from vCenter (even though it is not an accurate source of truth in a compromised environment (separate post to come later)…  for this customer, SSH is not enabled on their hosts, only a select few people have the right to do so, and many of the hosts were in such a compromised state that you couldn’t gain any useful information from them.  Therefore our only choice was to pull information fro vCenter, which had a view of the state of the datacenter at a particular point in time, prior to the event.  To compound matters, this particular customer has quite a few vCenter servers, and a subset of them represent the environment in question.

Therefore, from the superset of all vCenters, we needed to then pull ALL virtual machines and their datastores, and then pull ALL their datastore / LUN information, and cross reference it all against the WWNs pulled from the array to arrive at the information in question.

option #1 – try PowerShell

Knowing that we had to pull a LOT of information from (perhaps) dozens of vCenters, it seemed to make sense to me that this was a perfect opportunity to practice a little PowerShell.  No being a guru in that field myself, I did a little research, pulled in a couple favors, and put together the following few lines of code – very quick a dirty, but it looked promising:

1 – Get information on virtual machines and their datastores:

Once we have the information on the virtual machines, I then needed to…

2 – Get information on datastores and their LUNs:

 

No Love…

I decided to just run each of these little scripts against the vCenters in question (taking the time to put it all into one script and then parse through a list of vCenters wasn’t really an option, as the list of vCenters in question changed), and then once the data was gathered, use some functions in Excel to merge it all (which would be quicker in the short term for me, as my Excel skills far exceed my PowerShell!).  Well, as it turned out in a couple of test runs, it took hours for the above scripts to run… I mean, really, really horrible…  Abysmal.

In the end that first day, we ended up just going into the vSphere Web Client and exporting data into CSV files from the GUI…  It was neither the time nor the place to try to ‘root cause’ the sluggishness of the scripts, but in any case, it was not auspicious…

option #2 – try PowerShell, but better this time!

Well, this bothered me, needless to say.  Not only had it taken longer than I had wanted / hoped / expected, I ended up wasting both my and my customer’s time on a method that ultimately yielded poor results.  So I did some more digging in my off time, and discovered the PowerCLI cmdlet “Get-View.”  Whereas the Get-Datastore and Get-Datacenter cmdlets I used above actually talk to vCenter over the vSphere API and then query the vCenter database for some of the information (and really, it was the Get-Datastore loop above that was incredibly slow), the Get-View cmdlet just queries vCenter and returns the in-memory .NET object, and therefore doesn’t need to parse through individual hosts in the inventory.

Ultimately I ended up with the following lines of code in 2 different scripts:

Get-View command will return information on multiple types of vSphere objects:

  • ComputeResource
  • ClusterComputeResource
  • Datacenter
  • Datastore
  • Network
  • DistributedVirtualPortgroup
  • DistributedVirtualSwitch
  • Folder
  • HostSystem
  • ResourcePool
  • VirtualApp
  • VirtualMachine
  • VmwareDistributedVirtualSwitch

The really cool part of get-view is that when you run it on a certain type, you can parse through the object in question further by drilling down into additional properties, since you are basically just walking down through the API.. For example, in the above example, you will see I set a variable as follows: n=’NAA’;e={$_.info.vmfs.extent.diskname}.  In Powershell, the notation $_ is a self-referrential notation, meaning it refers to the object that has just been gathered (in this case, Get-View -ViewType datastore).  Then, we ask for “.info.vmfs.extent.diskname” on the object, which asks specifically for the:

  • disk name assigned to the
  • extent assigned to the
  • VMFS volume for the
  • datastore.

The statement following then requests for the disk layout information on the virtual machine, which returns the name(s) of all the VMDK files assigned to the virtual machine (including the path to the datastore and VMDK), and the size of the VMDK.

Success!!

This provided the required results… data from vCenter that was as fast as / faster than the C++ or Web client…

Along the way, I learned a bunch about PowerCLI, and in a future post I hope to how how to pull this information together into a single script.

Note: these were scripts that worked in my environment. There is no warranty or support with this script, please use at your own risk.

Advertisements

About Tom

Just a guy, mostly a father and husband, pretending to be a technologist, constantly underestimating how much there is to learn.... content is mine
This entry was posted in Adventures in Scripting, Automation, Best Practice, From the field, TechInfo and tagged , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s