using pyvmomi to get a list of all virtual machines — fast

Something that often comes up from various people new to using the vSphere API is how to get information very quickly about all the VirtualMachines in the inventory. There is a sample that comes with pyvmomi called getallvms.py which is an obvious place to start to get this info. When its run on an inventory that only has 30 VirtualMachines it seems pretty fast. It only takes about .5 seconds to complete. Try it on a larger inventory like something with 500+ VirtualMachines and it really starts to slow down going from .5 seconds all the way up to over 6 seconds. This number just keeps growing the larger the inventory gets. Once the inventory reaches over 1000 VirtualMachines it can take over 10 seconds for this info to be returned. In other words this solution just doesnt scale, but its often the only way new comers know. The good news is that VMware provides other ways to get this info. The bad news is that its not an obvious solution, and its kind of complicated to use, but thats what I am here for 🙂

Where I work we have over 45,000 vSphere powered Virtual Machines, and its my job as a Sr. Developer there to make sure our code is stream lined, efficient, and scales the way we do. This is why I use property collectors when I need to work with objects from the vSphere inventory. To help new users I provided a sample I call vminfo_quick which as its name implies get info about a VirtualMachine, quickly. To test this lets run the getallvms.py from above on a vCenter with 576 VirtualMachines and time it.


time python getallvms.py -s 10.12.254.119 -u 'administrator@vsphere.local' -p password

real 0m6.300s
user 0m2.476s
sys 0m0.123s

Almost 6 1/2 seconds. Thats not too bad right? Now lets run the vminfo_quick sample I provided against that same vCenter and see how it does. I included a counter and a timer in this sample so we dont have to run time.


python vminfo_quick.py -s 10.12.254.119 -u 'administrator@vsphere.local' -p password

Found 576 VirtualMachines.
Completion time: 0.368282 seconds.

As you can see using a property collector vastly improves performance. I have tested this on an inventory with 1500 VirtualMachines and it still finishes in just under 1 second. I plan to cover details around what the property collector is and how it works in future posts. Stay tuned!

14 thoughts on “using pyvmomi to get a list of all virtual machines — fast

  1. Running this today on a few dozen machines, great stuff!

    I’m new to the SDK and a pretty basic python developer – how could I modify it to return IPs and Datastores that fast?

    Thanks!

  2. Thanks for the feedback! You would need to add the properties for each thing you wish to get. datastore is a property but will return the MOR so then you need to add another filter to the search spec, then you have to relate the Datastore to the Datastore on the VM.. its not real simple when you first try doing these things but once you have done it a few times its not so bad. For IP you could just add the property.. this normally requires vmware tools to be installed on the VMs to get the correct info back. The property you need might be “vm.guest.ip”

  3. This two-year old post just saved my bacon. I was getting all the hostnames from a distant vCenter on old-ish hardware, and switching to your example brought a 30-second query down to 2 seconds. Thank you!

  4. Thanks for the post. Can you explain why your approach is faster than the simple one? Does it cut down the size of the response, make fewer round trips, something else?

  5. Hi Eric,

    The reason is in the number of API calls. A property collector will get all the info you ask for in 1 call vs the “simple” approach making at least 1 call for each VM and depending on the properties you are trying to get more than 1 call per VM.

  6. Hello,

    Like others, only recently trying to get into this, and have only rudimentary Python skills. It seems the secret sauce you list as the “property collector” is specific to a file called pchelper.py.

    Similar to the question above, if I want to make sure I’m using the property collector, do I have to always import the pchelper file?

    Should VMware have made pchelper part of the pyvmomi release instead of something found in sample tools?

    Thanks for your patience and the cool sample.

  7. The property collector is part of pyVmomi, I just made a wrapper around it to make it easier to use. If you want to use my wrapper then yes you do need to include it, but its not required.. You could implement the code I have in my helper in your own code. To answer your last question I do not think VMWare should have higher level things like pchelper included in pyVmomi, but I do think it would be nice if they had a higher level API for pyVmomi for folks to use because pyVmomi is the low level SOAP wrapper and that can be a bit difficult to use for people new to their API.

  8. I’m new to Python and coding as well. So may be its a small issue I’m facing. When I run the vminfo_quick I get “from tools import cli ImportError: cannot import name ‘cli'”. From basic understanding the tools library doesn’t have cli function ? How do I fix this ?

  9. Thank you I think that might have fixed it as I don’t see that error anymore. However when I run the code, I get this error ‘Unable to connect to host with supplied info.’ Im guessing I’m not passing the arguments correctly. I’m trying to run ‘python3 vCenter.py -s 10.38.154.10 -u ‘administrator@vsphere.local’ -p xxxxx . I tried to run the same way for getallvms.py and it works perfectly fine. vCenter.py is same as vminfo_quick.py. Appreciate your help.

  10. Fantastic. Thank you. Is there a way to apply further filters when creating the view?
    Instead of all objects of type virtual machine I need all vms in a given cluster only. I know I can do all that manually by getting all vms in a cluster and then only use the props from those bit it would be awesome if you could filter the initial container view directly and keep it to one api call.
    Any smoke and mirrors to share here?

  11. Great stuff. Thank you. Is there a way to further filter by adding something like cluster=cluster_mor/cluster_name?
    I would want to filter by cluster/datastore or folder. I know I can do all that manually but if it could be done in one smooth api call that would save me so much time and code. As far as I can tall you can only filter the obj type and the pros you want for it.

    thx

Leave a Reply

Your email address will not be published. Required fields are marked *