28 April, 2014

vMotion fails on MAC-address of virtual NIC

During one of my recent projects (replacing ESXi hosts, from rack servers to blades) there was also a second project ongoing that touched the VMware environment. The current EMC SAN solution was being replaced by a new EMC SAN solution comprised of VPLEX and VMAX components.
One of the inevitable tasks involved is moving VM's and Templates to datastores that reside on the new SAN. After all VM's of a particular datacenter were moved successfully,  it was time to move the templates.
As templates cannot be moved by the use of Storage vMotion, the customer first converted them to normal VM's. In this way they could leverage the ease of migrating them by Storage vMotion. Well so much for the idea, about 80% of the former template VM's failed the storage migration task. They failed at 99% with a "invalid configuration for device 12" error.
When I looked at this issue at first I had no idea what could be the cause of this issue, although it looked like it had something to do with the VM virtual hardware. I took a look at the former template VM's that did go through a successful storage migration and compared the virtual hardware to the ones that failed. There was no difference between the two. The only thing different was the OS used, this was also pointed out by the customer. Now the difference in OS is not what is important, but the point in time the template was created is!.
It stood out that the former template VM's with the older OS's where failing, so I asked to customer if he knew when these templates were created on more importantly on which version of vSphere.
As you might know the MAC-address of virtual NIC's has a relation to the vCenter which is managing the virtual environment, I don't know the exact details but there is a relation. And I remembered reading a old blog post about a invalid configuration for virtual hardware device 12, this post related device 12 to the Virtual NIC of the VM. The templates where originally created on a vSphere 4.1 environment of which the vCenter was decommissioned instead of upgraded along with the rest of the environment. When you put this information (or assumptions) together it could very well be that the MAC-address of the virtual NIC was not in a "good" relation with the current vCenter and that this resulted in failing Storage vMotion tasks. I know it was a bit far fetched, but still I gave it a go and removed the current vNIC from one of the failed VM's and added a new vNIC. I checked and the replacement changed the MAC-address of the vNIC.
After the replacement I re-tried to Storage vMotion and this time it succeeded!. Did the same replacement action of the remaining failed VM's and they all could now successfully be migrated to the new datastores.
So for some reason when doing a Storage vMotion vCenter needs a VM to have a "compatible" MAC-address to succeed.
In short if you ever run into a error "invalid configuration for device 12" when trying to perform a Storage vMotion, check if the MAC-address of this VM "aligns" with the MAC-adresses of VM's that can be Storage vMotioned.
If they don't, replacing the virtual NIC might solve your issue.

No comments:

Post a Comment