As VMware continues to push in the direction of unix-based appliances for their vSphere management components, those without a Unix background (like myself) are having to come to grips with the Unix versions of common administrative tasks. Increasing the disk size on a vCenter Server appliance (VCSA) is one such task.  In vCenter 6.0 VMware has introduced Logical Volume Management (LVM) which really simplifies the process of increasing the size of a disk, and allows it to be done while the appliance is online.  VMware KB 2126276 covers all the steps required to increase the size of a disk, but this guide will cover it in slightly more detail.

Step 1: identify which disk (if any) has a problem with free space.

To do this, I connect to the appliance via SSH or the console, enable and enter the shell, and use the df -h command.
For more information on using command line tools for working with disk space can be found in my post Useful Unix commands for managing disk space on VMware appliances.

increasing-the-disk-size-on-a-vcenter-server-appliance-in-vsphere-6-0-a

I can see that both /storage/core and /storage/log are 100% used.  I’m guessing that /storage/core is full with vpxd crashdumps that are being generated because vCenter is crashing after being unable to generate logs on /storage/log.  Based on this guess, I’ll increase the size of /storage/log and then manually delete the crashdumps on /storage/core and monitor the situation.  I won’t cover the steps involved in deleting the vpxd crashdumps in this post, but it basically involves deleting core.vpxd.* and *.tgz files in the /storage/core directory.

Step 2: Increase the size of the affected disk using the vSphere Web Client

Looking at the table in VMware KB 2126276, it tells me that the disk mounted to /storage/log is VMDK5.  The way this is presented is a bit confusing in my opinion, because the disk we’re looking for is listed as hard disk 5 in the web client, but the filename of the disk is vmname_4.vmdk (the numbering of virtual disks is thrown out in this way because hard disk 1 is vmname.vmdk, and hard disk 2 is vmname_1.vmdk).  Where the KB article says “VMDK5”, it really just means “the fifth VMDK file”.

The reason my /storage/logs disk filled up is because I’ve increased the logging levels on my vCenter appliance to try to catch an issue that had been occuring.  Because of the increased amount of logs being generated, I’m going to increase the size of this VMDK to 25GB.  I don’t want to go overboard because the disks are thick provisioned by default.

increasing-the-disk-size-on-a-vcenter-server-appliance-in-vsphere-6-0-b

Step 3: expand the logical drive and confirm that it has grown successfully

Return to the SSH session and expand the logical drive(s) that have been resized.  The following command will expand any disks that have had their vmdk files resized.

If the operation is successful, you should see a message similar to the following.

In my case, I did get that message eventually,  but I also got a bunch of the following errors:

The reason I saw that error is because my /storage/core disk is 100% used.  As mentioned I’m going to free up space on that drive manually, so I’ll ignore that error for now.

If I run df -h again, I can see that /storage/log is now 25GB in total size.  Job done!

increasing-the-disk-size-on-a-vcenter-server-appliance-in-vsphere-6-0-c

Note: In the vCenter 5.x appliance, increasing disk sizes was a bit of a pain. The operation had to be performed while vCenter was offline, and involved adding a brand new disk, copying files from the old disk to the new one, and editing mount points.  For anyone who is working with a vCenter 5.x appliance, the steps are in KB 2056764.

Coming from a Windows background without much knowledge of Unix commands, I often find myself at a loss when trying to figure out how to do things on VMware’s vSphere appliances.  Managing disk space from the command line on an appliance is something I’ve had to do more than a few times, so I thought I’d create a quick list of the Unix commands I use most often to identify which partitions are filling up, and then which folders and files on that partition are consuming the most space.

When I’m working on a disk space problem, there are few things I need to do. First, list disk space by partition. Second, identify the biggest consumers on a partition by listing disk usage of child files and folders. Third, figure out if any of the directories identified in the previous step are symbolic links, and find the link target. Lastly, depending on what files are consuming all that space, I may want to delete them.

List disk space per partition

The df command (which is an abbreviation for disk free) is the trick.  The -h switch will display file sizes in KB, MB and GB.

Now we know that /storage/core and /storage/log are both 100% full, we need to work out what is consuming the space on those partitions.

List disk usage of child files and folders on a partition

The du command (which is an abbreviation for disk usage) estimates the size of directories and files under a specific path. The best way to use this command is to sort the results by file size, as follows:

You can also use the -h switch to present the file size in a more friendly format, but the downside of that approach is that you can’t pipe the results to sort, as the list will be incorrectly sorted because it only considers the numbers and not the units, so it doesn’t understand that 10GB is larger than 100MB).

One thing to note is that if a child folder is actually a symbolic link, the file size will be listed as zero. These need to be identified and handled separately.

List symbolic links and discover link targets

To identify symbolic links, use the ls command with the -la switches. The results will be colour coded, and symbolic links will be listed in a light blue colour. The real path of the symbolic link will be listed to the right. In the snipped example below, you can see that /var/log/vmware is actually a symbolic link to /storage/log/vmware.

Delete files or folders

It’s possible to delete files via the command line using the rm command and specifying the file or folder name. To remove a file:

To remove a folder:

But use that with caution, because it could end badly.

A third option is actually my preferred choice, but it doesn’t involve using the command line at all. This option is to use a program like WinSCP to connect to the appliance and delete the files via the GUI. This is a good thing in my opinion, as there’s less of a risk of accidentally deleting a folder by mistake, and because it’s much easier to delete multiple files at once.useful-unix-commands-for-managing-disk-space-on-vmware-appliances-a

If you’ve set up a vCenter 6.0 appliance or a Platform Services Controller and tried to connect via WinSCP, you will have noticed the following error:

Host is not communicating for more than 15 seconds.  Still waiting…

resolving-the-host-is-not-communicating-for-more-than-15-seconds-error-when-connecting-to-a-vsphere-6-0-appliance-with-winscp-a

This error arises because vSphere 6.0 appliances now come with two shells: the appliance shell (which is the default shell for the root user), and the BASH shell.  WinSCP throws the above error when the root user is configured to use the appliance shell.  The error is easily resolved by configuring the root user to use the BASH shell.

To do so, connect to the appliance via SSH or the console and enter the following commands:

Problem solved!

After you’ve done what you need to do with WinSCP you can change the default shell back easily enough using the command below, however I’m not aware of any downside for leaving it set to the BASH shell (and the upside is you won’t need to manually change the shell every time you want to connect with WinSCP).

VMware published a KB article (KB 2100508) in March 2015 on the subject, but if you’re seeing this error for the first time chances are you have no idea what the root cause is, so good luck finding the solution through google.  Hopefully this helps!