Monday, December 17, 2007

Performance Monitoring: Correlations

Although I have been arguing that performance monitoring and capacity planning require a decent server montoring environment, it also requires more. This extra part comes from the fact that often services depend on each other. A web service connects to a database (hosted on a different server) and fetches data from the file server (local, or SAN/NAS). Often, one part in the chain is a bottleneck for the whole of the process. This is a shame and can be avoided by careful analysis of the correlations between performance data.

Again, this is an argument in favor of what I called a 'load profile' earlier. By modeling a server by means of a load profile, we get a representation of that server in terms of measurable quantities. Statistics and mathematics in general can then help us analyze the correlations between those load profiles.

Performance Monitoring: More about Peaks

In a previous post, we talked about averages (types of averages) and peaks and how peaks can tell you something about the spreading (variance, standard deviation) of the data.

Information about peaks is required (especially in capacity planning situations) to understand the sizing of the platform you're running on. On the other hand, having a peak utilization of (say) 80% and an average of 20% still does not tell you that much: how long was the system running at high CPU levels? Maybe only for 10 seconds during the day (a scheduled database operation, backup procedures, etc.)? Is it crucial for our service that this high level of CPU can be guaranteed at that moment, or is it affordable to let the application/server wait a little longer for CPU requests? Think of a mail server, for instance, where it wouldn't be a big deal if the server would forward your mail a few milliseconds later or earlier (would it?).

Basically, what we need is a load profile for a server. A load profile contains information like:

  • Load during hour, day, week, month (or any other relevant period for this server)
  • Expected response times instead of observed response times (basically, a cutoff on the resources)
  • Current hardware inventory
  • Current 'scaled' hardware inventory (20% CPU usage is different for a quad core than a single core, a scaled inventory takes that into account and enables easy comparison of systems)
  • Etc.


More about load profiles later...

Friday, December 14, 2007

Performance Monitoring: Averages and Peaks

Now that we are into the topic of performance (of capacity) monitoring and planning, let us continue with something that has kept me busy the last couple of days: averages of performance data (and other statistical information) versus peaks.

This goes back to a classic textbook example in statistics, where the mean value of a series of data points is completely irrelevant as a representation of the data points itself. Let us consider the following example. Given a series of data like in the table below:


X Y
A 1
B 2
C 4
D 7
E 100
F 4
G 9
H 7
I 3
J 5

These may, for instance, represent scores (0-100) given to students (by a very strange teacher). In a graph, this is presented below:



The red line represents the average value. It is clear that everyone (except the teacher's little friend with 100 points) is below the average. As a consequence, the average is not a good representation of the data as a whole. Some say it is too much influenced by the extreme values. In fact, this average (sum of the data values divided by the total amount of data points) is called the arithmetic mean. There is another notion of 'an average' which is called 'geometric mean'. In the example above, the value of it would be 5.4 which is much more relevant. The median would even better define the data set, but that would lead us too far.

Basically, the fact that the arithmetic mean does not give a good indication of the data set is caused by the large spread of the data. In statistics, there is another indicator for this: the variance, or standard deviation. It is a measure for how close or far apart the values are. In the example above, the standard deviation. In our example above, it would read 30.2. Suppose the value of E would be 10 instead of 100, the standard deviation becomes 3. In others words, the lower is the standard deviation, the more the data is 'close'.

The above brings me back to performance monitoring. Somehow, I want to summarize the performance data of a server by a small set of indicators (averages of time etc.) that give a reasonable picture of the actual performance or in other words: that are representable for the system's actual performance. Typically, when looking at the percentage a CPU is used over time, we see fluctuations that are similar to the figure above (don't believe me, below is an actual example of a system with some high peaks and further sitting idle for most of the time - this server has 4 cores by the way). We conclude that, therefore, it does not make much sense to look at simple averages of the performance counters in order to get an idea of the behavior of the system..



Coming back to VMware Capacity Planner: the tool keeps track of the average value (yes, the simple average that does not say much) but also internally uses the geometric mean but not the variance (as far as I can tell). From this perspective, the whole performance gathering using this tool would be worthless. Luckily, it also keeps track of peak values (and calculates averages over these peaks but that is another story). Comparing the peak values with the average tells us a lot about the spread of the data points. The system behind the graph above, has an average CPU value of less than 20% while the peak CPU utilization is higher than 90%! This tells us that the variance/spreading in data points is large.

In a virtualization/consolidation assessment, these cases have to be taken into account, as we do not want our systems to become unresponsive because they have their peaks at the same time. More about this and other topics later...

Capacity Planning: What to monitor and how to interpret

Capacity planning starts with capacity (of performance) monitoring.

Everybody who is involved in the monitoring of systems will acknowledge that the most difficult aspects in monitoring a server (or set of servers) are:


  1. Finding the proper indicators for the performance of the system (CPU usage, CPU cycles, memory usage, paging, etc.)

  2. Making sure they are queried regularly, but not too much in order to avoid impacting the performance of the system by monitoring it.

  3. Storing the resulting data

  4. Summarize, create views, average, etc. (this also depends on what you want to know about the system)

  5. Analyze, interpret, etc.


Did I say the most difficult aspects? Are there any other aspects? Well, not really... capacity monitoring (and planning as a further step) is not an easy task:

  • Are you aware of the utilization of your systems? Even of your workstations?

  • Would you have any idea how many of your servers could be placed on a virtualization platform with a specific set of hardware characteristics?

  • Would you know when your mail server had the hardest time managing mail boxes the last couple of weeks?


Probably the answer is 'no'. Maybe the answer is 'I don't care'?

Most companies do care, because of several reasons: cost, manageability, flexibility, scalability, environment, space, etc.

There are already some players on the market: VMware Capacity Planner (see earlier posts), PlateSpin PowerRecon, Veeam Monitor, etc. I'm mostly used to VMware Capacity Planner (VMCP) but recently, I have also evaluated both PowerRecon and Veeam Monitor. More about this later.

VMware Capacity Planner: taking the data offline (Part 3: Getting CSV data)

This is part 3 of a series concerning VMware Capacity Planner.

First a general remark about the update to version 2.6 of the tool: it was not always pleasant to use the web interface the last week because of downtime (scheduled maintenance), errors in the interface, latency, etc. It all feels normal again, and some of the new features where definitely worth it.

Back to business: What interests me most when it comes to downloading data from the website is the pure performance data. VMware averages this information over a week, so data can be downloaded on a weekly basis.

I'm not going into the full details of everything that can be tweaked, configured or queried but most of it will be clear when looking at the actual command-line to get a CSV file. Most important is understanding that some of the choices you make are based on the session that is open on te server, and are not selected using POST or GET. Some of these options can be configured on the html view, but not the export view (although the syntax is available) so we have to apply a trick: first get the html view for the type of data we want and then export the data.

Below, you find a typical command-line to download the core VMware statistics for week 45 of a company with ID 1234:


wget --keep-session-cookies --load-cookies cookies.txt
--save-cookies cookies.txt https://optimize.vmware.com/inv_report.cfm?page=ViewStatsMain2.cfm&action=export§ion=View&sort=ObjectName&sortDir=ASC&YearWeek=200745&Mon=2&ISO=2&opt=P&PerfGroupID=1&grpt=S&HostType=0&menutab=Performance --post-data CID=1234&YearWeek=200745 --no-check-certificate -O Stats.csv


Note that quite some things have to be specified: week, company, type of average, type of metrics, system group, etc.

Drop me a line or leave a comment in case you want to know more about the precise parameters that are important or in case you're interested in my little script that does the magic for me.

This concludes the practical part of this series. I plan to write more about capacity planning and capacity monitoring in general later...

Wednesday, November 21, 2007

Word vs. Excel

I stumbled upon the iWork suite (aka Office for Mac) already some time ago, but today thought back of it as I was thinking about Excel and Word...

The main problem I have with Excel and Word is that there is nothing in-between. Let me explain what I mean by means of some examples:

Text-based tables: Excel (as most spreadsheets) is a numerical calculation sheet application. Although it has some functions to work with strings, it is not the primary goal of the application. This is fine, but in practice, we see that a lot of people (and even big companies) use it to store text-based data (e.g.: IP ranges, Application overview, user accounts, passwords, etc.)

Side-by-side comparison: this is in fact an example of the previous point: how many times do we not want to compare pros and cons of something? Excel is not really optimized for this, but a table in Word is even worse.

Documents with a lot of tables: This really is on the boundary of both products. In order to keep the exact formatting as set in Excel, I usually copy/past the table into Word as a picture. This is not very efficient in terms of having to change some numbers in the tables. Using the OLE features with a conventional copy/paste is usually not an option as the table in Word does not look half as nice as it should.

Reports: This is again a special case of the previous point: sometimes the data in the tables is expected to change because it contains for instance KPI information, or extracts from a database. Using Excel to nicely print a paper/PDF report is hard, copying all the tables and graphs to a Word document every day or week is even harder. In practice, I solve this by using a macro that automatically creates a Word document and pastes the tables and graphs (as pictures) where they belong. This approach has been the most successful for me up till now.

It seems iNumber makes a step in the right direction. Take a look at some of the demos on the page and see for yourself: graphics, text and tables are mixed easily.

Is this another 'sign' I have to switch to the Mac?


Friday, October 26, 2007

VMware Capacity Planner: taking the data offline (Part 2: Cygwin & wget)

I started thinking about avoiding the manual exports from the VMCP website and remembered I used WGET in the past for this kind of stuff. The difference with before was that this time, I needed to login to the website before being able to get data from it.

Login in to the VMCP website is done using a POST method and luckily wget supports that. The next thing is to understand what the post data needs to be. This can be fetched from the source of the logon page:


The relevant POST data is derived to be

fuseaction=Security
Page=users_login.cfm
username=???
password=???


Since I run Windows XP on my laptop, I use Cygwin to run wget. Cookies are used to store a session ID for you logon session, so wget has to be told to store those cookies in a file. This is the resulting command line:


wget --keep-session-cookies --save-cookies cookies.txt 'https://optimize.vmware.com/index.cfm' --post-data 'fuseaction=Security&Page=users_login.cfm&username=???&password=???' --no-check-certificate -O output.html


Parsing the output.html file, you should be able to see whether logon was succesful. The success depends on many factors (local proxy, network settings, username/password, etc.). You can get more info by adding suitable options to wget.

This concludes part 2 of this series of articles. In this part, we used wget to logon to the VMware capacity planner website.

Thursday, October 25, 2007

VMware Capacity Planner: taking the data offline (Part 1: Introduction)

Lately, I have been involved in some VMware Capacity Planner (VMCP) tracks. VMCP deals with monitoring a set of (physical) servers in order to assess possible virtualization scenarios.

VMCP works in the following way: a monitoring server is set-up with a tool that does (but is not limited to) regular performance measurements. This tools sends the data to a VMware database through the web (over a secure channel). Via a web interface, one can then query the information, get reports, view trends, show graphs, configure and create consolidation scenarios, etc. Usually, we let the system run for around 4 weeks to get a realistic idea of the performance characteristics.

What I like about VMCP is that the data is not located at the customer site, and available at all times (once it has been uploaded). This gives me the opportunity to regularly check on the status of the performance measurements.

The biggest disadvantage of VMCP is that the web interface is not the most flexible and fast interface around. Some things I would like to do are not available (lack of flexibility) but could easily be done in, e.g., Excel and at times when everyone in the world is awake it takes ages to refresh a page and get the result of a query. Moreover, it is not easy to get good-looking information to paste in a document.

When it comes to writing a report, the customer is obviously not only interested in a statement like: you will need 5 ESX server of type X to cope with the load. Therefore, I like to add tables with the most useful metrics (CPU, network, Disk I/O, ...) for later reference. I add this information as an appendix.

This is where I would spend at least half a day exporting CSV files from the VMCP website, loading them in Excel, laying it out as a nice table and paste it in the document. I started thinking about automating some of the steps required, and I covered the most time-consuming already: exporting the information from the website as a CSV file.

In the following part, I'll explain how I started this little adventure...

Wednesday, October 03, 2007

Friday, September 28, 2007

VCB for NFS

Read this post: http://storagefoo.blogspot.com/2007/09/vmware-on-nfs-backup-tricks.html. It discusses a free alternative for VMware Consolidated Backup, but over NFS instead of Fibre Channel (SAN).

Wednesday, September 26, 2007

The thing we want Softgrid to do (Active Upgrade+)

This entry started as a comment on
http://blogs.technet.com/softgrid/archive/2007/09/25/methods-for-upgrading-or-updating-virtualized-applications.aspx

The thing with Active Upgrade is that (as the article points out), the user gets the update automatically on next launch of the application. This is a major step forward for application updates. However, it does not solve the whole application testing problem: before pushing changes to the users, I need to test the application. In order to test, both version need to coexist but with Softgrid it is not X and Y but X or Y. The fact that a rollback to an earlier version of an active upgrade is not possible adds to the complexity.

What I basically want is:
1) Make a parallel application branch Y (as it is called in the article) from version X
2) Test this branch (without impacting the existing version)
3) Update the parallel application branch if required (using active upgrade)
4) When tests are successful, use Active Upgrade to update the production version X

This scenario is inherently not feasible because in stage 1, a new asset directory and a new GUID is created so that settings are stored in a different location.

The scenario that comes closest to what I want is the following:
1) Make a parallel application branch Y
2) Test this branch
3) When tests are successful, apply exactly the same procedure to the original version X, but this time using active upgrade.

The main issue in this case is the fact that you need to record the exact upgrade scenario in order to apply it again in step 3.

An other workaround exists, though:
1) Make an active upgrade
2) Test this update by manually importing the SFT file on the client (instead of streaming it)
3) When successful, do the update centrally with active upgrade.

I hear you thinking: but we have a Test envionment in order to do application tests. Ok, right, do you have a test copy for every production database to name just one example of why production tests may be required.

Monday, September 17, 2007

File Versioning

File versioning is something I have been interested in since I started out doing programming for my simulations back in university. Later, I was lucky enough to be able to use (La)TeX to type my PhD thesis, which enabled me to use CVS to track changes to my text.

Years later, I'm less lucky in that I do not use LaTeX anymore (neither do I program in C ;-), but I do still want my documents and scripts to be versioned, especially when actively working on them. I used to create copies for major versions of Word or PowerPoint documents. At the end, I deleted most of the intermediate copies because they were no longer relevant.

But now, everything has changed dramatically... due to FileHamster. FileHamster is a versioning tool, that automatically tracks changes to files and directories and allows you to keep those changes, revert to earlier versions, backup and by means of some handy plugins add notes and remarks to versions. And ... it is free !



Behind the scenes, the tool creates a copy of every version you save, so you do use a lot of space for big files, but you cannot blame a tool for the fact that most of the files we work with nowadays are composed of binary data!

Friday, August 24, 2007

ESX 3.0 on Workstation 6 (Update)

By accident, I found out that on the website of XtraVirt, a series of white papers has been published around this theme. Starting from installing ESX in a VM, to creating a NAS virtual machine for hosting your virtual-virtual machines and doing VMotion etc. Interesting...

ESX 3.0 host as NAS server

A client asked me to configure one of their ESX hosts as an NFS server for hosting ISO files. This host had 300GB of remaining disk space which would otherwise be just a waste. So, I started right away: creating an export directory, formatting the file system (converting vmfs to ext3), tested mounting it from different ESX host... no problem!

Until I tried to import the new volume as a datastore via the VI Client, then the VI Client complains about the fact that the NFS server does not support NFS v3 over TCP. Although NFS over TCP has been around for ages, it appears to have been stripped from the NFSD module in the service console. Is this done on purpose by VMware to make sure you can not easily run a standalone ESX host with shared storage out-of-the-box?

Anyhow, by recompiling the NFSD module (luckily VMware has added the source for the kernel) one can make it work. See this document (in dutch) for a procedure on how to make this work. Works like a charm!

Tuesday, August 21, 2007

ESX on VMware Workstation 6

I installed ESX 3.0 on VMware Workstation 6 today. People have done this before without too many issues, just some important notes:
- Make sure your processor support VT (or the equivalent for AMD) because otherwise performance is extremely poor.
- Enable VT support in your BIOS (my Dell Latitude D620 had this feature disabled by default - duh!).
- Make sure your virtual disk is 4GB
- Choose a combination of LSILOGIC and SCSI

Doing this, installation of ESX is straightforward and takes about 30 minutes (rough estimate). I've seen messages from people that claim to have virtual machines up and running (with VMotion and all). This is the next step...

Keep you posted...

Monday, August 20, 2007

Disable MMC through GPO

I've been looking for 1 hour now to find the policy setting that prohibited me to launch the SQL manager MMC snap-in (packaged using Softgrid by the way). Stupid, it was there all the time:



I was checking and testing different PATH rules in the section for Software Restriction Policies but forgot there is an administrative template section for the MMC specifically...

Citrix buys XenSource (I know, it's old news!)

Sure I know this is old news, and people have been discussing this fact extensively over the last couple of days.

My colleague Michel Roth has written an interesting article on his website (see here) discussing a possible explanation of the broader picture of the whole deal. Interesting...

Friday, August 17, 2007

Renaming several Files at once

Freeware products exist for renaming files in batch, think about pictures related to the same theme or taken on the same day, for instance. It turns out no additional software is required because Windows Explorer on XP can handle it. Here's the way it goes:

Select all the files you want to rename:


Press F2 (or right-click on a file and select Rename). Give the files a new name and ...


Done !


Please note that file extensions are not handled properly!

Friday, July 20, 2007

Softgrid: Machine vs User Cache on the Client

Using Softgrid, applications are virtualized and packaged in (basically) on file, streamed to the user and launched locally on the client. On the client, these applications run in a 'bubble' so that it is isolated from other virtual applications. One of the nice consequences of virtualization is that some applications that would normally not be server-based computing (SBC) compliant can be made so. This is a consequence of the fact that on the client, a user cache exists.

Let me give the example of an INI file that is used by application X to store user-related data. Normally, on a terminal server, this INI file would be shared by all users and thus only one user would be able to use the application at one time. Using Softricity, this INI file is virtualized and is configured to be 'user configuration' (see also here). This way, changes to this file are stored in a cache specific to the user (in the user's profile). Every user has its own version of the INI file and no conflicts can arise.

Setting the INI file as 'user configuration' is automatically done by the sequencer during the sequencing process and it is because the sequencer has certain rules saying 'This type of file needs to be user-specific'. In my case, the files that need to be user-specific are not yet created at the time of sequencing (they are later created normal usage of it. No broblem you say?! When a file is created inside a virtual directory (a directory that does not exist locally on the client, but comes with the virtual environment) it is stored cache! Yes, but in machine cache, not user cache!

What I would need in this specific case is a possibility to define which types of files need to be cache machine-wide and which user-based, not on the sequencer (see this SeqTypes tool by VirtualApp) but on the client. Or in other words, I would like to have the SeqTypes tool available on the Softgrid client. Has anyone experience or more information on this?

Friday, June 22, 2007

ITWorks Virtualization Seminar

2 days ago, I was speaking at a seminar concerning Virtualization (http://www.eyeon.be/event.php?id=VIRTD1). Together with Wim van Balen, we gave an overview of different virtualization types (from hardware virtualization to application virtualization) and various examples of tools per type. This is a big scope, and perhaps too big for one afternoon.

One thing that I would have liked to do is putting VMware ESX next to Xen (and possibly Virtual Iron) and compare the products on different levels. This, however, is hard because unfortunately I only have theoretical knowledge about the latter two. Ideally, I would like to sit together with some people who know more about Xen and discuss strong and weak points of both. In the end, this is what and IT manager in the field is asking for... Any volunteers?

Friday, June 15, 2007

ScribeFire again

Did I mention I'm using ScribeFire to post these blog entries? The thing that drives me crazy is the following: (.  ). I'll have to spend some time finding out how to get rid of this.

To use the mouse or not?


Some time ago, I installed Enso (http://www.humanized.com) and fell in love with the easy and fast way of working. Tools like this can really give your productivity a boost. There is one disadvantage for poor consultants like me: Enso is not free (little less than 20 dollars). Nevertheless, this company is one to keep an eye on (check their blog as well, when you're at it) as other great ideas are under way.

This morning, I noticed that a plugin for Launchy (open source) has been created that enables to you to open already open windows. Launchy is free and the plugin as well, so this means that there is a free alternative to Enso. Well, not completely but I'll leave it up to you to take a look at both products and find the differences.



Thursday, June 07, 2007

Google and Virtualization

Check this post: http://www.virtualization.info/2007/06/google-acquires-application.html



Not much can be found on the website, but from the description, the idea sounds familiar...

Tuesday, May 15, 2007

How Windows comes closer to Linux ...

Some days ago, I got inspired by the following topic: "Things you can do in Linux but not Windows" (click to read).

For the larger part, I agree with the author. Some remarks however: One of the points the author is:

Take my settings with me where ever I go. In Linux, all your personal settings are stored in your Home folder, most in folders that begin with a period (like .gaim). So, I can copy all these settings from one computer to another. I can put these settings on a USB drive. When I switched from Gentoo to Ubuntu, I kept all my settings. On Windows, some settings are under your home folder and some are in the registry. So your settings are not portable.

True of course, but not the final word: there is a way around this... storing the registry information in a file. This can be done in several ways, but one of the most flexible one is probably using OPS file by means of the Office Resource Kit (free download).

Interested? Check out the Flex Profile Kit (Login Consultants) which does exactly that for a TS/Citrix environment but can easily used in a desktop environment as well. In principle, even a registry export/import would do...

This does not mean I don't like the UNIX way (quite the contrary!)... but at long as we can make Windows behave a little bit the same, we are already happy!

Monday, May 07, 2007

Good and bad applications


I'm doing some Solution4 packaging these days at a client's site. You'll probably ask "what the heck is Solution4"? Take a look at www.loginconsultants.com for more info. In short, it is our own script-based installation and maintenance framework that is primarily used for TS/Citrix environments.

When talking about packaging, we talk about getting applications installed on the Citrix box and making them ready for users to use. And, sadly enough, some applications can really spoil your day (or even days). Take for instance the free Autocad file viewer DWG True View. The installation medium comes with a ready-made MSI, but... you have to tweak the MSI tables to get the thing installed properly. Or, to give another example: the installation gives a strange error if "%ProfileFolder%\Application Data" does not exist.

There are also nice applications, luckily. For people living in Belgium, installing ITT Promedia is very easy: just copy the full contents of the CD and start it. Easy isn't it?!





Tuesday, April 24, 2007

Altiris SVS

I'm starting to become a real fan of Software Virtualisation Solution of Altiris (now owned by Symantec).

I already run 'Launchy' using SVS for some time. Today, I tried running the new VisionApp Remote Desktop tool using SVS. Since I did not have the correct .Net version, I had to somehow make this version available for the installation wizard. Here's the procedure I followed:
  1. Download .Net v2 (http://www.microsoft.com/downloads/details.aspx?familyid=0856eacb-4362-4b0d-8edd-aab15c5e04f5&displaylang=en)
  2. Close Outlook if it is open.
  3. Create a new layer (VisionApp RD) and install the downloaded executable.
  4. Test the layer.

  5. Download the VisionApp RD tool (http://www.visionapp.com/141.0.html?)
  6. Select to update the layer created before (we will change the name later).
  7. Install RD
  8. Start the layer to test the installation
It works perfectly!  This is also the first time I test RD from VisionApp and I can tell you it is amazing, especially when you visit different environments all the time and don't want to remember IP addresses and account information.

Tuesday, April 03, 2007

Bloody Applications

This is an interesting one. Below, you see first a screenshot of a PowerFuse shortcut that does not work and after that a shortcut that works. Working means that the application starts up without errors.

Do you see the difference? Ok, I should have avoided the active cursor in the window as it is a real spoiler... This is due to the fact that the application itself checks if it is started from the proper location to make sure a valid license is used. This check is done on the basis of the startup command and the string comparison is made case sensitive...

Not working:

Working:




Tuesday, March 27, 2007

Test of ScribeFire

If this works, it will be faster to upload something to my blog...

Friday, March 23, 2007

SAN Design for VMware

I stumbled upon an ebook from VMware today: SAN System Design and Deployment Guide. I just browsed through it and it seems a very interesting read about SAN infrastructures, i.e., one of the main prerequisites when implementing and using VMware ESX.

Tuesday, March 13, 2007

VMware Training

I'm currently on a VMware training session (Virtual Infrastructure 2.0, ESX 3). This is really interesting, especially because I can refresh my Linux background a little bit.

Custom Search