Tuesday 4 December 2012

The dangers in cherrypicking data

This post follows on from my previous one, where you may recall that I was worrying about having done something wrong.

So that you don’t necessarily have to read it again, I was discussing the ludicrous scenario where an investor switches, at the start of each year, into the asset class that performs the best over the whole of that year resulting in frankly awesome returns. A scenario, by the way, that had made it into the newspapers.

I chose to ignore this unbelievable scenario.  Not unbelievable because of the returns – they’re there for the world to see, but unbelievable because as I pointed out, one has to be either magically psychic to utilise this strategy, or be in a position to influence the world’s financial markets.

My fears about having botched this exercise were not necessarily mistaken: I had done everything right with the data that I had.  Let me digress for a moment and point out a couple of things that I didn’t mention.

Firstly, the table from The Advertiser was awe-inspiringly wrong in its assessment.  Usually newspapers like to beat things up, but the Addie managed to really blunder – if you invested $10,000 at the start of the period and it grew to $37,000-odd, your money wouldn’t be doubling, it would be nearly quadrupling.

Secondly, the ‘conventional wisdom’ that I referred to has a couple of biases built into it – firstly, the asset classes that should post the really bad results (over the long run) are also the ones that are likely to post the best results.  By presupposing that one follows a pattern where one invests in an asset class that has underperformed in previous years, one is necessarily depriving themselves of quite a lot of the really bad results, not by design, but more by assisted good fortune.

Lastly, however, do you remember when I pointed out the pitfalls of cherrypicking data?  Well, I did that.  My sample sizes were too small – in fact, I only looked at four ten year periods, all of which ended in the noughties.  So really, yes, I was looking at data that was consistent, just not terribly meaningful data.  A bit too consistent, if you like.

Cherrypicking, incidentally, is precisely what the Addie did to create this story.

So I stretched this out to include the entire 30 year dataset contained in Vanguard’s wall chart of 2010.  Remember, Vanguard obviously think that this is enough.  I don’t but I’ll show you what it turns up anyway.  To this, I also added the data from the 2012 wall chart.  That gave me a good 32 years of data.  I had to get some more, but I’ll come to this shortly.  Also, I've only used very basic statistics here.  I’m not a statistician, nor am I going to bother running this past someone who is.

The dataset includes figures on 9 asset classes: Australian shares, international shares (unhedged), international shares (hedged), US shares (unhedged), Australian bonds, international bonds (hedged), cash, Australian listed property and international listed property (unhedged).

Why the international bonds are hedged and equivalent unhedged figures have not been provided, I’ll never know.  Well, I could contact Vanguard, but I won’t get a straight answer.  I have a fair idea why there is no hedged international property and it’s all to do with availability – this is not a problem that unhedged international fixed interest should have.

The data is full financial year returns and makes no allowances for fees, costs, brokerage, taxes etc.  To put that another way, it assumes that you held an asset from 1 June to 31 July and faced no costs of any nature.

The returns are measured by index performance – don’t worry, I’ve checked them and they’re reasonably good.  International performance is measured as the world (including the US) ex-Australia.  On this subject, I deleted the US shares column entirely for two reasons:

  1. US shares make up about a half of the international shares index; and
  2. Superannuation funds rarely provide US shares options to their members.
International bonds (hedged) only goes back as far as 1986 and international listed property (unhedged) only goes back as far as 1991.

The first thing that I did, was work out the rolling 10 year averages, going back as far as the 1981-1990 period.  This turned up some rather interesting results.  Remember, we’re trying to see if one is advantaged by picking the previous year’s best result and going with that asset class for a year, or picking the previous year’s worst result and going with that for a year.

Just to be interesting, I’m going to add in a column of random results – what might one get if one picked asset classes to invest in via a dartboard.  I didn’t use a dartboard, though.  I used a random number generator.

Secondly, I’m going to use a weighted average result as well.  This will be weighted in line with my own invented asset allocation, which I will call Dikkii’s Balanced Fund (I've tried to keep it vaguely in line with super fund default options):



What did I find out?

Most obviously, I had way too little data to make a call.  About anything.  So how the Addie (or the rest of the News tabloids who carry the Money section) can justify serving up this piece of crap to their readers is beyond me.

But when I looked at this 32 year period (which is really only twenty, because we can’t start counting until ten years in), there is some interesting stuff to be found.

But I had to try to stretch this data out.  The problem with the data is that 32 years is still too short to get anything meaningful.  So I went and retrieved as much extra data as I could find.

This was problematic.  To begin with, in the Vanguard data, the property and bonds don’t go back the full 32 years that I had originally, and finding this data for the previous ten years has so far proven elusive.

This has resulted in a compromise ‘Balanced’ portfolio, which I’ve had to construct for the first 10 years as follows:



There are a number of ways to look at this data.  In the following few posts, you’ll see why focussing on any one area is dangerous, because, if you looked at the following graph:




…you’d think that it’s hedged international bonds all the way.  Don’t.  Read the next few posts first.

I'll leave it there for today.

No comments:

Post a Comment