Great topic -
You are right Colin - stats are bandied about without people understanding their true significance.
A key thing to note about using statistics is to be clear about the question you are attempting to answer. Here, what you are attempting to ask is 'how confident can I be that the profit/loss performance of this trainers runners at this course (as represented by the data) will be replicated in future' - I think that's what you are asking?
In this case a couple of things are important to think about -
Important to note that, in statistical terms, your historical numbers aren't a sample, they are the entire population of that trainers runners over the period you've looked at. There are therefore things you can say with certainty about that population (e.g. number winners, prices, etc).
But what you are looking to establish is the likely statistical relationship between that population (the historical data you have) and a totally separate population (the future results of that trainer).
That is an important distinction - looking at a sample (part of a data set) of a population (the entire data set) is, from a statistics viewpoint, very different to looking at one data set and attempting to predict the make-up of another data set.
In this case your theoretical hypothesis is that the yard have a modus operandi (either consciously or not) which means that the horses they run at a certain course are over-priced.
For horse racing prediction the difficulty is that there are so many variables that it is impossible to establish a clear figure for how many data points would be required, historically, to establish firm or meaningful confidence levels about how the future will look. That's the basic problem with 'back-fitting' (which is what you are doing here) - there is no guarantee that the future will look like the past and there are so many random variables in horse racing that it is impossible to place any meaningful numerical assessment (calculated or estimated) on how likely it is that the future will resemble the past.
But there is a useful rule of thumb which is the 'law of large numbers'. The law of large numbers states, among other things and in laymans terms, that the larger number of observations you have the greater confidence you have that the future will resemble the past. The insurance industry is based on this law.
You'll find if you look at trainer/course/race type relationships Colin that you will find hundreds (maybe thousands) of historically positive profit/loss correlations. If you test these you'll quickly find that they are wholly unreliable in isolation. The data sets are just too small. Sometimes they will pay off, but that will be down to chance (or you will have no way of establishing whether they were down to chance or otherwise, no matter how inuitively appealing the hypothesis may be) and the odds will be heavily against you.
One way of improving your odds (reducing the risk) is to apply the law of large numbers and aggregate such instances (find, maybe twenty/thirty trainer/course correlations that have worked historically and follow them as a group). That will reduce variation and reduce the risk.
But, putting numbers against the risks/confidence levels based on data set sizes - almost impossible. If it were that simple we'd all be rich.