It feels like I owe you an apology BH, but the result is an incredibly good piece of work, so thankyou.
My lamentations were (mostly) for comedic effect so an apology is not necessary. As a consequence of this research I have gone from having a faint conception of dosages to knowing a thing or two about the subject and my spreadsheet skills have also seen improvement. Indeed if there is to be gratitude then it should also extend in your direction. The study of juvenile hurdlers is a labour of love and the development of skills and a broader understanding of any subject is very much its own reward. Although if you must feel apologetic then I am sure we can negotiate a retainer of sorts. Gratitude also for Danny as it is always nice when an amusing thought is met with appreciation rather than concern...
What happens when you apply dosage to sires and damsires?
I'm glad you asked...
Given that I already have some data lying around for sires and damsires, looking into the dosage element would be a natural progression. Particularly if, as mentioned in this thread's opening post, it might reveal that certain stallions have an aptitude for this rather niche discipline.
Using those who have sired ten or more juvenile hurdlers since the 2011/12 season, I will first look at the average DIs grouped by the sire's winner to runner strike rate percentage.
________ Total Mean Median
64%-30% 35 1.33 1.11
29%-20% 43 1.59 1.53
19%-10% 42 2.02 1.76
09%-00% 31 1.86 1.61
As with the juveniles themselves, there is also a very discernible pattern with their sires. In broad terms, the lower the DI, the higher the winner to runner strike rate although this takes a stumble among those with poor strike rates. An examination of this anomaly reveals that the average DI is dragged down by Beneficial, Kayf Tara, Milan, Oscar, Presenting and Scorpion. It would be unfair to suggest that these sires can not produce juvenile hurdlers as they will gave been presented with little to no suitable opportunities. The mares serviced by these stallions will not be given to precocity and the offspring who do run during their juvenile season will almost always be store horses out for experience. As such, when these stallions are removed from the figures, the average DIs for the band read as 2.15 and 1.95 mean and median respectively which leaves a firmly consistent set of figures.
Interestingly, another stallion who weighs down the 9%-0% average is Kalanisi (1.43) who is without a winner from twelve foals during this time period. However, that he could count the likes of Alaivan, Barizan, Simarian, European Dream and, of course, Triumph Hurdle winning Katchit among his first crops supports the notion that the manner in which produce is bred and prepared for racing will impact its sires success rates.
The second table looks at the percentage of offspring to have achieved RPRs exceeding 107. The National Hunt stallions have been removed from this data in advance and while there is not a pronounced difference towards the lower bands, the general pattern remains to be seen.
________ Total Mean Median
64%-30% 42 1.36 1.16
29%-20% 47 1.62 1.53
19%-10% 36 2.14 2.01
09%-00% 21 2.04 2.00
This next chart may be of particular interest to those attempting to find the more unheralded National Hunt sires as it looks at the percentage of a stallion's progeny that improves for switching code. The figure is reached by subtracting the lower of the horse's official flat rating or highest flat RPR from the highest achieved jumps RPR and subtracting a further thirty-five pounds. Those with positive figures are assumed to have improved and those with negative figures are not. For the sake of integrity, only horses who have raced more than twice over hurdles AND have achieved an official rating are considered so as to reduce the skewing of the figures by lightly raced sorts.
________ Total Mean Median
79%-52% 23 1.45 1.48
50%-41% 27 1.72 1.58
40%-00% 33 1.94 1.48
These results show that sires whose progeny are more likely than not to improve for the switch of codes will have a lower DI than those who do not. However, there is a familiar cautionary tale here in that a slight shifting of parameters can quite dramatically alter a statistical narrative.
________ Total Mean Median
79%-52% 23 1.45 1.48
50%-40% 35 1.90 1.72
39%-00% 25 1.75 1.33
Incidentally, the top ten stallions of horses who improve for switching codes are;-
79 Pour Moi (0.43)
78 Authorized (0.62)
67 Montjeu (0.89)
65 Fast Company (2.06)
64 Sinndar (1.56)
64 Aussie Rules (1.73)
62 Casamento (1.77)
61 Sixties Icon (0.65)
60 Canford Cliffs (0.82)
59 Holy Roman Emperor (2.48)
Putting stallions in bands of their DIs and evalusting their strike rates for both winners/runners and plus 108 performers produces highly consistent results;-
Winners from runners
________ Total Mean % Median %
0.00-0.95 34 27.66 26.54
1.04-1.48 36 23.53 21.68
1.53-2.00 30 20.56 20.71
2.06-2.79 25 18.10 16.67
3.00-5.00 20 17.13 16.67
108+
________ Total Mean % Median %
0.00-0.95 34 32.24 30.95
1.04-1.48 36 26.22 22.22
1.53-2.00 30 23.45 22.22
2.06-2.79 25 18.60 17.65
3.00-5.00 20 16.88 16.67
However, these figures are not quite so well replicated when accounting for the transition from flat to jumps;-
________ Total Mean % Median %
0.00-0.95 18 46.48 49.33
1.04-1.48 24 41.16 41.42
1.53-2.00 17 45.48 47.06
2.06-2.79 11 46.54 47.37
3.00-5.00 13 38.40 40.00
When grouping damsires by bands of strike rates, it appears that the impact of their influence diminishes quite considerably as while there is still a vague trend overall, it lacks the distinction that occurs in the above figures;-
Damsire winners/runners%
________ Total Mean Median
64%-30% 27 1.79 1.29
29%-20% 43 2.07 1.95
19%-10% 34 1.92 1.72
09%-00% 8 1.89 1.62
As with the sires, there are a couple of damsires weighing down the lower figures in the shapes of Presenting and Supreme Leader. However, while their removal (which leaves figures of 2.34 and 2.18) helps to mitigate against the inconsistencies of the winners/runners table, it acts only as scant ballast against the comparative chaos that is the plus 108 table;-
________ Total Mean Median
64%-30% 35 1.94 1.50
29%-20% 35 1.83 1.79
19%-10% 33 2.14 1.67
09%-00% 7 1.99 1.71
This chaos explodes into pure anarchy when it comes to improvement figures;-
________ Total Mean Median
65%-50% 20 2.19 2.06
47%-40% 21 1.68 1.40
39%-19% 21 1.86 1.56
While one might be tempted to hypothesise on why these figures occur as they do, whether there is a tangible relationship to be extrapolated or if the figures are chaotic because the perameters are largely redundant, any theory at this stage might welll be pure speculation.
Can you begin to eliminate certain sires and breeding lines within the data bands. What i mean by that is, are there sires that fit the dosage profiles in the higher bands but just don't sire winners of that class and above.
With the data available, it is certainly possible to recognise sires who underperform despite having a favourable dosage index. While they won't be having any more juvenile hurdlers, Rip Van Winkle (0.69) and Marju (0.83) both had poor records in the field with neither having a double figure strike rate by any of the aforementioned metrics. Nevertheless, this is not to say that they could not sire jumpers, rather their more successful progeny tended to produce their best efforts at later stages. Marju's top jumpers had their most fruitful campaigns at the ages of 5-9 (Simenon), 7 (Aspirant Dancer), 7-8 (Bobs Pride) 7-10 (Almaydan and Oodachee). An underperforming sire who may have juveniles this season is Havana Gold (0.74) who currently has just one winner from eleven runners achieving an RPR of just 100 in the process (although he currently has an improvement rate of 33%). Conversely, despite its higer dosage indexes, the Danehill Dancer (2.09) line mentioned in the previous post boasts several stallions with above average figures including Jeremy (1.82), Mastercraftsman (1.82), Fast Company (2.06), Indesatchel (2.50) and Choisir (2.60). The fact that Kingston Hill could join the ranks this season with a DI of just 0.90 could make him a very interesting prospect.
If you began to hate me for setting you off on the arduous task you've just undertaken, I've also had another thought that is kind of the reverse of what you're working on at the moment. Often juveniles are accused of 'not training on'. I'm wondering if there is a sweet spot within the bands for juveniles improving/declining beyond their juvenile season, and again is there other criteria that can then be used to filter out further, thus statistically eliminating some data groups on probability.
Just as a Professor of the Middle Ages might balk at the idea of touching anything post-Reformation, I must stress that anything a four-year-old does after Punchestown would not be in my field of research. Nevertheless, because I found myself rather intrigued by the idea, I did some work relevant to your idea - possibly without fully grasping your intentions.
Firstly, I used the leading juveniles with the highest and lowest dosages of the seasons between 2011/12 and 2016/17 - the latter season chosen to allow the form to mature. The table shows Name/DI/Sire/Sire's DI/Highest RPR/Season/Three highest RPRs achieved after the juvenile season and the distances at which they were achieved/Average distances of top performances/Difference between top juvenile RPR and subsequent RPR - A larger the figure may be demonstrative of "training on".
I am not sure that the forum's format is conducive to presenting the table in this post but hopefully the link to the image will still be there. The sample size is probably too small but two ideas that can be taken from these figures are that horses with higher DIs seem to be better adept at training on and that while those with higher DIs tend to stick at the minimum trip, those with the lowest largely only step up a half mile in trip. Although it is worth noting that the two who fared best enjoyed success at three miles and beyond and it could well be that the placing of these horses played a greater role than genetics. It goes without saying that considerably more research would be necessary before drawing any firm conclusions. Nevertheless, I applied this format in a similar fashion to the Triumph Hurdle winners with the highest and lowest DIs in the RPR era. There are some horses whose best RPRs were equally attained over a range of distances. In these instances, I used an average figure whenever applicible.
Here, in contrast to the previous table, those with the lower DIs enjoy far greater success after the triumph than their more speedily bred counterparts. However, of the stoutly bred winners, only Commanche Court, Paddy's Return and Tiger Roll would establish themselves as bona fide stayers and while the likes of Mysilv and Celestial Halo ran close to or at their bests over three miles, they were equally capable at two miles. Zarkandar, the most successful of the speedier sorts would also perform well over both two and three miles which may suggest that for many high quality horses, the difference between two and three miles can be much of a muchness.
And finally, there were forty horses who posted RPRs of 150+ over jumps last season whose careers began in the juvenile division. (This will not include those who exclusively raced at three or four in France). The figures below are RPR/Distance RPR achieved/DI/Sire's DI/Age/Horse.
155 24.0 1.00 1.10 08 Apple's Jade
150 16.5 0.71 1.22 06 Ballywood
156 20.0 0.85 0.90 08 Ben Dundee
174 25.5 1.12 1.10 09 Bristol De Mai
158 17.8 0.85 0.76 07 Call Me Lord
178 24.0 1.22 0.93 08 Clan Des Obeaux
156 16.5 0.90 1.78 05 Coeur Sublime
159 15.8 2.20 1.29 06 Cornerstone Lad
171 15.5 1.00 1.29 07 Defi Du Seuil
163 17.0 1.00 1.22 08 Diego Du Charmil
166 15.5 1.40 0.93 07 Dolos
152 20.5 0.74 1.78 07 Ex Patriot
165 16.0 1.00 0.93 05 Fakir D'oudairies
165 26.0 0.86 0.94 08 Footpad
167 20.5 1.67 1.00 08 Frodon
153 15.5 0.71 0.71 05 Fusil Raffles
155 15.5 0.71 1.00 06 Grand Sancy
154 16.5 0.33 1.00 06 Gumball
153 22.0 0.71 0.90 11 Mala Beach
155 17.0 0.50 0.53 09 Marracudja
150 22.5 2.08 2.16 07 Mengli Khan
156 15.5 1.00 1.00 06 Monsieur Lecoq
153 15.8 1.00 1.29 06 Nube Negra
157 15.5 0.77 1.04 05 Pentland Hills
150 16.5 2.00 1.48 05 Pic D'Orhy
157 16.0 1.40 1.40 05 Quel Destin
151 21.0 0.67 0.71 08 Romain De Senam
161 16.5 0.40 0.58 05 Saldier
150 20.0 1.07 1.77 08 San Benedeto
168 16.5 1.40 1.67 08 Sceau Royal
166 16.5 1.86 1.67 07 Sharjah
157 16.2 1.67 4.00 07 Silver Streak
154 20.5 2.56 1.48 07 Siruh Du Lac
154 19.5 2.08 3.67 05 Song For Someone
150 18.3 1.00 1.04 06 Stormy Ireland
155 30.0 0.58 0.62 10 Tiger Roll
159 20.3 0.88 1.11 09 Top Notch
155 25.0 1.13 1.24 07 Tout Est Permis
156 20.0 1.00 1.29 08 Voix Du Reve
154 20.5 3.00 1.82 08 Who Dares Wins
Top 40 ex-juveniles 2019/20 MEAN/MEDIAN
________Total Age RPR ____DI SDI Dist Age RPR DI SDI Dist
_____ALL 40 7.03 157.95 1.18 1.31 19.0 7.0 156 1.00 1.11 17.40
Distance
15.5-17.8 21 6.24 158.57 1.09 1.28 16.2 6.0 157 1.00 1.22 16.20
18.3-21.0 11 7.45 154.82 1.41 1.51 20.1 8.0 154 1.00 1.29 20.30
22.0-30.0 08 8.50 160.63 1.09 1.12 24.9 8.0 155 1.06 1.02 24.50
DI
0.33-0.90 16 7.19 155.63 0.70 0.97 19.2 7.0 155 0.71 0.92 17.40
1.00-1.29 12 7.17 160.50 1.05 1.18 19.7 7.5 156 1.00 1.16 19.15
1.40-3.00 12 6.67 158.50 1.94 1.88 18.0 7.0 157 1.93 1.58 16.50
While it is only a sample size of forty, there is really nothing to glean from these figures other than the fact that no current high-class graduate-juvenile has a dosage index exceeding 3.00