• REGISTER NOW!! Why? Because you can't do much without having been registered!

    At the moment you have limited access to view all discussions - and most importantly, you haven't joined our community. What are you waiting for? Registration is fast, simple and absolutely free so please, join Join Talking Horses here!

AI Cheltenham Festival Project - Crowdsource?

Steeplechasing

Amateur Rider
Joined
Feb 21, 2026
Messages
27
Location
Scotland
I’m compiling Cheltenham Festival info from a range of sources. I plan to load it into chatGPT and ask it for a preview and recommendations for each race based on everything - trends, trainer plans, pundit articles, online gossip, race reports from the likes of the DRF and last year’s festival. It’s a fun project not meant as a serious tipping platform.

I plan to blog on it and perhaps run a forum thread in one of the popular forums, (whence I also intend to crowdsource input data).

Subject to approval from Horseracebase, Racing Post, and Timeform, I hope to include: 1 Trend Data; 2 Raceday Spotlights (evening before), and Timeform’s free horse by horse analysis (evening before), as well as Timeform racecard content. I have no intention of singling out sources in each race preview since it will be an amalgamation of all knowledge, analysed and produced by chatGPT.

The AI engine will make the decisions based on what it considers relevant to consider/mention. I will not edit that except for clarity.

But I will credit the organisations who have joined in the fun by allowing inclusion of their content.

If you have anything you believe worth adding, please post it here and I will paste it into the project.

I don't subscribe to Racing Post or Timeform (though will probably take out a monthly Race Passes before the meeting). So I am collecting stuff from free sites like Sportinglife on the basis it is in the public domain at no charge, making copyright less of an issue. I have written to the other organisations mentioned - Chris at Horseracebase will have no problem with it, I'm sure. Am not quite so confident with Tom Kerr at the Post or the Timeform chiefs, but we shall see.

Any other content suggestions are welcome.

All the best.
Joe
 
Try asking the same question with the same data multiple times & observe the variation in output
 
Last edited:
Ah, right. What kind of results have you been seeing? Is it simply not worthwhile? I am looking at it as a fun thing.
Even feeding in my own precise trial/prep/pointer races data via csv so it isn't largely a glorified internet search & collation, I still get slightly different results asking the same question multiple times.
It's worth looking at but blind faith is folly.
 
Okay. Thanks. I know it comes out with a lot of crap stuff, but thought that data analysis was its strong point. Those who've put billions into this AI must be starting to worry.

All the best.
Joe
 
It is very good at producing a written interpretation of data and when I say slightly different results they don't contradict each other more it can add and/remove bits each time you ask the question.

From what someone more into it than me has said , how you word your questions is all important i.e. broad questions gets broad answers that are likely to vary.

I've ran it against data I've spent a long time looking at myself and it can tell you useful stuff that you'd perhaps only skimmed over in your mind.

I think it works best alongside your own thinking
 
I'll carry on with it out of personal interest. What it helps greatly with is immediately turning a page like the attached (From Horseracebase Trends Profiler View) to a text report:

TRENDS – Coral Cup (Last 10 Years)
Source:

horseracinghistories.php




Ideal Statistical Profile​


Age


  • 7 or 8 years old
    • 7yo: 5 wins (best record)
    • 8yo: 3 wins

Last Time Out


  • Finished 1st, 3rd, 4th or 5th
  • Ignore horses who were 2nd last time (0 wins)

Odds


  • Sweet spot: 9/2 – 12/1
  • 14/1–20/1 can win, but lower strike rate
  • Very few winners from 22/1+

Official Rating


  • Strong band: 141–152
  • Particularly productive: 141, 146, 148–152

Weight


  • Winners scattered, but positive returns around:
    • 10-2
    • 10-7
    • 11-2 to 11-8
    • 11-10
  • No strong case for very low weights.

Days Since Run


  • Best window: 61–90 days
  • Acceptable: 16–60 days
  • Poor: 8–15 days

Runs in Last 12 Months


  • Ideal: 3 runs
  • Also viable: 5–7 runs
  • Avoid lightly raced (1–2 runs)

Distance Profile


  • Most winners had previous form around:
    • 2m4f to 2m5½f
  • Very few stepping down from 3m+
  • Ideally has already won around 2m2f–2m4½f

Season Runs


  • 1 run or 6 runs show best returns
  • Avoid those with 4–5 runs (weak record)

Position in Market


  • Not dominated by favourites
  • Often mid-market: 6th–7th in betting
  • Occasional bigger priced winner (15th–20th in market)



Composite “Ideal” Coral Cup Horse​


  • 7-year-old
  • Official Rating 146–150
  • Carries around 10-7 to 11-4
  • Finished 3rd or 4th last time out
  • Ran 61–90 days ago
  • Has had 3 runs this season
  • Proven winner around 2m4f
  • Priced 8/1–12/1
  • Sitting around 6th–8th in the betting



File stored under TRENDS – Coral Cup.
 

Attachments

  • CleanShot 2026-02-25 at 21.59.56@2x.png
    CleanShot 2026-02-25 at 21.59.56@2x.png
    3.4 MB · Views: 1
that's the sort of thing I mean, but I would look at the data your self and what AI has come up and see how much you agree
Figures are not my strong point. I couldn't even begin to work out how to start structuring a document to manually extract the info from every field in the Horseracebase layout and make sense of it. I suspect it comes easily to some, but sadly not me.
 
I asked chatgpt and google gemini for data regarding the record of 6 year olds running in the champion chase. Both came back to me and said Marine nationale won the race as a 6 year old on multiple occasions. Bloody melt
 
Figures are not my strong point. I couldn't even begin to work out how to start structuring a document to manually extract the info from every field in the Horseracebase layout and make sense of it. I suspect it comes easily to some, but sadly not me.
Take one of the statements such as "Proven winner around 2m4f", go through the figures and see if you agree with that.
 
Thanks Pawras. Your responses here are much appreciated. And, as you mentioned, I understand that the prompts used to reach your ideal ooutput in AI are crucial. I read of one guy who seemed to have a good system in that he would spend ages inputting prompts, refining based on the answer, refining again and again until he got what he wanted. He was then clever enough to put one final question in -"What prompt should I have started with that would have got me this answer without constantly refining"?
 
Back
Top