Let’s Use a Chatbot to Fill Out a Bracket
We all want to win our N.C.A.A. tournament bracket pools. Could Microsoft’s new Bing chatbot help?
After all, chatbots seem able to do anything these days. The Bing bot is frequently impressive (if occasionally creepy) and is based on the new GPT-4 artificial intelligence system. Unlike its cousin ChatGPT, Bing has access to information about this year’s teams, and it seems more willing to make predictions.
Our colleague Sarah Lyall engaged in a long exchange with Bing about college basketball, but it wouldn’t churn out an entire bracket’s worth of predictions in a single response. So, we asked for its picks round by round, region by region.
Then we filled out our own bracket, using Bing’s responses. Here’s what it recommended for the men’s bracket:

The bot won’t respond the same way every time, and the phrasing of a question matters. Simply asking which team is more likely to win each game might yield a bracket with no upsets. Instead, we asked Bing to try to win a bracket pool while accounting for potential upsets. We also kept reminding it to use information about this year’s teams, since it would often refer to past seasons or players despite its ability to incorporate more current details.
The resulting men’s bracket did contain upsets, including No. 3-seeded Baylor winning the championship. Here’s how the bot explained its pick of Baylor overcoming the tournament’s top seed, Alabama:
(Bing also said that Baylor had won two of its previous three meetings against Alabama, but the reverse is true.)
Bing picked a Final Four without any No. 1 seeds; that has happened three times since seeding began (in 1980, 2006 and 2011). Is the bot correct that we’re due for another? Is this a good strategy to win a pool this year? We can only wait and see.
Perhaps wisely, the bot picked a more traditional Final Four in the women’s bracket, where upsets have been less common. And it made the safest possible pick for the winner: undefeated South Carolina.

The Bing chatbot wasn’t exactly designed to compete with expert forecasts or mathematical tournament prediction models. Microsoft has said that its system has struggled to keep up with live sports information. The chatbot frequently cited outdated or incorrect details about teams, even if its overall impressions seemed valid.
So taking Bing’s advice — with its dash of unpredictability — is probably just as good as other amateur bracket strategies, like picking which teams’ mascots would beat the others.
Speaking of which, we did, in fact, ask Bing which teams’ mascots would beat the others.
You can be the judge of these judgment calls. (You can also ask similar questions to ChatGPT and get similarly humorous responses; the Bing chatbot is not yet widely available to the public.)
Methodology
All of our conversations were with the Bing chatbot on its “Balanced” conversation style setting. Through experimentation, we crafted queries that would keep the chatbot’s responses in a consistent format, force it to make selections for each matchup, allow it to make upset picks and encourage it to use information about teams’ current seasons (though it often mixed information from this season and previous seasons).
A typical query was formatted as follows:
Hey, Bing. I’ll list the first-round matchups in the South region of the 2023 NCAA men’s basketball tournament. I have included their seeds in the 2023 tournament. Make selections for each game, as if you were filling out a bracket in an office pool, attempting to win the pool while accounting for potential upsets. Be sure to use information about the teams and their seedings in the 2023 tournament, not previous years!
(1) Alabama vs. (16) Texas A&M-Corpus Christi
(8) Maryland vs. (9) West Virginia
(5) San Diego State vs. (12) Charleston
(4) Virginia vs. (13) Furman
(6) Creighton vs. (11) N.C. State
(3) Baylor vs. (14) U.C. Santa Barbara
(7) Missouri vs. (10) Utah State
(2) Arizona vs. (15) Princeton
We recorded the chatbot’s selections for winners in each matchup. Then we took those winners and asked it about the matchups that would occur in the next round of the tournament, given its selections. We repeated this round by round and region by region for both the men’s and women’s tournaments.
We asked the chatbot for its picks in the “First Four” games of each tournament before they were played. We advanced its selected winners into the first round of the tournament.
There is no guarantee that the chatbot will make the same selections even when asked the same questions in the same format. And wording questions differently may also produce different results.