I think it's easiest to understand this by first considering a tournament with only 8 teams (A, B, C, D, E, F, G, H) and hence three rounds:
quarterfinals with 4 games (A vs B, C vs D, E vs F, and G vs H)
semifinals with 2 games (winner of A vs B against winner of C vs D and winner of E vs F against winner of G vs H)
final with 1 game (winner of the ABCD group against winner of the EFGH group)
For each game in the quarterfinals, you can choose either of the two teams, giving you \[ 2 \times 2 \times 2 \times 2 = 2^4 = 16 \] possible ways to pick the winners of the quarterfinal games (ACEG, ACEH, ACFG, ACFH, ADEG, ADEH, ADFG, ADFH, BCEG, BCEH, BCFG, BCFH, BDEG, BDEH, BDFG, and BDFH).
No matter which 4 teams you pick to win in the quarterfinals, you can choose either of 2 teams in each of the 2 semifinal games, so you have \(2 \times 2 = 2^2 = 4\) possible choices. For example, if you chose A, C, E, and G to win their quarterfinal games, then your possibilities would be AE, AG, CE, and CG. But you have 4 possible choices no matter which of the 16 sets of teams you picked in the first round, so there are \(16 \times 4 = 64\) possible ways to fill out the first two rounds (quarterfinals and semifinals).
No matter which teams you picked to win in the quarterfinals and semifinals, you can choose either of your two semifinal winners to win the final game. Thus, no matter which of the 64 possible ways you chose to fill in the quarterfinal and semifinal rounds, you have 2 possible ways to finish your bracket, leaving you with a total of \(64 \times 2 = 128\) possible ways to fill in the complete bracket.
Notice that there were 4 games in the first round, 2 games in the second round, and 1 game in the final round, for a total of 7 games, and the number of possible brackets turned out to be \[ 2^4 \times 2^2 \times 2^1 = 2^{(4 + 2 + 1)} = 2^7 = 128. \]
Now, considering the real March Madness tournament, if we ignore the play-in games, there are 63 games in the tournament: 32 first round games, 16 second round games, 8 "Sweet 16" games, 4 "Elite 8" games, 2 "Final Four" games, and 1 championship game (32 + 16 + 8 + 4 + 2 + 1 = 63). Following the same logic as we did for the eight-team tournament:
You could pick either of the two teams in each of the 32 first round games, so altogether there are \(2^{32} = \text{4,294,967,296}\) possible sets of winners you could pick for the first round.
Similarly, for whatever set of 32 teams you picked to win the first round games, there are \(2^{16} = \text{65,536}\) different sets of teams that you could to win their 2nd round games, so by the time you fill in your teams for the first 2 rounds, you have chosen 1 of \(2^{32} \times 2^{16} = 2^{48} = \text{281,474,976,710,656}\) possibilities.
Continuing in this way, we find that there are \[ 2^{32} \times 2^{16} \times 2^8 \times 2^4 \times 2^2 \times 2^1 = 2^{63} \] possible ways to fill in your bracket. That's \[ 2^{63} = \text{9,223,372,036,854,775,808} \] or roughly 9 quintillion (a quintillion is a billion billion).
So, if you flipped a coin to pick your team in each of the 63 games, meaning that you picked your bracket completely at random from among the 9 quintillion possibilities, then the probability that you would pick the perfect bracket for that year's tournament would be 1 in 9 quintillion.
For comparision, the probability that a randomly chosen Floridian would be killed by lightning in the next year is a little less than 1 in 3 million, so that randomly chosen Floridian has a 3 trillion (3,000,000,000,000) times greater chance of being killed by lightning next year than a randomly chosen bracket has of being perfect.
However, and I can't emphasize this enough, this probability of choosing a perfect bracket is only correct if you randomly choose your bracket by flipping a fair coin 63 times. Of course absolutely no one picks their bracket this way.
If you didn't have any opinion at all about the relative strengths of the teams, and if you weren't concerned about competing with other bracket pickers for a prize, then it would be much better to just pick the higher seeded team to win each game, and then pick whichever of the 1 seeds you prefer in the semifinal and final games (there are \(2^3 = 8\) ways you could choose the winners of these last 3 games). The probability that the tournament will go completely "by the chalk" and give you a perfect bracket is still vanishingly small, but it's bigger than 1 in 9 quintillion.
For example, in 2023, I would have estimated the probability that this "chalk" method would yield a perfect bracket to be somewhere between 1 in 291 billion and 1 in 84 billion, depending on which teams you chose to win the final three games. Those are still extremely small probabilities, and our randomly chosen Floridian is still somewhere between 25,000 and 93,000 times more likely to die from a lightning strike this year, but the chalk method is a lot better than flipping coins.
If you are competing in a bracket pool, then your objective is to maximize your expected winnings. This is not the same as choosing your bracket to maximize your chances of being perfect (which is practically impossible anyway), and it's not even the same as maximizing your expected final "score". For example, picking the higher seeded team to win every game is generally not a good strategy, because lots of other people will do the same, or close to it, so even if you did win, you would probably split the pot with a lot of other players.
If you have a way to estimate the probability that team A wins over team B in any potential matchup in the tournament, then you can estimate the probability that a given bracket turns out to be perfect in a given year. Many of the "power ratings" that are provided by various sites on the web can be used to estimate the needed probabilities, although the sites don't usually tell you how to do it. For the calculation I did above for the 2023 tournament, I used the team ratings from FiveThirtyEight.com. Unfortunately these won't be available anymore due to cutbacks at Disney/ABC, so I'll have to use something else for 2024.
Note: If you have to pick the 4 play-in games as well, then you have to pick the winners of \(63 + 4 = 67\) games, so in this case there are \[ 2^{67} = \text{147,573,952,589,676,412,928} \] possible brackets, and the probability that a bracket chosen by flipping coins turns out perfect is about 1 in 148 quintillion.
You said:
We’d also give you 3 additional basketball statistics (points per game, assists per game, win/loss ratio, etc.) to analyze their statistical importance (or lack thereof) in determining the winner of a game.
I'm not sure that I would have the data at hand, or the spare time, to properly answer this question. Maybe, but given the other things I need to get done in the meantime, it's very doubtful.
Since at least the early 2000's I have participated in a small "portfolio pool" with some other statisticians and computer scientists, as well as some "civilians". I started thinking a little more seriously about it in the 2010's and around 2016 I got an idea for how I might optimize my picks for the pool. This led to me simulating the March Madness tournament on my home computer and using the outcomes to help me make my selections. My methods have evolved over the years and these days I run hundreds of thousands, and even millions of simulations of the tournment to generate training and test sets for choosing my picks and evaluating how well they are likely to perform. Almost every year I have ideas for improvements that I frantically implement and test out at the last minute so that I can submit my picks before the tournament begins. I have certainly learned a lot in the process.
I know of at least one other pool participant who is doing similar things: he leads a large-scale optimization research team at Google. A group of us started comparing ideas a couple of years ago and we hope at some point to write a paper (or two) about it.