MBL2 2023: A Closer Look

Date

Jan 11, 2024 2:05 PM

The morons betting league (MBL2) is a small informal group of friends (morons) who compete every week by picking ten football bets (NCAAF and NFL) totaling 100 “units” against the spread and/or over-under Here’s a small sample of the data. Note that

The players (morons) are represented by one-letter codes.
Some weeks may not have happened yet, so you may see a lot of missing (NA) data.
There is one exceptional (bowl) week where the morons place 20 bets totalling 200 units. Note also that one moron placed 10 bets totalling 200 units that week, which is technically against the “rules” of the MBL.
mia is an indicator of whether the moron failed to enter their picks. According to the MBL2 “rules” a player is allowed to have one missed week replaced by their score from the subsequent week. The “rules” do not specify what happens if the week missed or the week following the week missed is the 20-bet week.
The listing of the teams/games and the line is very irregular because these are emailed and entered by hand with no standardization. For this reason, the league inforamtion (NCAAF vs NFL) is also unavailable. So the only useful data are the moron, week, number of units wagered, and number of units won.

Code

mbl |>
    select(-cells) |>
    slice_sample(n = 10) |>
    arrange(week, moron)

# A tibble: 10 × 7
   moron  week mia   team_s                         line      wager units_won
   <chr> <int> <lgl> <chr>                          <chr>     <dbl>     <dbl>
 1 K         0 FALSE uscar                          2.5          10         0
 2 P         3 FALSE FSU @ CLEM                     OVER 55.0    10         5
 3 P         3 FALSE NE @ NYJ                       NYJ +3.0     10         0
 4 P         7 FALSE (17) Duke at (4) Florida State FSU -13.5    20        20
 5 D         8 FALSE Horns                          -17.5        20        20
 6 W         9 FALSE kssu/UTEX                      >50.5        10        10
 7 K        11 FALSE MIA                            -11.5        10         0
 8 S        11 FALSE Clemson                        -6.5         10        10
 9 D        14 FALSE Army                           -2.5         10        10
10 S        17 FALSE LV                             3            10         5

The next bit of code creates a result column (win, lose, or push) the wager and units_won columns.

Code

mbl <- mbl |>
    mutate(
        result = (units_won == wager) - (units_won == 0) + 2,
        result = ifelse(is.na(result), 4, result),
        result = c("lose", "push", "win", NA)[result])

The next bit of code creates a summary data frame of results by moron and week. It uses tidyr::fill() to backfill the units won for weeks where mia == TRUE. The variable units_won10 adjusts the number of units won to a 10-bet basis, in order to standardize the 20-bet bowl week to match the other weeks.

Code

mbl_by_moron_week <- mbl |>
    summarize(
        n_bets = n(),
        wins = sum(result == "win"),
        losses = sum(result == "lose"),
        pushes = sum(result == "push"),
        units_won = sum(units_won),
        .by = c(moron, week, mia)
    ) |>
    ## Handle MIA weeks with tidyr::fill().
    arrange(moron, week) |>    # Making sure that observations are ordered correctly.
    group_by(moron) |>         # Grouping means filling done only within group (by moron).
    fill(units_won, .direction = "up") |>
    ungroup() |>
    filter(!is.na(units_won)) |>
    mutate(units_won10 = (10/n_bets) * units_won) |>
    mutate(season_total = cumsum(units_won), .by = moron)

Here are some quantiles of the weekly units won (10-bet basis) by the morons. This is an attempt to determine what constitutes a good or bad weekly total.

Code

mbl_by_moron_week |>
    pull(units_won10) |>
    quantile(probs = seq(0.1, 0.9, by = 0.1))

10% 20% 30% 40% 50% 60% 70% 80% 90% 
 30  40  40  45  50  57  60  67  75

And here is a histogram of the same information.

Code

## Histogram of weekly units won by the morons.
mbl_by_moron_week |>
    ggplot(aes(x = units_won10)) +
    geom_histogram(binwidth = 10)

This is a line plot of the moron’s season totals by week.

Code

mbl_by_moron_week |>
    ggplot(aes(x = week, y = season_total,
               color = fct_reorder2(moron, week, season_total))) +
    geom_line(linewidth = 1) +
    labs(color = "moron", x = "Week", y = "Season Total")

How many weeks has each moron been one of the top scorers?

Code

mbl_by_moron_week |>
    group_by(week) |>
    mutate(
        week_max = max(units_won),
        top_scorer = (units_won == week_max)
    ) |>
    ungroup() |>
    filter(top_scorer) |>
    summarize(n = n(), .by = moron) |>
    arrange(desc(n))

# A tibble: 6 × 2
  moron     n
  <chr> <int>
1 H         5
2 K         5
3 W         5
4 P         4
5 S         4
6 D         3

Which moron’s weekly totals (10-bet basis) are the most (and least) variable (ordered by standard deviation)?

Code

mbl_by_moron_week |>
    summarize(mean = mean(units_won10), sd = sd(units_won10),
              median = median(units_won10), mad = mad(units_won10),
              .by = moron) |>
    arrange(desc(sd))

# A tibble: 6 × 5
  moron  mean    sd median   mad
  <chr> <dbl> <dbl>  <dbl> <dbl>
1 W      51.8  20.2     50 14.8 
2 K      49.6  18.3     50 22.2 
3 P      56.2  16.1     55 22.2 
4 H      51.4  16.1     50 14.8 
5 D      55.7  15.6     55 22.2 
6 S      49.9  14.6     40  7.41

Which moron’s weekly totals are the most (and least) variable (ordered by median absolute deviation)? Using knitr::kable() here for a prettier table, but I’m getting misaligned column headers. I may report this as a bug.

Code

mbl_by_moron_week |>
    summarize(
        mean = mean(units_won10), sd = sd(units_won10),
        median = median(units_won10), mad = mad(units_won10),
        .by = moron) |>
    arrange(desc(mad), desc(sd)) |>
    kable()

moron	mean	sd	median	mad
K	49.60526	18.33832	50	22.239
P	56.18421	16.12384	55	22.239
D	55.65789	15.63121	55	22.239
W	51.84211	20.22172	50	14.826
H	51.44737	16.07957	50	14.826
S	49.86842	14.63618	40	7.413

Overall number of winning, losing, and pushed wagers for each moron. The stated win percentage counts pushes as half a win.

Code

mbl |>
    filter(!is.na(result)) |>
    summarize(n = n(), .by = c(moron, result)) |>
    pivot_wider(names_from = result, values_from = n, values_fill = 0) |>
    mutate(
        n = win + push + lose,
        win_pct = 100 * (win + 0.5*push) / n) |>
    relocate(n, win, lose, push, .after = moron) |>
    arrange(desc(win_pct))

# A tibble: 6 × 6
  moron     n   win  lose  push win_pct
  <chr> <int> <int> <int> <int>   <dbl>
1 P       200   109    88     3    55.2
2 D       200   108    88     4    55  
3 W       200   103    96     1    51.8
4 S       200    98    97     5    50.2
5 H       190    91    94     5    49.2
6 K       180    84    92     4    47.8

Do these morons know what they’re doing when they wager different amounts? The overall number of winning, losing, and pushed wagers by units wagered.

Code

mbl |>
    filter(!is.na(result)) |>
    summarize(n = n(), .by = c(wager, result)) |>
    pivot_wider(names_from = result, values_from = n, values_fill = 0) |>
    mutate(
        n = win + push + lose,
        win_pct = 100 * (win + 0.5*push) / n) |>
    relocate(n, win, lose, push, .after = wager) |>
    arrange(wager)

# A tibble: 4 × 6
  wager     n   win  lose  push win_pct
  <dbl> <int> <int> <int> <int>   <dbl>
1     5   104    51    50     3    50.5
2    10   996   507   472    17    51.8
3    15    16     4    11     1    28.1
4    20    54    31    22     1    58.3

The overall percentages of winning, losing, and pushed wagers by units wagered.

Code

mbl |>
    filter(!is.na(result)) |>
    summarize(n = n(), .by = c(wager, result)) |>
    pivot_wider(names_from = result, values_from = n, values_fill = 0) |>
    mutate(
        n = win + push + lose,
        win = 100*win/n,
        push = 100*push/n,
        lose = 100*lose/n
    ) |>
    relocate(n, win, lose, push, .after = wager) |>
    arrange(wager)

# A tibble: 4 × 5
  wager     n   win  lose  push
  <dbl> <int> <dbl> <dbl> <dbl>
1     5   104  49.0  48.1  2.88
2    10   996  50.9  47.4  1.71
3    15    16  25    68.8  6.25
4    20    54  57.4  40.7  1.85

The overall percentages of wining, losing, and pushed wagers by units wagered for the individual morons. Note that some morons only ever bet in 10-unit increments.

Code

mbl |>
    filter(!is.na(result)) |>
    summarize(n = n(), .by = c(moron, wager, result)) |>
    pivot_wider(names_from = result, values_from = n, values_fill = 0) |>
    mutate(
        n = win + push + lose,
        win_pct = 100 * (win + 0.5*push) / n) |>
    relocate(n, win, lose, push, .after = wager) |>
    arrange(moron, wager) |>
    print(n = Inf)

# A tibble: 17 × 7
   moron wager     n   win  lose  push win_pct
   <chr> <dbl> <int> <int> <int> <int>   <dbl>
 1 D         5    17     6    10     1    38.2
 2 D        10   174    99    72     3    57.8
 3 D        15     1     0     1     0     0  
 4 D        20     8     3     5     0    37.5
 5 H         5    40    17    21     2    45  
 6 H        10   128    63    62     3    50.4
 7 H        15     4     1     3     0    25  
 8 H        20    18    10     8     0    55.6
 9 K         5     6     3     3     0    50  
10 K        10   161    74    84     3    46.9
11 K        20    13     7     5     1    57.7
12 P         5    41    25    16     0    61.0
13 P        10   133    70    61     2    53.4
14 P        15    11     3     7     1    31.8
15 P        20    15    11     4     0    73.3
16 S        10   200    98    97     5    50.2
17 W        10   200   103    96     1    51.8

Brett Presnell

Emeritus Professor of Statistics

My research interests include nonparametric and computationally intensive statistics, model misspecification, statistical computing, and the analysis of directional data.