Beyond the Basics: An Easy Alternative to the RPI Selection Process

June 22, 2023

Zack Capozzi

Kevin P. Tucker

What’s the most controversial aspect of college lacrosse? Faceoff rules? Drama between coaches? Or is it the process we go through to select NCAA tournament teams? To some, I’m sure each of those three feels like the most salient issue.

For my money, though, given how the selection process shapes so much about this sport, it’s not even close.

To be clear, I don’t think the selection process is controversial because of the teams that ultimately get selected. It’s the process that we use to get there. And there is one aspect of that process that I think should be reformed: RPI (and how it’s used).

I hope this article stimulates some conversation and discussion. I would love to hear from you with your reactions and suggestions. As a sport, I think it’s clear that we aren’t happy with the way that RPI is used, but there hasn’t really been an alternative. My goal is to introduce another method, in a tangible way, so that we can compare pros and cons. To me, this is a common-sense change, but I’m open to being persuaded that it’s not.

So, without further ado, here’s what I’m proposing. We make two changes to the official selection criteria:

Remove RPI as an official selection criteria, but use it as an input to calculate RPI Strength of Record
Remove “Record against ranked teams 1-5; 6-10; 11-15; 16-20; 21+” and Average RPI win/loss and replace it with RPI Strength of Record

MUCH MALIGNED, NOT ENTIRELY BAD

I’m not going to re-hash all of RPI’s ills. It’s a well-worn path that doesn’t need to be walked for the 100th time. The NCAA has made it clear that RPI is not an ideal way to rank teams for selection purposes in a sport with less than 20 games played per team.

That said, I think it still has a place in the process because it’s an acceptable (and understood) way to create a single score that combines a team’s winning percentage and strength of schedule. There just aren’t a lot of options for doing this that only use data from the current season.

But instead of using RPI as the de-facto metric to describe how strong a team’s resume is, it should be an input to a Strength of Record calculation instead of being one of the selection criteria itself. If that seems like splitting hairs, consider the downstream effects of RPI being first-among-equals among the selection criteria.

RP-WHY NOT?

Before the tournament, Peter Milliman spoke with D-Fly and Dixie about the scheduling considerations that he has to contend with when it comes to RPI (29:30 in the interview).

“It actually brings your RPI down to play a 15th game against a team that’s not in the top 30” was the quote that caught my ear. That’s the mechanical problem of having RPI as an actual selection criteria. You can win and have your RPI go down.

But just before, he made what is an equally important point: “If we can play a part in the expansion of lacrosse in any capacity ... I would love to continue to help that. I think there is actually one factor that hinders it a little bit, and that is that the RPI factors in every team on your schedule and not just your top competition."

To be clear, the problem I have with RPI is not that it considers all your opponents. I’m actually strongly in favor of this, which is why I don’t mind keeping RPI part of the process. The problem is that with the way RPI is used, who is on your schedule is too important relative to the actual results on the field.

It’s true to say that the committee doesn’t just go by RPI to award bids, but it’s clearly a big part of the decision-making process. And regardless of how it’s used by the committee, if its usage is forcing coaches to make these sorts of decisions when scheduling, it’s a problem. This is why, to reiterate my point above, it’s not about which teams get in, it’s about the negative effects of the process that we go through to get there.

But why do we need Strength of Record to solve this impasse? Why not just change the RPI formula to increase the weight of the win percentage factor? If strength of schedule is weighted too much, just increase the weight given to each team’s winning percentage (it is currently 25 percent of the total), and that would make actual results more important, right?

STRENGTH OF RECORD IS THE WAY

I think it’s clear what would happen in that scenario. Teams with tournament aspirations would be less likely to play each other in the non-conference. That would be disastrous. Strength of Record (SOR) is the way to square the circle here.

I can’t propose an SOR component to be added to the process without explaining what SOR is. The core principle is that you sum up the value of a team’s wins and subtract the value of their losses. The resulting number is their SOR, and you order teams based on this number. Higher SOR is better.

The “value” of a win or loss is based on the ranking of the opponent. In the simplest implementation, that would be RPI. That means that, using Division I men’s lacrosse as an example, you’d earn 76 points for beating the No. 1 RPI team. You’d earn one point for beating the last-place RPI team. If you lost to the No. 1 RPI team, you’d lose one point. If you lost to the last-place RPI team, you’d lose 76 points.

(If an RPI SOR criteria were to be added, it would probably need to be a bit more complicated than this, but this at least provides a basic understanding of what RPI SOR is. See the appendix for more discussion on this point and an example of how this plays out for Yale.)

Another way to present this is to lay out the core principles of an SOR metric that could work as a selection criterion:

No loss can improve your SOR. Any victory must improve your SOR. And the degree to which your resume is helped or hurt is based on how notable the result is.
No arbitrary cutoff points (i.e., the current large difference between beating the No. 20 team versus the No. 21 team)
The system should be designed so that the risks and rewards for scheduling any game are balanced (i.e., no scheduling strategy should be inherently better)
All things equal, a stronger schedule should still improve your tournament chances

IS THIS REALLY THAT DIFFERENT?

If you’ve made it this far, you might be thinking, “Wait a second, isn’t this what we have tiers for?” And you’d be 100 percent right. There is a criterion in the current process that differentiates whether your wins are against good teams or not. We constantly hear about how this team has three top-10 RPI victories and this team has three, while this other team has zero, but they did go 5-0 against the 10-20 RPI teams.

Fundamentally, SOR uses tiers, but there is a single team in each tier. It differentiates between the value of a win over the No. 6 RPI team and the No. 10 RPI team. And it doesn’t have huge drop-offs when an opponent falls out of the top 10 or the top 20. Today, the value of a win over the No. 26 RPI team is effectively zero, while there is really no difference between beating No. 11 and No. 20. That doesn’t pass the metric “eye test.”

It also avoids the problem of trying to compare Team A’s record against the various tiers with Team B’s record against those same tiers. With no real rules about how to evaluate a 3-1 record against the top 10 against a 7-1 record against 21+, you end up with a result that is more of an “eye-of-the-beholder” outcome than something objective. RPI SOR gives you a single score, so the comparison is much easier. A higher SOR equals a more impressive resume.

Using RPI SOR as an official selection criterion essentially takes what we have today, removes the worst parts of it and makes the process more transparent. It’s not a radical change, it’s an evolution. If you are of the mindset that the selection process doesn’t really need an overhaul, but you are sympathetic to the concerns that Milliman and others have raised, then you should be for this proposal.

WHAT DO WE HAVE TO LOSE?

College hockey uses a fully formula-based process to select its tournament field. College basketball has already evolved past RPI with the NET system. It’s time for lacrosse to do the same, and I hope that I’ve at least made a start at laying out a viable alternative.

I think it should be the responsibility of the committee, the RAC and the coaches to set the rules of the RPI SOR system. How valuable should three top-30 RPI wins be relative to a top-5 RPI win? Do we care more about consistency or peak performances or avoiding bad losses? Is there a head-to-head flipping mechanism? (I am more than happy to help coordinate this process of calibrating the RPI SOR system.)

I also would love to have RPI SOR stress tested. I’ve done a good bit of this, but part of the motivation for writing this piece was to have others poke holes in the concept. I have created a simulator tool that allows you to play around with alternate scheduling strategies to try and game the system. If you can build a schedule that gets a team in when they don’t deserve it, then the system may have holes. (To try and beat the system, click a team logo, then go to the “Schedule” tab to build a schedule.)

You can also use it to simply get more familiar with how RPI SOR works. It shows the calculation for every team, game by game, so you can see how a team’s RPI SOR score is arrived at. (If you really want to go down a rabbit hole, you can even change the RPI weights and see how that changes things.)

There are many ways to design an SOR system that reflect the values and incentives we want to give the coaches who are building schedules. You can design a system to focus on the best wins and worst losses. It can more heavily weight a team’s peak performance and give more of a pass for an off day. If we can agree on what we want to incentivize, you can design an SOR system to reflect that.

This feels like a relatively low-risk proposal. What do we have to lose?

I’ll say this one more time so that you know I really mean it. Poke holes in this. Let’s debate it. Let me know what you think about SOR and whether it should be a part of the selection process. Show me how to use SOR to get an undeserving team into the NCAA field. DMs are open at @laxreference, or you can drop me a line by email here.

FAQ

Doesn’t this mean it doesn’t matter which games you win and lose? No, see more below about the focus on most notable results. It’s true that in a vanilla SOR system, who the wins and losses came against doesn’t matter, but I don’t think that’s a good idea.

Doesn’t this favor teams who play more games? No, what it does is balance risk/reward better. It’s true that every win helps your resume, but the idea behind SOR is that the risk of potentially losing to a team should balance the potential reward for beating them. We should give credit for every win; the key is not to give too much for lesser wins and certainly not give credit for “good” losses.

But there are “good” losses; shouldn’t a team get credit for taking No. 1 to overtime, even if they lose? Great point. They should not get credit for losing to the No. 1 team; that is the root cause of the issues with RPI as a criterion. That said, the team does get credit for losing to No. 1 indirectly. Imagine that Team A and Team B both have three losses. They both lost to No. 10 and No. 7. Then Team A lost to No. 1 and Team B lost to #15. Team A is going to have a better SOR in this case because the first comparison (Team A lost to No. 10 and Team B lost to No. 15) is the highest-weighted comparison. Team A losing to No. 7 is then compared to Team B’s loss to No. 10 (weighted less, but still advantage Team A). Finally, we compare the losses to No. 1 and No. 7. This is the least-weighted comparison, but it is still to Team A’s advantage. Team A’s losses are less damaging than Team B’s losses, so they would get the edge (assuming their wins are equivalent).

Does this ignore head-to-head results? No. You could have a rule in the system that if the first team out beat the last team in, they flip based on the head-to-head result. I don’t have a strong opinion on whether the system should have that rule, but it’s easy to include and the RPI Simulator does have it as an option.

Would you use SOR for seeding? I suppose you could, but I think it’s best suited for the last-team-in/first-team-out discussion. I’m happy to leave the seeding process alone, especially if travel considerations play a role there.

Could this include a margin-of-victory (MOV) component? It could, but “should it?” is the first question to answer there. Is the extra complexity of MOV worth whatever benefit you might see it bringing? I’m not sure. Including it would mean you are forcing a coach to decide between getting some playing time for his or her bench players or maintaining a four- or five-goal lead to maximize their tournament chances. I don't like systems that force coaches to make these types of decisions.

If the powers-that-be wanted MOV included, it would be fine as long as there are no arbitrary cutoffs, and the value of every marginal goal is reduced. An extra goal to win by two versus one should be more valuable than adding an extra goal to win by four. Winning by six should be the same as winning by 50. It would be a good exercise to try and find some consensus about what sort of MOV advantage would be enough to overcome a lesser RPI SOR.

Doesn’t your SOR model use LaxElo, which means it uses data from previous seasons? The Strength of Record ratings that I have on lacrossereference.com do use LaxElo to determine the strength of the opponents. What I’m proposing for the selection process would use RPI as the input, which means that LaxElo would not be involved. It is imperative that any metric that is used for selections only include results and rankings from the current season.

Would this mean that teams have less incentive to schedule difficult opponents? Absolutely not. If you beat a top team, it’s a huge boost to your resume. If you lose to that team, it doesn’t hurt you much. I don’t think the logic for aspiring tournament teams changes much in this regard. I tested this with Notre Dame. The Irish finished second in the RPI SOR this year. Let’s replace Maryland, Georgetown, Marquette, Michigan and Ohio State with Navy, Providence, Dartmouth, Harvard and St. John’s. Much easier schedule for ND.

Under this scenario, the 10-2 Irish would see their RPI SOR drop from No. 2 to No. 6. Still safely in. Here’s the kicker, though: let’s say Notre Dame lost to Duke and ended up 3-3 in the ACC rather than 4-2. Now the RPI SOR is 11 (Bubble Out). If they’d kept their original schedule and still made the Duke game a loss, their RPI SOR would be No. 3. You could schedule easier non-conference opponents, but it would be a risk, just like it is today.

Click here to play around with this exact scenario to see if you can game the system. What bad downstream effects do you think SOR might cause? Use the simulator to prove it.

What about crediting a team more for road wins? You could do that, sure. Perhaps you add a 10-percent multiplier for road victories and home losses. So, you gain 10 percent more SOR points for a road win and lose 10 percent more SOR points for a home loss. I’m not sure what the right number is, but if the powers-that-be want it included, it is trivial to add.

Would this mean more games? I would think so. Milliman’s comments suggest that today a coach might avoid adding an extra game against a lower-ranked team because of RPI considerations. I can’t imagine why SOR would reduce the number of games, so I assume there’d be more filler games scheduled. More lacrosse!

Could you use something other than RPI to calculate SOR? Yes, you could. As I’ve thought about this, I think RPI is the best method. Everyone understands it, and it only relies on wins and losses. As you start to include data other than wins and losses, you get farther away from the process being easily understandable. I think this is a big deal.

I support seeing whether a different set of weights for the RPI calculation would be better; feels like the win percentage weight should be higher if RPI were to be used as an input to SOR. If there was consensus around something else to represent team strength, I would be totally fine with that, but I worry about metrics that are too “black boxy.” I also worry about metrics, as in the case of MOV, that force a coach to make decisions beyond how to win the game he or she is currently playing.

APPENDIX

More notable results get more weight.

The most basic vanilla SOR system simply assigns points based on opponent RPI. The problem you'd run into with simple SOR is that beating the No. 36 and No. 37 teams is worth more than a win over No. 1 (41 points for beating No. 36, plus 40 points for beating No. 37 is 81 points; beating No. 1 earns you 76.) A simple way to game the system would be to schedule extra games against mid-tier teams. Win a few of those and you can make up for no great wins. Not ideal. Kind of a nightmare scenario.

The solution is to care less about the less notable wins. First you compare every team's best win. Then you compare every team's next-best win. Then the third, fourth, fifth, etc. The key is that as you go down the list, the weight assigned to those victories is less and less.

By the time you get to those wins over No. 36 and No. 37, they are weighted approximately 30 percent compared to your best win. So instead of earning more points than another team got for their win over No. 1, you might earn 30 percent of that. Beating those teams improves your resume, but not too much.

And the risk of scheduling those teams is commensurate with the reward. If you lose to the No. 36 team, it is likely your worst (and most highly weighted loss), knocking down your "best" losses into territory where they get less weight compared to other team's losses. This is such a critical part of the logic here. The problem today is that beating a terrible team is always bad for your resume. If you are a bubble team, a game against a bad team is either your seventh- or eighth-best win or your worst loss. That’s how SOR balances the risk/reward of scheduling.

Better wins or no bad losses.

I mentioned that SOR systems can be fine-tuned. Maybe we want to give berths to the teams with the best peak performance (i.e., the best chance to advance). You can do that by making wins lose less weight than losses as you go from most to least notable results. Or you can credit consistency by making the penalty for bad losses greater. Penn v. Denver would have been a fascinating choice as far as this goes. It sounds like the committee favored Denver’s consistency and better record over Penn’s better top-end wins.

An example.

Here’s how the RPI SOR process would play out when we evaluate the wins for Yale. (If you want the full detail, including Yale’s losses, it’s here.)

Yale’s best win is Cornell, which finished No. 7 in the RPI. That earns Yale 54 RPI SOR points. The second-best win was over Denver, which earned Yale another 42 RPI SOR points. All told, the nine victories rated as the 10th-best set of wins in the country. Since Yale’s losses were the seventh-least damaging, its ultimate RPI SOR ranked No. 9.

(Note: the decay column shows how much weight the result loses. Every team’s eighth-best win loses 83= percent of its RPI-based value. That means that Yale only earns 7 points for beating the Danes. This is the effect of weighting the most notable results first.)

Another example.

It can also be helpful to look at a real-world head-to-head example of how RPI SOR produces a resume score. The committee dodged a bullet in the 2023 season with all the surprising conference tournament results. They nearly had to pick between Denver and Penn for the last at-large spot. The committee said that Denver would have had the edge, but you could make an argument for Penn since its peak performance was more impressive. You can examine the head-to-head in more detail here.

Here’s how their resumes compare. Look, they are right that Denver should have gotten the nod. Denver had the more impressive set of victories and the less damaging set of losses. But there have been years in which a resume like Penn’s got a team in because of the quality victories and “good” losses.

Written By

Zack Capozzi

Zack Capozzi's career peaked when he scored a goal in a men's league game against a defense that had a guy wearing a Delaware helmet. He (smartly) ended his playing career and now runs LacrosseReference.com where he uses numbers to uncover the hidden stories of college lacrosse.