Wow, I have been busy, and the Early-Access Beta todo items are piling up! I will just get right to talking about stuff, no introduction necessary!
The Depth Jam and Deduction
Recently, a small group of us did an intensive game design retreat we called the Depth Jam, during which I dug into the deductivity of the Sniper gameplay. I want there to be some deductivity on the Sniper side, so the player has some method or process he or she can step through to eliminate suspects and not just completely drown in information, but I don’t want you to be able to completely grind out who the Spy is by a series of rote operations, like in a game of Clue. Since one of the aesthetic themes for SpyParty is about “making consequential decisions with partial information”, it’s very important that the Sniper not be able be 100% sure of the Spy’s identity, unless the Spy screws up and the Sniper sees a hard tell. There are a lot of places where deductivity comes into play for the Sniper, but an important one is the highlight/lowlight mechanic.
If you’ve read the documentation, you know that the Sniper can highlight and lowlight people at the party to manage levels of suspicion. This is purely a Sniper-side mechanic: the Spy doesn’t know for sure which characters are being tagged in this way or how, except that the Sniper’s laser has to hit the character to tag them, and this is visible to the Spy. The current version of this mechanic has two levels of highlight above neutral, and two levels of lowlight below neutral. The original intent was that you’d want to vary the amount of suspicion or lack thereof based on how the party is progressing, and for the most part, this worked. However, gamers are very good at optimizing things, and soon one of the elite players realized he could use these five levels as a 5-counter, using both the highlights and lowlights to count total number of visits to the statues and bookcases. Depending on the missions selected for the game, this Sniper would have a pretty good idea of who had to be the Spy.
From a game design standoint, this kind of player behavior is a delicate thing to direct in a player-skill game. Should the game help the Sniper do this bookkeeping because the player is going to want to do it anyway? Or, does the existence of multiple levels actually encourage this kind of bookkeeping? Should I nerf the mechanic, or just make the NPCs go to the statues more often so the bookkeeping is less relevant? It’s very hard to know the answers to these questions, but after thinking about it for a while, and talking to the elite players, I decided I needed to deal with this in a few different ways, and it will be an ongoing iterative design challenge:
- I proposed nerfing the Sniper’s mechanic to a single level of highlight and a single level of lowlight. Snipers could still use this as a 3-counter, but that’s less valuable, and each individual highlight or lowlight becomes more consequential. On the flip side, it removes the cognitive load associated with trying to figure out if you’re going to single- or double-highlight, which is subtle but significant. It’s also a lot cleaner controls-wise. This was the first real nerf, and the beta community responded well to the idea. There was some grumbling, but players posted thoughtful critiques and analyses of how they thought they’d be affected. I realized I was going to have to test this thoroughly, but it was easy to isolate and I chose it as my question for the Depth Jam.
- I decided against changing the NPC behavior in the short term, but in the long term, instead of simply increasing the probability that NPCs will visit the statues (for example), I might make it so some NPCs end up going to the statues a minimum number of times. However, either of these is a very complex change, because if an NPC is at a statue, it means they’re not at a bookshelf or in a conversation, so the entire flow of the party will change. This is a very nonlinear change, and it’s hard to predict what will happen.
- I can also change the missions that require the Spy to be in “deductivity-susceptible situations”. Keeping with the statue example, at the Depth Jam we ended up modifying the Inspect Statues mission so that instead of requiring three visits to the statues, the Spy could also inspect the neighboring statues if they’re not being held by and NPC. This allows the Spy to trade off duration and number of visits, and requires Snipers to have a feeling for both these quantities. There’s also a discount on the amount of time it takes to inspect each additional statue in a single visit, so you’re encouraged to inspect more than one. Finally, it makes the middle pedestal the most valuable statue to visit, because it allows the Spy to inspect either side, so characters going to the middle statue are a bit more suspicious. I love these tradeoffs, and these changes felt great when we tested them.
To give you an idea of how we went about testing this stuff at the Depth Jam, here is a screenshot of a special build we played. In it, I had the computer automatically increment two counters over the characters’s heads, one for statue visits, and one for bookcase visits, so the Sniper didn’t even need to click. This was never something I’d release to players, but I wanted to see how the game felt if the deductivity was “turned to 11”. Especially before the Inspect Statues changes, you could definitely just shoot (or at least watch) the person with the highest numbers.

I’m very happy with both the single level of highlight/lowlight and the Inspect Statues changes, and they’re going to roll out in the next build.
Beta Balance
I haven’t run detailed metrics yet, since I’m scrambling just to keep things humming along for players, but I did get curious about whether the game is even roughly balanced right now. Qualitively, I’ve been surprised by the seeming consensus in the private beta forums that the Sniper is the harder side to play. In the private testing I did with Ian and Paul, it seemed like at elite levels the Sniper had the advantage, but I wasn’t seeing people complaining about this, which I found interesting. Even more surprising was that players had settled on Pick 4 of 5 Missions on Ballroom as the balanced game mode. Back in the day, Pick 3 of 4 Missions on Ballroom was considered balanced, and even then we were worried about a Sniper advantage at elite levels.
Because SpyParty is so intensely player-skill oriented, I’ve implemented a number of game types so players can handicap matches to make up for skill differences. You can tune the Spy’s difficulty by choosing modes with more or less missions to accomplish, and giving the Sniper more or less information about which missions will be enabled. Some modes allow the Spy to complete any small subset of the missions chosen opportunistically while playing. Other modes require the Spy to divulge exactly which missions he or she will attempt. A large skill gap in favor of the Spy can be handicapped by choosing “6 Known Missions”, where choosing “Any 3 of 6” makes up for a lot of skill on the Sniper side.
Pick 4 of 5 is significantly harder for the Spy than Pick 3 of 4, yet people were playing it as the default mode once they graduated from Beginner Ballroom, so I decided to run some quick numbers.
The first thing I did was looked at the results of all games ever played in the beta. There are four possible outcomes of a game of SpyParty: the Spy can accomplish the missions, the Spy can run out of time, the Sniper can shoot the Spy, or the Sniper can shoot a civilian. We call the first and the last a Spy Win, and the middle two a Sniper Win. The results were even more balanced than I thought they’d be!
Total Games: 13238, SpyWins: 6494 (49.1%), SniperWins: 6744 (50.9%)
Okay, so this is great, but it’s pretty silly, since this includes every newbie game, every game where somebody said “shoot me because I’m stuck on the briefcase”, and the like. So, next I decided to look just at the elite games. I defined “elite game” as one played between the people on the leaderboard with more than 20 hours of game time. That part of the leaderboard looks like this:

I did a few different queries here. First, all games these guys played in the last three weeks:
Total Games: 1794, SpyWins: 826 (46%), SniperWins: 968 (54%)
This includes teaching games, which these guys do a lot because they’re awesome, and everything else, so then I narrowed it to all games they played versus each other in the last three weeks, so these are the most advanced games going on in SpyParty right now…and these guys are very good…I know this because they beat me routinely:
Total Games: 266, SpyWins: 128 (48.1%), SniperWins: 138 (51.9%)
Wow, again, that’s just incredibly balanced!? I couldn’t believe it when I ran these numbers, but I’m super happy with them, and it goes a long way towards explaining why buxx said this to me in chat:
checker: "so, you've played 43 hours or whatever, do you feel like there's a
ceiling for you yet, or still going strong?"
buxx: "Skillwise, not even close, I still feel like I have a ton of room
for improvement"
I just feel incredibly lucky to have a game this deep, this early in its development. I’m so excited to take it even farther…people in competitive game design talk about “300 hour” games, and games like Go and Poker can be played for a lifetime. I hope to take SpyParty to those levels, or as close to them as I can get.
Finally, I looked at these players versus each other at a finer grain. I’ll just post some rough numbers for buxx and ardonite, because they’re an interesting contrast:
As Spy |
As Sniper |
Total Games |
Wins |
Win% |
Total Games |
Wins |
Win% |
buxx |
32 |
23 |
71.88% |
34 |
20 |
58.82% |
ardonite |
24 |
12 |
50.00% |
22 |
14 |
63.64% |
buxx is the top player right now, and he wins most of the time even against the other elite players, but especially as Spy. ardonite also wins over half the time against the elites, but moreso as Sniper.
I can’t wait to delve into the metrics more in the future.
Due to the Depth Jam and other distractions, the todo list has been building for weeks. I’m finally cranking through it, but I’ve dubbed this next build the “megabuild” because of all the stuff I’m shoving into it. This is not the right way to develop software. You should do more small releases, so you can test and isolate bugs, but I’m kind of on a roll and don’t want to slow down to make a build until I get some more stuff fixed. I won’t do this entire giant list before releasing, but I will do a couple more days worth of work. I’m afraid there are going to be bugs due to the number of changes, so it’s really the build after the megabuild that is going to be totally awesome!
