Blog: Voice Racer

Voice Racer - Fall 2022

Every once in a while I get an idea for something that makes me think "yeah, that could be cool/funny". I then write down those ideas in a text document, and look at them when I want something to do. I had this idea while just hanging out with friends, and thought "I could figure out how to make that". So here we go.

The Plan.

The idea behind this project is incredibly simple: a racing simulator that you control entirely with voice commands. Therefore, we can break the project down into three core components: A voice recognition system, a racing simulator, and an interface to link the two.


With the idea and components, it was time to decide on a platform. I have experience using the Unity game engine for 2D games, and decided that its powerful 3D capabilites would be worthwhile to learn for this project. Since I would be using Unity, this meant that my language of choice would be C#.

Recognizing Voices.

Going into this project, I was under the impression that actually doing the voice recognition would be by far the most challenging part of this process. In my experience as a user, when it comes to voice recognition, Google is the best in the business, so I began reading up on their Speech-to-text API. However, their API is paid, and while I likely would have been able to get by with the free trial version, I opted to begin exploring free or open source alternatives.


This is when I stumbled upon the single most convenient thing I have ever experienced in software development. It turns out that a while ago, Microsoft added a speech recognition API into Windows, presumably for accessibility features. Also, because it's Microsoft, it's written in C#, and it turns out that you can literally just import it and use it. I skimmed their documentation, and in I kid you not less than twenty minutes I had a fully working keyword recognizer, all I had to do was add the keywords to the dictionary, and assign their actions. This solution was so hassle free that if I ever work on any project in the future involving voice recognition, I will definitely use it again. It literally just works.

Simulating Racing.

With what I thought would be the hard part out of the way, I began on the part that I thought would be easy: creating the racing simulator. Now, I had absolutely no intentions of making this an incredibly accurate simulator--I was not going to be designing models to simulate tire wear or anything remotely of the sort. Nonetheless, this meant that I needed to program a car. I created four wheels, and opted to let Unity's built-in physics engine handle most of the heavy lifting, using my code to adjust the torque, steering angle, and braking force applied to each wheel based on player input. This was enough to get me a car that I could sort of drive around the level, but it didn't handle very well. I currently had something that felt like a minivan, and I wanted a sportscar. This meant that it was time for me to get my hands dirty and fine tune the physics of the car.


Did you know that vehicle handling dynamics are super complicated? I am going to be completely honest with you, physics was one of my least favorite subjects I've taken. So I just lowered the car's center of gravity, and spent about an hour tweaking friction curves on the tires until the car felt better. I was okay with the handling not being supercar precise, at the end of the day, people are going to be controlling it by shouting at it. So I got it to where it was "close enough" to what I wanted and carried on.

Driving With Your Voice.

With a functional car and voice recognition engine, it was now time to get to work hooking the two together. For this, I decided that the cleanest method of implementation would be to create an intermediary "driver" object. The keyword recognizer could then call methods on the driver object, which in turn would adjust the throttle and steering inputs on the car. Much to my surprise, this worked super well, and I now had a functional car that you could drive on the temporary plane I had placed in the scene. I then went on the asset store, found a free pack somebody made of race track components, and half an hour later I had a fully working prototype of my game! I then entered...

Mid-Development Hell.

Mid-development hell is what I like to call the often overlooked "middle" part of game development. After you have a working prototype, next comes fine tuning and polish. This always takes no less than ten times longer than you expect it to, and it is imperative that you create a proper plan to prevent scope creep. For Voice Racer, my plan for polish was the following:

  • Decorate and create a good looking environment around the track.

  • Improve the shading and lighting.

  • Implement a heads up display keeping track of the car's current speed and lap time.

  • Implement an easy to use way of keeping track of your lap times -- this is a racing simulator, after all.

  • Allow the user to customize the color of the car.

Building an environment.

When I picture a racetrack, two images immediately spring to mind: The NĂĽrburgring and Circuit de Spa-Francorchamps. The beautiful, picturesque way the track ebbs and flows through the countryside is something I aspired to capture in my game. As such, I decided to set the track in the mountains, surrounded on all sides by trees.

The simple solution was to line the track with tree models, and overlay some 2D images of mountains, but this felt cheap. I wanted the skybox for my game to be 3D. There are countless ways of accomplishing this, but one that I have always liked is the one used by Valve Software for their games. Valve takes an approach that involves utilizing two cameras, one of them rendering a small scale model of what they want to be in the skybox. The reason Valve chooses to do this is to get around limitations with the Source Engine that powers their games, and theoretically Unity would allow you to simply have large models placed far in the distance in your level. However, I argue that implementing Valve's approach within Unity allows for the scene to be more organized and easier to work with in the editor. This approach also enables you to utilize a low culling distance on your main camera and thus achieve better in game performance without sacrificing visual quality.

For these reasons, I opted to create my own implementation of Valve's method. I utilize two cameras, one in the main level, the other in the skybox. The skybox camera takes the movement vectors from the main camera, scales them down according to the skybox size, and applies them to itself. The main camera then renders everything except the sky, and the skybox camera renders only the sky. The end result is a seamless transition with good visual quality. In the image on the left, the mountains in the distance, as well as the clouds are all rendered using the skybox camera.

Keeping Time.

With the environment completed, it was time to implement a robust timing system. The goal here was for the timing system to do four things: keep track of your current lap time, keep track of your best time of the session, all times in the session, and your all time best.

To keep track of your current lap time, I implemented a checkpoint system. The timer would simply keep track of how long it took the car to pass each checkpoint in order, and that would be your current lap time. Upon completion of the lap, your time is checked against both your best time of the session, as well as your all time personal best, and both are updated accordingly.

I opted to utilize an Array List data structure. Although the constant time adding and removing of a Linked List would have been nice, I decided that with the operations that I was performing on the list, an Array List would be both easier to implement and yield marginally better performance thanks to constant time random access. Each time the car successfully completes a lap, the time for that lap is added to the list. I then created a timings screen to display the all the times on the list, in addition to the player's current time, session best time, and all time best.


Coloring the Car.

When it came to coloring the car, I had two options. Either tint the car model to the user's preference, or allow the user to choose from preset colors. I opted for the latter, primarily because I was never happy with the way tinting the car looked, although I may go back and change this in the future. To change the color of the car, I simply read the user's color preference, and swap the material used on the car accordingly. Between sessions, all user preferences are saved in a 'settings.json' file, as this way the user does not have to configure their preferences each time, and it is also easy to edit in any text editor if for some reason the user wants to input an exact value for the game volume, for instance.

The Dark Side of Polish and What I Learned.

When I first embarked on this project, I naively assumed that it would take two weeks. It ended up taking twelve. Part of that was my fault, I was only able to allocate an hour or so each day due to school. But more importantly I completely underestimated the time game development takes. Even a relatively simple game such as this one has so many moving parts, all of which you want to polish to the extreme. Eventually, you just have to throw your hands up and say "I'm done, this is as good as it's getting, time to publish". I almost certainly could have published this game seven weeks ago, but the voice recognizer was inconsistent so I decided to refine it. I almost certainly could have published this game six weeks ago, but the graphics were terrible so I decided to work on them. I could have game four weeks ago, but the main menu was hideous. You get the point. The work never ends. Even now I could point out a dozen things in this game I'm not fully happy with, but at some point you just need to declare something "done".

Development setbacks aside, I want to pat myself on the back for not allowing scope creep to take hold. It was so tempting to implement things like online leaderboards, or a track editor to allow users to create their own circuit. But no, these features were not in the initial plan, and implementing them would serve only to balloon the project. Aditionally, I learned a lot about structuring game development projects like this "correctly". I have made games in the past, but the scenes and code have all been spaghetti by the end of the project. For Voice Racer, I think I did a good job keeping things organized and modular.

Download.

If you want to check Voice Racer out for yourself, you can download it by clicking the button below.