Lessons From Creating and Maintaining a Basketball Stats Website


When I started building pbpstats.com I didn’t really have a plan for it. My main goal was just to use it as a way to learn something new while building a site that would make it easy to look up stats that weren’t easily accessible. There was a chance it would never be more than a local web app that was only used by me. As I started adding more and more to the site, I realized that I should probably get a domain name and deploy it so others could use it. I had experience working on back end stuff and data cleaning but a lot of this was new to me, so there were a lot of new things I learned a lot along the way. The code now is very different from when the site first went live. I have completely redone the front end and refactored most of the back end code. In general I think I am now at a point where I have a maintainable codebase, but it wasn’t like that from the start. I have made some updates this summer in places I hadn’t touched in 6+ months and was able to jump right in and easily make the changes. This definitely wouldn’t have been the case with some of my previous projects or even when the site first went live. I figured I would write down some thoughts on things I have learned. These are mostly for my own record, but maybe they will be useful for someone else.

  • Any corner you try to cut in the present will cost you more time in the future trying to clean up your laziness.
  • Writing tests is very valuable. Sometimes writing tests can feel like a waste of time but they have saved me so much time when updating the site. It’s nice to be able to make a change, have all tests pass and be (relatively) confident that I didn’t break something else. If a bug not covered by the current tests, I just add in a new test.
  • Make good comments. Comment on the logic for why I decided to write it that way. Had I not done this I probably would have looked at some code a while later and spent some time trying to rewrite it only to discover the reason why I originally wrote it that way.
  • Document everything. I use Bitbucket which has some basic built in issue tracking and create a new issue and issue branch for each new feature or bug fix. This has really helped me keep everything organized and leave a trail of all changes I have made and why I made them.
  • Spend some time thinking about how future features would fit in and build accordingly. When I first built the site I built the team, player and lineup stats separately. This often meant when making updates I would have to add the same thing in multiple places since they had so much shared functionality. I have now combined all of the shared logic which makes things easier to maintain and test.
  • Everything doesn’t need to be perfect. Worry about getting things working first, then refactor areas that are causing issues. There are some areas where I think my code could be better but they work just fine. Rather than going in and refactoring them now, I will wait until I’m working on something that uses the code that could be improved. Stuff that gets used the most should be optimized first.
  • When doing something you haven’t done before there will be some growing pains. If you aren’t experiencing some pain while trying to learn something new, you probably aren’t learning much.
  • When dealing with messy data, you aren’t going to be able to automate all data processing. Even after spending tens of hours trying to deal with different rare events that need to be accounted for, I still have to manually fix the play-by-play for some games to fix missing events or events that are out of order. Logging games with issues is important to quickly make these fixes.
  • Release a feature when it is ready then build the next feature, don’t try to build all the features you want before launching or updating the site. Things may not work the same way on the live site as they do when testing locally. For example AWS API Gateway has a 30 second timeout that you may not find out about until you try a query that takes longer than 30 seconds. Also, query latency may be different on your production database than you local database. It’s better to find out about these things early, and adjust accordingly, than have to rewrite a bunch of code.
  • It may be intimidating to make your work public but it’s worth it.

See also