Dan Luu

What to learn?

Added 2021-03-01 11:27:54 +0000 UTC

Steve Yegge has a set of blog posts where he recommends reading compiler books and learning about compilers. His reasoning is basically that, if you understand compilers, you'll see compiler problems everywhere and will recognize all of the cases where people are solving a compiler problem without using compiler knowledge. Instead of hacking together some half-baked solution that will never work, you can apply a bit of computer science knowledge to solve the problem in a better way with less effort. That's not untrue, but it's also not a reason to study compilers because you can say that about many different areas of computer science and math. Queuing theory, computer architecture, mathematical optimization, operations research, etc.

One response to that kind of objection is to say that [one should study everything](https://twitter.com/danluu/status/899141882760110081). I have no objection to taking that approach if that's your preference, but I think *should* is too strong. Another approach that can also work, one that's more to my taste, is to, [as Gian Carlo Rota put it](https://alumni.media.mit.edu/~cahn/life/gian-carlo-rota-10-lessons.html), learn a few tricks:

> A long time ago an older and well known number theorist made some disparaging remarks about Paul Erdos' work. You admire contributions to mathematics as much as I do, and I felt annoyed when the older mathematician flatly and definitively stated that all of Erdos' work could be reduced to a few tricks which Erdos repeatedly relied on in his proofs. What the number theorist did not realize is that other mathematicians, even the very best, also rely on a few tricks which they use over and over. Take Hilbert. The second volume of Hilbert's collected papers contains Hilbert's papers in invariant theory. I have made a point of reading some of these papers with care. It is sad to note that some of Hilbert's beautiful results have been completely forgotten. But on reading the proofs of Hilbert's striking and deep theorems in invariant theory, it was surprising to verify that Hilbert's proofs relied on the same few tricks. Even Hilbert had only a few tricks!

If you look at how people succeed in various fields, you'll see that this is a common approach. For example, [this analysis of world-class judo players found that most rely on a small handful of throws](https://judoinfo.com/weers1/), concluding[^J]

> Judo is a game of specialization. You have to use the skills that work best for you. You have to stick to what works and practice your skills until they become automatic responses.

[Joy Ebertz has analogous argument in for programmers](https://staffeng.com/stories/joy-ebertz):

> One piece of advice I got at some point was to amplify my strengths. All of us have strengths and weaknesses and we spend a lot of time talking about ‘areas of improvement.’ It can be easy to feel like the best way to advance is to eliminate all of those. However, it can require a lot of work and energy to barely move the needle if it’s truly an area we’re weak in. Obviously, you still want to make sure you don’t have any truly bad areas, but assuming you’ve gotten that, instead focus on amplifying your strengths. How can you turn something you’re good at into your superpower?

It's really difficult to measure programmer effectiveness in anything resembling an objective manner, but back when I played video games competitively, a very long time ago at this point (back before there was "real money" in competitive gaming), the thing that took me from being a pretty decent player (a regular player for a team that was in contention for the championships every season) to a very good player (in one season, I was reliably dominant enough that my team didn't lose a single game I played and I would almost always play in the most difficult matches, against other top teams) was abandoning practicing things I wasn't particularly good at and focusing on increasing the edge I had over everybody else at the few things I was the best at.

I think this is actually more effective at work than it is in sports or gaming since, unlike in competitive endeavors, you don't have an opponent who will try to expose your weaknesses and force you into positions where your strengths are irrelevant. If I study queuing theory instead of compiers, a rival co-worker isn't going to shoot down projects where queuing theory knowledge is helpful and leave me facing a field full of projects that require compiler knowledge.

Unlike Matt or Steve, I'm going to say that you should take a particular approach, but I'll say that learning a few things and not being particularly well rounded has worked for me in multiple disparate fields and it appears to work for a lot of other folks as well.

If you want to take this approach, this still leaves the question of what skills to learn. This is one of the most common questions I get and I think my answer is probably not really what people are looking for and not very satisfying, but I'm going to give it anyway. By the way, In this post I'm going to discuss technical skills, but I don't see why this shouldn't generalize to non-technical skills.

One of the two main methods I've used to decide what to learn is to opportunistically learn whatever skills folks around me are really good at. For example, at one point, I happened to end up sitting next to a group of former Cray engineers. They'd previously built some of the highest performance and most sophisticated interconnect on the planet, so I asked them how I should learn the basics. One person recommended [Dally & Towles](https://amzn.to/303rkCS), so I read it and asked them follow-up questions to fill in th gaps. More recently, I've been sitting in a group with world-class expertise on, among other things, [tracing](https://danluu.com/tracing-analytics/) and [caches](https://twitter.com/danluu/status/1324416895013986305). It would be an incredible waste not to learn from these folks, so I'm picking up what I can.

Earlier in my career, when I worked at Centaur, relative to other hardware companies their size (and even compared to most much larger companies, although not IBM, which had world-class tooling for this), they were very good at randomized testing, so I learned how to do randomized testing. I never got particularly good by Centaur standards, but if I look at where software companies are today, I'd be surprised if any major software company is as good at this in a decade as Centaur was in 2005, so [I don't actually need to be very good at randomized testing to be "pretty good for a programmer"](/p95-skill/).

One reason I think this approach is reasonable is that [there's a lot of knowledge that isn't written down, which can't be learned from books, making the options for learning them extremely time-consuming trial and error or learning from experts](/hardware-unforgiving/). Dan Wang has talked about this extensively with respect to semiconductors and manufacturing, but software infrastructure isn't fundamentally different. If random skills are often valuable, then when you have access to experts, it makes sense to learn from them.

In some ways, this is a long-term approach. While I'm sitting next to a world-class cache expert, there's not much need for my middling cache expertise. But a lot of my middling expertise that I picked up in the past is now very useful!

My other approach is much more short term. Instead of (or really, in addition to) learning skills from the experts around me, I try to find skills that are lacking in my organization that would obviously be useful and supply a very poor version of them. An example of this is that, until recently, my org didn't have data science expertise, so I've been using my very basic stats knowledge (acquired by reading the first half of Statistical Rethinking, an introductory stats textbook) because going from having zero people apply stats to one person apply stats can generate huge wins, even if that one person has less stats knowledge than the median data science intern. The impact of applying stats is too large for my organization to ignore, which basically makes my stats knowledge obsolete, since it looks like we're going to hire people with decades of experience instead of relying on some guy who's read half of a book, but the value I got out of learning and applying stats was still worth the time I put into it, even if it was transient.

Although these two techniques are, in some sense, the opposite (finding the deepest experts near you and learning from them and finding critical lack of expertise near you and learning that material), there's also a sense in which they're similar. In both cases, there's an opportunity. It's just that in one case, the opportunity is rapid growth and in the other case the opportunity is rapidly being able to create a lot of value.

This is a draft post I wrote in roughly an hour (long in terms of wall clock time since I took a break to eat, another to catch up on e-mail, and another to listen to some music). I think I'll probably publish some variant of this on my blog at some point, but I think it's still missing something and, of course, this is totally unedited and it at least needs an editing pass. I'm posting this here in case people are curious what my totally unedited drafts look like. I often remove large sections from a draft when editing but still increase the length significantly, usually by 50% to 100%. Since I haven't read what I've written, I can't even guess at what will happen here. We'll see.

I'm leaving the raw markdown in and not converting it to Patreon markup since that gives a better idea of what my drafts look like. Apologies if that makes it a bit annoying to read.

[^J]: This is an old analysis. If you were to do one today, you'd see a different mix of throws, but it's still the case that you see specialists having a lot of success, e.g., Shohei Ono, arguably the most dominant Judo player this century, is heavily specialized.