This writing represents a conversation with myself through time.
Essays marked with an asterisk (*) are technical. If unsure what to read, try War and Business, Falsifiability, or Welcome to the Future.
In Silicon Valley, it is common to hear people deliberating making life and career decisions through the framework of leverage. People decide to found company X or work at company Y because "it's the highest leverage choice." That is to say, it's the position from which they can exert the biggest raw impact on the world from their current place in life. High leverage choices have many multipliers between actors and the effects of their actions. Being the engineering lead for Google's search team is high leverage work, being a lone engineer on that team of 100 engineers is not. Leverage is purely about the scale of impact a single person can exert as a result of intentful choices that multiply the effects of their actions (thereby acting as a "lever" on those actions).
One thing that I've found disturbing about this philosophy is that it completely omits the notion of uniqueness in one's work. Let's caveat this with the fact that everyone share's different worldviews, and this is merely my own, etc etc. When thinking about what you could do that has a highest leverage, you forget to think about what only you could do. To see how different uniqueness and leverage can be, consider being an engineering lead at Facebook. Your team manages ad placement, which is business critical on the order of billions of dollars. This is enormous leverage. However, if you weren't doing it, some other commodotized MIT-CS-PhD type (call them Dr Nurd) would be doing it instead of you. Now consider the scope of work for an engineering lead: choosing a technology stack, architecturing an app or feature, recruiting new engineers onto the team, amongst other things. At most times there exists a clear state of the art and standard practise on doing the first two things, and individual junior engineers you recruit onto the team are unlikely to utterly transform its direction and atmosphere. That is to say, Dr Nurd and yourself have significant decision overlap -- the ad placements on Facebook would look mostly the same under both of you, especially since the major product decisions are made by executives.
And so on some level, there is very little fundamentally new (where new means doing things very differently to how someone similar to you would do them) you add to the world by being in it. Conversely, a mediocre writer with an audience of even a few thousand people is doing something that could not have existed without them. The books pumped out by two similar writers can be vastly different. A good heuristic: would the world have lost something, subtle but fundamental and substantive, if you, in particular, hadn't existed?
And so we must consider both uniqueness and leverage: the number of "forks in the road" -- that is, opportunities to make a unique decision very different to what someone else would have made in your position -- presented to you per unit time (say, per day), and how different the product/service you provide looks to the end consumer as a consequences of those decisions. For instance, if you have an eclectic taste in coffee, and use your clout at work to get an obscure brand of coffee maker brought into work, you've contributed something new to the world, but it has no meaningful impact for the consumer of the end product of the company (uniqueness but not leverage). In contrast, if you're choosing between two recently published sublinear network routing algorithms for the streaming service on Netflix as an engineering lead there and you choose the one that is more amenable to systems optimizations, you're not making any decisions unique to you (Dr Nurd would have done the same) but that choice (the packet transmission speed induced by the resulting systems optimizations) is having a huge effect on the user experience (lag, loading times) for the consumer (thus, leverage but not uniqueness).
In general, work that has high uniqueness is that which presents many forks, as described above. And so intensely creative work -- from theoretical physics to playwriting to architecture -- has high uniqueness because it represents the will and vision of its creator, manifest in the flesh. And two otherwise similar people can produce vastly different pieces of prose, since the possibility space for different types of prose or plays is much, much bigger than the possible decisions two engineering leads could make about choosing a technology stack (where there exists best practices for the most part). But there are other types of work that have high uniquness, work that you think is important but that almost everyone else doesn't. For a pedestrian example, consider a YouTube channel on Russian stamp collection. Because nothing like this exists, if you were to start one and get even a few thousand subscribers, you would have brought into the universe something fundamentally fresh and new, something that wouldn't exist without you. The universe would be a little worse off without you. The same cannot be said for, say, college admission reaction videos that go viral on YouTube (if it wasn't yours, it'd be someone else's). For something a little grander, consider entrepreneurship; because starting a company involves thousands of little decisions that two similar people may disagree on (eg. product features), two similar founders can create two enormously different companies. Consider Brian Chesky and AirBnB -- one of the few examples of a truly contrarian bet. Everybody thought the idea was moronic, as well as hundreds of investors they first pitched to, and, importantly, no-one else was working on anything like it. Therefore, AirBnB (or anything like it) would likely not exist if Brian Chesky didn't, and we'd all be worse off because of it.
Before people accuse me of criticising "conventional" jobs, I should note that the thing that matters in the end is not just uniqueness or just leverage, but the product of the two. And so if I had to choose between an obscure stamp collecting channel and being an engineering lead at a big tech company, I'd probably choose the latter (begrudgingly). But if I had to choose between being a niche writer with a few thousand (dedicated) readers, barely making a frugal living but getting by, and an engineering lead, I'd choose the former in a heartbeat. As caveated at the beginning, this is predicated on the fact that something resembling uniqueness matters to you at all. I think it intuitively does to most people, but I hadn't seen anyone quite articulate it in the way that made sense to me.Back To Top
I recently had a bit of a chat with someone I've looked up to for a long time over email, and here's a tid-bit from one of their responses.
I'm particularly interested in his first point about opacity, as a few examples of such an "opacity bias" come to mind from my own experiences. You think something is trivial (unimportant) in the naive sense as an outsider to the field, but when you learn a little bit about the field you realize how important it is that that thing is done well.
There is a Harvard professor I got to know who works in behavioral economics and macroeconomics. In behavioral economics, there is a notion of "choice architecture," where framing matters for human decision making. For example, changing 401k plans at companies from default opt-in, where you had to fill out an annoying form to start your 401k retirement plan, to a default opt-out, has done a huge amount to increase savings in the United States (empirically). It's not a profound idea, but it's important. Even more, this economist spent a great deal of time using ideas from psychology and behavioral economics to think about how the form for opt-out should be designed, distributed, and presented, to make sure anyone who actually wanted to opt out, even a little bit, had no beauracratic burden stopping them from doing so. I was flabbergasted to see that someone so intelligent could be spending their time and intellectual energy on something so pedestrian as form design. It seemed to me that the work of this prodigious economist was as unimportant as that of a lone accountant at an obscure small company.
Then I took a course on basic macroeconomics, and learned how the banking system actually works. It turns out (this was new to me; or, rather, it's obvious but I had never actually given the topic serious thought) that the reason banking, as an institution, is so profound as a lever for progress, is that it moves capital from useless places (under one's mattress) to useful places (as the up-front cost needed to kickstart a new construction project). Banks are responsible for correctly (efficiently) allocating this capital (and the associated risk), and the fundamental idea: all of this capital comes from savings. A bunch of it from corporate savings (residual profit companies have), but a lot of it also comes from savings on the level of individual households. And so, any money an individual or household saves (such as in a 401k) is really a contribution and bet on future progress (capital being earmarked for capital-intense projects like construction or loans to businesses) of society. And there are robust (ie. reproducible) examples of how savings rate has much to do with future economic output, and, transitively, future progress and prosperity (especially when it comes to technology). And most of all, when you look at the effect of form design on the choices people make by examining huge datasets of millions of responses, you see that even small changes have big effects (this is part of the reason that big tech companies A/B test their UIs so aggressively -- these things make a big difference) on how much people save. And so being an economist that is part of the reason that opt-in 401ks became opt-out 401ks is something to be immensely proud of: it directly contributed to billions of dollars of excess capital being invested into the future, as opposed to on jewellery or being stored under a mattress. It's just that this contribution is opaque to most outsiders to the field, as it was to me. That's a lot of leverage this economist has had; certainly one that Monsieur Lambda at Ye Olde Accountancy Firme doesn't have to his name.
And so it was really only by learning the foundational/basic modes of thought in a field (macroeconomics) that I gained the tools I needed to properly appreciate the motivations and importance of certain work being done in that field. I think this is one example, amongst many else in my own experience, and that of others, of how something seemingly trivial to outsiders can turn out to be of profound importance, once you think and learn a little about it, ridding yourself of "opacity bias."
[TODO: more examples]
There seem to be two types of mathematical models. There are those of convinience, which use heuristic arguments and simple reasoning to create and justify equations that vaguely seem to fit a system, and often yield surprisingly good predictions for their relative simplicity, but don't say anything about the nature of reality that underlies the systems being described. The other type of mathematical model is one that makes strong assertions about underlying mechanisms, and, in doing so, posits something concrete and deep about the underlying nature of the system it's trying to explain. These are often found in physics.
Here's an example of the first type of models. These are a system of differential equations we create to describe the growth and decline of certain populations. We justify them based on our intuition that, initially, a population will grow exponentially in times of plenty, and will eventually plateau due to constraints in the environment around them. This gives rise to the logistic curve we know so well, and it's that curve that we use to make predictions. Amongst other things, these equations predict that growth per capita (or per individual) will plateau asymptotically to zero as we reach the "carrying capacity" of the environment. Of course, in practise, we see that this doesn't hold. It provides a decent approximation some of the time, capturing the fact that growth is initially fast, and slows down, but empirically quite wrong for most population systems. And we explain these discrepancies away by mentioning that real systems are more complicated and we are missing variables, and it is a simple model, and so on. The key point is, these are models of convenience that arise from "economy of thought," as physicist and philosopher Ernst Mach put it. We do not believe that this model captures some deep truth about the universe, but instead is more a mental crutch that helps us make sense of why population grows in the general way it does.
Here's an example of the second type of model. Consider Einstein's general theory of relativity, which goes beyond giving some equations we can use to make predictions, but actually asserts that reality itself is a certain way. Specifically, it asserts that mass changes the curvature of space, measured via the energy-momentum tensor, and these changes in the geometric nature of space manifest as visible effects of gravitation. It goes beyond Newton's equations, which are closer to the first type of modelling, in explainging exactly why gravitation exists in the first place. Another example are the ideal gas laws. They actually assume the existance of atoms to reason about the behavior of individual atoms on a mechanical level. The laws are not entirely correct because of the assumptions they make, but they do reflect some underlying reality (atoms exist, or so we believe).
Here's an example of mathematical modelling that is not obviously a model of "convenience" and neither one that reflects some "deep natural truth." Consider the humble uranium atom. In high school, a typical setting in which differential equations is taught is in the context of radioactive decay, since it is obvious that the rate of change of atoms depends on the number thereof. The more atoms there are, for some fixed probability of decay per atom, the more atoms there are that decay. But what quantum mechanics makes difficult is reasoning about whether probability is deeply embedded in reality, or whether there's an underlying structure we simply don't understand. The fact that physicists showed that there is no "hidden variable" governing quantum mechanical behavior (which at this point we all agree is inherently probabilistic, whatever that means) doesn't mean that there isn't some deeper mathematical structure (eg. M-theory) that is at play causing this randomness we observe in quantum mechanical behavior (as well as atomic decay, etc). This has always been troublesome for my intuition because it's not clear whether I should think of QM as a model of convenience like the Lotka-Volterra equations, or a model that reflects the underlying nature of reality, like GR.
The reason I mentioned the ideal gas laws earlier is because in Asimov's classic series, Foundation, the protagonist develops a statistical theory of macroeconomics called "Psychohistory." In it, he uses advanced mathematical statistics alongside psychology to make predictions about societal behavior, in a time when the population of all humans exceeds one quintillion and is spread across the "galactic empire." After reading some Taleb and perhaps even after taking some psychology courses and seeing how crude our methods for understanding the human mind are, it's easy to believe that something so complex as the human psyche and its behavioral consequences will never be understood well, let alone its emergent behavior when many such minds interact.
On the other hand, cognitive psychology -- and particularly the works of Kahneman and Tversky -- have yielded lots of experimental and theoretical results that are exceedingly reproducible. The reason I'm not a fan of psychology at large is because most results don't replicate. That is, most things published in papers are wrong; mere figments of chance and statistics abused by people (psychologists) who have near-zero understanding of statistics. And so the fact that this work (mostly taken from the book Thinking, Fast and Slow) hsa been so thoroughly reproduced and shown to be robust to replication is, to me, astonishing. This was emphasized to me in my introductory economics course last year, where the professors ran several live experiments with our class of a few hundred people, such as running a real-time prediction market, auctions, and asking people to make savings and investing decisions with real money. Again and again, the results were almost exactly in line with those made by an entirely different group of participants at a different place and time!
This list of common behaviors and biases makes little use of grand theories from biology or neuroscience, and instead humbly seeks to make predictions based on empirical evidence, without philosophizing. Much in the style of Taleb's "convex tinkering." This list reminds me of the ideal gas assumptions we make when deriving the famous pV = nRT equation and ideal gas laws. We use these assumptions (which reflect some deep truth about individual atoms' behavior) and statistical reasoning to determine probability distributions of outcomes of fluid systems, and are approximately right, much of the time! If we have solid results on the level of small groups, results that reproduce, it doesn't matter that the bodies in question are sentient. We've already done much of the hard work in finding the laws that determine the behavior of the unit (the small group of people, for example in an auction or prediction market). Now all that is left is to apply probabilistic reasoning to make predictions about many such units (eg. a country or religion) in the framework of statistics. Perhaps the reason we can't do this yet is because 7 billion people is not enough for such behavior to manifest robustly, and indeed we do need closer to a quintillion. Or perhaps there are behaviors that emerge with scale (eg. differ greatly between 100 and 1 million people) and so these psychological results from Kahneman and Tversky are rendered useless. Or perhaps we need a new kind of statistics to reason about such a thing, in the same way that Boltzmann had to make strides in probability theory itself to formalize his laws of thermodynamics (the greatest scientific result ever published, I should add -- alongside evolution, also a theory that is statistical in nature).
Therefore I hold two contradictory beliefs. One is that much of the work of the economist is doomed to failure because human behavior is too complex to predict, even probabilistically. The other is that we do have empirical results that predict such behavior, and this means that better statisticians than those currently living may be able to generalize these to larger groups, and then use them to make accurate predictions about the behavior of a few million people, in response to some policy or conflict. I think there is more to be fleshed out in trying to ascertain what a real theory of psychohistory would look like, and how it would be notably less "theoretical" and much more statistical than current approaches to economics or social science. It would probably have to come from someone trained as mathematical statistician, who understands human behavior on an practical (think bar fights and comedy clubs) and well as formal (Kahneman's work) level.
[TODO: Think about what such a theory might look like more specifically. Or try to build it yourself and see what roadblocks there are. ]Back To Top
The paper Controlling the False Discovery Rate via Knockoffs by Barber and Candès in 2014 is, in my opinion, a great example of statistical theory at its best. In it, they present a new way to identify causal covariates in high dimensional settings. Put more simply, if you have 10,000 genes potentially contributing to a disease, how do you identify which ones are most causally important, given all sorts of correlations between the genes, known, and otherwise? Minimizing the false discovery rate (FDR) of causal factors is crucial in fields ranging from the obvious (natural sciences like genetics) to the subtle (A/B testing web designs at big tech companies). Their new method for doing so is intuitive, and in some sense, obvious (in hindsight). That's how you know it's the real deal; a graduate student in statistics tells me it's the sort of thing that will be a standard part of inference courses in 30 years time. Here, we take this paper apart, studying what the core contribution is, how the authors might have come up with it, why it's important, and the new questions posed in the field (high dimensional inference) as a result. With that, let's begin.
[TODO]Back To Top
When living in the Bay Area and seeing what people need to know to build interesting things in various fields, I started to believe that most of what you learn in college is (for most people) useless, and that if you purely optimizing for vocational expertise you're better off taking the foundational courses in your field and then leaving school. My attitude on this has shifted slightly. I maintain that the above stance is, as stated, probably accurate, but I also now appreciate where this thesis is incomplete.Where I Was Wrong
The initial premise of my argument was that learning new material is only useful insofar as it does one or more of these things.
- Stretches your cognitive limits (eg. Galois theory)
- Teaches you material you will directly use in the future (eg. intro CS)
- Is fascinating enough that you would regret not studying it (eg. Quantum Mechanics)
This means understanding the foundational modes of thought of the field well enough that you could easily teach yourself adjacent material in the field. The "in a specific field" caveat is important because it do not think that taking a hard graduate course on, say, randomized algorithms will improve your ability to learn financial economics or solid-state chemistry by making you "generally smarter." I think humans are too bad at applying knowledge across domains (even related ones) for that to happen. Instead, what I mean is that taking a grad course on randomized algorithms will help you learn computational complexity theory, algorithmic game theory, and probability theory. In other words, fields that have substantial overlap in content/keywords, and not just "modes of thinking." I elaborate on this, analogizing it with graph theory and giving examples, below.Knowledge as Graph Theory
To understand what I mean by "technical maturity," think of knowledge as comprised of nodes in a graph. Each tid-bit of knowledge constitutes a node, and the connections between them are the edges. For example, a class about line and surface integrals over vector fields would create a new node on the graph, linking it to, say, electromagnetism from one's physics class, which are prime examples of line integrals that have physical meaning. That would be an edge between the two nodes.
Why is this relevant? The contention that many champions of the liberal arts have is that studying challenging topics in broad contexts makes one "generally smarter" and improve one's "ability to learn," in generality. This is NOT what I mean by technical maturity. This claim of theirs, in the context of our knowledge graph, is claiming that a node about commutative groups from abstract algebra, or the portrayal of feminism in Austen's Pride and Prejudice has edges going to things that people do on their jobs; say, writing code or hiring employees. I think there is a wealth of evidence supporting the fact that humans are very poor at transferring knowledge across domains, even closely related ones. Therefore, this "liberal arts as building critical thinking skills" school of thought, I see as pretty much baloney.
But they're on to something. Instead, I claim, knowing about, say, commutative groups is useful for learning molecular representations in chemistry (think of these topics as knowledge nodes, I'm claiming most people would be able to connect the two). Indeed, it's quite different from abstract algebra, but has clear and direct applications of group theory and so is similar enough for meaningful transfer to happen. In other words, not as different to abstract algebra as, say, doing corporate taxes in Excel. Just different enough. Similarly, knowing about the portrayal of feminism in Austen (again, think of this as a node) really could concievably be directly useful when studying the Suffragete movement because, again, they're different yet similar enough.
Therefore what I mean for technical maturity is comfort around a set of keywords. In abstract algebra, one keyword might be "symmetric structure" and in feminist studies one might be "intersectional theory." Having heard these words in other contexts -- even very different ones -- adds value and builds "technical maturity." However, groups and feminist theory is too far from writing code or drafting speeches to reasonably possibly have any keyword overlap.A Concrete Example
In the fall of my freshman year, I took my first rigorous math class. It studied linear algebra from an abstract perspective. One of the things we learned about were unitary operators, which are linear transformations that preserve inner product structure. During the winter break after that semester, I perused a textbook on quantum computation. And in quantum computing, the unit of logic/computation is the idea of a quantum gate, itself a linear transformtion that takes some input and maps it to some output. Being a linear transformation, we represent such logic gates as matrices, and they operate on vector inputs.
One thing that the author of the textbook really emphasized and spent a good amount of time explaining was why we used unitary operators for our logic gates. Evidently, he thought most people reading the book would find that unmotivated. But because I was lucky to take a relatively abstract linear algebra course and had taken a class on probability beforehand, it didn't need any explanation: unitary operators preserve inner products, and random variables are vectors. If you want to ensure the probabilities in the vectors coming in AND leaving the linear transformation (quantum logic gate), using a unitary operator is the obvious choice, since they preserve the inner product (sum of probabilities) of their input. This is the key point: taking the class on abstract linear algebra and one on probability firmly planted nodes (unitary operators, random variables as vectors), and the new topic I was learning (quantum logic gates) had enough keywords in common that I could reasonably create new edges between these knowledge nodes, on the fly. This is the crux of technical maturity: increasing your threshold for what is "obvious".What Might This Look Like In Practise?
If you see college mostly as I do -- a 4-year course on "learning to think like an X" for many different X (mathematician, programmer, physicist, economist, statistician, historian, psychologist), you should optimize for taking difficult, foundational courses in many fields. It's important to emphasize that by "foundational" I don't mean introductory. For example, I do not think taking an introductory CS course or two suffices to "learn to think like a computer scientist." Instead, take intermediate/advanced courses in algorithm design and operating systems to get those gains.
At Harvard, you need to take 12-16 courses in a field and 32 courses overall to graduate with a degree in that field. What if, instead, you took the hardest/meatiest 6/12 in a several related fields? In math, perhaps that includes group theory, rings and fields, real and complex analysis, differential geometry. In physics perhaps classical and quantum mechanics, electrodynamics and thermodynamics? In economics, maybe the intermediate micro/macro sequence alongside game theory, economic history, political economy. In CS, algorithms and operating systems, programming paradigms, complexity theory and compilers. You'd get 80% of the value of 3-4 different undergraduate degrees, and more importantly be equipped to at least understand parts of the research frontier in each field. Teaching yourself would become much easier, and you wouldn't really be an outsider to any of the fields.
The obvious caveat here is if you're set on spending your life exploring and deepening a specific field. If you want to be a mathematical physicist, ignore my advice. Go ahead and take quantum field theory and Lie algebra at the expense of understanding how economists or computer scientists think. That is the price you pay for mastery. And that's fine. But most people are quite as pointed in their goals, and so my thoughts are probably more relevant for them (and myself).
To summarize, I think that building technical maturity in multiple fields equips you with the ability to learn new things in those specific fields (unlike the claims of the pure-liberal-arts charlatans, who claim it builds critical thinking in general). And this manifests itself as deep comfort in novel sitatuations (again, in the fields you've cultivated maturity in only) which leads to very fast learning and growth of knowledge graphs. Ultimately, the point of college is to set down as many nodes as you can in as many diverse fields, constrainted by the fact that you want node density within any specific field (ie. part of your knowledge graph) to be high enough that they roughly approximate/sample the entire space so you can learn new things in new fields and slot them into the context of existing knowledge.
This mostly explains the big difference I saw between technical college graduates and high-school dropouts in the Valley -- the college graduates felt comfortable in novel situations because they knew their graphs were wide and dense. The high-school dropouts mostly had no nodes to use as reference points from which to grow their graph, and so struggled to keep up with new material. Of course, this does not always hold; you can create these graphs outside of the academy and most people in the academy fail to do a good job at creating these graphs at all. But for me, school seems to work well for this, so I'm happy I'm here, for now.Back To Top
The 2012 paper by Alex Krizhevsky et al. in which he implements a deep convolutional neural network for image recognition, is perhaps the most seminal academic paper that has been writtein in the last few decades, full stop. It started the "deep learning revolution." Despite this, it's interesting to note that most of it's contributions are not novel.
It was not the first paper to show how deep networks can significantly outperform traditional neural nets for tasks like image recognition. A paper by Ciresan et al. (2011) had already done that. It was not the first paper to show how GPUs can used to train nets much faster than CPUs. Ciresan's paper also was written in CUDA to exploit GPU support. And of course neither has a claim to originality as far as the concept of convolution in image recognition is concerned; Yann Lecun (2018 Turing) came up with the idea of a CNN in 1989! I AlexNet's success is an example of "incrementalism as discontinuity" innovations. By this, I mean that it has no substantive novel contribution as far as theory is concerned, but makes incremental improvements on existing ideas, combining several important existing ideas (depth, GPUs, dropout regularization) in a way that makes it pass an important threshold. The fact that AlexNet classified over 10% more images in the ImageNet challenge than the runner up is key because that meant it passed the "threshold" for computer vision being practicable. Being the first to cross that threshold (via a well placed incremental improvement), its architecture became ubiquitous, instantly. Alex Krizhevsky is now immortal because of this.
In this piece, we're going to pore over every page of the paper and see how the authors might
have come up with the ideas, and what goes into leaps of technical innovation that are so great that other innovations in the future
become analogized to them. For example, when DeepMind's AlphaFold 2 surpassed the 90% protein classification threshold in 2021, it was called the
"ImageNet/AlexNet moment of biology." That's high praise. With that, we begin.
As I become more interested in economics and ponder the trade-offs associated with a career in academic economics and, say, the high-technology industry, one thing that keeps coming to mind is the notion of "falsifiability" in a job. I define it as the ability to see clear and immediate cause-and-effect between the work you do and associated outcomes on the world that are a direct consequence. For example, medicine has falsifiability; people trust doctors because if a doctor is not good at their job, people will die and it will be clear to everyone. Similarly, business owners have falsifiability: there is an easy way to tell if you're doing your job right--see if you're making a profit. A business owner always runs the risk of going out of business. And they own that risk.
This isn't building up to a criticism of the academy. On the contrary: mathematicians and physicists have falsifiability, too. If your proof of a proposition is incorrect, it will be clear to everyone and unambiguous that you have failed at proving the proposition. If your theory of gravitation yields bad predictions for how a projectile or planet will behave, one can simply conduct an experiment and see that your theory is manifestly wrong. It's not just the presence of clear right and wrong or lack of ambiguity, it is also the ownership of risk and coupling of cause and effect in one's job.
One of the reasons I've become entranced with the study of economics is because it's the rare subject that requires a deep understanding of a variety of fields. A successful economist must have a near-professional working understanding of mathematics, statistics, programming, but also history, politics, philosophy. It's the rare field where studying measure theory and reading Hamlet can both be reasonably argued to be relevant to your job. And it's really impressive to see someone go from proving a difficult theorem on a whiteboard to making a moving speech about the future of a society. Finally, I'm convinced it's possible to have an enormous scope of impact when doing your job right. For example, watching Michael Kremer (2019 Nobel) talk about his work on de-worming showcased to me how his experiments in developing countries gave strong evidence that de-worming policies in primary schools can have huge returns in enabling students to be healthy enough to stay in school for the long haul. As a direct consequence of the work of him and his collaborators, hundreds of millions of students are benefiting from a policy that might have made the difference between them getting a high school education, and not. That, is falsifiability.
My problem with economics is that such examples of falsifiability are far and few between. It often feels like economists are commentators on the side-lines of a football match making strong comments about the performance of players, where they themselves would struggle to play against a talented middle-schooler. When an innovation economist makes strong claims about the conditions required for innovation and about the traits a successful innovator should have, it makes one think: if you truly are correct, why don't you act on this knowledge and start a successful research lab yourself? In other words, because the study of societies is so complicated, with so many moving parts, it's easy to conjure up an explanation for why your hypothesis was incorrect, and difficult to really know if you were, unambiguously, right, or wrong.
In Good Economics For Hard Times, the authors point out that medical professionals have very high trust ratings by the public, and economists very low trust ratings. I think this is because of falsifiability. A doctor owns the risk they run from the predictions they make. If they predict a tumor being benign, and it's malignant, someone dies. If an economist predicts a causal mechanism between a certain tax policy and patent rates, they are never held accountable because they can always behind the shroud of "further study is needed", even if policies based on that research might have cost their country decades in patent-years worth of innovation. In some sense, it's as if the "doctors of the economy", who have potentially the most leverage, are held the least accountable and taken the least seriously.
As a direct consequence, surprisingly little economics research makes its way into policy because, as Banerjee and Duflo point out, "economists are not in the business of futurology". This is my problem. If economics isn't meant to make any predictions about the future, what use is it? The entire point of science is to understand underlying mechanisms using the scientific method, and then use that understanding to make falsifiable predictions about the future, and be held accountable for those predictions. Anything else is, as my father puts it, "intellectual masturbation". In short, futurology is the very point of any scientific discipline. It is the raison d'etre, it is the holy grail, it is the single source of truth.
And so if I opt not to pursue economics in the academy, you'll know why. I want to be held accountable for the work I do.
Back To Top
Smil is a profilic author on energy economics and history, and his work is remarkably well evidenced and broad in its scope. His books are some of the best works of nonfiction I've read, where Energy and Civilisation literally changed how I look at the world. In this short piece, I'll try to articulate some areas of disagreement, where I think he's wrong, without reducing my stance into the sort of blind techno-optimism that is pervasive in 2020 Silicon Valley. Given below are some broad and strong claims he makes in either EaC or Creating the Twentieth Century. Each claim is either mostly unsubstantiated, or just a flat out an opinion disguised as fact. NOTE: CTTC is the title of the book, but I use it to refer to the group of innovations he discusses in the book, innovations made from 1867-1914 (Haber process, x-rays, automobiles, etc) that he believes are far more impactful and epochal than the late computer industry.
- The late computational revolution is less impactful on human life than that of CTTC. My central argument against his belittling of modern innovation is this: it’s true that CTTC innovations were wide-ranging and also deep: energy, agriculture, mechanical/structural engineering, chemistry, electronics, and more, and that modern innovations have been largely limited to software/internet, but I think that’s a reductionist argument/stance for him to take, that just because there were transformative innovations across more fields, it was a better time. I think a fairer lens would be to look at how people’s lives have been changed. I’d argue that since most people spend most of their days in front of screens (like many would spend on the farm centuries ago), the seemingly pedestrian innovations like Google, FB, Uber, AirBnB, Amazon, have an enormous impact in how people live their lives on a day to day level, just as much as the automobile and electric lightbulb put together. I think just as how the liquefaction of air and mass production of electricity are seen as innovations in different domains, the widespread use of technology for transport and technology for e-commerce is also quite different in how users end up using them, despite both being classified as “software” innovations.
- Consumerism and high-energy society is bad and we should not use any more than we need. Instead of trusting energy innovation, we should live within our means.
- Major artistic leaps might not have happened if not for this singularity of technology innovation.
- Human ability to harness increasing amounts of energy more easily is the most important proxy for technological progress. I think this was true throughout history, but starts to break down in times of energy excess (which had never existed before a few decades ago). Even the poorest in the world have access to gas to cook and heat with, and lights to light their houses. If we have enough energy to live above a certain comfortable baseline (much like how Smil draws that baseline in nutrition as having enough calories/diversity to be healthy), then of course you shouldn’t expect huge leaps forward, because it’s a solved problem. I’m not saying energy is “solved” certainly, but as with any system offering diminishing returns, you can’t expect huge leaps in energy if it’s not the bottleneck for human prosperity and survival anymore. And if progress, for Smil, is definitionally tied to the ability to harness increasingly powerful prime movers, you definitionally won’t see progress, which is kind of the case he puts forth.
- Using "how surprised a contemporary scientist would be if they were magically transported to more modern times" as a metric/proxy for technological innovation For all my cynicism that Smil undervalues modern computational technology, I think he’s probably right on this front: going from 1850 to 1920 would be more surprising to competent scientists than 1920 to 1970 or 1970 to today. But two points in defense of modernity:
Sure, some innovations are lower down the abstraction chain and necessary to literally sustain life (Haber process) but I think that those innovations can tautologically only come about once: you can’t revolutionize agriculture from shortage to insane surplus more than once. If it’s largely a solved problem (like food/urban lighting/etc) then there’s no incentive for lots more innovation. I think innovations higher up the abstraction chain are important and revolutionary in the same way as long as they impact the daily life of billions of people in a nontrivial way, which they certainly do.
I also think the idea of invention vs improvement is more subtle than Smil lets on. Take the Hall aluminium process--it was new/step function insofar as that particular manufacturing process hadn’t been seen before, but we could manufacture aluminium before, the Hall process was really just a (massively) improved way. It’s not like manufacturing aluminum was impossible beforehand. Yet Smil counts this as a step function improvement (doing something that literally could not be done before), but then contradicts himself discrediting the invention of the transistor as merely an “incremental computational improvement,” even though it’s a new technology that does something that could be done before, but massively better, just like the Hall/Haber processes. Either massive improvements on existing methods count as step functions or they don’t. You can’t selectively apply that definition and then use your mistaken assumptions to imply modern electronics is “pedestrian” in its contributions. It’s a subtle semantic game he plays, and I think, more generally, incessant use of data (sometimes somewhat irrelevant and distracting) helps him embed these logical inconsistencies with more ease.
Firstly, I also think many modern innovations are very impactful but not visible or tangible in the same way that CTTC innovations were. While illuminating a dark city or taking flight off land are very memorable and iconic, searching web pages for accurate information and allowing people rent cheaper holiday homes pales in comparison, but I don’t think it’s much less valuable just because it’s higher up the abstraction ladder: billions of people get access to exactly the right information they need to navigate every question or decision they face in life at the tap of a finger, for free. Millions of people who would never otherwise choose or be able to afford a memorable and thrilling holiday experience can now do so. From an economic perspective, tons of resources that were left empty before are now being utilised (empty homes via AirBnB, for example). I don’t think these are “pedestrian” just because they’re less sexy.
Secondly, I think tech innovation comes with diminishing returns. String theory is objectively harder than Newtonian physics. Going from starving on a dark farm to eating meat in air conditioning illuminated by mysterious electric light (a jump that happened in probably 20 years) is a bigger leap than going from eating meat in aircon to driving to an office job. Modern society has to work much harder to sustain the same amount of improvements in quality of life, just because most of the big improvements (getting people off dangerous, painful farm work and into houses) has already been done! In this sense, we’re fighting an uphill battle: and this will continue to become harder over the next few centuries, so any singularities of tech innovations from 2020-2500 will be that much more impressive (discontinuities despite most of the low hanging fruit having been picked).
He also slips in somewhat of a straw man: who said that the CTTC era innovations will be gone soon? For the world to move forward, older inventions don’t necessarily have to be supplanted, just built upon. If we go with his line of reasoning--that modern innovation is less than CTTC because we still use many things from that era--I could say we haven’t innovated at all from Han China--we still use wheels, after all! Of course we can expect to use the Haber and Hall processes for the foreseeable future--and I wouldn’t be surprised if they remain the backbone of high-energy civilisation for centuries to come--because they do their job so well there really isn’t much need for additional innovation. Incentives for innovation constantly move to where there’s a bottleneck: once feeding the world has been largely solved (as it has), it is no longer profitable to try and improve on existing solutions. Just because these fields (where basic innovations have been sufficient for human life to not warrant significant improvements) exist doesn’t mean that innovation is dead. That would be silly, yet Smil makes it sound like it’s a legitimate line of reasoning.
For a long time I, like most people, thought that many of the great thinkers of eons past were so great because they were brilliant and had stunning insights that revealed something fundamentally new about the world. This is true, but it presents a deceptive, reductionist about how they became famous. Most of the time, their work got the time of day not on its own merits, as if they simply published it and were lauded as geniuses who'd cemented their place in history henceforth. Many of them displayed startling and shocking amounts of what today we call hustle or resourcefulness. These entrepreneurial qualities are not typically associated with great thinkers, but I believe are central to why these people, and not other equally smart people, went down in history. In short, these people are so great not just of their ability to grapple with the profound, but also because of their ability to navigate the mundane and pedestrian obstacles that encumber modern "entrepreneurs" trying to enact change in the world. This is a list of such examples, added to as I come across them.
Nuclear Physics & the Manhattan Project
- Einstein and Szilard patented a EM-based fridge, experimental physics is to real physics what a rails/django app is to computer science: success depends more on creativity and energy than knowing advanced material.
- When Szilard mentions being “above” joining wood pieces together “like a painter” on the experimental front in Berkeley, Fermi stops collaborating with him (importance of being very good at both pedestrian and profound).
- Similar with Bohr spending long coordinating architectural plans for his new Institute in Copenhagen and haggling prices & recruiting people etc.
- Also similar with Leo Szilard doing radioactivity experiments at Bart’s hospital--think of the persuasiveness/networking that must have gone on behind the scenes to convince a hospital to let a rando do experiments with radium (and this non-science-related-skill was crucial as it gave him the initial publications that led him to Oxford).
- Alex Sachs (banker) delivering fission letter to President being crucial as kickstarting the whole process and it came down to the scientists knowing the right banker to make something happen (analogous to technical founders seeking funding).
- When scientists weren’t sure if Sachs delivered letter, Szilard arranged intro to Compon (MIT), contacted businessmen from before through refrigerator deals, found people to cold reach out to through newspapers, etc, all corresponding to hustle associated with getting intros to VCs in SV today which is seen as such a big and important skill but to the masterful physicists was not something they cared about thinking formally about, but was just something natural on their missionary quest to discover the fundamental reality of the universe.
- Technical innovation in a resource constrained system was the theme during atom bomb wartime as well as SV today: Manhattan scientists had to ask for money from gov, Germany tried to source heavy water in context of invading Norway, all of the eng challenges associated with creating the first plant seem similar to the resource-constrained, people-intensive innovation that happens in software startups today--except of course, death was an option in the 20th century.
- France persuading Norway to lend heavy water but Germany failing to, setting them behind in research.
- Frisch planning the dragon experiment, with such a creative setup--allowing a ball of U235 to fall and be supercritical for a subsecond (reaction slowed down by making it uranium hydride, not pure U235), not scientifically advanced but so inventive to think up. Re: clever prototypes that test the fit between theory and experiment in a fast and cheap manner.
- When Szilard wanted to make a final push at talking to the President about deploying nukes, and FDR died, he got in touch with a mathematician who had worked in the Kansas City political circles, got him to intro Szilard to the people there, stunned them with grandiose physical ideas and they arranged a meeting with the president at the White House (since Truman came from Missouri). Renaissance Florence
- Michelangelo managing a team of artists and apprentices through creative competition, negotiating affordable prices and paying enormous amounts of detail to sourcing quality materials, recruiting talented up-and-coming artists, marketing and branding his ability, pandering to wealthy Medici patrons. He knew his crew and their relationships with each other, choosing project teams to optimise chemistry.
- Leonardo would loiter around public courtyards looking for ugly people with distorted facial features, and then would chat them up, befriend them, invite them to his home so he could spend dinner silently studying their faces and ingraining their subtle details in his mind, all the while entertaining them. As soon as they left, he'd use this newfound inspiration to draw his grotesques. An astonishingly creative and charismatic way of overcoming the inspiration problem. I can imagine if some entrepreneur had done stuff like this en route to starting a successful company, VCs would be hailing them as the epitome of "resourceful" and "inventive", but things like this were throwaway actions taken for granted amongst the great Italian masters of antiquity.
- Michelangelo would make a sublime drawing of an ornate part of a cathedral to be build on one side of a paper, and then draft a letter ordering bushels of grain for his oxen on the other side.
Modern, high-energy, high-tech society really is something to marvel at.
Today, we can use artificially intelligent machines to manufacture convincing
DeepFakes of politicians and celebrities doing or saying things they didn't.
Social media apps are used by billions of people--a level of access that puts them in the august company of institutions like the Catholic
Church. What's more, these apps can--and have--influence political action by pulling psychological strings in users' minds,
and thereby potentially changing the outcome of international relations and elections.
All of human knowledge is accessible to even the poorest villagers, at the tap of a fingertip.
Software that started as a web crawler became a search engine and then the planet's de-facto
arbiter of truth.
The difficult moral quanderies philosphers toyed with decades ago as mere hypotheticals are now coming to pass. We are living in the future.
We send goods created locally 10,000 miles across the world because it is cheaper to have them refined there and transported all the way back than to process them locally. And we still get those goods on our doorstep, exactly when we want them. It has been many decades since a conflict on the scale of those seen in the early 20th century, with a few lines of code now able to wreak more havoc on nations than armies of men in ages past. Despite living in a vastly more connected and globalized world than ever before, a terrifyingly infectious pandemic has killed less people in many months than cigarettes or car crashes. We have robots stacking shelves in titanic warehouses, welding parts, and helping construct buildings.
For such an advanced civilisation, we are curiously blind in places. In the most advanced nation on the planet, public infrastructure is crumbling. Up to a quarter of roads and bridges are rated unsafe because there isn't enough money to maintain them. Improvements in life expectancy are plateauing, and our grandchildren will suffer under the weight of our environmental negligence. The democracies that have brought unprecedented human literacy and social equality are the very same systems that incentivise politicans to please swing voters in the short term, at the expense of significant, irreversable environmental damage in the long-term. The average member of the American public struggles to multiply single-digit numbers, tell the difference between a country and a continent, or paraphrase a simple argument. And they're overweight. Almost half the American public doesn't "believe" in evolution, and most people who do can't explain what it is!
We are living in the future, yes, but we have not yet fully escaped our past. I wonder what 2120 will look like. I suspect less different than many think.
Back To Top
Despite technology making it easier than ever before to source materials, inputs, employees from across the
world, the most innovative, iconic, and productive groups flock to central hubs. Examples of clusters are
Hollywood, Silicon Valley, Wall Street. But other, more niche clusters exist too, and very much influence the
trajectory of industries and progress: Boston for biotech, Shenzhen for hardware & electronics, South Germany for
automobiles, Southern California for wine, and more. And this is nothing new, historical clusters include
20th century Göttingen and Berlin for theoretical physics, Renaissance Florence for fine art, ancient
Athens for philosophy, industrial London for engineering, and more.
This piece is based on an HBR article about the economics of clusters, and I'll draw on some of its content while highlighting the factors I think contribute most to giving clusters a disproportionate edge. I'm slightly biased towards clusters because I moved to SF thinking that the internet makes it as easy to start a successful company in London or Abu Dhabi as in the Valley, only to see firsthand how wrong I was, and how many decades ahead the rest of the world the Valley was.
Historically, competitive advantages came from sourcing better inputs for a lower cost than your competitors. Because the differences in input costs could be such a stark advantage, improvements in knowledge, management and technique weren't as valued, and being close to a port or railway line was an immense advantage, and so companies would cluster around them. As freight and shipping make procuring quality goods from across the world cheap, reliable and quick, the advantages in knowledge, management and technique have become decisive. Clusters having a huge advantage on this front, too, as a consequence of in-person meetings being significantly more effective at inciting progress and action than virtual ones--a fact I conjecture, but one that seems to be empirically true. At a high level, I think the advantages clusters hold in a modern knowledge economy include: lower barriers to entry, process knowledge, peer pressure, agglomeration economies, sophisticated markets, and public investment.
- Barriers To Entry: Vertical integration; that is, doing everything yourself--like Apple did from software to hardware--is hard, and usually only enormous companies have the resources to pull it off. In other words, if you're starting a trading business based on your innovative algorithm, you don't want to spend time doing perfunctory paperwork--you pay a lawyer to do that. Clusters have abstractions available for common use cases that you can't easily get access to outside them, and this means people starting new ventures in those clusters get them off the ground significantly faster. In the context of a technology startup: when you're living in South Park in San Francisco, your friends tell you about another tech startup that was created to automate accounting for companies. It hasn't grown outside the Bay Area yet, but because you're headquartered next door, you grab a bite with the founder--you're his third ever customer--and arrange an informal deal to have them automate your accounting for significantly less expense and time than if you were to go through the motions and paperwork yourself. These abstractions exist far beyond legal and accounting: there are obscure but powerful Python packages you might only hear about from someone in the Bay Area who made them, or a novel technique for rendering more realistic shadows in paintings invented by your neighbour when you're living in 15th century Florence that painters in Rome or Milan competing with you simply do not know exists.
- Process Knowledge I actually think this is the most important reason clusters succeed. Process knowledge is the knowledge taken for granted by experts, and the type of knowledge that can't be written down. If you had a master chef write out a recipe in painstaking detail, giving ample detail on exactly how to do everything, would an amateur chef's output using that recipe be just as good as that of a Michelin-starred virtuoso? No, and that's because of process knowledge. Dan Wang wrote an excellent piece on it here.
- Peer Pressure This isn't very complicated. To be great at something, you must live, eat, breathe and shit it. When you see the guy next door's lights on at 2am and you know he's working on the script for his movie while you just got back from doing coke at a party, you can't help but feel bad and become more disciplined as a result.
- Agglomeration and Economies of Scale: By being in a cluster, small startups and creatives can reap some benefits as if they were huge companies without sacrificing the agility that makes them able to be so innovative in the first place. They can have their cake, and eat it too. Startups need to do less training and in-house education of their own when the big companies next door do it for them; the startups just need to poach employees, which they often have no trouble doing. Another example is easier access to talent because smart people reduce risk by choosing to work in a cluster: if they lose their job, it'll be easy enough to find another one. Such is not the case if you're a biochemistry PhD laid off from your pharma research job in Anchorage, Alaska. Startups also have access to cheaper inputs more easily obtainable because suppliers are competing for market share in a huge market. If they want to try something new, materials and expertise can be on their doorstep the next day, significantly reducing the barrier to experimentation.
- Sophisticated Customers Markets for products in clusters are often both more demanding and tolerant than markets for the same thing elsewhere. As with the example of an accounting automation startup finding its first customer in another fledgling startup located next door in San Francisco, is the case with Los Angeles movie-goers being more willing to experiment and explore unconventional and potentially disruptive new media formats, and with New Yorkers being more educated about personal finance than the average American, and being more willing to put their money into new and innovative instruments. The customers in a cluster have preferences that are often 5-10 years ahead of the world at large, such that often listening to the pulse of consumer demand in a cluster can tell you the future direction of your industry: a remarkable advantage, and certainly one that cannot be "digitalized" in any meaningful way.
- Public Investment Clusters have large masses of people, and are therefore powerful political forces. And since clusters need a functioning ecosystem to survive (a tourism cluster can't survive without functioning public transport), governments are happy to support them in any way possible, since clusters often contribute a disproportionate amount to national economies. Startups also benefit from workers educated on state money in the case of public universities, like Berkeley in Silicon Valley or UCLA in Los Angeles. It is also telling that many developing economies don't have clusters because they lack government support or have governments that actively work against them.
This access to abstractions, coupled with the availability of talent & capital, high risk tolerance of experts in the area, and general atmosphere of "it's okay to fail" that is pervasive as much in Hollywood as in Shenzhen make it very easy for talented employees at big companies or ambitious graduates of top universities--both of which are present in copious amounts in hubs--to leave and recruit some friends to start the next big thing, whether that's making Rocky, writing A Midsummer Night's Dream, or founding Tencent.
In another context: in Silicon Valley cafés, in casual conversation, you never have to explain was "SaaS" or "B2B" means, it's as much part of the vernacular as "lmao" is to teenagers on their phones. Before I moved to the Valley, people told me the Valley was ahead because it's easier to source capital and talent there than it is anywhere else. This may be true, but I don't think that's the main reason it's a better place to start a high-tech company than anywhere else. I think the real reason is that starting new ventures-- whether it's a company, artistic movement, religion, hedge fund, or film idea--is difficult. In the case of SV, if you're starting a payments company, and Max Levchin lives down the road and is willing to grab coffee to talk about the problems you're having, you'll have your questions about, say, early user acquisition in payments, answered by perhaps one of the only 10 people in the world equipped to answer them. A pow-wow over coffee or drinks with movers and shakers that have struggled through that which you're struggling reveals insights and paths that just are not possible to get when you're teleconferencing from the comfort of your Colorado home.
This sense of informality, the ability to move fast and break things, is something you can never get through "formal partnerships" of the kind mediated through email and contracts and video call. This is related to my earlier point about low barriers to entry, as well as to my later point about agglomeration economies.
To conclude, I think clusters are one of humanity's most powerful engines for progress. As globalization further increases,
I think their importance will only grow--social media led to more college parties, not less, and teleconfering compounded the
importance of in-person meetings instead of obviating them. Information technology makes the world more dynamic and
knowledge/service based, which in turns gives outsized advantages to those that can easily identify and adapt to new trends and
access 'insider' knowledge. It's a positive feedback loop. I'd welcome suggestion on how
both legislation and technology can make large cities in developing and developed countries alike potential breeding grounds for
Back To Top
Tentative thoughts on cities I've lived in or visited multiple times.
Back To Top
It's interesting to note the similarity between the type of relentless resourcefulness important in war, and in business.
The reason the French Republic got enough artillery
to fire at the allies at Toulon was because Napoleon personally pulled strings to get more supplies,
had new forges build on-site,
trained people to themselves train troops on using artillery.
An example of the circumstance/skill dichotomy is that the geography
of Toulon and the layout of the Allies meant that any fighting was done by artillery, which is what Napoleon
happened to be trained in (right person at right place at right time).
And on the importance of implementation/distribution channels—I would argue that profits are to modern business what war was to nations: mechanisms for propagating/disseminating some radical innovation. Something as revolutionary as the French Republic and its progressive ideals could easily have been stamped out at various points in its infancy, changing the face of the world forever (setting us back 50-100 years!). It succeeded not because of the "correctness" of its ideals, in any absolute sense (ideals which philosophers had come up with long before they were enacted, just like Xerox came up with the graphical computer interface long before it was implemented/distributed by Apple), but because of the enterprising and relentlessness of a few men (like Napoleon) in changing the incumbent landscape enough to allow for introduction, distribution and implementation of these radical new ideas that had been fully fleshed out in theory.
Importantly, this was done from within the existing infrastructure (fighting wars with other nations that opposed Republicanism). To spread successfully, the new innovation (Republicanism) did not just have to be a superior method of running a state, but also had to be better at winning wars. I see this fragility at birth to also be present in businesses—every household, office, and more, might look different if Jobs hadn’t recognized the potential of the GUI at PARC that one day he visited; just as how the whole world’s methods of government may still be leaning aristocratic, if Napoleon hadn’t been enterprising enough to requisition extra Artillery at Toulon or food for his soldiers at the bridge of Lodi (victories which led directly to him gaining power to implement civic innovations like his Code Napoleon).
You also see that attempting to scale any major innovation (GUI, Haber process, batillon carré) requires lots of related, ancillary innovations as well as a receptive landscape—think Apple's redefinition of technology in retail via Apple Stores, Carl Bosch at Oppau/Leuna practically inventing the fields of high-pressure and catalytic chemistry, countless small but crucial bills passed under Napoleon. It’s a philosophical question of which you believe to be higher impact work—the invention of a concept (Wozniak, Haber) or its distribution/enactment at scale (Jobs, Bosch). An important caveat when weighing the two is that we usually know very well who was first to implement an innovation at scale, but the question of who was first to conceive with the theory is usually murky and debatable.
In short, I've changed my opinion on what makes a startup succeed (and its competitors fail). A year ago, I'd tell you it was the novelty of the idea, insightfulness of the founders, and receptiveness of the geography and market at the time. Now, I think that successful founders are measured by very different yardsticks. Their test isn't how novel or unique their idea is, their test isn't how insightful or intelligent they are, and their test isn't whether the market or geography is ready for their cool new idea at that time. They are, instead, testing every hour of every day. They are tested when they hire their first employee--are they charismatic and inspirational enough to motivate and persuade a phenomenal engineer to risk it all to come work for them? They are tested when they are running out of money, and don't know any VCs--can they find a way to get a warm intro to a VC or dazzle investors through some creative stunt? They are tested when their initial customers tell them their product is shit--do they know when to persevere, and when to quit? Do they make the right decision at that point, and at every other pivotal moment that marks the early days of starting a company (or anything, for that matter)?
In this sense, I no longer see startups as outlets for scientists to out-innovate each other, but what they are: businesses. And the ability to run a startup is just a long way of saying you know how to do business. In other words, you're a good businessman.
This is an important realization for me because this word, "businessman", in particular, was one that would have had me see someone as "braindead" when first meeting them. I think many scientists/engineers share this sentiment. But now I find that 99% of what comprises entrepreneurial success is, unsurprisingly, entrepreneurial qualities. It seems trivially obvious in hindsight, but from what I've seen in the Bay Area, most technical founders, even those aged 30 and older, do not viscerally understand this until after their first or second company fails.
In other words, I see now that innovation is not the cost of startup success, but the reward. Now that Facebook has tens of billions in cash reserves, it can pump money into moonshots like Oculus and CTRL-Labs knowing that it can pay for all that research and afford those risks, to a degree that even the biggest universities cannot. I believe Google outputs more computer science research than Stanford and Berkeley combined. Just ONE of Google's moonshot projects, Waymo, has been given more R&D funding in one year ($3B) than MIT gets for ALL its departments in several years ($1B/y). That is what Sergey Brin and Larry Page feel proud for when they put the kettle on to boil every morning of every day. That, is high leverage work.
The requirements for getting to a point where the company is well enough endowed to innovate at all are simple to define: be an anomalously resourceful, charismatic person who can identify what is essential and what isn't, and act decisively and correctly in very high pressure, resource-constrained, systems. These are not necessarily, or indeed at all, the traits associated with "innovation" or "science", which are more to do with the creation of knowledge and tools, and require very different skill sets. Instead, these traits describe a good businessman. And, I would argue, also describe the legendary soldier-statesmen of eons past. Jobs, Gates and Caesar, Napoleon have more in common than it might seem.
Put differently, business is modern warfare, and innovation is the prize.
Back To Top
On Silicon Valley, Startups, and Software Engineering
- Operations, partnerships, distribution channels, and other “business” buzzwords are just as important to understand as the actual theory behind your “revolutionary” tech innovation. 6 months ago, I’d dismiss anyone using these words as a braindead MBA monkey, but it turns out most companies and products have been built before, it’s just that new ones have an edge when it comes to partnerships, distribution channels, talent, or something else of the like. Practical implementation and distribution of a cool product is the bottleneck for great ideas more often than the technical innovation, which is a sad reality.
- The most money isn’t necessarily made in consumer products, even though that’s all we college students have ever really been exposed to. Everyone I know, my age, is thinking about how they’d make the next SnapChat or TikTok. The really successful founders and Valley veterans are instead thinking about the niche, unsexy enterprise problems that generate the most value for businesses.
- Being able to integrate your product with existing infrastructure and not screw over important people in the supply chain, at any point, is crucial. It’s very important to have respect for why the problem hasn’t been solved so far. If you can’t conjure a compelling answer to that, you haven’t looked hard enough. For example, the invention of DVR by TiVo was revolutionary, but they didn’t own the content distribution channels (cable TV) that their product would be used on, Comcast did. And so they failed, despite their breakthrough technology.
- A lot of the work involved in software engineering sounds complicated and world-changing, but it’s pretty mindless. Much is simply rebuilding or refactoring pedestrian products that already exist in some flavor or capacity, but with a small new change or addition to address some novel market need. For example, companies will invest thousands of engineering hours into copying each others’ products from scratch because no open-source equivalent exists. The work can be technically challenging, absolutely, but much of it is not creating anything fundamentally new in the universe. Front-end engineers often spend time working to make a certain animation marginally more memory-efficient or rearranging how the edges of a certain box look — quite a far cry from studying circuit architecture and red-black trees at college.
- Most brilliant people in the valley are excellent at solving problems but spend almost no time thinking about what problem they are solving, why they’re solving, the impact it’s likely to have, and how profitable it’s really likely to be. They just get giddy about solving something difficult. In math speak, I believe that our species has come to be very good at optimizing narrow objective functions without having any meaningful understanding of how these narrow functions relate to comprise a wider, global solution to our problems.
- Technology is almost never the inherent solution to a problem. It is just the vector through which the solution can be implemented cheaply at scale and in a clean, simple-to-use way. Uber’s breakthrough was with its business model, implemented through some neat software. Building the app was not a breakthrough. Being a 10x engineer doesn’t really give you any meaningful advantage in coming up with clever solutions, merely with implementing them. Being a technical co-founder is only useful insofar as it lets you prototype and test ideas, as well as know the limitations of the technology you’re using to build solutions (which is still very useful, I might add).
- The importance of projects is often under-appreciated. Most companies, research ideas, and works of art and literature and music came from great people tinkering and playing with things in their free time. The ability and desire to build web apps on a whim, write pieces of short fiction because of a dream, or investigate a new research topic over the summer because of a book you read, are vastly important. Tinkering and building should become routine and ubiquitous for our world to embed innovation into our culture.
- I’ve come to appreciate the importance of legislation and individual behavior in enacting macroscopic change. As a techno-optimist who started the year believing that more money and talent would lead to endless innovation that would solve all of humanity’s problems. By talking to academics, entrepreneurs, and reading widely about various industries, I’m now pretty convinced that a lot of our biggest problems (climate, public education) are best solved by informed electorates who act the right way and elect responsible and intelligent leaders rather than brute force investment into technology.
- An ancillary point is that I’ve come to appreciate the complexity, beauty, and efficiency of the institutions of modernity, like democratic governments, international law, financial services, energy generation, waste management, and more world-changing machinery that would hum around under the hood of daily life, hitherto unnoticed. Everything from how VisaNet allows for instant electronic card payment between banks to how landfill pipes capture methane from trash, I’ve understood that there have been a lot of smart people who have experimented a lot of times and gone through a lot of pain and effort to come up with many of the tools and institutions we don’t even know exist.
- I’ve come to see the understated difference between how large, bureaucratic institutions (government/finance) are ideally supposed to work and the twisted, complicated, corrupt and underhanded way in which they often do. This might seem to contradict my previous point, but I think that it’s consistent to hold the beliefs that democracies are beautiful, contrived and deliberate structures that have done a lot of good, but that they also creak and crack in places due to the inherent weakness of individual humans.
- College is only useful because of the people you meet — both peers and professors. Even vocational majors like mechanical engineering have limited utility in the real world, even if you’re working as a full-time mechanical engineer. The rat race to join prestigious clubs and groups and be atop the social ladder is almost laughable when you look at it as an outsider and realize how useless those races really are (including chasing grades).
- Reading widely and often is not a foolproof way to become an intellectual, it makes little sense to read blindly for the sake of reading. I don’t know why so many people focus on chasing “number of books per year” rather than deeply internalizing new concepts through whichever medium can best teach them about those.
- Learning languages has little to no professional utility, but the way it deepens and enriches relationships between you and people who speak that language gives you uniquely rich insight into the way they live and think. This could be gratifying and enriching for you on a personal level and is the only good reason to learn a new language in my opinion. As a trilingual, I do not think this is generally a good investment in time.
- Conspiracy theories and stereotypes are often grounded in some truth. I heard that some teams on Google and Twitter had doctored some search results leading up to the election to sway voters one way, and immediately dismissed it as fiction. Then I met some people who had a background in the industry who revealed that elements of it were not too far off.
- There are no deadlines in the real world, and life after college can easily lack a sense of urgency. You don’t meet people unless you go out of your way to convert casual interactions into meaningful relationships, and this is disconcerting and deeply uncomfortable at the beginning. It is easy to fall into a routine where you’re losing awareness of where you’re going in life since there are no clear deadlines or people to hold you to them when you’re in the real world.
- Great art, film, literature, and music do have an important place in society. For all the money, ideas, and opportunity gushing into Silicon Valley, people, including techies themselves, hate living here. With collapsing civil infrastructure and almost nothing to do outside of work and good ethnic food, there’s a palpable difference between San Francisco and cities like NYC, London, Paris, Tokyo. Even if you don’t go to art exhibitions or attend rock concerts, the fact that they happen all around you affects the pulse of the city in a subtle but noticeable way, and the lack thereof makes life just a little less vibrant as if people go home to take a short break before going back to work again, instead of the other way around.
- I’ve become disillusioned with top universities. Coming into the year, I would instinctively associate anyone with a brand-name school on their resume as world-class. Very quickly, I saw how most students at top schools are a product of circumstance, and only about 5% of the population, even at Harvard, is “elite” by Silicon Valley standards (in terms of intelligence and output). Similarly, most of the Valley “elite” did not attend Harvard (or Stanford or MIT and so forth). I see these credentials now as signifying moderate intelligence and good fortune and look to gauge whether people are smart by actually talking to them about ideas and looking at projects they’ve worked on.
On Living in, and Understanding How “The Real World” Works
On Silicon Valley, Startups, and Software Engineering
- I spent a big chunk of my gap year studying the future of various industries to see if I'd like to work in any. A potential reason this approach is suboptimal: it assumes you're motivated purely by a desire to enact the greatest good in the world. Studying across industries reveals high leverage, important, work, and makes you question why you still don't want to do it. For example, after studying three industries, I think working in any of climate or education or developer tools is really impactful, important work. But I still don't plan on doing it. That's because I, like almost everyone, am not purely motivated by the leverage of the work. I also care about how intellectually interesting it is, how much beauracratic bullshit I'll have to wade through, what kind of people work in the industry, and more. I'm lucky that doing this industrial survey has made me explicitly confront these difficult truths, because now I have a lot more clarity in what I am looking for in the immediate future.
- Similarly, I understand viscerally that successful companies are not novel ideas or problems, but old ideas and problem executed flawlessly, or in a novel way. The bottleneck for startup growth in many of these industries is not product, but industry connections, and I don’t have any. How can I expect my execution to be better? More importantly, I've found that the initial problems and ideas you spot are not what will form the basis of your company. The problem that you end up finding product-market fit is a result of deeply understanding the customer, not the one you spotted initially. And therefore any problem list I compile is merely a list of starting points, each one having to be explored through industry contacts for months to see if there’s anything there at all. In VC-speak, I’m not the right founder for almost all those problems, and that’s really what makes companies succeed.
- Most people working on startups are smart people that want to get rich, not because they have any "passion" for the problem. People are not "passionate" about enterprise software or dating apps. They are, however, passionate about making bank. And there's nothing wrong with that. These smart people do this by creating a lot of value for customers fast (they don’t really care which customers). And this is fine, too.
- I finally understand where the culture of studying CS to become a tech entrepreneur came from. I think that today, with most web and mobile software, CS courses are not very useful as most of what you learn is completely unrelated to the programming work you do. Many people (who have done CS degrees), counter that it teaches you "general principles" that help you pick up more specific, concrete frameworks and trends faster. I have been generally unconvinced. But I now see that this culture is just a remnant of the past. CS really was important for early 2000s and 1990s founders because all the abstractions that people build atop of today, like web frameworks, mobile SDKs, had to be invented from the operating system and physical network level up. For exampe, Paul Graham's Viaweb had to pioneer the act of sending information via HTTP to a remote server, and that required knowledge of servers, operating systems, as well as their customers’ pain points. The literal idea of a web application did not exist when it was invented (tautologically), and so for the inventors, they had to build it from the fundamentals of computing. This is why they actually needed to know computer science, and, similarly, we don't today (if you're building pedestrian web/mobile software, which most Silicon Valley startups are).
- Many people in the valley are very well read, and thoughtful about their opinions in a scholarly way. Perhaps my fierce opposition to the can-do & move-fast-and-break-things culture of the Valley may arise more from general shock of how quickly things can get done in the Bay Area than anything else.
- I always thought people pursuing startups for the sake of startups fail, and that some poetic justice in the world would mean that those who succeed had a genuine love for the problem they were trying to solve. This is just fiction. Almost all founders are mercenary, and there's nothing wrong with that. Information technology has meant that smart, determined people can almost-predictably get rich if they try enough times, and I can't, and shouldn't, blame people for trying to get-in on this new status quo. The repeat founders are in fact they’re the most likely to succeed, because they persistently are looking for problems people want solved, and are trying to solve them across diverse fields, learning tons in the process. Given that most of the difficulty in making a company succeed is actually to do with distribution and business, not product, this makes sense, even if it isn't the way I wished it was.
- An important (and scary) one: to be remembered for creating, discovering or inventing anything outside of academia, you have to own distribution to get credit. Philo Farnsworth invented the television, have you heard of him? No, because he was not part of the company that ended up distributing it at scale, and so there was no way for people to know that, he, in fact, was the one that invented it. Whereas with Apple, there were teams of PR people using various channels to tout how innovative their engineers (and Steve Jobs) were in coming up with the iPod, or iPhone. People don't famous if no-one talks about their work. Anytime you associate an invention with someone (another example: Haber, Bosch with the fixation of nitrogen into Ammonia), it's because their company worked hard to have you make that association (as BASF, the chemical company they worked at, did). In the real world, sales, marketing and PR matter. Often much more than quality of product and technical ability.
- I sometimes feel bad about being interested in startups because if I was born 50 years ago, I might have aspired to be an executive at a big company, or 500 years ago, a painter. I kept telling myself I should do something I would've done regardless of when I was born, as that is what you "truly" want to do but now I see there is no such thing. One has certain overarching philosophical goals in life (wealth, or posterity, or interesting experiences, or fame, etc), and these are the invariants. The way you go about working towards those will of course vary with when and where you're born. I'm only now learning that instead of shirking away from this reality, accept it as the natural state of things. The methods you should use to get to those invariants vary a lot with time (founding startups was not the highest way to exert leverage 50 or 100 years ago, and may not be in 50 or 100 years time). And you should act appropriately. It’s not “wrong” or “mimetic”, but sensible.
- You can broadly classify all great people into two bins: creatives and businessmen. Einstein, Bach, Michelangelo, Plato, all fall into the former. Caesar, Napoleon, Churchill, Rockefeller, all fall into the latter. Jobs and Edison are are more ambiguous. I use the word "businessman" to describe anyone whose success depends more on external events and their ability to navigate them. Creative types are more at war with their own intellect, constantly asked the question of: are you inventive enough to spot that breakthrough idea? Businessman are more at war with external circumstances: can you motivate employees not to go on strike, can you secure more military funding as your army is about to fall apart? Once you put it this way, you have to ask yourself: am I good enough to produce as much as the Renaissance masters or Greek philosophers of antiquity? If not, the future path is obvious: work hard at becoming great at selling things, motivating people, negotiating, envisioning the future, more than working hard at any one skill or craft like a creative might.