This writing represents a conversation with myself through time.

Essays marked with an asterisk (*) are technical. If unsure what to read, try War and Business, Falsifiability, or Welcome to the Future.

In Silicon Valley, it is common to hear people deliberating making life and career decisions through the framework of leverage. People decide to found company X or work at company Y because "it's the highest leverage choice." That is to say, it's the position from which they can exert the biggest raw impact on the world from their current place in life. High leverage choices have many multipliers between actors and the effects of their actions. Being the engineering lead for Google's search team is high leverage work, being a lone engineer on that team of 100 engineers is not. Leverage is purely about the scale of impact a single person can exert as a result of intentful choices that multiply the effects of their actions (thereby acting as a "lever" on those actions).

One thing that I've found disturbing about this philosophy is that it completely omits the notion of uniqueness in one's work. Let's caveat this with the fact that everyone share's different worldviews, and this is merely my own, etc etc. When thinking about what you could do that has a highest leverage, you forget to think about what only you could do. To see how different uniqueness and leverage can be, consider being an engineering lead at Facebook. Your team manages ad placement, which is business critical on the order of billions of dollars. This is enormous leverage. However, if you weren't doing it, some other commodotized MIT-CS-PhD type (call them Dr Nurd) would be doing it instead of you. Now consider the scope of work for an engineering lead: choosing a technology stack, architecturing an app or feature, recruiting new engineers onto the team, amongst other things. At most times there exists a clear state of the art and standard practise on doing the first two things, and individual junior engineers you recruit onto the team are unlikely to utterly transform its direction and atmosphere. That is to say, Dr Nurd and yourself have significant decision overlap -- the ad placements on Facebook would look mostly the same under both of you, especially since the major product decisions are made by executives.

And so on some level, there is very little fundamentally new (where new means doing things very differently to how someone similar to you would do them) you add to the world by being in it. Conversely, a mediocre writer with an audience of even a few thousand people is doing something that could not have existed without them. The books pumped out by two similar writers can be vastly different. A good heuristic: would the world have lost something, subtle but fundamental and substantive, if you, in particular, hadn't existed?

And so we must consider both uniqueness and leverage: the number of "forks in the road" -- that is, opportunities to make a unique decision very different to what someone else would have made in your position -- presented to you per unit time (say, per day), and how different the product/service you provide looks to the end consumer as a consequences of those decisions. For instance, if you have an eclectic taste in coffee, and use your clout at work to get an obscure brand of coffee maker brought into work, you've contributed something new to the world, but it has no meaningful impact for the consumer of the end product of the company (uniqueness but not leverage). In contrast, if you're choosing between two recently published sublinear network routing algorithms for the streaming service on Netflix as an engineering lead there and you choose the one that is more amenable to systems optimizations, you're not making any decisions unique to you (Dr Nurd would have done the same) but that choice (the packet transmission speed induced by the resulting systems optimizations) is having a huge effect on the user experience (lag, loading times) for the consumer (thus, leverage but not uniqueness).

In general, work that has high uniqueness is that which presents many forks, as described above. And so intensely creative work -- from theoretical physics to playwriting to architecture -- has high uniqueness because it represents the will and vision of its creator, manifest in the flesh. And two otherwise similar people can produce vastly different pieces of prose, since the possibility space for different types of prose or plays is much, much bigger than the possible decisions two engineering leads could make about choosing a technology stack (where there exists best practices for the most part). But there are other types of work that have high uniquness, work that you think is important but that almost everyone else doesn't. For a pedestrian example, consider a YouTube channel on Russian stamp collection. Because nothing like this exists, if you were to start one and get even a few thousand subscribers, you would have brought into the universe something fundamentally fresh and new, something that wouldn't exist without you. The universe would be a little worse off without you. The same cannot be said for, say, college admission reaction videos that go viral on YouTube (if it wasn't yours, it'd be someone else's). For something a little grander, consider entrepreneurship; because starting a company involves thousands of little decisions that two similar people may disagree on (eg. product features), two similar founders can create two enormously different companies. Consider Brian Chesky and AirBnB -- one of the few examples of a truly contrarian bet. Everybody thought the idea was moronic, as well as hundreds of investors they first pitched to, and, importantly, no-one else was working on anything like it. Therefore, AirBnB (or anything like it) would likely not exist if Brian Chesky didn't, and we'd all be worse off because of it.

Before people accuse me of criticising "conventional" jobs, I should note that the thing that matters in the end is not just uniqueness or just leverage, but the product of the two. And so if I had to choose between an obscure stamp collecting channel and being an engineering lead at a big tech company, I'd probably choose the latter (begrudgingly). But if I had to choose between being a niche writer with a few thousand (dedicated) readers, barely making a frugal living but getting by, and an engineering lead, I'd choose the former in a heartbeat. As caveated at the beginning, this is predicated on the fact that something resembling uniqueness matters to you at all. I think it intuitively does to most people, but I hadn't seen anyone quite articulate it in the way that made sense to me.

Back To Top

I recently had a bit of a chat with someone I've looked up to for a long time over email, and here's a tid-bit from one of their responses.

Much more goes into building a successful company than you're aware of. Yes, many people can build prototypes of Uber, or Fortnite, or Airbnb, or Twitter. But why do only certain companies survive? It must be either due to 1) nuances of the product that are opaque to you as an outsider, 2) other hard work in invisible areas that you don't see as a user, 3) just random luck (which is IMO less of a factor than laymen think).
--Some Wise Tech Entrepreneur

I'm particularly interested in his first point about opacity, as a few examples of such an "opacity bias" come to mind from my own experiences. You think something is trivial (unimportant) in the naive sense as an outsider to the field, but when you learn a little bit about the field you realize how important it is that that thing is done well.

Household Savings

There is a Harvard professor I got to know who works in behavioral economics and macroeconomics. In behavioral economics, there is a notion of "choice architecture," where framing matters for human decision making. For example, changing 401k plans at companies from default opt-in, where you had to fill out an annoying form to start your 401k retirement plan, to a default opt-out, has done a huge amount to increase savings in the United States (empirically). It's not a profound idea, but it's important. Even more, this economist spent a great deal of time using ideas from psychology and behavioral economics to think about how the form for opt-out should be designed, distributed, and presented, to make sure anyone who actually wanted to opt out, even a little bit, had no beauracratic burden stopping them from doing so. I was flabbergasted to see that someone so intelligent could be spending their time and intellectual energy on something so pedestrian as form design. It seemed to me that the work of this prodigious economist was as unimportant as that of a lone accountant at an obscure small company.

Then I took a course on basic macroeconomics, and learned how the banking system actually works. It turns out (this was new to me; or, rather, it's obvious but I had never actually given the topic serious thought) that the reason banking, as an institution, is so profound as a lever for progress, is that it moves capital from useless places (under one's mattress) to useful places (as the up-front cost needed to kickstart a new construction project). Banks are responsible for correctly (efficiently) allocating this capital (and the associated risk), and the fundamental idea: all of this capital comes from savings. A bunch of it from corporate savings (residual profit companies have), but a lot of it also comes from savings on the level of individual households. And so, any money an individual or household saves (such as in a 401k) is really a contribution and bet on future progress (capital being earmarked for capital-intense projects like construction or loans to businesses) of society. And there are robust (ie. reproducible) examples of how savings rate has much to do with future economic output, and, transitively, future progress and prosperity (especially when it comes to technology). And most of all, when you look at the effect of form design on the choices people make by examining huge datasets of millions of responses, you see that even small changes have big effects (this is part of the reason that big tech companies A/B test their UIs so aggressively -- these things make a big difference) on how much people save. And so being an economist that is part of the reason that opt-in 401ks became opt-out 401ks is something to be immensely proud of: it directly contributed to billions of dollars of excess capital being invested into the future, as opposed to on jewellery or being stored under a mattress. It's just that this contribution is opaque to most outsiders to the field, as it was to me. That's a lot of leverage this economist has had; certainly one that Monsieur Lambda at Ye Olde Accountancy Firme doesn't have to his name.

And so it was really only by learning the foundational/basic modes of thought in a field (macroeconomics) that I gained the tools I needed to properly appreciate the motivations and importance of certain work being done in that field. I think this is one example, amongst many else in my own experience, and that of others, of how something seemingly trivial to outsiders can turn out to be of profound importance, once you think and learn a little about it, ridding yourself of "opacity bias."

[TODO: more examples]

There seem to be two types of mathematical models. There are those of convinience, which use heuristic arguments and simple reasoning to create and justify equations that vaguely seem to fit a system, and often yield surprisingly good predictions for their relative simplicity, but don't say anything about the nature of reality that underlies the systems being described. The other type of mathematical model is one that makes strong assertions about underlying mechanisms, and, in doing so, posits something concrete and deep about the underlying nature of the system it's trying to explain. These are often found in physics.

Lotka-Voltera equations

Here's an example of the first type of models. These are a system of differential equations we create to describe the growth and decline of certain populations. We justify them based on our intuition that, initially, a population will grow exponentially in times of plenty, and will eventually plateau due to constraints in the environment around them. This gives rise to the logistic curve we know so well, and it's that curve that we use to make predictions. Amongst other things, these equations predict that growth per capita (or per individual) will plateau asymptotically to zero as we reach the "carrying capacity" of the environment. Of course, in practise, we see that this doesn't hold. It provides a decent approximation some of the time, capturing the fact that growth is initially fast, and slows down, but empirically quite wrong for most population systems. And we explain these discrepancies away by mentioning that real systems are more complicated and we are missing variables, and it is a simple model, and so on. The key point is, these are models of convenience that arise from "economy of thought," as physicist and philosopher Ernst Mach put it. We do not believe that this model captures some deep truth about the universe, but instead is more a mental crutch that helps us make sense of why population grows in the general way it does.

General Relativity

Here's an example of the second type of model. Consider Einstein's general theory of relativity, which goes beyond giving some equations we can use to make predictions, but actually asserts that reality itself is a certain way. Specifically, it asserts that mass changes the curvature of space, measured via the energy-momentum tensor, and these changes in the geometric nature of space manifest as visible effects of gravitation. It goes beyond Newton's equations, which are closer to the first type of modelling, in explainging exactly why gravitation exists in the first place. Another example are the ideal gas laws. They actually assume the existance of atoms to reason about the behavior of individual atoms on a mechanical level. The laws are not entirely correct because of the assumptions they make, but they do reflect some underlying reality (atoms exist, or so we believe).

Radioactive Decay

Here's an example of mathematical modelling that is not obviously a model of "convenience" and neither one that reflects some "deep natural truth." Consider the humble uranium atom. In high school, a typical setting in which differential equations is taught is in the context of radioactive decay, since it is obvious that the rate of change of atoms depends on the number thereof. The more atoms there are, for some fixed probability of decay per atom, the more atoms there are that decay. But what quantum mechanics makes difficult is reasoning about whether probability is deeply embedded in reality, or whether there's an underlying structure we simply don't understand. The fact that physicists showed that there is no "hidden variable" governing quantum mechanical behavior (which at this point we all agree is inherently probabilistic, whatever that means) doesn't mean that there isn't some deeper mathematical structure (eg. M-theory) that is at play causing this randomness we observe in quantum mechanical behavior (as well as atomic decay, etc). This has always been troublesome for my intuition because it's not clear whether I should think of QM as a model of convenience like the Lotka-Volterra equations, or a model that reflects the underlying nature of reality, like GR.


The reason I mentioned the ideal gas laws earlier is because in Asimov's classic series, Foundation, the protagonist develops a statistical theory of macroeconomics called "Psychohistory." In it, he uses advanced mathematical statistics alongside psychology to make predictions about societal behavior, in a time when the population of all humans exceeds one quintillion and is spread across the "galactic empire." After reading some Taleb and perhaps even after taking some psychology courses and seeing how crude our methods for understanding the human mind are, it's easy to believe that something so complex as the human psyche and its behavioral consequences will never be understood well, let alone its emergent behavior when many such minds interact.

On the other hand, cognitive psychology -- and particularly the works of Kahneman and Tversky -- have yielded lots of experimental and theoretical results that are exceedingly reproducible. The reason I'm not a fan of psychology at large is because most results don't replicate. That is, most things published in papers are wrong; mere figments of chance and statistics abused by people (psychologists) who have near-zero understanding of statistics. And so the fact that this work (mostly taken from the book Thinking, Fast and Slow) hsa been so thoroughly reproduced and shown to be robust to replication is, to me, astonishing. This was emphasized to me in my introductory economics course last year, where the professors ran several live experiments with our class of a few hundred people, such as running a real-time prediction market, auctions, and asking people to make savings and investing decisions with real money. Again and again, the results were almost exactly in line with those made by an entirely different group of participants at a different place and time!

This list of common behaviors and biases makes little use of grand theories from biology or neuroscience, and instead humbly seeks to make predictions based on empirical evidence, without philosophizing. Much in the style of Taleb's "convex tinkering." This list reminds me of the ideal gas assumptions we make when deriving the famous pV = nRT equation and ideal gas laws. We use these assumptions (which reflect some deep truth about individual atoms' behavior) and statistical reasoning to determine probability distributions of outcomes of fluid systems, and are approximately right, much of the time! If we have solid results on the level of small groups, results that reproduce, it doesn't matter that the bodies in question are sentient. We've already done much of the hard work in finding the laws that determine the behavior of the unit (the small group of people, for example in an auction or prediction market). Now all that is left is to apply probabilistic reasoning to make predictions about many such units (eg. a country or religion) in the framework of statistics. Perhaps the reason we can't do this yet is because 7 billion people is not enough for such behavior to manifest robustly, and indeed we do need closer to a quintillion. Or perhaps there are behaviors that emerge with scale (eg. differ greatly between 100 and 1 million people) and so these psychological results from Kahneman and Tversky are rendered useless. Or perhaps we need a new kind of statistics to reason about such a thing, in the same way that Boltzmann had to make strides in probability theory itself to formalize his laws of thermodynamics (the greatest scientific result ever published, I should add -- alongside evolution, also a theory that is statistical in nature).

Therefore I hold two contradictory beliefs. One is that much of the work of the economist is doomed to failure because human behavior is too complex to predict, even probabilistically. The other is that we do have empirical results that predict such behavior, and this means that better statisticians than those currently living may be able to generalize these to larger groups, and then use them to make accurate predictions about the behavior of a few million people, in response to some policy or conflict. I think there is more to be fleshed out in trying to ascertain what a real theory of psychohistory would look like, and how it would be notably less "theoretical" and much more statistical than current approaches to economics or social science. It would probably have to come from someone trained as mathematical statistician, who understands human behavior on an practical (think bar fights and comedy clubs) and well as formal (Kahneman's work) level.

[TODO: Think about what such a theory might look like more specifically. Or try to build it yourself and see what roadblocks there are. ]

Back To Top

The paper Controlling the False Discovery Rate via Knockoffs by Barber and Candès in 2014 is, in my opinion, a great example of statistical theory at its best. In it, they present a new way to identify causal covariates in high dimensional settings. Put more simply, if you have 10,000 genes potentially contributing to a disease, how do you identify which ones are most causally important, given all sorts of correlations between the genes, known, and otherwise? Minimizing the false discovery rate (FDR) of causal factors is crucial in fields ranging from the obvious (natural sciences like genetics) to the subtle (A/B testing web designs at big tech companies). Their new method for doing so is intuitive, and in some sense, obvious (in hindsight). That's how you know it's the real deal; a graduate student in statistics tells me it's the sort of thing that will be a standard part of inference courses in 30 years time. Here, we take this paper apart, studying what the core contribution is, how the authors might have come up with it, why it's important, and the new questions posed in the field (high dimensional inference) as a result. With that, let's begin.


Back To Top

[TODO: This isn't very clear.]

When living in the Bay Area and seeing what people need to know to build interesting things in various fields, I started to believe that most of what you learn in college is (for most people) useless, and that if you purely optimizing for vocational expertise you're better off taking the foundational courses in your field and then leaving school. My attitude on this has shifted slightly. I maintain that the above stance is, as stated, probably accurate, but I also now appreciate where this thesis is incomplete.

Where I Was Wrong

The initial premise of my argument was that learning new material is only useful insofar as it does one or more of these things.

  1. Stretches your cognitive limits (eg. Galois theory)
  2. Teaches you material you will directly use in the future (eg. intro CS)
  3. Is fascinating enough that you would regret not studying it (eg. Quantum Mechanics)
In fact, there's a fourth major reason you'd benefit from a class: to build "technical maturity" in a specific field.

Technical Maturity, Defined

This means understanding the foundational modes of thought of the field well enough that you could easily teach yourself adjacent material in the field. The "in a specific field" caveat is important because it do not think that taking a hard graduate course on, say, randomized algorithms will improve your ability to learn financial economics or solid-state chemistry by making you "generally smarter." I think humans are too bad at applying knowledge across domains (even related ones) for that to happen. Instead, what I mean is that taking a grad course on randomized algorithms will help you learn computational complexity theory, algorithmic game theory, and probability theory. In other words, fields that have substantial overlap in content/keywords, and not just "modes of thinking." I elaborate on this, analogizing it with graph theory and giving examples, below.

Knowledge as Graph Theory

To understand what I mean by "technical maturity," think of knowledge as comprised of nodes in a graph. Each tid-bit of knowledge constitutes a node, and the connections between them are the edges. For example, a class about line and surface integrals over vector fields would create a new node on the graph, linking it to, say, electromagnetism from one's physics class, which are prime examples of line integrals that have physical meaning. That would be an edge between the two nodes.

Why is this relevant? The contention that many champions of the liberal arts have is that studying challenging topics in broad contexts makes one "generally smarter" and improve one's "ability to learn," in generality. This is NOT what I mean by technical maturity. This claim of theirs, in the context of our knowledge graph, is claiming that a node about commutative groups from abstract algebra, or the portrayal of feminism in Austen's Pride and Prejudice has edges going to things that people do on their jobs; say, writing code or hiring employees. I think there is a wealth of evidence supporting the fact that humans are very poor at transferring knowledge across domains, even closely related ones. Therefore, this "liberal arts as building critical thinking skills" school of thought, I see as pretty much baloney.

But they're on to something. Instead, I claim, knowing about, say, commutative groups is useful for learning molecular representations in chemistry (think of these topics as knowledge nodes, I'm claiming most people would be able to connect the two). Indeed, it's quite different from abstract algebra, but has clear and direct applications of group theory and so is similar enough for meaningful transfer to happen. In other words, not as different to abstract algebra as, say, doing corporate taxes in Excel. Just different enough. Similarly, knowing about the portrayal of feminism in Austen (again, think of this as a node) really could concievably be directly useful when studying the Suffragete movement because, again, they're different yet similar enough.

Therefore what I mean for technical maturity is comfort around a set of keywords. In abstract algebra, one keyword might be "symmetric structure" and in feminist studies one might be "intersectional theory." Having heard these words in other contexts -- even very different ones -- adds value and builds "technical maturity." However, groups and feminist theory is too far from writing code or drafting speeches to reasonably possibly have any keyword overlap.

A Concrete Example

In the fall of my freshman year, I took my first rigorous math class. It studied linear algebra from an abstract perspective. One of the things we learned about were unitary operators, which are linear transformations that preserve inner product structure. During the winter break after that semester, I perused a textbook on quantum computation. And in quantum computing, the unit of logic/computation is the idea of a quantum gate, itself a linear transformtion that takes some input and maps it to some output. Being a linear transformation, we represent such logic gates as matrices, and they operate on vector inputs.

One thing that the author of the textbook really emphasized and spent a good amount of time explaining was why we used unitary operators for our logic gates. Evidently, he thought most people reading the book would find that unmotivated. But because I was lucky to take a relatively abstract linear algebra course and had taken a class on probability beforehand, it didn't need any explanation: unitary operators preserve inner products, and random variables are vectors. If you want to ensure the probabilities in the vectors coming in AND leaving the linear transformation (quantum logic gate), using a unitary operator is the obvious choice, since they preserve the inner product (sum of probabilities) of their input. This is the key point: taking the class on abstract linear algebra and one on probability firmly planted nodes (unitary operators, random variables as vectors), and the new topic I was learning (quantum logic gates) had enough keywords in common that I could reasonably create new edges between these knowledge nodes, on the fly. This is the crux of technical maturity: increasing your threshold for what is "obvious".

What Might This Look Like In Practise?

If you see college mostly as I do -- a 4-year course on "learning to think like an X" for many different X (mathematician, programmer, physicist, economist, statistician, historian, psychologist), you should optimize for taking difficult, foundational courses in many fields. It's important to emphasize that by "foundational" I don't mean introductory. For example, I do not think taking an introductory CS course or two suffices to "learn to think like a computer scientist." Instead, take intermediate/advanced courses in algorithm design and operating systems to get those gains.

At Harvard, you need to take 12-16 courses in a field and 32 courses overall to graduate with a degree in that field. What if, instead, you took the hardest/meatiest 6/12 in a several related fields? In math, perhaps that includes group theory, rings and fields, real and complex analysis, differential geometry. In physics perhaps classical and quantum mechanics, electrodynamics and thermodynamics? In economics, maybe the intermediate micro/macro sequence alongside game theory, economic history, political economy. In CS, algorithms and operating systems, programming paradigms, complexity theory and compilers. You'd get 80% of the value of 3-4 different undergraduate degrees, and more importantly be equipped to at least understand parts of the research frontier in each field. Teaching yourself would become much easier, and you wouldn't really be an outsider to any of the fields.

The obvious caveat here is if you're set on spending your life exploring and deepening a specific field. If you want to be a mathematical physicist, ignore my advice. Go ahead and take quantum field theory and Lie algebra at the expense of understanding how economists or computer scientists think. That is the price you pay for mastery. And that's fine. But most people are quite as pointed in their goals, and so my thoughts are probably more relevant for them (and myself).

To summarize, I think that building technical maturity in multiple fields equips you with the ability to learn new things in those specific fields (unlike the claims of the pure-liberal-arts charlatans, who claim it builds critical thinking in general). And this manifests itself as deep comfort in novel sitatuations (again, in the fields you've cultivated maturity in only) which leads to very fast learning and growth of knowledge graphs. Ultimately, the point of college is to set down as many nodes as you can in as many diverse fields, constrainted by the fact that you want node density within any specific field (ie. part of your knowledge graph) to be high enough that they roughly approximate/sample the entire space so you can learn new things in new fields and slot them into the context of existing knowledge.

This mostly explains the big difference I saw between technical college graduates and high-school dropouts in the Valley -- the college graduates felt comfortable in novel situations because they knew their graphs were wide and dense. The high-school dropouts mostly had no nodes to use as reference points from which to grow their graph, and so struggled to keep up with new material. Of course, this does not always hold; you can create these graphs outside of the academy and most people in the academy fail to do a good job at creating these graphs at all. But for me, school seems to work well for this, so I'm happy I'm here, for now.

Back To Top

The 2012 paper by Alex Krizhevsky et al. in which he implements a deep convolutional neural network for image recognition, is perhaps the most seminal academic paper that has been writtein in the last few decades, full stop. It started the "deep learning revolution." Despite this, it's interesting to note that most of it's contributions are not novel.

It was not the first paper to show how deep networks can significantly outperform traditional neural nets for tasks like image recognition. A paper by Ciresan et al. (2011) had already done that. It was not the first paper to show how GPUs can used to train nets much faster than CPUs. Ciresan's paper also was written in CUDA to exploit GPU support. And of course neither has a claim to originality as far as the concept of convolution in image recognition is concerned; Yann Lecun (2018 Turing) came up with the idea of a CNN in 1989! I AlexNet's success is an example of "incrementalism as discontinuity" innovations. By this, I mean that it has no substantive novel contribution as far as theory is concerned, but makes incremental improvements on existing ideas, combining several important existing ideas (depth, GPUs, dropout regularization) in a way that makes it pass an important threshold. The fact that AlexNet classified over 10% more images in the ImageNet challenge than the runner up is key because that meant it passed the "threshold" for computer vision being practicable. Being the first to cross that threshold (via a well placed incremental improvement), its architecture became ubiquitous, instantly. Alex Krizhevsky is now immortal because of this.

In this piece, we're going to pore over every page of the paper and see how the authors might have come up with the ideas, and what goes into leaps of technical innovation that are so great that other innovations in the future become analogized to them. For example, when DeepMind's AlphaFold 2 surpassed the 90% protein classification threshold in 2021, it was called the "ImageNet/AlexNet moment of biology." That's high praise. With that, we begin.


As I become more interested in economics and ponder the trade-offs associated with a career in academic economics and, say, the high-technology industry, one thing that keeps coming to mind is the notion of "falsifiability" in a job. I define it as the ability to see clear and immediate cause-and-effect between the work you do and associated outcomes on the world that are a direct consequence. For example, medicine has falsifiability; people trust doctors because if a doctor is not good at their job, people will die and it will be clear to everyone. Similarly, business owners have falsifiability: there is an easy way to tell if you're doing your job right--see if you're making a profit. A business owner always runs the risk of going out of business. And they own that risk.

This isn't building up to a criticism of the academy. On the contrary: mathematicians and physicists have falsifiability, too. If your proof of a proposition is incorrect, it will be clear to everyone and unambiguous that you have failed at proving the proposition. If your theory of gravitation yields bad predictions for how a projectile or planet will behave, one can simply conduct an experiment and see that your theory is manifestly wrong. It's not just the presence of clear right and wrong or lack of ambiguity, it is also the ownership of risk and coupling of cause and effect in one's job.

One of the reasons I've become entranced with the study of economics is because it's the rare subject that requires a deep understanding of a variety of fields. A successful economist must have a near-professional working understanding of mathematics, statistics, programming, but also history, politics, philosophy. It's the rare field where studying measure theory and reading Hamlet can both be reasonably argued to be relevant to your job. And it's really impressive to see someone go from proving a difficult theorem on a whiteboard to making a moving speech about the future of a society. Finally, I'm convinced it's possible to have an enormous scope of impact when doing your job right. For example, watching Michael Kremer (2019 Nobel) talk about his work on de-worming showcased to me how his experiments in developing countries gave strong evidence that de-worming policies in primary schools can have huge returns in enabling students to be healthy enough to stay in school for the long haul. As a direct consequence of the work of him and his collaborators, hundreds of millions of students are benefiting from a policy that might have made the difference between them getting a high school education, and not. That, is falsifiability.

My problem with economics is that such examples of falsifiability are far and few between. It often feels like economists are commentators on the side-lines of a football match making strong comments about the performance of players, where they themselves would struggle to play against a talented middle-schooler. When an innovation economist makes strong claims about the conditions required for innovation and about the traits a successful innovator should have, it makes one think: if you truly are correct, why don't you act on this knowledge and start a successful research lab yourself? In other words, because the study of societies is so complicated, with so many moving parts, it's easy to conjure up an explanation for why your hypothesis was incorrect, and difficult to really know if you were, unambiguously, right, or wrong.

In Good Economics For Hard Times, the authors point out that medical professionals have very high trust ratings by the public, and economists very low trust ratings. I think this is because of falsifiability. A doctor owns the risk they run from the predictions they make. If they predict a tumor being benign, and it's malignant, someone dies. If an economist predicts a causal mechanism between a certain tax policy and patent rates, they are never held accountable because they can always behind the shroud of "further study is needed", even if policies based on that research might have cost their country decades in patent-years worth of innovation. In some sense, it's as if the "doctors of the economy", who have potentially the most leverage, are held the least accountable and taken the least seriously.

As a direct consequence, surprisingly little economics research makes its way into policy because, as Banerjee and Duflo point out, "economists are not in the business of futurology". This is my problem. If economics isn't meant to make any predictions about the future, what use is it? The entire point of science is to understand underlying mechanisms using the scientific method, and then use that understanding to make falsifiable predictions about the future, and be held accountable for those predictions. Anything else is, as my father puts it, "intellectual masturbation". In short, futurology is the very point of any scientific discipline. It is the raison d'etre, it is the holy grail, it is the single source of truth.

And so if I opt not to pursue economics in the academy, you'll know why. I want to be held accountable for the work I do.

Back To Top

Smil is a profilic author on energy economics and history, and his work is remarkably well evidenced and broad in its scope. His books are some of the best works of nonfiction I've read, where Energy and Civilisation literally changed how I look at the world. In this short piece, I'll try to articulate some areas of disagreement, where I think he's wrong, without reducing my stance into the sort of blind techno-optimism that is pervasive in 2020 Silicon Valley. Given below are some broad and strong claims he makes in either EaC or Creating the Twentieth Century. Each claim is either mostly unsubstantiated, or just a flat out an opinion disguised as fact. NOTE: CTTC is the title of the book, but I use it to refer to the group of innovations he discusses in the book, innovations made from 1867-1914 (Haber process, x-rays, automobiles, etc) that he believes are far more impactful and epochal than the late computer industry.

Back To Top

For a long time I, like most people, thought that many of the great thinkers of eons past were so great because they were brilliant and had stunning insights that revealed something fundamentally new about the world. This is true, but it presents a deceptive, reductionist about how they became famous. Most of the time, their work got the time of day not on its own merits, as if they simply published it and were lauded as geniuses who'd cemented their place in history henceforth. Many of them displayed startling and shocking amounts of what today we call hustle or resourcefulness. These entrepreneurial qualities are not typically associated with great thinkers, but I believe are central to why these people, and not other equally smart people, went down in history. In short, these people are so great not just of their ability to grapple with the profound, but also because of their ability to navigate the mundane and pedestrian obstacles that encumber modern "entrepreneurs" trying to enact change in the world. This is a list of such examples, added to as I come across them.

Back To Top

Modern, high-energy, high-tech society really is something to marvel at. Today, we can use artificially intelligent machines to manufacture convincing DeepFakes of politicians and celebrities doing or saying things they didn't. Social media apps are used by billions of people--a level of access that puts them in the august company of institutions like the Catholic Church. What's more, these apps can--and have--influence political action by pulling psychological strings in users' minds, and thereby potentially changing the outcome of international relations and elections. All of human knowledge is accessible to even the poorest villagers, at the tap of a fingertip. Software that started as a web crawler became a search engine and then the planet's de-facto arbiter of truth.

The difficult moral quanderies philosphers toyed with decades ago as mere hypotheticals are now coming to pass. We are living in the future.

We send goods created locally 10,000 miles across the world because it is cheaper to have them refined there and transported all the way back than to process them locally. And we still get those goods on our doorstep, exactly when we want them. It has been many decades since a conflict on the scale of those seen in the early 20th century, with a few lines of code now able to wreak more havoc on nations than armies of men in ages past. Despite living in a vastly more connected and globalized world than ever before, a terrifyingly infectious pandemic has killed less people in many months than cigarettes or car crashes. We have robots stacking shelves in titanic warehouses, welding parts, and helping construct buildings.

For such an advanced civilisation, we are curiously blind in places. In the most advanced nation on the planet, public infrastructure is crumbling. Up to a quarter of roads and bridges are rated unsafe because there isn't enough money to maintain them. Improvements in life expectancy are plateauing, and our grandchildren will suffer under the weight of our environmental negligence. The democracies that have brought unprecedented human literacy and social equality are the very same systems that incentivise politicans to please swing voters in the short term, at the expense of significant, irreversable environmental damage in the long-term. The average member of the American public struggles to multiply single-digit numbers, tell the difference between a country and a continent, or paraphrase a simple argument. And they're overweight. Almost half the American public doesn't "believe" in evolution, and most people who do can't explain what it is!

We are living in the future, yes, but we have not yet fully escaped our past. I wonder what 2120 will look like. I suspect less different than many think.

Back To Top

Despite technology making it easier than ever before to source materials, inputs, employees from across the world, the most innovative, iconic, and productive groups flock to central hubs. Examples of clusters are Hollywood, Silicon Valley, Wall Street. But other, more niche clusters exist too, and very much influence the trajectory of industries and progress: Boston for biotech, Shenzhen for hardware & electronics, South Germany for automobiles, Southern California for wine, and more. And this is nothing new, historical clusters include 20th century Göttingen and Berlin for theoretical physics, Renaissance Florence for fine art, ancient Athens for philosophy, industrial London for engineering, and more.

This piece is based on an HBR article about the economics of clusters, and I'll draw on some of its content while highlighting the factors I think contribute most to giving clusters a disproportionate edge. I'm slightly biased towards clusters because I moved to SF thinking that the internet makes it as easy to start a successful company in London or Abu Dhabi as in the Valley, only to see firsthand how wrong I was, and how many decades ahead the rest of the world the Valley was.

Historically, competitive advantages came from sourcing better inputs for a lower cost than your competitors. Because the differences in input costs could be such a stark advantage, improvements in knowledge, management and technique weren't as valued, and being close to a port or railway line was an immense advantage, and so companies would cluster around them. As freight and shipping make procuring quality goods from across the world cheap, reliable and quick, the advantages in knowledge, management and technique have become decisive. Clusters having a huge advantage on this front, too, as a consequence of in-person meetings being significantly more effective at inciting progress and action than virtual ones--a fact I conjecture, but one that seems to be empirically true. At a high level, I think the advantages clusters hold in a modern knowledge economy include: lower barriers to entry, process knowledge, peer pressure, agglomeration economies, sophisticated markets, and public investment.

To conclude, I think clusters are one of humanity's most powerful engines for progress. As globalization further increases, I think their importance will only grow--social media led to more college parties, not less, and teleconfering compounded the importance of in-person meetings instead of obviating them. Information technology makes the world more dynamic and knowledge/service based, which in turns gives outsized advantages to those that can easily identify and adapt to new trends and access 'insider' knowledge. It's a positive feedback loop. I'd welcome suggestion on how both legislation and technology can make large cities in developing and developed countries alike potential breeding grounds for clusters.

Back To Top

Tentative thoughts on cities I've lived in or visited multiple times.

  • Cleveland, OH
  • Quintessential American suburbia. Sprawling yet cozy, uneventful without a small-town feel. Contented.

  • New York City, NY
  • Something for everyone. Brash, no-bullshit attitude in stark contrast to west-coast political-correctness. Absolute cultural melée, and fully worth its reputation.

  • Abu Dhabi, UAE
  • City built from sand dunes, great for children and families but lacking cultural, experiential and intellectual diversity due to its youth.

  • London, UK
  • Perfect. Innovative, but with room for good banter. People take their work seriously, but don't take themselves too seriously. History and character.

  • Delhi, India
  • Slow, lazy, yet immensely hectic in an iconically Indian way. Enormous estates juxtapose mothers begging for starving children. People think in days, not years.

  • San Francisco, CA
  • On paper, utopic. In reality, not quite. Forward-thinking, open-minded people, fast moving culture. Closest to a meritocracy I've seen any city get, but with a subtle lack of intellectual vitality and a unique strain of Silicon Valley pretentiousness.

  • Tokyo, Japan
  • Gigantic, and utterly, utterly civilised. People value community and courtesy without the groupthink of Chinese cities. Meticulous attention to detail, but sadly culturally homogenous.

  • Dubai, UAE
  • A more intense, hedonistic, artificial version of Abu Dhabi. I really respect the government's effort to divserify economy into tourism. Beautiful, but lifeless.

  • Mumbai, India
  • A more intense, dense version of Delhi, its people chase fortune rather than let it come to them. Even the crime is more extreme. Wall Street and Hollywood meet chai and auto-rickshaws.

  • Boston, MA
  • TBD

    Back To Top

    It's interesting to note the similarity between the type of relentless resourcefulness important in war, and in business. The reason the French Republic got enough artillery to fire at the allies at Toulon was because Napoleon personally pulled strings to get more supplies, had new forges build on-site, trained people to themselves train troops on using artillery. An example of the circumstance/skill dichotomy is that the geography of Toulon and the layout of the Allies meant that any fighting was done by artillery, which is what Napoleon happened to be trained in (right person at right place at right time).

    And on the importance of implementation/distribution channels—I would argue that profits are to modern business what war was to nations: mechanisms for propagating/disseminating some radical innovation. Something as revolutionary as the French Republic and its progressive ideals could easily have been stamped out at various points in its infancy, changing the face of the world forever (setting us back 50-100 years!). It succeeded not because of the "correctness" of its ideals, in any absolute sense (ideals which philosophers had come up with long before they were enacted, just like Xerox came up with the graphical computer interface long before it was implemented/distributed by Apple), but because of the enterprising and relentlessness of a few men (like Napoleon) in changing the incumbent landscape enough to allow for introduction, distribution and implementation of these radical new ideas that had been fully fleshed out in theory.

    Importantly, this was done from within the existing infrastructure (fighting wars with other nations that opposed Republicanism). To spread successfully, the new innovation (Republicanism) did not just have to be a superior method of running a state, but also had to be better at winning wars. I see this fragility at birth to also be present in businesses—every household, office, and more, might look different if Jobs hadn’t recognized the potential of the GUI at PARC that one day he visited; just as how the whole world’s methods of government may still be leaning aristocratic, if Napoleon hadn’t been enterprising enough to requisition extra Artillery at Toulon or food for his soldiers at the bridge of Lodi (victories which led directly to him gaining power to implement civic innovations like his Code Napoleon).

    You also see that attempting to scale any major innovation (GUI, Haber process, batillon carré) requires lots of related, ancillary innovations as well as a receptive landscape—think Apple's redefinition of technology in retail via Apple Stores, Carl Bosch at Oppau/Leuna practically inventing the fields of high-pressure and catalytic chemistry, countless small but crucial bills passed under Napoleon. It’s a philosophical question of which you believe to be higher impact work—the invention of a concept (Wozniak, Haber) or its distribution/enactment at scale (Jobs, Bosch). An important caveat when weighing the two is that we usually know very well who was first to implement an innovation at scale, but the question of who was first to conceive with the theory is usually murky and debatable.

    In short, I've changed my opinion on what makes a startup succeed (and its competitors fail). A year ago, I'd tell you it was the novelty of the idea, insightfulness of the founders, and receptiveness of the geography and market at the time. Now, I think that successful founders are measured by very different yardsticks. Their test isn't how novel or unique their idea is, their test isn't how insightful or intelligent they are, and their test isn't whether the market or geography is ready for their cool new idea at that time. They are, instead, testing every hour of every day. They are tested when they hire their first employee--are they charismatic and inspirational enough to motivate and persuade a phenomenal engineer to risk it all to come work for them? They are tested when they are running out of money, and don't know any VCs--can they find a way to get a warm intro to a VC or dazzle investors through some creative stunt? They are tested when their initial customers tell them their product is shit--do they know when to persevere, and when to quit? Do they make the right decision at that point, and at every other pivotal moment that marks the early days of starting a company (or anything, for that matter)?

    In this sense, I no longer see startups as outlets for scientists to out-innovate each other, but what they are: businesses. And the ability to run a startup is just a long way of saying you know how to do business. In other words, you're a good businessman.

    This is an important realization for me because this word, "businessman", in particular, was one that would have had me see someone as "braindead" when first meeting them. I think many scientists/engineers share this sentiment. But now I find that 99% of what comprises entrepreneurial success is, unsurprisingly, entrepreneurial qualities. It seems trivially obvious in hindsight, but from what I've seen in the Bay Area, most technical founders, even those aged 30 and older, do not viscerally understand this until after their first or second company fails.

    In other words, I see now that innovation is not the cost of startup success, but the reward. Now that Facebook has tens of billions in cash reserves, it can pump money into moonshots like Oculus and CTRL-Labs knowing that it can pay for all that research and afford those risks, to a degree that even the biggest universities cannot. I believe Google outputs more computer science research than Stanford and Berkeley combined. Just ONE of Google's moonshot projects, Waymo, has been given more R&D funding in one year ($3B) than MIT gets for ALL its departments in several years ($1B/y). That is what Sergey Brin and Larry Page feel proud for when they put the kettle on to boil every morning of every day. That, is high leverage work.

    The requirements for getting to a point where the company is well enough endowed to innovate at all are simple to define: be an anomalously resourceful, charismatic person who can identify what is essential and what isn't, and act decisively and correctly in very high pressure, resource-constrained, systems. These are not necessarily, or indeed at all, the traits associated with "innovation" or "science", which are more to do with the creation of knowledge and tools, and require very different skill sets. Instead, these traits describe a good businessman. And, I would argue, also describe the legendary soldier-statesmen of eons past. Jobs, Gates and Caesar, Napoleon have more in common than it might seem.

    Put differently, business is modern warfare, and innovation is the prize.

    Back To Top

    Back To Top

    Back To Top