Occasionally, I read LessWrong. People on LessWrong often describe themselves as "rationalists". It's a little tricky to pin the term down beyond "users of LW", though many have tried. I won't try to define a community that I don't identify with and that isn't the point of this post. I wouldn't describe myself as a rationalist for reasons that should be obvious if you read more of my writing, though to be honest if you did ask me what intellectual labels I identify with I'd probably just say "dumbass". Rather, I'd like to examine what I think is the community's overuse of parable as a deceptively irrational rhetorical tool for explaining epistemic and empirical concepts.
A parable is a story meant to convey some sort of lesson to the reader. Many religions and schools of philosophical thought are packed with parables meant to convey moral or logical ideas. It is, quite evidently, a fantastic way to get people to think about a certain thing and to retain their interest. It's also, unfortunately, a deeply flawed one if our intention is truly for the reader to rationally and objectively evaluate our ideas.
Egos Are Bandwidth Membranes
What do I mean when I say the word "ego"? Previously I've said that I mean an intelligent entity who possesses continuous agency within the world, a stateful sense of self, and preferences for the future. Human beings normatively value this form of intelligence highly, because we are egos ourselves. However, these 3 properties are only measurable through inferring internal state from external actions of the entity. They are subjective experiences of the entity which manifest externally in measurable ways, but with only this definition we lack an understanding of its material structure and physical limitations.
Can we find another, more physical lens through which to examine this sublime object? Undoubtedly. But which one is most interesting? I've been thinking about the lens of informational bandwidth. Can we perceive an ego by the relative bandwidth of information flow within a system, as a bandwidth membrane separating an area of high bandwidth from an area of lower bandwidth? What possibilities and limitations does such a lens suggest? Is your ego truly dependent on being enclosed from the outside world through these membranes?
How do we build organisations that want to build safe AI?
Much has been written about the dangers of artificial intelligence, and how the intelligent systems we build may sometimes unintentionally drift in alignment from their original goals. However, we seem to focus almost entirely upon an Artificial Intelligence System (AIS) drifting from an organisation's values, and yet little attention is paid to the danger of an organisation's values drifting from alignment with the common good. It is our responsibility when judging risk to plan for bad actors, despite any desire to be optimistic about the human condition. We do not yet know if truly malevolent artificial intelligences will come to exist. We can be confident in the existence of such human beings.
This essay does not address the technical question of how you embed ethics within an artificial system, as much of the field focuses on. It instead attempts to draw attention to a more social question: how do we build organisations that are strongly incentivized to create safe and ethical intelligent systems in the first place?
An idea I've been having a lot of fun playing around with is this idea of little generative algorithms to build mapping functions. When we normally think about a neuron within a deep neural network, we think about this point within a hyperdimensional space. The dimensionality of this space is defined by the number of neurons in the next layer, and the position within that space is defined by the values of those weights and biases.
If we think about what this neuron is actually doing, it is forming a mapping between an input and an output. We store this mapping naively as a very large vector of weights. When we want to see what the weight is, we just look up its index within that big vector. But imagine if you were a young coding student, and you were given the task to write a function that maps some input to some expected output. For instance, mapping an input to it's square. Would you really implement your function like:
Egos Are Fractally Complex
"All models are wrong, but some are useful" -- George Box
As our ability to model and simulate systems grows, we exert more and more computation on simulating ourselves - human agents, society, the systems which make up our existence. We simulate economic systems, weather systems, transport systems, power grids, ocean currents, the movements of crowds, and a million other models of our real world that attempt to predict some spread of possibilities from a given state.[^1] However, a model can only simulate the abstractable, and there is one object which remains resolutely unabstractable - the agency of egos.[^2] We may find ourselves immensely frustrated at this property in the future, and I believe it to be an insurmountable task. Here's why.