Whitepaper: Why Attention Functions make ideas flourish

🦢Solving Aging

Fiction 📓The Hike

Other Projects Arcadia DEEPWAVE

You can comment on everything. Or give me (harsh) personal feedback.

Made with 💙 by me (how to).

Dedicated to my family.

Status: Draft This is a first version of a whitepaper proposing a new file format: .aimm (Attention Instructing Memes). Aimm combines simple Markdown text blocks with a way to program in a varying probability of their importance

Introduction

Right now, you the reader could be spending your attention on many different things. But you are reading this, engaging with it’s ideas, reflecting on them, criticizing them or just passively skimming. In this paper I am laying out a vision for how this process could become more structured, allowing any agent to program their attention.

Objects of Attention

Let’s take any given moment $t$ in the life of an agent. What will be the object of attention $OoA_t$ at that time? We can abstract this down to a set of experiences ${\rm I\!E}_{\text{Agent}}$ (also called environment henceforth), which are both its external stimuli as well as internal representations of older stimuli or combinations thereof (memories, thoughts, imaginations). Experience here refers to a subset of all possible inputs (also called $\text{Observations}$ in the machine learning literature) and their representation (whether or not that is qualia).

OaA_t \subset {\rm I\!E}_{\text{Agent}} \\ {\rm I\!E}_{\text{Agent}} = {\rm I\!E}_{\text{Internal}} + {\rm I\!E}_{\text{External}} \subset {\rm I\!E}_{\text{All possible}}

Now internal experiences are often triggered by external events, and it is unclear to me what one would experience without any inputs. This paper therefore focuses mostly on looking at how the environment of an agent changes it’s set of experiences.

This definition allows to ask multiple fundamental questions. First, one can compare the richness $R_E$ of one environment to another. $R_E$ is just the size of the set ${\rm I\!E}_{\text{Agent}}$ . A white room has relatively few possible things one can spend ones attention to, while the biggest increase in $R_E$ has come through the invention of the internet. There is an interesting question here of whether or not the increase in $R_E$ over the course of history has been correlational or if it could be a measure of civilizational progress. But for this paper it suffices to say that $R_E$ is extremely large:

|{\rm I\!E}_{\text{Agent}}| ≫ 1

Now, one fact that has been consistently reported by meditation practitioners is that $|OoA_t| = 1$ , which means humans can only focus on one thing at a time. While I personally believe it is more complicated than this, for an arbitrarily large definition of “experience”, this is most likely true, possibly even for all agents. It definitely suffices for the current argument.

This leaves agents with a core problem tough. If

(|OaA_t| = 1) \cap (|{\rm I\!E}_{\text{Agent}}| ≫ 1) \cap (OaA_t \subset {\rm I\!E}_{\text{Agent}})

which experience should the agent focus on?

Attention is a weighing over experiences

Let’s suppose our agent has a goal $G$ , which is a state of the environment $S_G$ it prefers over all other states $S_\text{all}$ . It doesn’t have access to these states, all it can do is observe it. So a goal is the question: which actions should I take at point $t$ , given that I want to observe $OoA_{G, {t+n}}$ . Given infinite compute the agent could just do backcasting and ask what is $OoA_{G, {t+n-1}}$ till $t + n = t$ . But a more tractable approach would be to use probabilities.

The core variable I will introduce in this paper is the Attention Probability $æ$ :

æ_{i,t} = P(OoA_{i,t} | OoA_{G, t+n}) = P(\{{\rm I\!E}_{\text{Agent}}\}_i|S_G)

with $\sum æ_i = 1$ .

Or in plain language: $æ$ is the attention you should put on any possible experience, given what you care about. This reduces the hard question of what to pay attention to into three questions:

What is my goal? (At what time in the future do I want to experience what?)
How rich is my environment? Can I change it so that it includes only experiences that would lead me to my goal?
How will the attention function of different experience change over time? Do I need to do some things in order?

All three of these questions are still hard, but I will propose a solution that could make each a little more tractable.

Attention functions

In the last chapter we saw that what one should pay attention to depends on three variables, the goal $G$ , the environment ${\rm I\!E}_{\text{Agent}}$ and time $t$ .

Let’s look at the simplest case $t$ first.

Time dependent attention functions in the wild

Let’s suppose our agent is a human with a simple goal: the human wants to meet with his friend in the evening. They also live on the same planet, so the environment contains that experience. How would the attention function for this context look like?

Or more explicitly it is mostly 0 during the day, (the friend isn’t there), should stay 1 the whole time they are together and then goes back to 0 if he leaves.

To be continued …

Memes that instruct attention replicate better

Attention Management as a convergent instrumental goal

Introducing an Attention Instructing file format

The motivating need for a universal attention instructing file format

The problem with recommendation silos

.aimm and .aiml

Summary

Introduction
Objects of Attention
Attention is a weighing over experiences
Attention functions
Time dependent attention functions in the wild
To be continued …
Memes that instruct attention replicate better
Attention Management as a convergent instrumental goal
Introducing an Attention Instructing file format
The motivating need for a universal attention instructing file format
The problem with recommendation silos
.aimm and .aiml
Summary