Extracted Text


2405.07987v1.pdf

The Platonic Representation Hypothesis
Minyoung Huh
* 1
Brian Cheung
* 1
Tongzhou Wang
* 1
Phillip Isola
* 1
Abstract
We argue that representations in AI models, par-
ticularly deep networks, are converging. First, we
survey many examples of convergence in the lit-
erature: over time and across multiple domains,
the ways by which different neural networks rep-
resent data are becoming more aligned. Next, we
demonstrate convergence across data modalities:
as vision models and language models get larger,
they measure distance between datapoints in a
more and more alike way. We hypothesize that
this convergence is driving toward a shared sta-
tistical model of reality, akin to Plato’s concept
of an ideal reality. We term such a representation
theplatonic representationand discuss several
possible selective pressures toward it. Finally,
we discuss the implications of these trends, their
limitations, and counterexamples to our analysis.
Project Page: phillipi.github.io/prh
Code: github.com/minyoungg/platonic-rep
1. Introduction
AI systems are rapidly evolving into highly multifunctional
entities. For example, whereas in the past we had special-
purpose solutions for different language processing tasks
(e.g., sentiment analysis, parsing, dialogue), modern large
language models (LLMs) are competent at all these tasks us-
ing a single set of weights (Srivastava et al.,). Unified
systems are also being built across data modalities: instead
of using a different architecture for processing images ver-
sus text, recent models, such as GPT4-V (Achiam et al.,
2023), Gemini (Anil et al.,), and LLaVA (Liu et al.,
2023), handle both modalities with a combined architecture.
More and more systems are built off of general-purpose
pretrained backbones, sometimes called foundation mod-
els (Bommasani et al.,), that support a large range
of tasks, including robotics (Driess et al.,;
et al.,), bioinformatics (Ma et al.,), and health-
*
Equal contribution
1
MIT. Correspondence to: Minyoung Huh
<minhuh@mit.edu>.
Proceedings of the41
stInternational Conference on Machine
Learning, Vienna, Austria. PMLR 235, 2024. Copyright 2024 by
the author(s).
The Platonic Representation Hypothesis
Neural networks, trained with different objectives
on different data and modalities, are converging to a
shared statistical model of reality in their representa-
tion spaces.
Figure 1.
The Platonic Representation Hypothesis:Images (X)
and text (Y) are projections of a common underlying reality (Z).
We conjecture that representation learning algorithms will con-
verge on a shared representation ofZ, and scaling model size, as
well as data and task diversity, drives this convergence.
care (Steinberg et al.,). In short, AI systems are becom-
ing increasingly homogeneous in both their architectures
and their capabilities.
This paper explores one aspect of this trend: representational
convergence. We argue that there is a growing similarity
in how datapoints are represented in different neural net-
work models. This similarity spans across different model
architectures, training objectives, and even data modalities.
What has led to this convergence? Will it continue? And
ultimately, where does it end?
Our central hypothesis, stated above in Figure, is that there
is indeed an endpoint to this convergence and a principle
that drives it: different models are all trying to arrive at a
representation of reality, meaning a representation of the
1

The Platonic Representation Hypothesis
joint distribution over events in the world that generate the
data we observe. Figure
exists a real world (labeledZ), which we measure with
various sensors, such as the camera shown to the left (X).
Otherprojectionsof these measurements, such as the tex-
tual description shown, can be produced from the first set
of measurements or mediated by some other set of measure-
ments,e.g., touch or other camera views (dotted arrow from
XtoY)
1
. Representation learning algorithms find vector
embeddings that statistically model the various measure-
ments and projections. The resulting vector embeddings
are all derived from the underlying reality inZand thereby
become aligned. As models are trained on more data and for
more tasks, they require representations that capture more
and more information aboutZ, and hence alignment toward
Zincreases toward a convergent point as a function of scale.
We call this converged hypothetical representation the “pla-
tonic representation” in reference to Plato’s Allegory of the
Cave (Plato,), and his idea of an ideal reality
that underlies our sensations. The training data for our algo-
rithms are shadows on the cave wall, yet, we hypothesize,
models are recovering ever better representations of the ac-
tual world outside the cave. This idea is not unique to Plato;
our hypothesis is also related to the notion of “convergent re-
alism” (Newton-Smith,;,;,;
Hardin & Rosenberg,) in the philosophy of science
(i.e., that science is converging on truth), and to many argu-
ments that have been put forth in the representation learn-
ing literature (e.g.,2020a);
(2021);2024);2024)).
Also closely related to our hypothesis is the “Anna Karenina
scenario” described by2021), referring to the
possibility that all well-performing neural nets represent the
world in the same way. We discuss the evidence they give
for this possibility in Section.
2
The platonic representation
hypothesis refers to the situation where we are in an Anna
Karenina scenarioandthe “happy representation” that is
converged upon is one that reflects a statistical model of the
underlying reality. We discuss the potential nature of this
statistical model in more detail in Section.
2. Representations are converging
PreliminariesWe restrict our attention to representations
that arevector embeddings. We characterize such a repre-
1
Touch could convey the shapes in this example but not the
colors. This is an important limitation to our hypothesis that we
discuss at several points in the paper: different sensors and views
might capture different information, which may limit their potential
to converge to identical representations.
2
Borrowed from1877), similar analogies have been
made in other domains, such as the “Anna Karenina principle”
popularized by1998) to explain animal domestication.
sentation by the similarity structure it induces, referred to
as its kernel. Kernels are commonly used to assess repre-
sentations (Kornblith et al.,;,);
this can be justified by the fact that they capture the relative
structures among data samples, which are also the learning
signal for many machine learning algorithms (Aronszajn,
1950; ¨olkopf,). Following prior literature,
we definerepresentational alignmentas a measure of the
similarity of the similarity structures induced by two repre-
sentations,i.e., a similarity metric over kernels. We give the
mathematical definition of these concepts below:

Arepresentationis a functionf:X →R
n that assigns
a feature vector to each input in some data domainX.

Akernel,K:X × X →R , characterizes how a represen-
tation measures distance/similarity between datapoints.
K(xi, xj) =⟨f(xi), f(xj)⟩ , where⟨ ·,· ⟩denotes inner
product,xi, xj∈ XandK∈ K.

Akernel-alignment metric,m:K × K →R , measures
the similarity between two kernels,i.e., how similar is the
distance measure induced by one representation to the
distance measure induced by another. Examples include
Centered Kernel Distance (CKA) (Kornblith et al.,),
SVCCA (Raghu et al.,), and nearest-neighbor met-
rics (Klabunde et al.,).
In our experiments, we use amutual nearest-neighbor met-
ricthat measures the mean intersection of thek-nearest
neighbor sets induced by two kernels,K1andK2, nor-
malized byk. This metric is a variant of those proposed
in2023) and2017). See Ap-
pendix
parisons with alternative alignment metrics.
Next, we explore several ways in which representations are
converging. First, we argue that different neural networks
are converging to aligned representations. Then, we show
that this continues to hold across modalities, where image
embeddings in vision models align with text embeddings in
language models.
2.1. Different models, with different architectures and
objectives, can have aligned representations
One indication of representational convergence is the rising
number of systems built on top of pre-trained foundation
models. These models are becoming standard backbones
across a growing spectrum of tasks. Their versatility across
numerous applications implies a level of universality in the
way they represent data.
While this trend implies convergence toward a relatively
small set of foundation models, it does not imply thatdiffer-
entfoundation models will arrive at the same representation.
Yet that is what has been observed by several recent papers.
Lenc & Vedaldi2015) conducted one such study, in which
2

The Platonic Representation Hypothesis0-20%0-40%0-60%0-80%0-100%
Percentage of VTAB tasks solved (total=19)
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
Intra-bucket alignment
Convergence to general competence UMAP of model representations
Random Initialization
Classification
MAE
Contrastive
CLIP
0
4
8
12
16
19
# VTAB tasks solved
Figure 2.
VISION models converge as COM-
PETENCE increases:We measure align-
ment among78models using mutual nearest-
neighbors on Places-365 (Zhou et al.,), and
evaluate their performance on downstream tasks
from the Visual Task Adaptation Benchmark
(VTAB;2019)). LEFT:Models
that solve more VTAB tasks tend to be more
aligned with each other. Error bars show stan-
dard error.RIGHT:We use UMAP to embed
modelsinto a 2D space, based ondistance≜
−log(alignment)
. More competent and general
models (blue) have more similar representations.
they measured representational similarity through a tech-
nique calledmodel stitching. Given two models,fandg,
each composed of multiple layers (f=f1◦ · · · ◦fn ,g=
g1◦ · · · ◦gm
), an intermediate representation fromfis inte-
grated intogvia a learned affine stitching layerh, resulting
in a new stitched modelF=f1◦· · ·◦fk◦h◦gk+1◦· · ·◦gm .
IfFhas good performance, it indicates thatfandghave
compatible representations at layerk, up to the transformh.
In their study,2015) made two notable find-
ings: (1) A vision model trained on ImageNet (Russakovsky
et al.,) can be aligned with a model trained on the
Places-365 dataset (Zhou et al.,) while maintaining
good performance; (2) The early layers of these convolu-
tional networks are more interchangeable than later layers.
The first finding illustrates a level of data independence
where distinct image datasets lead to similar representations.
The second finding agrees with extensive research that ori-
ented Gabor-like filters are common in both artificial and
biological vision systems. This suggests a convergence to a
similar initial layer of representation across various neural
network architectures (Olshausen & Field,;
et al.,).2021) expanded on the idea of
model stitching, uncovering that models trained using self-
supervised objectives align closely with their supervised
counterparts.
Moschella et al.2022) further demonstrated the feasibility
of “zero-shot” model stitching without learning a stitching
layer. Despite the fact that different text models were trained
on different modalities, they found that the models often
embed data in remarkably similar ways. In particular, they
considered the kernelKdefined by learned representations
and showed thatKserves as a bridge between models,
allowing an encoder trained in one language, like English,
to work effectively with a decoder in another, like French.
Dravid et al.2023) extended this idea to individual neurons,
and found “Rosetta Neurons” that are activated by the same
pattern across a range of vision models. Such neurons form a
common dictionary independently discovered by all models.
2.2. Alignment increases with scale and performance
Kornblith et al.2019) observed model alignment not only
exists but also increases with model scale. On CIFAR-
10 classification (Krizhevsky et al.,), they found that
larger models exhibit greater alignment with each other com-
pared to smaller ones. Theoretically,
(2018) showed that models with similar outputs (e.g., as a
result of having high performance) also have similar internal
activations. With the continuing trend of models scaling
up, this suggests model alignment will increase over time
– we might expect that the next generation of bigger, better
models will be even more aligned with each other.
We expand upon this observation by evaluating the trans-
fer performance of78vision models. These models were
trained with varying architectures, training objectives, and
datasets (detailed in Appendix). In Figure
bin these models based on their average transfer perfor-
mance on the VTAB dataset (Zhai et al.,), and then
measure the average kernel alignment of the models within
each bin. The results indicate that models with high transfer
performance form a tightly clustered set of representations,
while models with weak performance have more variable
representations. We further visualize this structure with
UMAP (McInnes et al.,) over models representation
in Figure
petent all represent data in a similar way. Echoing
et al.2021) and1877), we might say: all strong
models are alike, each weak model is weak in its own way.
The discussion so far indicates that various models are align-
ing toward a unified representation. But does the conver-
gence extend to model weights? While models with differ-
ent architectures might not have compatible weight spaces,
there exists ample evidence that models with the same archi-
tecture will often converge to the same basin of weights (Na-
garajan & Kolter,;,;,
2023). This holds even for models with different initial-
izations, up to permutations over weight space (Ainsworth
et al.,). Because of this, it is possible to merge sepa-
3

The Platonic Representation Hypothesisbloom0.56bbloom1.1bbloom1.7bbloom3bbloom7b openllama3bopenllama7bopenllama13bllama13bllama33bllama65b
0.1 0.2 0.3 0.4 0.5
LANGUAGE performance
0.10
0.12
0.14
0.16
Alignment to DINOv2
dino small
dino base
dino large
dino giant
0.10.20.30.40.5
0.05
0.06
0.07
0.08
0.09
0.10
MAE
base
large
huge
0.10.20.30.40.5
0.08
0.10
0.12
0.14
ImageNet21K
tiny
small
base
large
0.10.20.30.40.5
0.12
0.14
0.16
0.18
0.20
CLIP
base
large
huge
0.10.20.30.40.5
0.10
0.12
0.14
0.16
0.18
CLIP(I12Kft)
base
large
huge
Figure 3.
LANGUAGE and VISION models align:We measure alignment using mutual nearest-neighbor on the Wikipedia caption
dataset (WIT) (Srinivasan et al.,). The x-axis is the language model performance measured over 4M tokens from the OpenWebText
dataset (Gokaslan & Cohen,) (see Appendix 1−bits-per-byte,
wherebits-per-bytenormalizes the cross-entropy by the total bytes in the input text string. The results show a linear relationship
between language-vision alignment and language modeling score, where a general trend is that more capable language models align better
with more capable vision models. We find that CLIP models, which are trained with explicit language supervision, exhibit a higher level
of alignment. However, this alignment decreases after being fine-tuned on ImageNet classification (labeled CLIP (I12K ft)).
rately trained models of the same architecture, and achieve
some of the capabilities of all models in the mixture (Stoica
et al.,;,;,).
2.3. Representations are converging across modalities
Do models trained on different data modalities also con-
verge? Several works indicate that the answer isyes.
Merullo et al.2022) extended model stitching to the cross-
modal setting, finding that a single linear projection is suffi-
cient to stitch a vision model to an LLM and achieve good
performance on visual question answering and image cap-
tioning.2023) showed that linear stitching can
also work in the opposite direction, aligning text inputs to
visual outputs. In fact, many recent language-vision models
stitch pre-trained language and vision models together. For
example, LLaVA (Liu et al.,) demonstrated state-of-
the-art results by projecting visual features into a language
model with a 2-layer MLP.
Other works show further kinds of evidence of cross-modal
synergy.2023) found that jointly training a
language model with a vision model improves performance
on language tasks, compared to training the language model
on its own.2024) probed the visual knowl-
edge of LLMs trainedonlyon language data, by converting
images into code that an LLM can process. They found that
LLMs have rich knowledge of visual structures, to the extent
that decent visual representations can be trained on images
generated solely by querying an LLM to produce code and
rendering the response. In visual generation, LLMs show
abilities to augment captions with visual structures (e.g.,
bounding boxes and locations) and improve generation qual-
ity (Betker et al.,;,;b;,).
In other modalities,2024) showed auditory
models are also roughly aligned with language models up to
a linear transformation, and2023) demonstrated
the effectiveness of using pre-trained language models for
facial motion prediction.
We set out to address these claims in a broader scope to de-
termine whether models are indeed learning an increasingly
modality-agnostic representation of the world. We sampled
a variety of models trained either solely on vision or solely
on language, and compared their representations as they
became larger and more competent over many tasks.
In Figure, we assess alignment between a suite of language
models and vision models. So far we have only defined
alignment for two kernels defined over the same input space.
To measure cross-modal alignment, we use paired datasets
to bridge the two modalities. For vision and text, we use
the Wikipedia captions dataset{(xi, yi)}i (Srinivasan et al.,
2021), composed of images from Wikipedia (xi) and their
corresponding captions (yi). We then measure alignment
between a language modelftextand a vision modelfimgas
the alignment of the two following kernels:
Kimg(i, j) =⟨fimg(xi), fimg(xj)⟩ (1)
Ktext(i, j) =⟨ftext(yi), ftext(yj)⟩. (2)
Using this analysis, we find that the better an LLM is at
language modeling, the more it tends to aligns with vision
4

The Platonic Representation Hypothesis0.14 0.16 0.18 0.20 0.22 0.24 0.26
Alignment to VISION (DINOv2)
0.30
0.35
0.40
0.45
0.50
0.55
0.60
0.65
0.70
Performance on Hellaswag
llama-33b
bloom-1.7b
llama-65b
mistral-7b
openllama-7b
mixtral-8x7b
openllama-3b
gemma-7b
bloom-3b
llama3-8b
bloom-7.1b
bloom-1.1b
olmo-1b
bloom-560m
gemma-2b
llama-7b
olmo-7b
openllama-13b
llama3-70b
llama-13b 0.14 0.16 0.18 0.20 0.22 0.24 0.26
Alignment to VISION (DINOv2)
0.0
0.2
0.4
0.6
0.8
Performance on GSM8K (5 shot)
llama-33b
bloom-1.7b
llama-65b
mistral-7b
openllama-7b
mixtral-8x7b
openllama-3b
gemma-7b
bloom-3b
llama3-8b
bloom-7.1b
bloom-1.1b
olmo-1b
bloom-560m
gemma-2b
llama-7b
olmo-7b
openllama-13b
llama3-70b
llama-13b
Figure 4.
Alignment predicts downstream performance:We visualize correlation between LLM alignment score to DINOv2 (Oquab
et al.,) and downstream task performance on Hellaswag (common-sense) (Zellers et al.,) and GSM8K (math) (Cobbe et al.,
2021). LLMs are plotted with radii proportional to the size of the model, and color-coded by their rank order in language modeling scores
(1−bits-per-byte). We observe that models aligned more closely with vision also show better performance on downstream language
tasks. For Hellaswag, there is a linear relationship with alignment score, while GSM8K exhibits an “emergence”-esque trend.
models, as shown in Figure. The converse effect also
holds: the better a vision models is, the more it tends to
align with LLMs. See Appendix
2.4. Models are increasingly aligning to brains
Neural networks also show substantial alignment with bi-
ological representations in the brain (Yamins et al.,).
This commonality may be due to similarities in the task and
data constraints both systems are confronted with. Even
though the mediums may differ – silicon transistors ver-
sus biological neurons – the fundamental problem faced
by brains and machines is the same: efficiently extracting
and understanding the underlying structure in images, text,
sounds, etc (Barlow et al.,;,).
The tasks that the human visual system has been honed to
perform through evolution – like segmentation, detection,
and whole-image classification – are also the ones that we
train our neural nets to perform.2014) went
as far as to title their work in the spirit that performance over
such tasks implies brain alignment. Further,
(2022) showed that training data plays a large role in align-
ment. Psychophysical studies have also shown agreement
between how humans perceive visual similarity and how
models do, even when the models are trained on tasks, such
as self-supervised prediction, that are seemingly unrelated
to mimicking human perception (Zhang et al.,).
2.5. Does alignment predict downstream performance?
If models are converging towards a more accurate represen-
tation of reality, we expect that alignment should correspond
to improved performance on downstream tasks. Figure
supports this hypothesis by demonstrating improved per-
formance on commonsense reasoning (Hellaswag;
et al.2019)) and mathematical problem solving (GSM8K;
Cobbe et al.2021)) as alignment improves.
3. Why are representations converging?
Modern machine learning models are generally trained to
minimize the empirical risk with possible implicit and/or
explicit regularization:
trained model
f

=
arg minf∈F
function class
E
x∼
dataset
[
training objective
L(f, x)] +R
regularization
(f)
In the following sections, we lay out how each colored
component in this optimization process potentially plays a
role in facilitating representational convergence.
3.1. Convergence via
Task Generality
Each training datapoint and objective (task) places an addi-
tional constraint on the model. As data and tasks scale, the
volume of representations that satisfy these constraints must
proportionately grow smaller, as visualized in Figure
stated below:
The Multitask Scaling Hypothesis
There are fewer representations that are competent
forNtasks than there are forM < Ntasks. As we
train more general models that solve more tasks at
once, we should expect fewer possible solutions.
This has been previously termed as the Contravariance prin-
ciple by2024), which states that the set of
solutions to an easy goal is large, while the set of solutions
to a challenging goal is comparatively smaller. Moreover,
we argue that this narrower solution set also generalizes
better. As data scales, models that optimize the empirical
riskE
x∼
dataset
[L(f, x)]
also improve on the population risk
E
x∼
reality
[L(f, x)]
, and become better at capturing statisti-
cal structures of the true data generating process (reality).
5

The Platonic Representation Hypothesis Hypothesis
<latexit sha1_base64="cMheVzeaFOxQm33/2jltUVmfiJc=">AAAB8XicbVBNS8NAEN34WetX1aOXYBE8laQXPRa99FjBfmAbymY7aZdudsPuRAih/8KLB0W8+m+8+W/ctjlo64OBx3szzMwLE8ENet63s7G5tb2zW9or7x8cHh1XTk47RqWaQZspoXQvpAYEl9BGjgJ6iQYahwK64fRu7nefQBuu5ANmCQQxHUsecUbRSo/NLFE4AcPNsFL1at4C7jrxC1IlBVrDytdgpFgag0QmqDF930swyKlGzgTMyoPUQELZlI6hb6mkMZggX1w8cy+tMnIjpW1JdBfq74mcxsZkcWg7Y4oTs+rNxf+8forRTZBzmaQIki0XRalwUbnz990R18BQZJZQprm91WUTqilDG1LZhuCvvrxOOvWa79X8+3q1cVvEUSLn5IJcEZ9ckwZpkhZpE0YkeSav5M0xzovz7nwsWzecYuaM/IHz+QPnipEM</latexit><latexit sha1_base64="B+b/jF1gJbxAde33GnB653x9wZc=">AAACFnicjVC7SgNBFL0bXzG+opY2i0GwCrtptAzapFQwD0yWMDu5mwyZnVlmZoUl5C8sbPwVGxFbsfNvnCRbaGLhgYHDOedy554w4Uwbz/tyCmvrG5tbxe3Szu7e/kH58KilZaooNqnkUnVCopEzgU3DDMdOopDEIcd2OL6e+e0HVJpJcWeyBIOYDAWLGCXGSveNLJFmhJrpfrniVb053FXi56QCOf4X75c/ewNJ0xiFoZxo3fW9xAQTogyjHKelXqoxIXRMhti1VJAYdTCZnzV1z6wycCOp7BPGnas/JyYk1jqLQ5uMiRnpZW8m/uV1UxNdBhMmktSgoItFUcpdI91ZR+6AKaSGZ5YQqpj9q0tHRBFqbJMle7q/fOgqadWqvlf1b2uV+lXeWRFO4BTOwYcLqEMDbqAJFAQ8wjO8Ok/Oi/PmvC+iBSefOYZfcD6+Adk7mIU=</latexit><latexit sha1_base64="B+b/jF1gJbxAde33GnB653x9wZc=">AAACFnicjVC7SgNBFL0bXzG+opY2i0GwCrtptAzapFQwD0yWMDu5mwyZnVlmZoUl5C8sbPwVGxFbsfNvnCRbaGLhgYHDOedy554w4Uwbz/tyCmvrG5tbxe3Szu7e/kH58KilZaooNqnkUnVCopEzgU3DDMdOopDEIcd2OL6e+e0HVJpJcWeyBIOYDAWLGCXGSveNLJFmhJrpfrniVb053FXi56QCOf4X75c/ewNJ0xiFoZxo3fW9xAQTogyjHKelXqoxIXRMhti1VJAYdTCZnzV1z6wycCOp7BPGnas/JyYk1jqLQ5uMiRnpZW8m/uV1UxNdBhMmktSgoItFUcpdI91ZR+6AKaSGZ5YQqpj9q0tHRBFqbJMle7q/fOgqadWqvlf1b2uV+lXeWRFO4BTOwYcLqEMDbqAJFAQ8wjO8Ok/Oi/PmvC+iBSefOYZfcD6+Adk7mIU=</latexit><latexit sha1_base64="B+b/jF1gJbxAde33GnB653x9wZc=">AAACFnicjVC7SgNBFL0bXzG+opY2i0GwCrtptAzapFQwD0yWMDu5mwyZnVlmZoUl5C8sbPwVGxFbsfNvnCRbaGLhgYHDOedy554w4Uwbz/tyCmvrG5tbxe3Szu7e/kH58KilZaooNqnkUnVCopEzgU3DDMdOopDEIcd2OL6e+e0HVJpJcWeyBIOYDAWLGCXGSveNLJFmhJrpfrniVb053FXi56QCOf4X75c/ewNJ0xiFoZxo3fW9xAQTogyjHKelXqoxIXRMhti1VJAYdTCZnzV1z6wycCOp7BPGnas/JyYk1jqLQ5uMiRnpZW8m/uV1UxNdBhMmktSgoItFUcpdI91ZR+6AKaSGZ5YQqpj9q0tHRBFqbJMle7q/fOgqadWqvlf1b2uV+lXeWRFO4BTOwYcLqEMDbqAJFAQ8wjO8Ok/Oi/PmvC+iBSefOYZfcD6+Adk7mIU=</latexit>
space 1
<latexit sha1_base64="18Zjza79/9bv33YsePIgOfZKpy8=">AAAB7nicbVBNSwMxEJ31s9avqkcvwSJ4Kru96LHoxWMF+wHtUrLpbBuaZJckK5SlP8KLB0W8+nu8+W9M2z1o64PA472ZycyLUsGN9f1vb2Nza3tnt7RX3j84PDqunJy2TZJphi2WiER3I2pQcIUty63AbqqRykhgJ5rczf3OE2rDE/VopymGko4Ujzmj1kkdk1KGJBhUqn7NX4Csk6AgVSjQHFS++sOEZRKVZYIa0wv81IY51ZYzgbNyPzPoRk/oCHuOKirRhPli3Rm5dMqQxIl2T1myUH935FQaM5WRq5TUjs2qNxf/83qZjW/CnKs0s6jY8qM4E8QmZH47GXKNzIqpI5Rp7nYlbEw1ZdYlVHYhBKsnr5N2vRb4teChXm3cFnGU4Bwu4AoCuIYG3EMTWsBgAs/wCm9e6r14797HsnTDK3rO4A+8zx+q2I8b</latexit><latexit sha1_base64="mPJ5vUjtI9Z3NJNiQlWfck9DNEk=">AAACE3icjVC7SgNBFL0bXzG+Vi1tBoNgFXbTaBm0sVQwD0iWMDu5mwyZnV1mZoWw5CMsbPwVGxFbGzv/xkmyhSYWHhg4nHMvd84JU8G18bwvp7S2vrG5Vd6u7Ozu7R+4h0ctnWSKYZMlIlGdkGoUXGLTcCOwkyqkcSiwHY6vZ377AZXmibw3kxSDmA4ljzijxkptnVKGxO+7Va/mzUFWiV+QKhT433jf/ewNEpbFKA0TVOuu76UmyKkynAmcVnqZRnt/TIfYtVTSGHWQzzNNyZlVBiRKlH3SkLn6cyOnsdaTOLSTMTUjvezNxL+8bmaiyyDnMs0MSrY4FGWCmITMCiIDrpAZMbGEMsXtXwkbUUWZsTVWbHR/OegqadVrvlfz7+rVxlXRWRlO4BTOwYcLaMAN3EITGIzhEZ7h1XlyXpw3530xWnKKnWP4BefjG1iblpQ=</latexit><latexit sha1_base64="mPJ5vUjtI9Z3NJNiQlWfck9DNEk=">AAACE3icjVC7SgNBFL0bXzG+Vi1tBoNgFXbTaBm0sVQwD0iWMDu5mwyZnV1mZoWw5CMsbPwVGxFbGzv/xkmyhSYWHhg4nHMvd84JU8G18bwvp7S2vrG5Vd6u7Ozu7R+4h0ctnWSKYZMlIlGdkGoUXGLTcCOwkyqkcSiwHY6vZ377AZXmibw3kxSDmA4ljzijxkptnVKGxO+7Va/mzUFWiV+QKhT433jf/ewNEpbFKA0TVOuu76UmyKkynAmcVnqZRnt/TIfYtVTSGHWQzzNNyZlVBiRKlH3SkLn6cyOnsdaTOLSTMTUjvezNxL+8bmaiyyDnMs0MSrY4FGWCmITMCiIDrpAZMbGEMsXtXwkbUUWZsTVWbHR/OegqadVrvlfz7+rVxlXRWRlO4BTOwYcLaMAN3EITGIzhEZ7h1XlyXpw3530xWnKKnWP4BefjG1iblpQ=</latexit><latexit sha1_base64="mPJ5vUjtI9Z3NJNiQlWfck9DNEk=">AAACE3icjVC7SgNBFL0bXzG+Vi1tBoNgFXbTaBm0sVQwD0iWMDu5mwyZnV1mZoWw5CMsbPwVGxFbGzv/xkmyhSYWHhg4nHMvd84JU8G18bwvp7S2vrG5Vd6u7Ozu7R+4h0ctnWSKYZMlIlGdkGoUXGLTcCOwkyqkcSiwHY6vZ377AZXmibw3kxSDmA4ljzijxkptnVKGxO+7Va/mzUFWiV+QKhT433jf/ewNEpbFKA0TVOuu76UmyKkynAmcVnqZRnt/TIfYtVTSGHWQzzNNyZlVBiRKlH3SkLn6cyOnsdaTOLSTMTUjvezNxL+8bmaiyyDnMs0MSrY4FGWCmITMCiIDrpAZMbGEMsXtXwkbUUWZsTVWbHR/OegqadVrvlfz7+rVxlXRWRlO4BTOwYcLaMAN3EITGIzhEZ7h1XlyXpw3530xWnKKnWP4BefjG1iblpQ=</latexit>
space 2
<latexit sha1_base64="X7UBoVQavr21YMAA6C/nQxPP8Nc=">AAAB7nicbVBNSwMxEJ34WetX1aOXYBE8ld1e9Fj04rGC/YB2Kdl0tg3NZpckK5SlP8KLB0W8+nu8+W9M2z1o64PA472ZycwLUymM9bxvsrG5tb2zW9or7x8cHh1XTk7bJsk0xxZPZKK7ITMohcKWFVZiN9XI4lBiJ5zczf3OE2ojEvVopykGMRspEQnOrJM6JmUcaX1QqXo1bwG6TvyCVKFAc1D56g8TnsWoLJfMmJ7vpTbImbaCS5yV+5lBN3rCRthzVLEYTZAv1p3RS6cMaZRo95SlC/V3R85iY6Zx6CpjZsdm1ZuL/3m9zEY3QS5UmllUfPlRlElqEzq/nQ6FRm7l1BHGtXC7Uj5mmnHrEiq7EPzVk9dJu17zvZr/UK82bos4SnAOF3AFPlxDA+6hCS3gMIFneIU3kpIX8k4+lqUbpOg5gz8gnz+sXI8c</latexit><latexit sha1_base64="C++06etgmcd/smPxZYwdqkM+dyM=">AAACE3icjVC7SgNBFL0bXzG+Vi1tBoNgFXbTaBm0sVQwD0iWMDu5mwyZnV1mZoWw5CMsbPwVGxFbGzv/xkmyhSYWHhg4nHMvd84JU8G18bwvp7S2vrG5Vd6u7Ozu7R+4h0ctnWSKYZMlIlGdkGoUXGLTcCOwkyqkcSiwHY6vZ377AZXmibw3kxSDmA4ljzijxkptnVKGpN53q17Nm4OsEr8gVSjwv/G++9kbJCyLURomqNZd30tNkFNlOBM4rfQyjfb+mA6xa6mkMeogn2eakjOrDEiUKPukIXP150ZOY60ncWgnY2pGetmbiX953cxEl0HOZZoZlGxxKMoEMQmZFUQGXCEzYmIJZYrbvxI2oooyY2us2Oj+ctBV0qrXfK/m39WrjauiszKcwCmcgw8X0IAbuIUmMBjDIzzDq/PkvDhvzvtitOQUO8fwC87HN1pElpU=</latexit><latexit sha1_base64="C++06etgmcd/smPxZYwdqkM+dyM=">AAACE3icjVC7SgNBFL0bXzG+Vi1tBoNgFXbTaBm0sVQwD0iWMDu5mwyZnV1mZoWw5CMsbPwVGxFbGzv/xkmyhSYWHhg4nHMvd84JU8G18bwvp7S2vrG5Vd6u7Ozu7R+4h0ctnWSKYZMlIlGdkGoUXGLTcCOwkyqkcSiwHY6vZ377AZXmibw3kxSDmA4ljzijxkptnVKGpN53q17Nm4OsEr8gVSjwv/G++9kbJCyLURomqNZd30tNkFNlOBM4rfQyjfb+mA6xa6mkMeogn2eakjOrDEiUKPukIXP150ZOY60ncWgnY2pGetmbiX953cxEl0HOZZoZlGxxKMoEMQmZFUQGXCEzYmIJZYrbvxI2oooyY2us2Oj+ctBV0qrXfK/m39WrjauiszKcwCmcgw8X0IAbuIUmMBjDIzzDq/PkvDhvzvtitOQUO8fwC87HN1pElpU=</latexit><latexit sha1_base64="C++06etgmcd/smPxZYwdqkM+dyM=">AAACE3icjVC7SgNBFL0bXzG+Vi1tBoNgFXbTaBm0sVQwD0iWMDu5mwyZnV1mZoWw5CMsbPwVGxFbGzv/xkmyhSYWHhg4nHMvd84JU8G18bwvp7S2vrG5Vd6u7Ozu7R+4h0ctnWSKYZMlIlGdkGoUXGLTcCOwkyqkcSiwHY6vZ377AZXmibw3kxSDmA4ljzijxkptnVKGpN53q17Nm4OsEr8gVSjwv/G++9kbJCyLURomqNZd30tNkFNlOBM4rfQyjfb+mA6xa6mkMeogn2eakjOrDEiUKPukIXP150ZOY60ncWgnY2pGetmbiX953cxEl0HOZZoZlGxxKMoEMQmZFUQGXCEzYmIJZYrbvxI2oooyY2us2Oj+ctBV0qrXfK/m39WrjauiszKcwCmcgw8X0IAbuIUmMBjDIzzDq/PkvDhvzvtitOQUO8fwC87HN1pElpU=</latexit>
Hypothesis
<latexit sha1_base64="cMheVzeaFOxQm33/2jltUVmfiJc=">AAAB8XicbVBNS8NAEN34WetX1aOXYBE8laQXPRa99FjBfmAbymY7aZdudsPuRAih/8KLB0W8+m+8+W/ctjlo64OBx3szzMwLE8ENet63s7G5tb2zW9or7x8cHh1XTk47RqWaQZspoXQvpAYEl9BGjgJ6iQYahwK64fRu7nefQBuu5ANmCQQxHUsecUbRSo/NLFE4AcPNsFL1at4C7jrxC1IlBVrDytdgpFgag0QmqDF930swyKlGzgTMyoPUQELZlI6hb6mkMZggX1w8cy+tMnIjpW1JdBfq74mcxsZkcWg7Y4oTs+rNxf+8forRTZBzmaQIki0XRalwUbnz990R18BQZJZQprm91WUTqilDG1LZhuCvvrxOOvWa79X8+3q1cVvEUSLn5IJcEZ9ckwZpkhZpE0YkeSav5M0xzovz7nwsWzecYuaM/IHz+QPnipEM</latexit><latexit sha1_base64="B+b/jF1gJbxAde33GnB653x9wZc=">AAACFnicjVC7SgNBFL0bXzG+opY2i0GwCrtptAzapFQwD0yWMDu5mwyZnVlmZoUl5C8sbPwVGxFbsfNvnCRbaGLhgYHDOedy554w4Uwbz/tyCmvrG5tbxe3Szu7e/kH58KilZaooNqnkUnVCopEzgU3DDMdOopDEIcd2OL6e+e0HVJpJcWeyBIOYDAWLGCXGSveNLJFmhJrpfrniVb053FXi56QCOf4X75c/ewNJ0xiFoZxo3fW9xAQTogyjHKelXqoxIXRMhti1VJAYdTCZnzV1z6wycCOp7BPGnas/JyYk1jqLQ5uMiRnpZW8m/uV1UxNdBhMmktSgoItFUcpdI91ZR+6AKaSGZ5YQqpj9q0tHRBFqbJMle7q/fOgqadWqvlf1b2uV+lXeWRFO4BTOwYcLqEMDbqAJFAQ8wjO8Ok/Oi/PmvC+iBSefOYZfcD6+Adk7mIU=</latexit><latexit sha1_base64="B+b/jF1gJbxAde33GnB653x9wZc=">AAACFnicjVC7SgNBFL0bXzG+opY2i0GwCrtptAzapFQwD0yWMDu5mwyZnVlmZoUl5C8sbPwVGxFbsfNvnCRbaGLhgYHDOedy554w4Uwbz/tyCmvrG5tbxe3Szu7e/kH58KilZaooNqnkUnVCopEzgU3DDMdOopDEIcd2OL6e+e0HVJpJcWeyBIOYDAWLGCXGSveNLJFmhJrpfrniVb053FXi56QCOf4X75c/ewNJ0xiFoZxo3fW9xAQTogyjHKelXqoxIXRMhti1VJAYdTCZnzV1z6wycCOp7BPGnas/JyYk1jqLQ5uMiRnpZW8m/uV1UxNdBhMmktSgoItFUcpdI91ZR+6AKaSGZ5YQqpj9q0tHRBFqbJMle7q/fOgqadWqvlf1b2uV+lXeWRFO4BTOwYcLqEMDbqAJFAQ8wjO8Ok/Oi/PmvC+iBSefOYZfcD6+Adk7mIU=</latexit><latexit sha1_base64="B+b/jF1gJbxAde33GnB653x9wZc=">AAACFnicjVC7SgNBFL0bXzG+opY2i0GwCrtptAzapFQwD0yWMDu5mwyZnVlmZoUl5C8sbPwVGxFbsfNvnCRbaGLhgYHDOedy554w4Uwbz/tyCmvrG5tbxe3Szu7e/kH58KilZaooNqnkUnVCopEzgU3DDMdOopDEIcd2OL6e+e0HVJpJcWeyBIOYDAWLGCXGSveNLJFmhJrpfrniVb053FXi56QCOf4X75c/ewNJ0xiFoZxo3fW9xAQTogyjHKelXqoxIXRMhti1VJAYdTCZnzV1z6wycCOp7BPGnas/JyYk1jqLQ5uMiRnpZW8m/uV1UxNdBhMmktSgoItFUcpdI91ZR+6AKaSGZ5YQqpj9q0tHRBFqbJMle7q/fOgqadWqvlf1b2uV+lXeWRFO4BTOwYcLqEMDbqAJFAQ8wjO8Ok/Oi/PmvC+iBSefOYZfcD6+Adk7mIU=</latexit>
Hypothesis
<latexit sha1_base64="cMheVzeaFOxQm33/2jltUVmfiJc=">AAAB8XicbVBNS8NAEN34WetX1aOXYBE8laQXPRa99FjBfmAbymY7aZdudsPuRAih/8KLB0W8+m+8+W/ctjlo64OBx3szzMwLE8ENet63s7G5tb2zW9or7x8cHh1XTk47RqWaQZspoXQvpAYEl9BGjgJ6iQYahwK64fRu7nefQBuu5ANmCQQxHUsecUbRSo/NLFE4AcPNsFL1at4C7jrxC1IlBVrDytdgpFgag0QmqDF930swyKlGzgTMyoPUQELZlI6hb6mkMZggX1w8cy+tMnIjpW1JdBfq74mcxsZkcWg7Y4oTs+rNxf+8forRTZBzmaQIki0XRalwUbnz990R18BQZJZQprm91WUTqilDG1LZhuCvvrxOOvWa79X8+3q1cVvEUSLn5IJcEZ9ckwZpkhZpE0YkeSav5M0xzovz7nwsWzecYuaM/IHz+QPnipEM</latexit><latexit sha1_base64="B+b/jF1gJbxAde33GnB653x9wZc=">AAACFnicjVC7SgNBFL0bXzG+opY2i0GwCrtptAzapFQwD0yWMDu5mwyZnVlmZoUl5C8sbPwVGxFbsfNvnCRbaGLhgYHDOedy554w4Uwbz/tyCmvrG5tbxe3Szu7e/kH58KilZaooNqnkUnVCopEzgU3DDMdOopDEIcd2OL6e+e0HVJpJcWeyBIOYDAWLGCXGSveNLJFmhJrpfrniVb053FXi56QCOf4X75c/ewNJ0xiFoZxo3fW9xAQTogyjHKelXqoxIXRMhti1VJAYdTCZnzV1z6wycCOp7BPGnas/JyYk1jqLQ5uMiRnpZW8m/uV1UxNdBhMmktSgoItFUcpdI91ZR+6AKaSGZ5YQqpj9q0tHRBFqbJMle7q/fOgqadWqvlf1b2uV+lXeWRFO4BTOwYcLqEMDbqAJFAQ8wjO8Ok/Oi/PmvC+iBSefOYZfcD6+Adk7mIU=</latexit><latexit sha1_base64="B+b/jF1gJbxAde33GnB653x9wZc=">AAACFnicjVC7SgNBFL0bXzG+opY2i0GwCrtptAzapFQwD0yWMDu5mwyZnVlmZoUl5C8sbPwVGxFbsfNvnCRbaGLhgYHDOedy554w4Uwbz/tyCmvrG5tbxe3Szu7e/kH58KilZaooNqnkUnVCopEzgU3DDMdOopDEIcd2OL6e+e0HVJpJcWeyBIOYDAWLGCXGSveNLJFmhJrpfrniVb053FXi56QCOf4X75c/ewNJ0xiFoZxo3fW9xAQTogyjHKelXqoxIXRMhti1VJAYdTCZnzV1z6wycCOp7BPGnas/JyYk1jqLQ5uMiRnpZW8m/uV1UxNdBhMmktSgoItFUcpdI91ZR+6AKaSGZ5YQqpj9q0tHRBFqbJMle7q/fOgqadWqvlf1b2uV+lXeWRFO4BTOwYcLqEMDbqAJFAQ8wjO8Ok/Oi/PmvC+iBSefOYZfcD6+Adk7mIU=</latexit><latexit sha1_base64="B+b/jF1gJbxAde33GnB653x9wZc=">AAACFnicjVC7SgNBFL0bXzG+opY2i0GwCrtptAzapFQwD0yWMDu5mwyZnVlmZoUl5C8sbPwVGxFbsfNvnCRbaGLhgYHDOedy554w4Uwbz/tyCmvrG5tbxe3Szu7e/kH58KilZaooNqnkUnVCopEzgU3DDMdOopDEIcd2OL6e+e0HVJpJcWeyBIOYDAWLGCXGSveNLJFmhJrpfrniVb053FXi56QCOf4X75c/ewNJ0xiFoZxo3fW9xAQTogyjHKelXqoxIXRMhti1VJAYdTCZnzV1z6wycCOp7BPGnas/JyYk1jqLQ5uMiRnpZW8m/uV1UxNdBhMmktSgoItFUcpdI91ZR+6AKaSGZ5YQqpj9q0tHRBFqbJMle7q/fOgqadWqvlf1b2uV+lXeWRFO4BTOwYcLqEMDbqAJFAQ8wjO8Ok/Oi/PmvC+iBSefOYZfcD6+Adk7mIU=</latexit>
space 1
<latexit sha1_base64="18Zjza79/9bv33YsePIgOfZKpy8=">AAAB7nicbVBNSwMxEJ31s9avqkcvwSJ4Kru96LHoxWMF+wHtUrLpbBuaZJckK5SlP8KLB0W8+nu8+W9M2z1o64PA472ZycyLUsGN9f1vb2Nza3tnt7RX3j84PDqunJy2TZJphi2WiER3I2pQcIUty63AbqqRykhgJ5rczf3OE2rDE/VopymGko4Ujzmj1kkdk1KGJBhUqn7NX4Csk6AgVSjQHFS++sOEZRKVZYIa0wv81IY51ZYzgbNyPzPoRk/oCHuOKirRhPli3Rm5dMqQxIl2T1myUH935FQaM5WRq5TUjs2qNxf/83qZjW/CnKs0s6jY8qM4E8QmZH47GXKNzIqpI5Rp7nYlbEw1ZdYlVHYhBKsnr5N2vRb4teChXm3cFnGU4Bwu4AoCuIYG3EMTWsBgAs/wCm9e6r14797HsnTDK3rO4A+8zx+q2I8b</latexit><latexit sha1_base64="mPJ5vUjtI9Z3NJNiQlWfck9DNEk=">AAACE3icjVC7SgNBFL0bXzG+Vi1tBoNgFXbTaBm0sVQwD0iWMDu5mwyZnV1mZoWw5CMsbPwVGxFbGzv/xkmyhSYWHhg4nHMvd84JU8G18bwvp7S2vrG5Vd6u7Ozu7R+4h0ctnWSKYZMlIlGdkGoUXGLTcCOwkyqkcSiwHY6vZ377AZXmibw3kxSDmA4ljzijxkptnVKGxO+7Va/mzUFWiV+QKhT433jf/ewNEpbFKA0TVOuu76UmyKkynAmcVnqZRnt/TIfYtVTSGHWQzzNNyZlVBiRKlH3SkLn6cyOnsdaTOLSTMTUjvezNxL+8bmaiyyDnMs0MSrY4FGWCmITMCiIDrpAZMbGEMsXtXwkbUUWZsTVWbHR/OegqadVrvlfz7+rVxlXRWRlO4BTOwYcLaMAN3EITGIzhEZ7h1XlyXpw3530xWnKKnWP4BefjG1iblpQ=</latexit><latexit sha1_base64="mPJ5vUjtI9Z3NJNiQlWfck9DNEk=">AAACE3icjVC7SgNBFL0bXzG+Vi1tBoNgFXbTaBm0sVQwD0iWMDu5mwyZnV1mZoWw5CMsbPwVGxFbGzv/xkmyhSYWHhg4nHMvd84JU8G18bwvp7S2vrG5Vd6u7Ozu7R+4h0ctnWSKYZMlIlGdkGoUXGLTcCOwkyqkcSiwHY6vZ377AZXmibw3kxSDmA4ljzijxkptnVKGxO+7Va/mzUFWiV+QKhT433jf/ewNEpbFKA0TVOuu76UmyKkynAmcVnqZRnt/TIfYtVTSGHWQzzNNyZlVBiRKlH3SkLn6cyOnsdaTOLSTMTUjvezNxL+8bmaiyyDnMs0MSrY4FGWCmITMCiIDrpAZMbGEMsXtXwkbUUWZsTVWbHR/OegqadVrvlfz7+rVxlXRWRlO4BTOwYcLaMAN3EITGIzhEZ7h1XlyXpw3530xWnKKnWP4BefjG1iblpQ=</latexit><latexit sha1_base64="mPJ5vUjtI9Z3NJNiQlWfck9DNEk=">AAACE3icjVC7SgNBFL0bXzG+Vi1tBoNgFXbTaBm0sVQwD0iWMDu5mwyZnV1mZoWw5CMsbPwVGxFbGzv/xkmyhSYWHhg4nHMvd84JU8G18bwvp7S2vrG5Vd6u7Ozu7R+4h0ctnWSKYZMlIlGdkGoUXGLTcCOwkyqkcSiwHY6vZ377AZXmibw3kxSDmA4ljzijxkptnVKGxO+7Va/mzUFWiV+QKhT433jf/ewNEpbFKA0TVOuu76UmyKkynAmcVnqZRnt/TIfYtVTSGHWQzzNNyZlVBiRKlH3SkLn6cyOnsdaTOLSTMTUjvezNxL+8bmaiyyDnMs0MSrY4FGWCmITMCiIDrpAZMbGEMsXtXwkbUUWZsTVWbHR/OegqadVrvlfz7+rVxlXRWRlO4BTOwYcLaMAN3EITGIzhEZ7h1XlyXpw3530xWnKKnWP4BefjG1iblpQ=</latexit>
space 2
<latexit sha1_base64="X7UBoVQavr21YMAA6C/nQxPP8Nc=">AAAB7nicbVBNSwMxEJ34WetX1aOXYBE8ld1e9Fj04rGC/YB2Kdl0tg3NZpckK5SlP8KLB0W8+nu8+W9M2z1o64PA472ZycwLUymM9bxvsrG5tb2zW9or7x8cHh1XTk7bJsk0xxZPZKK7ITMohcKWFVZiN9XI4lBiJ5zczf3OE2ojEvVopykGMRspEQnOrJM6JmUcaX1QqXo1bwG6TvyCVKFAc1D56g8TnsWoLJfMmJ7vpTbImbaCS5yV+5lBN3rCRthzVLEYTZAv1p3RS6cMaZRo95SlC/V3R85iY6Zx6CpjZsdm1ZuL/3m9zEY3QS5UmllUfPlRlElqEzq/nQ6FRm7l1BHGtXC7Uj5mmnHrEiq7EPzVk9dJu17zvZr/UK82bos4SnAOF3AFPlxDA+6hCS3gMIFneIU3kpIX8k4+lqUbpOg5gz8gnz+sXI8c</latexit><latexit sha1_base64="C++06etgmcd/smPxZYwdqkM+dyM=">AAACE3icjVC7SgNBFL0bXzG+Vi1tBoNgFXbTaBm0sVQwD0iWMDu5mwyZnV1mZoWw5CMsbPwVGxFbGzv/xkmyhSYWHhg4nHMvd84JU8G18bwvp7S2vrG5Vd6u7Ozu7R+4h0ctnWSKYZMlIlGdkGoUXGLTcCOwkyqkcSiwHY6vZ377AZXmibw3kxSDmA4ljzijxkptnVKGpN53q17Nm4OsEr8gVSjwv/G++9kbJCyLURomqNZd30tNkFNlOBM4rfQyjfb+mA6xa6mkMeogn2eakjOrDEiUKPukIXP150ZOY60ncWgnY2pGetmbiX953cxEl0HOZZoZlGxxKMoEMQmZFUQGXCEzYmIJZYrbvxI2oooyY2us2Oj+ctBV0qrXfK/m39WrjauiszKcwCmcgw8X0IAbuIUmMBjDIzzDq/PkvDhvzvtitOQUO8fwC87HN1pElpU=</latexit><latexit sha1_base64="C++06etgmcd/smPxZYwdqkM+dyM=">AAACE3icjVC7SgNBFL0bXzG+Vi1tBoNgFXbTaBm0sVQwD0iWMDu5mwyZnV1mZoWw5CMsbPwVGxFbGzv/xkmyhSYWHhg4nHMvd84JU8G18bwvp7S2vrG5Vd6u7Ozu7R+4h0ctnWSKYZMlIlGdkGoUXGLTcCOwkyqkcSiwHY6vZ377AZXmibw3kxSDmA4ljzijxkptnVKGpN53q17Nm4OsEr8gVSjwv/G++9kbJCyLURomqNZd30tNkFNlOBM4rfQyjfb+mA6xa6mkMeogn2eakjOrDEiUKPukIXP150ZOY60ncWgnY2pGetmbiX953cxEl0HOZZoZlGxxKMoEMQmZFUQGXCEzYmIJZYrbvxI2oooyY2us2Oj+ctBV0qrXfK/m39WrjauiszKcwCmcgw8X0IAbuIUmMBjDIzzDq/PkvDhvzvtitOQUO8fwC87HN1pElpU=</latexit><latexit sha1_base64="C++06etgmcd/smPxZYwdqkM+dyM=">AAACE3icjVC7SgNBFL0bXzG+Vi1tBoNgFXbTaBm0sVQwD0iWMDu5mwyZnV1mZoWw5CMsbPwVGxFbGzv/xkmyhSYWHhg4nHMvd84JU8G18bwvp7S2vrG5Vd6u7Ozu7R+4h0ctnWSKYZMlIlGdkGoUXGLTcCOwkyqkcSiwHY6vZ377AZXmibw3kxSDmA4ljzijxkptnVKGpN53q17Nm4OsEr8gVSjwv/G++9kbJCyLURomqNZd30tNkFNlOBM4rfQyjfb+mA6xa6mkMeogn2eakjOrDEiUKPukIXP150ZOY60ncWgnY2pGetmbiX953cxEl0HOZZoZlGxxKMoEMQmZFUQGXCEzYmIJZYrbvxI2oooyY2us2Oj+ctBV0qrXfK/m39WrjauiszKcwCmcgw8X0IAbuIUmMBjDIzzDq/PkvDhvzvtitOQUO8fwC87HN1pElpU=</latexit>
Hypothesis
<latexit sha1_base64="cMheVzeaFOxQm33/2jltUVmfiJc=">AAAB8XicbVBNS8NAEN34WetX1aOXYBE8laQXPRa99FjBfmAbymY7aZdudsPuRAih/8KLB0W8+m+8+W/ctjlo64OBx3szzMwLE8ENet63s7G5tb2zW9or7x8cHh1XTk47RqWaQZspoXQvpAYEl9BGjgJ6iQYahwK64fRu7nefQBuu5ANmCQQxHUsecUbRSo/NLFE4AcPNsFL1at4C7jrxC1IlBVrDytdgpFgag0QmqDF930swyKlGzgTMyoPUQELZlI6hb6mkMZggX1w8cy+tMnIjpW1JdBfq74mcxsZkcWg7Y4oTs+rNxf+8forRTZBzmaQIki0XRalwUbnz990R18BQZJZQprm91WUTqilDG1LZhuCvvrxOOvWa79X8+3q1cVvEUSLn5IJcEZ9ckwZpkhZpE0YkeSav5M0xzovz7nwsWzecYuaM/IHz+QPnipEM</latexit><latexit sha1_base64="B+b/jF1gJbxAde33GnB653x9wZc=">AAACFnicjVC7SgNBFL0bXzG+opY2i0GwCrtptAzapFQwD0yWMDu5mwyZnVlmZoUl5C8sbPwVGxFbsfNvnCRbaGLhgYHDOedy554w4Uwbz/tyCmvrG5tbxe3Szu7e/kH58KilZaooNqnkUnVCopEzgU3DDMdOopDEIcd2OL6e+e0HVJpJcWeyBIOYDAWLGCXGSveNLJFmhJrpfrniVb053FXi56QCOf4X75c/ewNJ0xiFoZxo3fW9xAQTogyjHKelXqoxIXRMhti1VJAYdTCZnzV1z6wycCOp7BPGnas/JyYk1jqLQ5uMiRnpZW8m/uV1UxNdBhMmktSgoItFUcpdI91ZR+6AKaSGZ5YQqpj9q0tHRBFqbJMle7q/fOgqadWqvlf1b2uV+lXeWRFO4BTOwYcLqEMDbqAJFAQ8wjO8Ok/Oi/PmvC+iBSefOYZfcD6+Adk7mIU=</latexit><latexit sha1_base64="B+b/jF1gJbxAde33GnB653x9wZc=">AAACFnicjVC7SgNBFL0bXzG+opY2i0GwCrtptAzapFQwD0yWMDu5mwyZnVlmZoUl5C8sbPwVGxFbsfNvnCRbaGLhgYHDOedy554w4Uwbz/tyCmvrG5tbxe3Szu7e/kH58KilZaooNqnkUnVCopEzgU3DDMdOopDEIcd2OL6e+e0HVJpJcWeyBIOYDAWLGCXGSveNLJFmhJrpfrniVb053FXi56QCOf4X75c/ewNJ0xiFoZxo3fW9xAQTogyjHKelXqoxIXRMhti1VJAYdTCZnzV1z6wycCOp7BPGnas/JyYk1jqLQ5uMiRnpZW8m/uV1UxNdBhMmktSgoItFUcpdI91ZR+6AKaSGZ5YQqpj9q0tHRBFqbJMle7q/fOgqadWqvlf1b2uV+lXeWRFO4BTOwYcLqEMDbqAJFAQ8wjO8Ok/Oi/PmvC+iBSefOYZfcD6+Adk7mIU=</latexit><latexit sha1_base64="B+b/jF1gJbxAde33GnB653x9wZc=">AAACFnicjVC7SgNBFL0bXzG+opY2i0GwCrtptAzapFQwD0yWMDu5mwyZnVlmZoUl5C8sbPwVGxFbsfNvnCRbaGLhgYHDOedy554w4Uwbz/tyCmvrG5tbxe3Szu7e/kH58KilZaooNqnkUnVCopEzgU3DDMdOopDEIcd2OL6e+e0HVJpJcWeyBIOYDAWLGCXGSveNLJFmhJrpfrniVb053FXi56QCOf4X75c/ewNJ0xiFoZxo3fW9xAQTogyjHKelXqoxIXRMhti1VJAYdTCZnzV1z6wycCOp7BPGnas/JyYk1jqLQ5uMiRnpZW8m/uV1UxNdBhMmktSgoItFUcpdI91ZR+6AKaSGZ5YQqpj9q0tHRBFqbJMle7q/fOgqadWqvlf1b2uV+lXeWRFO4BTOwYcLqEMDbqAJFAQ8wjO8Ok/Oi/PmvC+iBSefOYZfcD6+Adk7mIU=</latexit>
Scale up
<latexit sha1_base64="ctwBWq2U9QXa7DTNTrIa0ueuveo=">AAAB73icbVA9TwJBEJ3DL8Qv1NJmIzGxInc0WhJtLDHKRwIXsrfMwYa9vWN3z4Rc+BM2Fhpj69+x89+4wBUKvmSSl/dmMjMvSATXxnW/ncLG5tb2TnG3tLd/cHhUPj5p6ThVDJssFrHqBFSj4BKbhhuBnUQhjQKB7WB8O/fbT6g0j+WjmSboR3QoecgZNVbqPDAqkKRJv1xxq+4CZJ14OalAjka//NUbxCyNUBomqNZdz02Mn1FlOBM4K/VSjQllYzrErqWSRqj9bHHvjFxYZUDCWNmShizU3xMZjbSeRoHtjKgZ6VVvLv7ndVMTXvsZl0lqULLlojAVxMRk/jwZcIXMiKkllClubyVsRBVlxkZUsiF4qy+vk1at6rlV775Wqd/kcRThDM7hEjy4gjrcQQOawEDAM7zCmzNxXpx352PZWnDymVP4A+fzB6uuj7U=</latexit><latexit sha1_base64="hOETAKy0LkJCk7ZKzpax/aB3fFU=">AAACFHicjVC7SgNBFL0bXzG+opY2g0GwCrtpTBm0sVQ0D0iWMDu5mwyZnV1mZoWw5CcsbPwVGxFbCzv/xkmyhSYWHhg4nHMud+4JEsG1cd0vp7C2vrG5Vdwu7ezu7R+UD49aOk4VwyaLRaw6AdUouMSm4UZgJ1FIo0BgOxhfzfz2AyrNY3lvJgn6ER1KHnJGjZU6d4wKJGnSL1fcqjsHWSVeTiqQ43/xfvmzN4hZGqE0TFCtu56bGD+jynAmcFrqpRoTysZ0iF1LJY1Q+9n8qCk5s8qAhLGyTxoyV39OZDTSehIFNhlRM9LL3kz8y+umJqz7GZdJalCyxaIwFcTEZNYQGXCFzIiJJZQpbv9K2IgqyoztsWRP95YPXSWtWtVzq95trdK4zDsrwgmcwjl4cAENuIYbaAIDAY/wDK/Ok/PivDnvi2jByWeO4Recj29uZpcu</latexit><latexit sha1_base64="hOETAKy0LkJCk7ZKzpax/aB3fFU=">AAACFHicjVC7SgNBFL0bXzG+opY2g0GwCrtpTBm0sVQ0D0iWMDu5mwyZnV1mZoWw5CcsbPwVGxFbCzv/xkmyhSYWHhg4nHMud+4JEsG1cd0vp7C2vrG5Vdwu7ezu7R+UD49aOk4VwyaLRaw6AdUouMSm4UZgJ1FIo0BgOxhfzfz2AyrNY3lvJgn6ER1KHnJGjZU6d4wKJGnSL1fcqjsHWSVeTiqQ43/xfvmzN4hZGqE0TFCtu56bGD+jynAmcFrqpRoTysZ0iF1LJY1Q+9n8qCk5s8qAhLGyTxoyV39OZDTSehIFNhlRM9LL3kz8y+umJqz7GZdJalCyxaIwFcTEZNYQGXCFzIiJJZQpbv9K2IgqyoztsWRP95YPXSWtWtVzq95trdK4zDsrwgmcwjl4cAENuIYbaAIDAY/wDK/Ok/PivDnvi2jByWeO4Recj29uZpcu</latexit><latexit sha1_base64="hOETAKy0LkJCk7ZKzpax/aB3fFU=">AAACFHicjVC7SgNBFL0bXzG+opY2g0GwCrtpTBm0sVQ0D0iWMDu5mwyZnV1mZoWw5CcsbPwVGxFbCzv/xkmyhSYWHhg4nHMud+4JEsG1cd0vp7C2vrG5Vdwu7ezu7R+UD49aOk4VwyaLRaw6AdUouMSm4UZgJ1FIo0BgOxhfzfz2AyrNY3lvJgn6ER1KHnJGjZU6d4wKJGnSL1fcqjsHWSVeTiqQ43/xfvmzN4hZGqE0TFCtu56bGD+jynAmcFrqpRoTysZ0iF1LJY1Q+9n8qCk5s8qAhLGyTxoyV39OZDTSehIFNhlRM9LL3kz8y+umJqz7GZdJalCyxaIwFcTEZNYQGXCFzIiJJZQpbv9K2IgqyoztsWRP95YPXSWtWtVzq95trdK4zDsrwgmcwjl4cAENuIYbaAIDAY/wDK/Ok/PivDnvi2jByWeO4Recj29uZpcu</latexit>
architectures
<latexit sha1_base64="HxVNIkMHx1zZk5+9XhEjo2dCrT0=">AAAB9XicbVDLTgJBEOz1ifhCPXqZSEw8kV0ueiR68YiJPBJYyezQwITZR2Z6NWTDf3jxoDFe/Rdv/o0D7EHBSjqpVHWnuytIlDTkut/O2vrG5tZ2Yae4u7d/cFg6Om6aONUCGyJWsW4H3KCSETZIksJ2opGHgcJWML6Z+a1H1EbG0T1NEvRDPozkQApOVnpgXIuRJBSUajS9UtmtuHOwVeLlpAw56r3SV7cfizTEiITixnQ8NyE/45qkUDgtdlODCRdjPsSOpREP0fjZ/OopO7dKnw1ibSsiNld/T2Q8NGYSBrYz5DQyy95M/M/rpDS48jMZJSlhJBaLBqliFLNZBKwvtX1YTSzhQkt7KxMjrrkgG1TRhuAtv7xKmtWK51a8u2q5dp3HUYBTOIML8OASanALdWiAAA3P8ApvzpPz4rw7H4vWNSefOYE/cD5/AKmokpo=</latexit><latexit sha1_base64="1Bw8ZUC5PEBHtAaJdkh29WJ9Gbc=">AAACGnicjVC7SgNBFL3rM8ZX1NJmMAhWYTeNlkEbSwXzgCSG2cndZMjsg5m7QljyHxY2/oqNiJ3Y+DdOki00sfDAwOGcc7lzj58oach1v5yV1bX1jc3CVnF7Z3dvv3Rw2DBxqgXWRaxi3fK5QSUjrJMkha1EIw99hU1/dDX1mw+ojYyjOxon2A35IJKBFJysdM+4FkNJKCjVaHqlsltxZ2DLxMtJGXL8L94rfXT6sUhDjEgobkzbcxPqZlyTFAonxU5qMOFixAfYtjTiIZpuNjttwk6t0mdBrO2LiM3UnxMZD40Zh75NhpyGZtGbin957ZSCi24moyQljMR8UZAqRjGb9sT6UttW1NgSLrS0f2ViyDUXZNss2tO9xUOXSaNa8dyKd1st1y7zzgpwDCdwBh6cQw2u4QbqIEDDIzzDq/PkvDhvzvs8uuLkM0fwC87nN8+rmhM=</latexit><latexit sha1_base64="1Bw8ZUC5PEBHtAaJdkh29WJ9Gbc=">AAACGnicjVC7SgNBFL3rM8ZX1NJmMAhWYTeNlkEbSwXzgCSG2cndZMjsg5m7QljyHxY2/oqNiJ3Y+DdOki00sfDAwOGcc7lzj58oach1v5yV1bX1jc3CVnF7Z3dvv3Rw2DBxqgXWRaxi3fK5QSUjrJMkha1EIw99hU1/dDX1mw+ojYyjOxon2A35IJKBFJysdM+4FkNJKCjVaHqlsltxZ2DLxMtJGXL8L94rfXT6sUhDjEgobkzbcxPqZlyTFAonxU5qMOFixAfYtjTiIZpuNjttwk6t0mdBrO2LiM3UnxMZD40Zh75NhpyGZtGbin957ZSCi24moyQljMR8UZAqRjGb9sT6UttW1NgSLrS0f2ViyDUXZNss2tO9xUOXSaNa8dyKd1st1y7zzgpwDCdwBh6cQw2u4QbqIEDDIzzDq/PkvDhvzvs8uuLkM0fwC87nN8+rmhM=</latexit><latexit sha1_base64="1Bw8ZUC5PEBHtAaJdkh29WJ9Gbc=">AAACGnicjVC7SgNBFL3rM8ZX1NJmMAhWYTeNlkEbSwXzgCSG2cndZMjsg5m7QljyHxY2/oqNiJ3Y+DdOki00sfDAwOGcc7lzj58oach1v5yV1bX1jc3CVnF7Z3dvv3Rw2DBxqgXWRaxi3fK5QSUjrJMkha1EIw99hU1/dDX1mw+ojYyjOxon2A35IJKBFJysdM+4FkNJKCjVaHqlsltxZ2DLxMtJGXL8L94rfXT6sUhDjEgobkzbcxPqZlyTFAonxU5qMOFixAfYtjTiIZpuNjttwk6t0mdBrO2LiM3UnxMZD40Zh75NhpyGZtGbin957ZSCi24moyQljMR8UZAqRjGb9sT6UttW1NgSLrS0f2ViyDUXZNss2tO9xUOXSaNa8dyKd1st1y7zzgpwDCdwBh6cQw2u4QbqIEDDIzzDq/PkvDhvzvs8uuLkM0fwC87nN8+rmhM=</latexit>
Loss
<latexit sha1_base64="Qv+C7rKsOIBGxHPCTreDEtY3tEw=">AAAB63icbVC7SgNBFL0bXzG+opY2g0GwCrtptAzaWFhEMA9IljA7mU2GzGOZmRXCkl+wsVDE1h+y82+cTbbQxAMXDufcy733RAlnxvr+t1fa2Nza3invVvb2Dw6PqscnHaNSTWibKK50L8KGciZp2zLLaS/RFIuI0240vc397hPVhin5aGcJDQUeSxYzgm0u3StjhtWaX/cXQOskKEgNCrSG1a/BSJFUUGkJx8b0Az+xYYa1ZYTTeWWQGppgMsVj2ndUYkFNmC1unaMLp4xQrLQradFC/T2RYWHMTESuU2A7MateLv7n9VMbX4cZk0lqqSTLRXHKkVUofxyNmKbE8pkjmGjmbkVkgjUm1sVTcSEEqy+vk06jHvj14KFRa94UcZThDM7hEgK4gibcQQvaQGACz/AKb57wXrx372PZWvKKmVP4A+/zBxjRjkE=</latexit><latexit sha1_base64="4YxSAN2nlAicW/wTm7Ptf3oazXc=">AAACEHicjVC7TsMwFL3mWcqrwMhiUSExVUkXGCtYGBhAog+pjSrHdVqrjh3ZDlIV9RcYWPgVFoRYGdn4G5w2A7QMHMnS0Tnn6vqeMBHcWM/7Qiura+sbm6Wt8vbO7t5+5eCwZVSqKWtSJZTuhMQwwSVrWm4F6ySakTgUrB2Or3K//cC04Ure20nCgpgMJY84JTaXbpQx/UrVq3kz4GXiF6QKBf4X71c+ewNF05hJSwUxput7iQ0yoi2ngk3LvdSwhNAxGbKuo5LEzATZ7KApPnXKAEdKuyctnqk/JzISGzOJQ5eMiR2ZRS8X//K6qY0ugozLJLVM0vmiKBXYKpy3gwdcM2rFxBFCNXd/xXRENKHWdVh2p/uLhy6TVr3mezX/rl5tXBadleAYTuAMfDiHBlzDLTSBwgge4Rle0RN6QW/ofR5dQcXMEfwC+vgGquqVug==</latexit><latexit sha1_base64="4YxSAN2nlAicW/wTm7Ptf3oazXc=">AAACEHicjVC7TsMwFL3mWcqrwMhiUSExVUkXGCtYGBhAog+pjSrHdVqrjh3ZDlIV9RcYWPgVFoRYGdn4G5w2A7QMHMnS0Tnn6vqeMBHcWM/7Qiura+sbm6Wt8vbO7t5+5eCwZVSqKWtSJZTuhMQwwSVrWm4F6ySakTgUrB2Or3K//cC04Ure20nCgpgMJY84JTaXbpQx/UrVq3kz4GXiF6QKBf4X71c+ewNF05hJSwUxput7iQ0yoi2ngk3LvdSwhNAxGbKuo5LEzATZ7KApPnXKAEdKuyctnqk/JzISGzOJQ5eMiR2ZRS8X//K6qY0ugozLJLVM0vmiKBXYKpy3gwdcM2rFxBFCNXd/xXRENKHWdVh2p/uLhy6TVr3mezX/rl5tXBadleAYTuAMfDiHBlzDLTSBwgge4Rle0RN6QW/ofR5dQcXMEfwC+vgGquqVug==</latexit><latexit sha1_base64="4YxSAN2nlAicW/wTm7Ptf3oazXc=">AAACEHicjVC7TsMwFL3mWcqrwMhiUSExVUkXGCtYGBhAog+pjSrHdVqrjh3ZDlIV9RcYWPgVFoRYGdn4G5w2A7QMHMnS0Tnn6vqeMBHcWM/7Qiura+sbm6Wt8vbO7t5+5eCwZVSqKWtSJZTuhMQwwSVrWm4F6ySakTgUrB2Or3K//cC04Ure20nCgpgMJY84JTaXbpQx/UrVq3kz4GXiF6QKBf4X71c+ewNF05hJSwUxput7iQ0yoi2ngk3LvdSwhNAxGbKuo5LEzATZ7KApPnXKAEdKuyctnqk/JzISGzOJQ5eMiR2ZRS8X//K6qY0ugozLJLVM0vmiKBXYKpy3gwdcM2rFxBFCNXd/xXRENKHWdVh2p/uLhy6TVr3mezX/rl5tXBadleAYTuAMfDiHBlzDLTSBwgge4Rle0RN6QW/ofR5dQcXMEfwC+vgGquqVug==</latexit>
Figure 5.
The Capacity Hypothesis:If an optimal representation exists in function space, larger hypothesis spaces are more likely to
cover it.LEFT:Two small models might not cover the optimum and thus finddifferentsolutions (marked by outlined
☆).RIGHT:As
the models become larger, they cover the optimum and converge to the same solution (marked by filled
⋆). Solves task 1
<latexit sha1_base64="51NXQkDypcG/A+WtwXqm2WeyfWo=">AAAB9HicbVA9TwJBEJ3DL8Qv1NJmIzGxInc0WhJtLDHKRwIXsrfswYa923N3joRc+B02Fhpj64+x89+4wBUKvmSSl/dmMjMvSKQw6LrfTmFjc2t7p7hb2ts/ODwqH5+0jEo1402mpNKdgBouRcybKFDyTqI5jQLJ28H4du63J1wboeJHnCbcj+gwFqFgFK3kPyg54YYgNWPi9csVt+ouQNaJl5MK5Gj0y1+9gWJpxGNkkhrT9dwE/YxqFEzyWamXGp5QNqZD3rU0phE3frY4ekYurDIgodK2YiQL9fdERiNjplFgOyOKI7PqzcX/vG6K4bWfiThJkcdsuShMJUFF5gmQgdCcoZxaQpkW9lbCRlRThjankg3BW315nbRqVc+teve1Sv0mj6MIZ3AOl+DBFdThDhrQBAZP8Ayv8OZMnBfn3flYthacfOYU/sD5/AEMH5Ga</latexit><latexit sha1_base64="FW6agll44/+cHGyWR3ULXYfdBVs=">AAACGXicjVC7TsMwFL3hWcqrwMhiUSExVUkXGCtYGEHQh9RGleM6rVXHCfZNpSrqdzCw8CssCDHCxN/gthmgZeBIlo7OOVfX9wSJFAZd98tZWV1b39gsbBW3d3b39ksHhw0Tp5rxOotlrFsBNVwKxesoUPJWojmNAsmbwfBq6jdHXBsRq3scJ9yPaF+JUDCKVvLvYjnihiA1Q+J1S2W34s5AlomXkzLk+F+8W/ro9GKWRlwhk9SYtucm6GdUo2CST4qd1PCEsiHt87alikbc+Nnssgk5tUqPhLG2TyGZqT8nMhoZM44Cm4woDsyiNxX/8tophhd+JlSSIldsvihMJcGYTGsiPaE5Qzm2hDIt7F8JG1BNGdoyi/Z0b/HQZdKoVjy34t1Wy7XLvLMCHMMJnIEH51CDa7iBOjB4gEd4hlfnyXlx3pz3eXTFyWeO4Becz28Ob5kT</latexit><latexit sha1_base64="FW6agll44/+cHGyWR3ULXYfdBVs=">AAACGXicjVC7TsMwFL3hWcqrwMhiUSExVUkXGCtYGEHQh9RGleM6rVXHCfZNpSrqdzCw8CssCDHCxN/gthmgZeBIlo7OOVfX9wSJFAZd98tZWV1b39gsbBW3d3b39ksHhw0Tp5rxOotlrFsBNVwKxesoUPJWojmNAsmbwfBq6jdHXBsRq3scJ9yPaF+JUDCKVvLvYjnihiA1Q+J1S2W34s5AlomXkzLk+F+8W/ro9GKWRlwhk9SYtucm6GdUo2CST4qd1PCEsiHt87alikbc+Nnssgk5tUqPhLG2TyGZqT8nMhoZM44Cm4woDsyiNxX/8tophhd+JlSSIldsvihMJcGYTGsiPaE5Qzm2hDIt7F8JG1BNGdoyi/Z0b/HQZdKoVjy34t1Wy7XLvLMCHMMJnIEH51CDa7iBOjB4gEd4hlfnyXlx3pz3eXTFyWeO4Becz28Ob5kT</latexit><latexit sha1_base64="FW6agll44/+cHGyWR3ULXYfdBVs=">AAACGXicjVC7TsMwFL3hWcqrwMhiUSExVUkXGCtYGEHQh9RGleM6rVXHCfZNpSrqdzCw8CssCDHCxN/gthmgZeBIlo7OOVfX9wSJFAZd98tZWV1b39gsbBW3d3b39ksHhw0Tp5rxOotlrFsBNVwKxesoUPJWojmNAsmbwfBq6jdHXBsRq3scJ9yPaF+JUDCKVvLvYjnihiA1Q+J1S2W34s5AlomXkzLk+F+8W/ro9GKWRlwhk9SYtucm6GdUo2CST4qd1PCEsiHt87alikbc+Nnssgk5tUqPhLG2TyGZqT8nMhoZM44Cm4woDsyiNxX/8tophhd+JlSSIldsvihMJcGYTGsiPaE5Qzm2hDIt7F8JG1BNGdoyi/Z0b/HQZdKoVjy34t1Wy7XLvLMCHMMJnIEH51CDa7iBOjB4gEd4hlfnyXlx3pz3eXTFyWeO4Becz28Ob5kT</latexit>
Solves task
<latexit sha1_base64="vHe9mN3flA+xmkcJNFXcKeX8kHA=">AAAB8nicbVBNS8NAEJ34WetX1aOXxSJ4Kkkveix68VjRfkAayma7aZdusmF3UiihP8OLB0W8+mu8+W/ctjlo64OBx3szzMwLUykMuu63s7G5tb2zW9or7x8cHh1XTk7bRmWa8RZTUuluSA2XIuEtFCh5N9WcxqHknXB8N/c7E66NUMkTTlMexHSYiEgwilbyH5WccEOQmnG/UnVr7gJknXgFqUKBZr/y1RsolsU8QSapMb7nphjkVKNgks/KvczwlLIxHXLf0oTG3AT54uQZubTKgERK20qQLNTfEzmNjZnGoe2MKY7MqjcX//P8DKObIBdJmiFP2HJRlEmCisz/JwOhOUM5tYQyLeythI2opgxtSmUbgrf68jpp12ueW/Me6tXGbRFHCc7hAq7Ag2towD00oQUMFDzDK7w56Lw4787HsnXDKWbO4A+czx8+nJE1</latexit><latexit sha1_base64="oLuSoEdJx+dHxGxHzsUUdyqXKSE=">AAACF3icjVC7SgNBFL0bXzG+opY2g0GwCrtptAzaWCqaB2yWMDuZTYbM7iwzdwMh5DMsbPwVGxFb7fwbJ8kWmlh4YOBwzrncuSdMpTDoul9OYW19Y3OruF3a2d3bPygfHjWNyjTjDaak0u2QGi5FwhsoUPJ2qjmNQ8lb4fB65rdGXBuhkgccpzyIaT8RkWAUreTfKznihiA1w2654lbdOcgq8XJSgRz/i3fLn52eYlnME2SSGuN7borBhGoUTPJpqZMZnlI2pH3uW5rQmJtgMr9rSs6s0iOR0vYlSObqz4kJjY0Zx6FNxhQHZtmbiX95fobRZTARSZohT9hiUZRJgorMSiI9oTlDObaEMi3sXwkbUE0Z2ipL9nRv+dBV0qxVPbfq3dUq9au8syKcwCmcgwcXUIcbuIUGMFDwCM/w6jw5L86b876IFpx85hh+wfn4BjTtmK4=</latexit><latexit sha1_base64="oLuSoEdJx+dHxGxHzsUUdyqXKSE=">AAACF3icjVC7SgNBFL0bXzG+opY2g0GwCrtptAzaWCqaB2yWMDuZTYbM7iwzdwMh5DMsbPwVGxFb7fwbJ8kWmlh4YOBwzrncuSdMpTDoul9OYW19Y3OruF3a2d3bPygfHjWNyjTjDaak0u2QGi5FwhsoUPJ2qjmNQ8lb4fB65rdGXBuhkgccpzyIaT8RkWAUreTfKznihiA1w2654lbdOcgq8XJSgRz/i3fLn52eYlnME2SSGuN7borBhGoUTPJpqZMZnlI2pH3uW5rQmJtgMr9rSs6s0iOR0vYlSObqz4kJjY0Zx6FNxhQHZtmbiX95fobRZTARSZohT9hiUZRJgorMSiI9oTlDObaEMi3sXwkbUE0Z2ipL9nRv+dBV0qxVPbfq3dUq9au8syKcwCmcgwcXUIcbuIUGMFDwCM/w6jw5L86b876IFpx85hh+wfn4BjTtmK4=</latexit><latexit sha1_base64="oLuSoEdJx+dHxGxHzsUUdyqXKSE=">AAACF3icjVC7SgNBFL0bXzG+opY2g0GwCrtptAzaWCqaB2yWMDuZTYbM7iwzdwMh5DMsbPwVGxFb7fwbJ8kWmlh4YOBwzrncuSdMpTDoul9OYW19Y3OruF3a2d3bPygfHjWNyjTjDaak0u2QGi5FwhsoUPJ2qjmNQ8lb4fB65rdGXBuhkgccpzyIaT8RkWAUreTfKznihiA1w2654lbdOcgq8XJSgRz/i3fLn52eYlnME2SSGuN7borBhGoUTPJpqZMZnlI2pH3uW5rQmJtgMr9rSs6s0iOR0vYlSObqz4kJjY0Zx6FNxhQHZtmbiX95fobRZTARSZohT9hiUZRJgorMSiI9oTlDObaEMi3sXwkbUE0Z2ipL9nRv+dBV0qxVPbfq3dUq9au8syKcwCmcgwcXUIcbuIUGMFDwCM/w6jw5L86b876IFpx85hh+wfn4BjTtmK4=</latexit>
2
<latexit sha1_base64="yq8/5g8NcY6cy0ADZKiGUGrj1YY=">AAAB6HicbVA9TwJBEJ3DL8Qv1NJmIzGxInc0UhJtLCGRjwQuZG+Zg5W9vcvungm58AtsLDTG1p9k579xgSsUfMkkL+/NZGZekAiujet+O4Wt7Z3dveJ+6eDw6PikfHrW0XGqGLZZLGLVC6hGwSW2DTcCe4lCGgUCu8H0buF3n1BpHssHM0vQj+hY8pAzaqzUqg3LFbfqLkE2iZeTCuRoDstfg1HM0gilYYJq3ffcxPgZVYYzgfPSINWYUDalY+xbKmmE2s+Wh87JlVVGJIyVLWnIUv09kdFI61kU2M6Imole9xbif14/NWHdz7hMUoOSrRaFqSAmJouvyYgrZEbMLKFMcXsrYROqKDM2m5INwVt/eZN0alXPrXqtWqVxm8dRhAu4hGvw4AYacA9NaAMDhGd4hTfn0Xlx3p2PVWvByWfO4Q+czx97gYy0</latexit><latexit sha1_base64="g11zAFHkyS//eho1X0Cng1EOxtU=">AAACDXicjVC7SgNBFL0bXzG+opY2g0GwCrtpTBm0sTRgHpAsYXZyNxkyO7vMzAphyRdY2PgrNiK29nb+jZNkC00sPDBwOOdc7twTJIJr47pfTmFjc2t7p7hb2ts/ODwqH5+0dZwqhi0Wi1h1A6pRcIktw43AbqKQRoHATjC5mfudB1Sax/LeTBP0IzqSPOSMGis1a4Nyxa26C5B14uWkAjn+Fx+UP/vDmKURSsME1brnuYnxM6oMZwJnpX6qMaFsQkfYs1TSCLWfLa6ZkQurDEkYK/ukIQv150RGI62nUWCTETVjverNxb+8XmrCup9xmaQGJVsuClNBTEzm1ZAhV8iMmFpCmeL2r4SNqaLM2AJL9nRv9dB10q5VPbfqNWuVxnXeWRHO4BwuwYMraMAt3EELGCA8wjO8Ok/Oi/PmvC+jBSefOYVfcD6+AdgglC0=</latexit><latexit sha1_base64="g11zAFHkyS//eho1X0Cng1EOxtU=">AAACDXicjVC7SgNBFL0bXzG+opY2g0GwCrtpTBm0sTRgHpAsYXZyNxkyO7vMzAphyRdY2PgrNiK29nb+jZNkC00sPDBwOOdc7twTJIJr47pfTmFjc2t7p7hb2ts/ODwqH5+0dZwqhi0Wi1h1A6pRcIktw43AbqKQRoHATjC5mfudB1Sax/LeTBP0IzqSPOSMGis1a4Nyxa26C5B14uWkAjn+Fx+UP/vDmKURSsME1brnuYnxM6oMZwJnpX6qMaFsQkfYs1TSCLWfLa6ZkQurDEkYK/ukIQv150RGI62nUWCTETVjverNxb+8XmrCup9xmaQGJVsuClNBTEzm1ZAhV8iMmFpCmeL2r4SNqaLM2AJL9nRv9dB10q5VPbfqNWuVxnXeWRHO4BwuwYMraMAt3EELGCA8wjO8Ok/Oi/PmvC+jBSefOYVfcD6+AdgglC0=</latexit><latexit sha1_base64="g11zAFHkyS//eho1X0Cng1EOxtU=">AAACDXicjVC7SgNBFL0bXzG+opY2g0GwCrtpTBm0sTRgHpAsYXZyNxkyO7vMzAphyRdY2PgrNiK29nb+jZNkC00sPDBwOOdc7twTJIJr47pfTmFjc2t7p7hb2ts/ODwqH5+0dZwqhi0Wi1h1A6pRcIktw43AbqKQRoHATjC5mfudB1Sax/LeTBP0IzqSPOSMGis1a4Nyxa26C5B14uWkAjn+Fx+UP/vDmKURSsME1brnuYnxM6oMZwJnpX6qMaFsQkfYs1TSCLWfLa6ZkQurDEkYK/ukIQv150RGI62nUWCTETVjverNxb+8XmrCup9xmaQGJVsuClNBTEzm1ZAhV8iMmFpCmeL2r4SNqaLM2AJL9nRv9dB10q5VPbfqNWuVxnXeWRHO4BwuwYMraMAt3EELGCA8wjO8Ok/Oi/PmvC+jBSefOYVfcD6+AdgglC0=</latexit>
Hypothesis space
<latexit sha1_base64="tLmIljFWhXs/Bu7+r0Wbg/j8TVU=">AAACoXicfVFNb9NAEN2YrxK+UjhysbCQEEKR3QscK+BQDoiAmrSS14rGm3Gy6n5Yu+OCZfmfcIX/xL9hnQaJtoiRVnr75s3um5myVtJTmv4aRTdu3rp9Z+/u+N79Bw8fTfYfL7xtnMC5sMq60xI8KmlwTpIUntYOQZcKT8qzd0P+5Bydl9YcU1tjoWFtZCUFUKCWk8lRW1vaoJc+9jUIXE6SdJpuI74Osh1I2C5my/1Ry1dWNBoNCQXe51laU9GBIykU9mPeeAwvn8Ea8wANaPRFt7Xex88Ds4or68IxFG/Zvys60N63ugxKDbTxV3MD+a9c3lD1puikqRtCIy4+qhoVk42HOcQr6VCQagMA4WTwGosNOBAUpjUec4NfhdUazKrjxjrd51nRcYUVcbVAR0nGnVxviLvhFrp8j6F7hx+Dk081OiDrXnYc3FpL04dprPmrAf1PCN/+CAO6bIEcGF9bj33Ht81W3XHfh2VlV1dzHSwOplk6zT4fJIdvd2vbY0/ZM/aCZew1O2RHbMbmTLBz9p39YD+jJPoQzaIvF9JotKt5wi5FlP8GkZ7TfQ==</latexit><latexit sha1_base64="DEUQt5Kk9QbDYQMJ1vZ14+xLeyc=">AAACxnicjVHbattAEF2rt8S9Oe1jX0RNoZRipLy0j6Htg19KW4idgFeY0XokL9mL2B0lFULQD+lrviZf0L/pynGgSUrpwMLZM2eYOTN5paSnJPk1iO7cvXf/wc7u8OGjx0+ejvaezb2tncCZsMq64xw8KmlwRpIUHlcOQecKj/KTj33+6BSdl9YcUlNhpqE0spACKFDL0WjaVJbW6KWPfQUCl6NxMkk2Ed8G6RaM2Tb+T77cGzR8ZUWt0ZBQ4P0iTSrKWnAkhcJuyGuPof0JlLgI0IBGn7Ubf138KjCruLAuPEPxhv2zogXtfaPzoNRAa38z15N/yy1qKt5nrTRVTWjEZaOiVjHZuF9WvJIOBakmABBOhlljsQYHgsJKh0Nu8ExYrcGsWm6s090izVqusCCu5uhonHInyzVx1/+Cy08Y3Dv8HCb5UqEDsu5Ny8GVWpoubKPkb3v0LyF8vxIGdH0EcmB8ZT12Ld+YLdrDrgsnSm8e5DaY70/SZJJ+2x8ffNjedoe9YC/Za5ayd+yATdlXNmOCnbKf7JxdRNPIRHV0dimNBtua5+xaRD9+Ay8H2vY=</latexit><latexit sha1_base64="DEUQt5Kk9QbDYQMJ1vZ14+xLeyc=">AAACxnicjVHbattAEF2rt8S9Oe1jX0RNoZRipLy0j6Htg19KW4idgFeY0XokL9mL2B0lFULQD+lrviZf0L/pynGgSUrpwMLZM2eYOTN5paSnJPk1iO7cvXf/wc7u8OGjx0+ejvaezb2tncCZsMq64xw8KmlwRpIUHlcOQecKj/KTj33+6BSdl9YcUlNhpqE0spACKFDL0WjaVJbW6KWPfQUCl6NxMkk2Ed8G6RaM2Tb+T77cGzR8ZUWt0ZBQ4P0iTSrKWnAkhcJuyGuPof0JlLgI0IBGn7Ubf138KjCruLAuPEPxhv2zogXtfaPzoNRAa38z15N/yy1qKt5nrTRVTWjEZaOiVjHZuF9WvJIOBakmABBOhlljsQYHgsJKh0Nu8ExYrcGsWm6s090izVqusCCu5uhonHInyzVx1/+Cy08Y3Dv8HCb5UqEDsu5Ny8GVWpoubKPkb3v0LyF8vxIGdH0EcmB8ZT12Ld+YLdrDrgsnSm8e5DaY70/SZJJ+2x8ffNjedoe9YC/Za5ayd+yATdlXNmOCnbKf7JxdRNPIRHV0dimNBtua5+xaRD9+Ay8H2vY=</latexit><latexit sha1_base64="DEUQt5Kk9QbDYQMJ1vZ14+xLeyc=">AAACxnicjVHbattAEF2rt8S9Oe1jX0RNoZRipLy0j6Htg19KW4idgFeY0XokL9mL2B0lFULQD+lrviZf0L/pynGgSUrpwMLZM2eYOTN5paSnJPk1iO7cvXf/wc7u8OGjx0+ejvaezb2tncCZsMq64xw8KmlwRpIUHlcOQecKj/KTj33+6BSdl9YcUlNhpqE0spACKFDL0WjaVJbW6KWPfQUCl6NxMkk2Ed8G6RaM2Tb+T77cGzR8ZUWt0ZBQ4P0iTSrKWnAkhcJuyGuPof0JlLgI0IBGn7Ubf138KjCruLAuPEPxhv2zogXtfaPzoNRAa38z15N/yy1qKt5nrTRVTWjEZaOiVjHZuF9WvJIOBakmABBOhlljsQYHgsJKh0Nu8ExYrcGsWm6s090izVqusCCu5uhonHInyzVx1/+Cy08Y3Dv8HCb5UqEDsu5Ny8GVWpoubKPkb3v0LyF8vxIGdH0EcmB8ZT12Ld+YLdrDrgsnSm8e5DaY70/SZJJ+2x8ffNjedoe9YC/Za5ayd+yATdlXNmOCnbKf7JxdRNPIRHV0dimNBtua5+xaRD9+Ay8H2vY=</latexit>
Simple
<latexit sha1_base64="zFVvUIF9Z2hD+LDeFqir1+TO8Qw=">AAAB7XicbVA9SwNBEJ3zM8avqKXNYhCswl0aLYM2lhHNByRH2NvMJWt2947dPSGE/AcbC0Vs/T92/hs3yRWa+GDg8d4MM/OiVHBjff/bW1vf2NzaLuwUd/f2Dw5LR8dNk2SaYYMlItHtiBoUXGHDciuwnWqkMhLYikY3M7/1hNrwRD3YcYqhpAPFY86odVLznstUYK9U9iv+HGSVBDkpQ456r/TV7Scsk6gsE9SYTuCnNpxQbTkTOC12M4MpZSM6wI6jiko04WR+7ZScO6VP4kS7UpbM1d8TEyqNGcvIdUpqh2bZm4n/eZ3MxlfhhKs0s6jYYlGcCWITMnud9LlGZsXYEco0d7cSNqSaMusCKroQguWXV0mzWgn8SnBXLdeu8zgKcApncAEBXEINbqEODWDwCM/wCm9e4r14797HonXNy2dO4A+8zx+W+48e</latexit><latexit sha1_base64="UD0yK0w0W+mMuOEwUpm4g629bNE=">AAACEnicjVC7TsMwFL0pr1JeBUYWiwqJqUq6wFjBwgiCPqQ2qhz3pjW1k8h2kKqo/8DAwq+wIMTKxMbf4LYZoGXgSJaOzjlX1/cEieDauO6XU1hZXVvfKG6WtrZ3dvfK+wdNHaeKYYPFIlbtgGoUPMKG4UZgO1FIZSCwFYwup37rAZXmcXRnxgn6kg4iHnJGjZWat1wmAnvlilt1ZyDLxMtJBXL8L94rf3b7MUslRoYJqnXHcxPjZ1QZzgROSt1UY0LZiA6wY2lEJWo/m500ISdW6ZMwVvZFhszUnxMZlVqPZWCTkpqhXvSm4l9eJzXhuZ/xKEkNRmy+KEwFMTGZ9kP6XCEzYmwJZYrbvxI2pIoyY1ss2dO9xUOXSbNW9dyqd1Or1C/yzopwBMdwCh6cQR2u4BoawOAeHuEZXp0n58V5c97n0YKTzxzCLzgf30Z6lpc=</latexit><latexit sha1_base64="UD0yK0w0W+mMuOEwUpm4g629bNE=">AAACEnicjVC7TsMwFL0pr1JeBUYWiwqJqUq6wFjBwgiCPqQ2qhz3pjW1k8h2kKqo/8DAwq+wIMTKxMbf4LYZoGXgSJaOzjlX1/cEieDauO6XU1hZXVvfKG6WtrZ3dvfK+wdNHaeKYYPFIlbtgGoUPMKG4UZgO1FIZSCwFYwup37rAZXmcXRnxgn6kg4iHnJGjZWat1wmAnvlilt1ZyDLxMtJBXL8L94rf3b7MUslRoYJqnXHcxPjZ1QZzgROSt1UY0LZiA6wY2lEJWo/m500ISdW6ZMwVvZFhszUnxMZlVqPZWCTkpqhXvSm4l9eJzXhuZ/xKEkNRmy+KEwFMTGZ9kP6XCEzYmwJZYrbvxI2pIoyY1ss2dO9xUOXSbNW9dyqd1Or1C/yzopwBMdwCh6cQR2u4BoawOAeHuEZXp0n58V5c97n0YKTzxzCLzgf30Z6lpc=</latexit><latexit sha1_base64="UD0yK0w0W+mMuOEwUpm4g629bNE=">AAACEnicjVC7TsMwFL0pr1JeBUYWiwqJqUq6wFjBwgiCPqQ2qhz3pjW1k8h2kKqo/8DAwq+wIMTKxMbf4LYZoGXgSJaOzjlX1/cEieDauO6XU1hZXVvfKG6WtrZ3dvfK+wdNHaeKYYPFIlbtgGoUPMKG4UZgO1FIZSCwFYwup37rAZXmcXRnxgn6kg4iHnJGjZWat1wmAnvlilt1ZyDLxMtJBXL8L94rf3b7MUslRoYJqnXHcxPjZ1QZzgROSt1UY0LZiA6wY2lEJWo/m500ISdW6ZMwVvZFhszUnxMZlVqPZWCTkpqhXvSm4l9eJzXhuZ/xKEkNRmy+KEwFMTGZ9kP6XCEzYmwJZYrbvxI2pIoyY1ss2dO9xUOXSbNW9dyqd1Or1C/yzopwBMdwCh6cQR2u4BoawOAeHuEZXp0n58V5c97n0YKTzxzCLzgf30Z6lpc=</latexit>
functions
<latexit sha1_base64="ha6+cYdMcIxvOCs6820nv4vYs8Q=">AAAB8HicbZDLSgMxFIZP6q3WW9Wlm2ARXJWZbnRZdOOygr1IO5RMmmlDk8yQZIQy9CncuFDErY/jzrcx085CW38IfPznHHLOHyaCG+t536i0sbm1vVPereztHxweVY9POiZONWVtGotY90JimOCKtS23gvUSzYgMBeuG09u83n1i2vBYPdhZwgJJxopHnBLrrMcoVTQHM6zWvLq3EF4Hv4AaFGoNq1+DUUxTyZSlghjT973EBhnRllPB5pVBalhC6JSMWd+hIpKZIFssPMcXzhnhKNbuKYsX7u+JjEhjZjJ0nZLYiVmt5eZ/tX5qo+sg4ypJLVN0+VGUCmxjnF+PR1wzasXMAaGau10xnRBNqHUZVVwI/urJ69Bp1H2v7t83as2bIo4ynME5XIIPV9CEO2hBGyhIeIZXeEMavaB39LFsLaFi5hT+CH3+ADubkKs=</latexit><latexit sha1_base64="L1hfgXbk9HOQP+EH+uD2lOyEqio=">AAACFXicjVDLSgMxFL3xWeur6tJNsAiuykw3uiy6calgH9IOJZNm2tAkMyQZoQz9Chdu/BU3Im4Fd/6NmXYW2rrwQOBwzrnc3BMmghvreV9oZXVtfWOztFXe3tnd268cHLZMnGrKmjQWse6ExDDBFWtabgXrJJoRGQrWDsdXud9+YNrwWN3ZScICSYaKR5wS66T7KFU0J6ZfqXo1bwa8TPyCVKHA/+L9ymdvENNUMmWpIMZ0fS+xQUa05VSwabmXGpYQOiZD1nVUEclMkM2umuJTpwxwFGv3lMUz9edERqQxExm6pCR2ZBa9XPzL66Y2uggyrpLUMkXni6JUYBvjvCI84JpRKyaOEKq5+yumI6IJta7IsjvdXzx0mbTqNd+r+bf1auOy6KwEx3ACZ+DDOTTgGm6gCRQkPMIzvKIn9ILe0Ps8uoKKmSP4BfTxDSCUmCQ=</latexit><latexit sha1_base64="L1hfgXbk9HOQP+EH+uD2lOyEqio=">AAACFXicjVDLSgMxFL3xWeur6tJNsAiuykw3uiy6calgH9IOJZNm2tAkMyQZoQz9Chdu/BU3Im4Fd/6NmXYW2rrwQOBwzrnc3BMmghvreV9oZXVtfWOztFXe3tnd268cHLZMnGrKmjQWse6ExDDBFWtabgXrJJoRGQrWDsdXud9+YNrwWN3ZScICSYaKR5wS66T7KFU0J6ZfqXo1bwa8TPyCVKHA/+L9ymdvENNUMmWpIMZ0fS+xQUa05VSwabmXGpYQOiZD1nVUEclMkM2umuJTpwxwFGv3lMUz9edERqQxExm6pCR2ZBa9XPzL66Y2uggyrpLUMkXni6JUYBvjvCI84JpRKyaOEKq5+yumI6IJta7IsjvdXzx0mbTqNd+r+bf1auOy6KwEx3ACZ+DDOTTgGm6gCRQkPMIzvKIn9ILe0Ps8uoKKmSP4BfTxDSCUmCQ=</latexit><latexit sha1_base64="L1hfgXbk9HOQP+EH+uD2lOyEqio=">AAACFXicjVDLSgMxFL3xWeur6tJNsAiuykw3uiy6calgH9IOJZNm2tAkMyQZoQz9Chdu/BU3Im4Fd/6NmXYW2rrwQOBwzrnc3BMmghvreV9oZXVtfWOztFXe3tnd268cHLZMnGrKmjQWse6ExDDBFWtabgXrJJoRGQrWDsdXud9+YNrwWN3ZScICSYaKR5wS66T7KFU0J6ZfqXo1bwa8TPyCVKHA/+L9ymdvENNUMmWpIMZ0fS+xQUa05VSwabmXGpYQOiZD1nVUEclMkM2umuJTpwxwFGv3lMUz9edERqQxExm6pCR2ZBa9XPzL66Y2uggyrpLUMkXni6JUYBvjvCI84JpRKyaOEKq5+yumI6IJta7IsjvdXzx0mbTqNd+r+bf1auOy6KwEx3ACZ+DDOTTgGm6gCRQkPMIzvKIn9ILe0Ps8uoKKmSP4BfTxDSCUmCQ=</latexit>
Functions that solve
<latexit sha1_base64="qY5BReh17p74a3n3nJbos5lx5Kk=">AAAB/XicbZDLSgMxFIYz9VbrrV52boJFcFVmutFlURCXFewF2qFk0kwbmkmG5EyhDsVXceNCEbe+hzvfxkw7C239IfDxn3M4J38QC27Adb+dwtr6xuZWcbu0s7u3f1A+PGoZlWjKmlQJpTsBMUxwyZrAQbBOrBmJAsHawfgmq7cnTBuu5ANMY+ZHZCh5yCkBa/XLJ7eJpBkaDCMC2CgxYf1yxa26c+FV8HKooFyNfvmrN1A0iZgEKogxXc+NwU+JBk4Fm5V6iWExoWMyZF2LkkTM+On8+hk+t84Ah0rbJwHP3d8TKYmMmUaB7YwIjMxyLTP/q3UTCK/8lMs4ASbpYlGYCAwKZ1HgAdeMgphaIFRzeyumI6IJBRtYyYbgLX95FVq1qudWvftapX6dx1FEp+gMXSAPXaI6ukMN1EQUPaJn9IrenCfnxXl3PhatBSefOUZ/5Hz+AI0ilUQ=</latexit><latexit sha1_base64="IdZuo/QJSVV7D8Pplh4p+drN71E=">AAACInicjVDLSgMxFM3UV62v8bFzEyyCqzLTjS6LgrhUsA9oh5JJM21oJhmSO4Va+i8u3PgrbkRdCX6MmXYW2rrwQOBwzrnc3BMmghvwvE+nsLK6tr5R3Cxtbe/s7rn7Bw2jUk1ZnSqhdCskhgkuWR04CNZKNCNxKFgzHF5lfnPEtOFK3sM4YUFM+pJHnBKwUtc9uk4lzajBMCCAjRIj1nXLXsWbAS8TPydllON/8a773ukpmsZMAhXEmLbvJRBMiAZOBZuWOqlhCaFD0mdtSyWJmQkmsxOn+NQqPRwpbZ8EPFN/TkxIbMw4Dm0yJjAwi14m/uW1U4guggmXSQpM0vmiKBUYFM76wj2uGQUxtoRQze1fMR0QTSjYVkv2dH/x0GXSqFZ8r+LfVcu1y7yzIjpGJ+gM+egc1dANukV1RNEDekTP6NV5cl6cN+djHi04+cwh+gXn6xsLXpy9</latexit><latexit sha1_base64="IdZuo/QJSVV7D8Pplh4p+drN71E=">AAACInicjVDLSgMxFM3UV62v8bFzEyyCqzLTjS6LgrhUsA9oh5JJM21oJhmSO4Va+i8u3PgrbkRdCX6MmXYW2rrwQOBwzrnc3BMmghvwvE+nsLK6tr5R3Cxtbe/s7rn7Bw2jUk1ZnSqhdCskhgkuWR04CNZKNCNxKFgzHF5lfnPEtOFK3sM4YUFM+pJHnBKwUtc9uk4lzajBMCCAjRIj1nXLXsWbAS8TPydllON/8a773ukpmsZMAhXEmLbvJRBMiAZOBZuWOqlhCaFD0mdtSyWJmQkmsxOn+NQqPRwpbZ8EPFN/TkxIbMw4Dm0yJjAwi14m/uW1U4guggmXSQpM0vmiKBUYFM76wj2uGQUxtoRQze1fMR0QTSjYVkv2dH/x0GXSqFZ8r+LfVcu1y7yzIjpGJ+gM+egc1dANukV1RNEDekTP6NV5cl6cN+djHi04+cwh+gXn6xsLXpy9</latexit><latexit sha1_base64="IdZuo/QJSVV7D8Pplh4p+drN71E=">AAACInicjVDLSgMxFM3UV62v8bFzEyyCqzLTjS6LgrhUsA9oh5JJM21oJhmSO4Va+i8u3PgrbkRdCX6MmXYW2rrwQOBwzrnc3BMmghvwvE+nsLK6tr5R3Cxtbe/s7rn7Bw2jUk1ZnSqhdCskhgkuWR04CNZKNCNxKFgzHF5lfnPEtOFK3sM4YUFM+pJHnBKwUtc9uk4lzajBMCCAjRIj1nXLXsWbAS8TPydllON/8a773ukpmsZMAhXEmLbvJRBMiAZOBZuWOqlhCaFD0mdtSyWJmQkmsxOn+NQqPRwpbZ8EPFN/TkxIbMw4Dm0yJjAwi14m/uW1U4guggmXSQpM0vmiKBUYFM76wj2uGQUxtoRQze1fMR0QTSjYVkv2dH/x0GXSqFZ8r+LfVcu1y7yzIjpGJ+gM+egc1dANukV1RNEDekTP6NV5cl6cN+djHi04+cwh+gXn6xsLXpy9</latexit>
the tasks
<latexit sha1_base64="kHNBGIL9Xh/D+2bTV1fyNlwdkTY=">AAAB8HicbVA9SwNBEJ2LXzF+RS1tFoNgFe7SaBm0sYxgPiQ5wt5mkyzZ3Tt254Rw5FfYWChi68+x89+4Sa7QxAcDj/dmmJkXJVJY9P1vr7CxubW9U9wt7e0fHB6Vj09aNk4N400Wy9h0Imq5FJo3UaDkncRwqiLJ29Hkdu63n7ixItYPOE14qOhIi6FgFJ30iGNOkNqJ7ZcrftVfgKyTICcVyNHol796g5ilimtkklrbDfwEw4waFEzyWamXWp5QNqEj3nVUU8VtmC0OnpELpwzIMDauNJKF+nsio8raqYpcp6I4tqveXPzP66Y4vA4zoZMUuWbLRcNUEozJ/HsyEIYzlFNHKDPC3UrYmBrK0GVUciEEqy+vk1atGvjV4L5Wqd/kcRThDM7hEgK4gjrcQQOawEDBM7zCm2e8F+/d+1i2Frx85hT+wPv8Ab3BkFk=</latexit><latexit sha1_base64="HQzqcZXPyQIUEQ3C41blEr90t9E=">AAACFXicjVC7SgNBFL3rM8ZX1NJmMAhWYTeNlkEbSwXzkGQJs5PZZMjM7DJzVwghX2Fh46/YiNgKdv6Nk2QLTSw8MHA451zu3BOlUlj0/S9vZXVtfWOzsFXc3tnd2y8dHDZskhnG6yyRiWlF1HIpNK+jQMlbqeFURZI3o+HV1G8+cGNFou9wlPJQ0b4WsWAUnXSPA06Q2qHtlsp+xZ+BLJMgJ2XI8b94t/TZ6SUsU1wjk9TaduCnGI6pQcEknxQ7meUpZUPa521HNVXchuPZVRNy6pQeiRPjnkYyU39OjKmydqQil1QUB3bRm4p/ee0M44twLHSaIddsvijOJMGETCsiPWE4QzlyhDIj3F8JG1BDGboii+70YPHQZdKoVgK/EtxWy7XLvLMCHMMJnEEA51CDa7iBOjBQ8AjP8Oo9eS/em/c+j654+cwR/IL38Q2W4JfS</latexit><latexit sha1_base64="HQzqcZXPyQIUEQ3C41blEr90t9E=">AAACFXicjVC7SgNBFL3rM8ZX1NJmMAhWYTeNlkEbSwXzkGQJs5PZZMjM7DJzVwghX2Fh46/YiNgKdv6Nk2QLTSw8MHA451zu3BOlUlj0/S9vZXVtfWOzsFXc3tnd2y8dHDZskhnG6yyRiWlF1HIpNK+jQMlbqeFURZI3o+HV1G8+cGNFou9wlPJQ0b4WsWAUnXSPA06Q2qHtlsp+xZ+BLJMgJ2XI8b94t/TZ6SUsU1wjk9TaduCnGI6pQcEknxQ7meUpZUPa521HNVXchuPZVRNy6pQeiRPjnkYyU39OjKmydqQil1QUB3bRm4p/ee0M44twLHSaIddsvijOJMGETCsiPWE4QzlyhDIj3F8JG1BDGboii+70YPHQZdKoVgK/EtxWy7XLvLMCHMMJnEEA51CDa7iBOjBQ8AjP8Oo9eS/em/c+j654+cwR/IL38Q2W4JfS</latexit><latexit sha1_base64="HQzqcZXPyQIUEQ3C41blEr90t9E=">AAACFXicjVC7SgNBFL3rM8ZX1NJmMAhWYTeNlkEbSwXzkGQJs5PZZMjM7DJzVwghX2Fh46/YiNgKdv6Nk2QLTSw8MHA451zu3BOlUlj0/S9vZXVtfWOzsFXc3tnd2y8dHDZskhnG6yyRiWlF1HIpNK+jQMlbqeFURZI3o+HV1G8+cGNFou9wlPJQ0b4WsWAUnXSPA06Q2qHtlsp+xZ+BLJMgJ2XI8b94t/TZ6SUsU1wjk9TaduCnGI6pQcEknxQ7meUpZUPa521HNVXchuPZVRNy6pQeiRPjnkYyU39OjKmydqQil1QUB3bRm4p/ee0M44twLHSaIddsvijOJMGETCsiPWE4QzlyhDIj3F8JG1BDGboii+70YPHQZdKoVgK/EtxWy7XLvLMCHMMJnEEA51CDa7iBOjBQ8AjP8Oo9eS/em/c+j654+cwR/IL38Q2W4JfS</latexit>
Hypothesis space
<latexit sha1_base64="tLmIljFWhXs/Bu7+r0Wbg/j8TVU=">AAACoXicfVFNb9NAEN2YrxK+UjhysbCQEEKR3QscK+BQDoiAmrSS14rGm3Gy6n5Yu+OCZfmfcIX/xL9hnQaJtoiRVnr75s3um5myVtJTmv4aRTdu3rp9Z+/u+N79Bw8fTfYfL7xtnMC5sMq60xI8KmlwTpIUntYOQZcKT8qzd0P+5Bydl9YcU1tjoWFtZCUFUKCWk8lRW1vaoJc+9jUIXE6SdJpuI74Osh1I2C5my/1Ry1dWNBoNCQXe51laU9GBIykU9mPeeAwvn8Ea8wANaPRFt7Xex88Ds4or68IxFG/Zvys60N63ugxKDbTxV3MD+a9c3lD1puikqRtCIy4+qhoVk42HOcQr6VCQagMA4WTwGosNOBAUpjUec4NfhdUazKrjxjrd51nRcYUVcbVAR0nGnVxviLvhFrp8j6F7hx+Dk081OiDrXnYc3FpL04dprPmrAf1PCN/+CAO6bIEcGF9bj33Ht81W3XHfh2VlV1dzHSwOplk6zT4fJIdvd2vbY0/ZM/aCZew1O2RHbMbmTLBz9p39YD+jJPoQzaIvF9JotKt5wi5FlP8GkZ7TfQ==</latexit><latexit sha1_base64="DEUQt5Kk9QbDYQMJ1vZ14+xLeyc=">AAACxnicjVHbattAEF2rt8S9Oe1jX0RNoZRipLy0j6Htg19KW4idgFeY0XokL9mL2B0lFULQD+lrviZf0L/pynGgSUrpwMLZM2eYOTN5paSnJPk1iO7cvXf/wc7u8OGjx0+ejvaezb2tncCZsMq64xw8KmlwRpIUHlcOQecKj/KTj33+6BSdl9YcUlNhpqE0spACKFDL0WjaVJbW6KWPfQUCl6NxMkk2Ed8G6RaM2Tb+T77cGzR8ZUWt0ZBQ4P0iTSrKWnAkhcJuyGuPof0JlLgI0IBGn7Ubf138KjCruLAuPEPxhv2zogXtfaPzoNRAa38z15N/yy1qKt5nrTRVTWjEZaOiVjHZuF9WvJIOBakmABBOhlljsQYHgsJKh0Nu8ExYrcGsWm6s090izVqusCCu5uhonHInyzVx1/+Cy08Y3Dv8HCb5UqEDsu5Ny8GVWpoubKPkb3v0LyF8vxIGdH0EcmB8ZT12Ld+YLdrDrgsnSm8e5DaY70/SZJJ+2x8ffNjedoe9YC/Za5ayd+yATdlXNmOCnbKf7JxdRNPIRHV0dimNBtua5+xaRD9+Ay8H2vY=</latexit><latexit sha1_base64="DEUQt5Kk9QbDYQMJ1vZ14+xLeyc=">AAACxnicjVHbattAEF2rt8S9Oe1jX0RNoZRipLy0j6Htg19KW4idgFeY0XokL9mL2B0lFULQD+lrviZf0L/pynGgSUrpwMLZM2eYOTN5paSnJPk1iO7cvXf/wc7u8OGjx0+ejvaezb2tncCZsMq64xw8KmlwRpIUHlcOQecKj/KTj33+6BSdl9YcUlNhpqE0spACKFDL0WjaVJbW6KWPfQUCl6NxMkk2Ed8G6RaM2Tb+T77cGzR8ZUWt0ZBQ4P0iTSrKWnAkhcJuyGuPof0JlLgI0IBGn7Ubf138KjCruLAuPEPxhv2zogXtfaPzoNRAa38z15N/yy1qKt5nrTRVTWjEZaOiVjHZuF9WvJIOBakmABBOhlljsQYHgsJKh0Nu8ExYrcGsWm6s090izVqusCCu5uhonHInyzVx1/+Cy08Y3Dv8HCb5UqEDsu5Ny8GVWpoubKPkb3v0LyF8vxIGdH0EcmB8ZT12Ld+YLdrDrgsnSm8e5DaY70/SZJJ+2x8ffNjedoe9YC/Za5ayd+yATdlXNmOCnbKf7JxdRNPIRHV0dimNBtua5+xaRD9+Ay8H2vY=</latexit><latexit sha1_base64="DEUQt5Kk9QbDYQMJ1vZ14+xLeyc=">AAACxnicjVHbattAEF2rt8S9Oe1jX0RNoZRipLy0j6Htg19KW4idgFeY0XokL9mL2B0lFULQD+lrviZf0L/pynGgSUrpwMLZM2eYOTN5paSnJPk1iO7cvXf/wc7u8OGjx0+ejvaezb2tncCZsMq64xw8KmlwRpIUHlcOQecKj/KTj33+6BSdl9YcUlNhpqE0spACKFDL0WjaVJbW6KWPfQUCl6NxMkk2Ed8G6RaM2Tb+T77cGzR8ZUWt0ZBQ4P0iTSrKWnAkhcJuyGuPof0JlLgI0IBGn7Ubf138KjCruLAuPEPxhv2zogXtfaPzoNRAa38z15N/yy1qKt5nrTRVTWjEZaOiVjHZuF9WvJIOBakmABBOhlljsQYHgsJKh0Nu8ExYrcGsWm6s090izVqusCCu5uhonHInyzVx1/+Cy08Y3Dv8HCb5UqEDsu5Ny8GVWpoubKPkb3v0LyF8vxIGdH0EcmB8ZT12Ld+YLdrDrgsnSm8e5DaY70/SZJJ+2x8ffNjedoe9YC/Za5ayd+yATdlXNmOCnbKf7JxdRNPIRHV0dimNBtua5+xaRD9+Ay8H2vY=</latexit>
simplicity bias
<latexit sha1_base64="9hYX6Wjy0VHKyxplovop/5+4+qY=">AAAB+HicbVBNS8NAEJ34WetHox69LBbBU0l60WPRi8cK9gPaUDbbTbt0Nwm7EyGW/hIvHhTx6k/x5r9x2+agrQ8GHu/NMDMvTKUw6Hnfzsbm1vbObmmvvH9weFRxj0/aJsk04y2WyER3Q2q4FDFvoUDJu6nmVIWSd8LJ7dzvPHJtRBI/YJ7yQNFRLCLBKFpp4FaMUHYPE5iTUFAzcKtezVuArBO/IFUo0By4X/1hwjLFY2SSGtPzvRSDKdUomOSzcj8zPKVsQke8Z2lMFTfBdHH4jFxYZUiiRNuKkSzU3xNTqozJVWg7FcWxWfXm4n9eL8PoOpiKOM2Qx2y5KMokwYTMUyBDoTlDmVtCmRb2VsLGVFOGNquyDcFffXmdtOs136v59/Vq46aIowRncA6X4MMVNOAOmtACBhk8wyu8OU/Oi/PufCxbN5xi5hT+wPn8Afmyk0U=</latexit><latexit sha1_base64="2c5HZptIFuLSIg5wRrHLe0x+A4o=">AAACHXicjVC7SgNBFL3rM8ZHVi1tBoNgFXbTaBm0sVQwD0iWMDuZTYbM7C4zd4Ul5EssbPwVGxELG/FvnCRbaGLhgYHDOfdy55wwlcKg5305a+sbm1vbpZ3y7t7+QcU9PGqZJNOMN1kiE90JqeFSxLyJAiXvpJpTFUreDsfXM7/9wLURSXyPecoDRYexiASjaKW+WzFC2TtMYE5CQU3frXo1bw6ySvyCVKHA/8b77kdvkLBM8RiZpMZ0fS/FYEI1Cib5tNzLDE8pG9Mh71oaU8VNMJmnm5IzqwxIlGj7YiRz9efGhCpjchXaSUVxZJa9mfiX180wugwmIk4z5DFbHIoySTAhs6rIQGjOUOaWUKaF/SthI6opQ1to2Ub3l4Oukla95ns1/65ebVwVnZXgBE7hHHy4gAbcwC00gUEGj/AMr86T8+K8Oe+L0TWn2DmGX3A+vwE0lJq+</latexit><latexit sha1_base64="2c5HZptIFuLSIg5wRrHLe0x+A4o=">AAACHXicjVC7SgNBFL3rM8ZHVi1tBoNgFXbTaBm0sVQwD0iWMDuZTYbM7C4zd4Ul5EssbPwVGxELG/FvnCRbaGLhgYHDOfdy55wwlcKg5305a+sbm1vbpZ3y7t7+QcU9PGqZJNOMN1kiE90JqeFSxLyJAiXvpJpTFUreDsfXM7/9wLURSXyPecoDRYexiASjaKW+WzFC2TtMYE5CQU3frXo1bw6ySvyCVKHA/8b77kdvkLBM8RiZpMZ0fS/FYEI1Cib5tNzLDE8pG9Mh71oaU8VNMJmnm5IzqwxIlGj7YiRz9efGhCpjchXaSUVxZJa9mfiX180wugwmIk4z5DFbHIoySTAhs6rIQGjOUOaWUKaF/SthI6opQ1to2Ub3l4Oukla95ns1/65ebVwVnZXgBE7hHHy4gAbcwC00gUEGj/AMr86T8+K8Oe+L0TWn2DmGX3A+vwE0lJq+</latexit><latexit sha1_base64="2c5HZptIFuLSIg5wRrHLe0x+A4o=">AAACHXicjVC7SgNBFL3rM8ZHVi1tBoNgFXbTaBm0sVQwD0iWMDuZTYbM7C4zd4Ul5EssbPwVGxELG/FvnCRbaGLhgYHDOfdy55wwlcKg5305a+sbm1vbpZ3y7t7+QcU9PGqZJNOMN1kiE90JqeFSxLyJAiXvpJpTFUreDsfXM7/9wLURSXyPecoDRYexiASjaKW+WzFC2TtMYE5CQU3frXo1bw6ySvyCVKHA/8b77kdvkLBM8RiZpMZ0fS/FYEI1Cib5tNzLDE8pG9Mh71oaU8VNMJmnm5IzqwxIlGj7YiRz9efGhCpjchXaSUVxZJa9mfiX180wugwmIk4z5DFbHIoySTAhs6rIQGjOUOaWUKaF/SthI6opQ1to2Ub3l4Oukla95ns1/65ebVwVnZXgBE7hHHy4gAbcwC00gUEGj/AMr86T8+K8Oe+L0TWn2DmGX3A+vwE0lJq+</latexit>
simplicity bias
<latexit sha1_base64="9hYX6Wjy0VHKyxplovop/5+4+qY=">AAAB+HicbVBNS8NAEJ34WetHox69LBbBU0l60WPRi8cK9gPaUDbbTbt0Nwm7EyGW/hIvHhTx6k/x5r9x2+agrQ8GHu/NMDMvTKUw6Hnfzsbm1vbObmmvvH9weFRxj0/aJsk04y2WyER3Q2q4FDFvoUDJu6nmVIWSd8LJ7dzvPHJtRBI/YJ7yQNFRLCLBKFpp4FaMUHYPE5iTUFAzcKtezVuArBO/IFUo0By4X/1hwjLFY2SSGtPzvRSDKdUomOSzcj8zPKVsQke8Z2lMFTfBdHH4jFxYZUiiRNuKkSzU3xNTqozJVWg7FcWxWfXm4n9eL8PoOpiKOM2Qx2y5KMokwYTMUyBDoTlDmVtCmRb2VsLGVFOGNquyDcFffXmdtOs136v59/Vq46aIowRncA6X4MMVNOAOmtACBhk8wyu8OU/Oi/PufCxbN5xi5hT+wPn8Afmyk0U=</latexit><latexit sha1_base64="2c5HZptIFuLSIg5wRrHLe0x+A4o=">AAACHXicjVC7SgNBFL3rM8ZHVi1tBoNgFXbTaBm0sVQwD0iWMDuZTYbM7C4zd4Ul5EssbPwVGxELG/FvnCRbaGLhgYHDOfdy55wwlcKg5305a+sbm1vbpZ3y7t7+QcU9PGqZJNOMN1kiE90JqeFSxLyJAiXvpJpTFUreDsfXM7/9wLURSXyPecoDRYexiASjaKW+WzFC2TtMYE5CQU3frXo1bw6ySvyCVKHA/8b77kdvkLBM8RiZpMZ0fS/FYEI1Cib5tNzLDE8pG9Mh71oaU8VNMJmnm5IzqwxIlGj7YiRz9efGhCpjchXaSUVxZJa9mfiX180wugwmIk4z5DFbHIoySTAhs6rIQGjOUOaWUKaF/SthI6opQ1to2Ub3l4Oukla95ns1/65ebVwVnZXgBE7hHHy4gAbcwC00gUEGj/AMr86T8+K8Oe+L0TWn2DmGX3A+vwE0lJq+</latexit><latexit sha1_base64="2c5HZptIFuLSIg5wRrHLe0x+A4o=">AAACHXicjVC7SgNBFL3rM8ZHVi1tBoNgFXbTaBm0sVQwD0iWMDuZTYbM7C4zd4Ul5EssbPwVGxELG/FvnCRbaGLhgYHDOfdy55wwlcKg5305a+sbm1vbpZ3y7t7+QcU9PGqZJNOMN1kiE90JqeFSxLyJAiXvpJpTFUreDsfXM7/9wLURSXyPecoDRYexiASjaKW+WzFC2TtMYE5CQU3frXo1bw6ySvyCVKHA/8b77kdvkLBM8RiZpMZ0fS/FYEI1Cib5tNzLDE8pG9Mh71oaU8VNMJmnm5IzqwxIlGj7YiRz9efGhCpjchXaSUVxZJa9mfiX180wugwmIk4z5DFbHIoySTAhs6rIQGjOUOaWUKaF/SthI6opQ1to2Ub3l4Oukla95ns1/65ebVwVnZXgBE7hHHy4gAbcwC00gUEGj/AMr86T8+K8Oe+L0TWn2DmGX3A+vwE0lJq+</latexit><latexit sha1_base64="2c5HZptIFuLSIg5wRrHLe0x+A4o=">AAACHXicjVC7SgNBFL3rM8ZHVi1tBoNgFXbTaBm0sVQwD0iWMDuZTYbM7C4zd4Ul5EssbPwVGxELG/FvnCRbaGLhgYHDOfdy55wwlcKg5305a+sbm1vbpZ3y7t7+QcU9PGqZJNOMN1kiE90JqeFSxLyJAiXvpJpTFUreDsfXM7/9wLURSXyPecoDRYexiASjaKW+WzFC2TtMYE5CQU3frXo1bw6ySvyCVKHA/8b77kdvkLBM8RiZpMZ0fS/FYEI1Cib5tNzLDE8pG9Mh71oaU8VNMJmnm5IzqwxIlGj7YiRz9efGhCpjchXaSUVxZJa9mfiX180wugwmIk4z5DFbHIoySTAhs6rIQGjOUOaWUKaF/SthI6opQ1to2Ub3l4Oukla95ns1/65ebVwVnZXgBE7hHHy4gAbcwC00gUEGj/AMr86T8+K8Oe+L0TWn2DmGX3A+vwE0lJq+</latexit>
task gradient
<latexit sha1_base64="5gXgYddUQbJOgb2Vbxiy2PQ+7dI=">AAAB9XicbVC7SgNBFL3jM8ZX1NJmMAhWYTeNlkEbywjmAcka7s7OJkNmH8zMKmHJf9hYKGLrv9j5N06SLTTxwMDhnHu4d46fSqGN43yTtfWNza3t0k55d2//4LBydNzWSaYYb7FEJqrro+ZSxLxlhJG8myqOkS95xx/fzPzOI1daJPG9maTci3AYi1AwNFZ6MKjHdKgwEDw2dFCpOjVnDrpK3IJUoUBzUPnqBwnLIhtmErXuuU5qvByVEUzyabmfaZ4iG+OQ9yyNMeLay+dXT+m5VQIaJso+u3yu/k7kGGk9iXw7GaEZ6WVvJv7n9TITXnm5iNPM8JgtFoWZpCahswpoIBRnRk4sQaaEvZWyESpkxhZVtiW4y19eJe16zXVq7l292rgu6ijBKZzBBbhwCQ24hSa0gIGCZ3iFN/JEXsg7+ViMrpEicwJ/QD5/ACplkkU=</latexit><latexit sha1_base64="1lG3sUBkK4ERYdL3hQH7d+rY7jA=">AAACGnicjVA9SwNBEJ3zM8avqKXNYhCswl0aLYM2lgrmA5IzzO3tJUt2747dPSEc+R8WNv4VGxE7sfHfuEmu0MTCBwOP92aYeROkgmvjul/Oyura+sZmaau8vbO7t185OGzpJFOUNWkiEtUJUDPBY9Y03AjWSRVDGQjWDkZXU7/9wJTmSXxnxinzJQ5iHnGKxkr3BvWIDBSGnMWG9CtVt+bOQJaJV5AqFPhfe7/y0QsTmkm7gQrUuuu5qfFzVIZTwSblXqZZinSEA9a1NEbJtJ/Pok3IqVVCEiXKlr1wpv6cyFFqPZaB7ZRohnrRm4p/ed3MRBd+zuM0Myym80VRJohJyPRPJOSKUSPGliBV3N5K6BAVUmO/WbbRvcWgy6RVr3luzbutVxuXxc9KcAwncAYenEMDruEGmkBBwSM8w6vz5Lw4b877vHXFKWaO4Becz29EH5m+</latexit><latexit sha1_base64="1lG3sUBkK4ERYdL3hQH7d+rY7jA=">AAACGnicjVA9SwNBEJ3zM8avqKXNYhCswl0aLYM2lgrmA5IzzO3tJUt2747dPSEc+R8WNv4VGxE7sfHfuEmu0MTCBwOP92aYeROkgmvjul/Oyura+sZmaau8vbO7t185OGzpJFOUNWkiEtUJUDPBY9Y03AjWSRVDGQjWDkZXU7/9wJTmSXxnxinzJQ5iHnGKxkr3BvWIDBSGnMWG9CtVt+bOQJaJV5AqFPhfe7/y0QsTmkm7gQrUuuu5qfFzVIZTwSblXqZZinSEA9a1NEbJtJ/Pok3IqVVCEiXKlr1wpv6cyFFqPZaB7ZRohnrRm4p/ed3MRBd+zuM0Myym80VRJohJyPRPJOSKUSPGliBV3N5K6BAVUmO/WbbRvcWgy6RVr3luzbutVxuXxc9KcAwncAYenEMDruEGmkBBwSM8w6vz5Lw4b877vHXFKWaO4Becz29EH5m+</latexit><latexit sha1_base64="1lG3sUBkK4ERYdL3hQH7d+rY7jA=">AAACGnicjVA9SwNBEJ3zM8avqKXNYhCswl0aLYM2lgrmA5IzzO3tJUt2747dPSEc+R8WNv4VGxE7sfHfuEmu0MTCBwOP92aYeROkgmvjul/Oyura+sZmaau8vbO7t185OGzpJFOUNWkiEtUJUDPBY9Y03AjWSRVDGQjWDkZXU7/9wJTmSXxnxinzJQ5iHnGKxkr3BvWIDBSGnMWG9CtVt+bOQJaJV5AqFPhfe7/y0QsTmkm7gQrUuuu5qfFzVIZTwSblXqZZinSEA9a1NEbJtJ/Pok3IqVVCEiXKlr1wpv6cyFFqPZaB7ZRohnrRm4p/ed3MRBd+zuM0Myym80VRJohJyPRPJOSKUSPGliBV3N5K6BAVUmO/WbbRvcWgy6RVr3luzbutVxuXxc9KcAwncAYenEMDruEGmkBBwSM8w6vz5Lw4b877vHXFKWaO4Becz29EH5m+</latexit>
task gradient
<latexit sha1_base64="5gXgYddUQbJOgb2Vbxiy2PQ+7dI=">AAAB9XicbVC7SgNBFL3jM8ZX1NJmMAhWYTeNlkEbywjmAcka7s7OJkNmH8zMKmHJf9hYKGLrv9j5N06SLTTxwMDhnHu4d46fSqGN43yTtfWNza3t0k55d2//4LBydNzWSaYYb7FEJqrro+ZSxLxlhJG8myqOkS95xx/fzPzOI1daJPG9maTci3AYi1AwNFZ6MKjHdKgwEDw2dFCpOjVnDrpK3IJUoUBzUPnqBwnLIhtmErXuuU5qvByVEUzyabmfaZ4iG+OQ9yyNMeLay+dXT+m5VQIaJso+u3yu/k7kGGk9iXw7GaEZ6WVvJv7n9TITXnm5iNPM8JgtFoWZpCahswpoIBRnRk4sQaaEvZWyESpkxhZVtiW4y19eJe16zXVq7l292rgu6ijBKZzBBbhwCQ24hSa0gIGCZ3iFN/JEXsg7+ViMrpEicwJ/QD5/ACplkkU=</latexit><latexit sha1_base64="1lG3sUBkK4ERYdL3hQH7d+rY7jA=">AAACGnicjVA9SwNBEJ3zM8avqKXNYhCswl0aLYM2lgrmA5IzzO3tJUt2747dPSEc+R8WNv4VGxE7sfHfuEmu0MTCBwOP92aYeROkgmvjul/Oyura+sZmaau8vbO7t185OGzpJFOUNWkiEtUJUDPBY9Y03AjWSRVDGQjWDkZXU7/9wJTmSXxnxinzJQ5iHnGKxkr3BvWIDBSGnMWG9CtVt+bOQJaJV5AqFPhfe7/y0QsTmkm7gQrUuuu5qfFzVIZTwSblXqZZinSEA9a1NEbJtJ/Pok3IqVVCEiXKlr1wpv6cyFFqPZaB7ZRohnrRm4p/ed3MRBd+zuM0Myym80VRJohJyPRPJOSKUSPGliBV3N5K6BAVUmO/WbbRvcWgy6RVr3luzbutVxuXxc9KcAwncAYenEMDruEGmkBBwSM8w6vz5Lw4b877vHXFKWaO4Becz29EH5m+</latexit><latexit sha1_base64="1lG3sUBkK4ERYdL3hQH7d+rY7jA=">AAACGnicjVA9SwNBEJ3zM8avqKXNYhCswl0aLYM2lgrmA5IzzO3tJUt2747dPSEc+R8WNv4VGxE7sfHfuEmu0MTCBwOP92aYeROkgmvjul/Oyura+sZmaau8vbO7t185OGzpJFOUNWkiEtUJUDPBY9Y03AjWSRVDGQjWDkZXU7/9wJTmSXxnxinzJQ5iHnGKxkr3BvWIDBSGnMWG9CtVt+bOQJaJV5AqFPhfe7/y0QsTmkm7gQrUuuu5qfFzVIZTwSblXqZZinSEA9a1NEbJtJ/Pok3IqVVCEiXKlr1wpv6cyFFqPZaB7ZRohnrRm4p/ed3MRBd+zuM0Myym80VRJohJyPRPJOSKUSPGliBV3N5K6BAVUmO/WbbRvcWgy6RVr3luzbutVxuXxc9KcAwncAYenEMDruEGmkBBwSM8w6vz5Lw4b877vHXFKWaO4Becz29EH5m+</latexit><latexit sha1_base64="1lG3sUBkK4ERYdL3hQH7d+rY7jA=">AAACGnicjVA9SwNBEJ3zM8avqKXNYhCswl0aLYM2lgrmA5IzzO3tJUt2747dPSEc+R8WNv4VGxE7sfHfuEmu0MTCBwOP92aYeROkgmvjul/Oyura+sZmaau8vbO7t185OGzpJFOUNWkiEtUJUDPBY9Y03AjWSRVDGQjWDkZXU7/9wJTmSXxnxinzJQ5iHnGKxkr3BvWIDBSGnMWG9CtVt+bOQJaJV5AqFPhfe7/y0QsTmkm7gQrUuuu5qfFzVIZTwSblXqZZinSEA9a1NEbJtJ/Pok3IqVVCEiXKlr1wpv6cyFFqPZaB7ZRohnrRm4p/ed3MRBd+zuM0Myym80VRJohJyPRPJOSKUSPGliBV3N5K6BAVUmO/WbbRvcWgy6RVr3luzbutVxuXxc9KcAwncAYenEMDruEGmkBBwSM8w6vz5Lw4b877vHXFKWaO4Becz29EH5m+</latexit>
Figure 6.
The Multitask Scaling Hypothesis:Models trained
with an increasing number of tasks are subjected to pressure to
learn a representation that can solve all the tasks.
Recent work has demonstrated a power law relationship
between data scale and model performance (Hestness et al.,
2017). This implies that with enough data (e.g., consisting
of the entire internet and all offline scientific measurements)
one ought to converge to a very small solution set with
irreducible error – the inherent epistemic uncertainty of the
world. As more models are trained on internet-scale data,
the set of solutions that satisfies all data constraints must
become relatively small.
In addition to data-scaling, many modern representation
learning objectives
L(f, x)
directly optimize for multi-
task solving. Contrastive learning finds a distance structure
over data samples that optimizes many classification tasks
(Arora et al.,;,;,).
Masked Autoencoders (He et al.,) optimize randomly
sampled reconstruction tasks. In fact, autoregressive lan-
guage modeling can also be seen as optimizing a diverse set
of tasks (Radford et al.,). Such multi-task objectives
may be more effective than single-task ones (e.g., ImageNet
classification) due to the fact that they impose more task
constraints on the representation, leading to a smaller and
higher-quality solution space (Chen et al.,;,
2020;,;).
3.2. Convergence via
Model Capacity
Suppose there is a globally optimal representation for stan-
dard learning objectives. Then, under sufficient data,scaling
a model (i.e., using larger function classes
F), as well asimproved optimization
, should be more effective at find-
ing better approximations to this optimum, as illustrated in
Figure. With the same training objective, larger models,
even of different architectures, will thus tend to converge
toward this optimum. When different training objectives
share similar minimizers, larger models are better at finding
these minimizers, and will train to similar solutions over the
training tasks. We summarize this hypothesis as follows:
The Capacity Hypothesis
Bigger models are more likely to converge to a shared
representation than smaller models.
3.3. Convergence via
Simplicity Bias
Arriving at the same mapping on thetraining datadoes
not prohibit the models from developing distinct internal
representations. It is not unreasonable to posit that the rep-
resentations used to detect a dog in a 1M parameter model
could be quite different than that used by a 1B parameter
model. What would stop a billion-parameter (and count-
ing) model from learning an overly complicated and distinct
representation? One key factor might be simplicity bias:
6

The Platonic Representation Hypothesis Solves task 1
<latexit sha1_base64="51NXQkDypcG/A+WtwXqm2WeyfWo=">AAAB9HicbVA9TwJBEJ3DL8Qv1NJmIzGxInc0WhJtLDHKRwIXsrfswYa923N3joRc+B02Fhpj64+x89+4wBUKvmSSl/dmMjMvSKQw6LrfTmFjc2t7p7hb2ts/ODwqH5+0jEo1402mpNKdgBouRcybKFDyTqI5jQLJ28H4du63J1wboeJHnCbcj+gwFqFgFK3kPyg54YYgNWPi9csVt+ouQNaJl5MK5Gj0y1+9gWJpxGNkkhrT9dwE/YxqFEzyWamXGp5QNqZD3rU0phE3frY4ekYurDIgodK2YiQL9fdERiNjplFgOyOKI7PqzcX/vG6K4bWfiThJkcdsuShMJUFF5gmQgdCcoZxaQpkW9lbCRlRThjankg3BW315nbRqVc+teve1Sv0mj6MIZ3AOl+DBFdThDhrQBAZP8Ayv8OZMnBfn3flYthacfOYU/sD5/AEMH5Ga</latexit><latexit sha1_base64="FW6agll44/+cHGyWR3ULXYfdBVs=">AAACGXicjVC7TsMwFL3hWcqrwMhiUSExVUkXGCtYGEHQh9RGleM6rVXHCfZNpSrqdzCw8CssCDHCxN/gthmgZeBIlo7OOVfX9wSJFAZd98tZWV1b39gsbBW3d3b39ksHhw0Tp5rxOotlrFsBNVwKxesoUPJWojmNAsmbwfBq6jdHXBsRq3scJ9yPaF+JUDCKVvLvYjnihiA1Q+J1S2W34s5AlomXkzLk+F+8W/ro9GKWRlwhk9SYtucm6GdUo2CST4qd1PCEsiHt87alikbc+Nnssgk5tUqPhLG2TyGZqT8nMhoZM44Cm4woDsyiNxX/8tophhd+JlSSIldsvihMJcGYTGsiPaE5Qzm2hDIt7F8JG1BNGdoyi/Z0b/HQZdKoVjy34t1Wy7XLvLMCHMMJnIEH51CDa7iBOjB4gEd4hlfnyXlx3pz3eXTFyWeO4Becz28Ob5kT</latexit><latexit sha1_base64="FW6agll44/+cHGyWR3ULXYfdBVs=">AAACGXicjVC7TsMwFL3hWcqrwMhiUSExVUkXGCtYGEHQh9RGleM6rVXHCfZNpSrqdzCw8CssCDHCxN/gthmgZeBIlo7OOVfX9wSJFAZd98tZWV1b39gsbBW3d3b39ksHhw0Tp5rxOotlrFsBNVwKxesoUPJWojmNAsmbwfBq6jdHXBsRq3scJ9yPaF+JUDCKVvLvYjnihiA1Q+J1S2W34s5AlomXkzLk+F+8W/ro9GKWRlwhk9SYtucm6GdUo2CST4qd1PCEsiHt87alikbc+Nnssgk5tUqPhLG2TyGZqT8nMhoZM44Cm4woDsyiNxX/8tophhd+JlSSIldsvihMJcGYTGsiPaE5Qzm2hDIt7F8JG1BNGdoyi/Z0b/HQZdKoVjy34t1Wy7XLvLMCHMMJnIEH51CDa7iBOjB4gEd4hlfnyXlx3pz3eXTFyWeO4Becz28Ob5kT</latexit><latexit sha1_base64="FW6agll44/+cHGyWR3ULXYfdBVs=">AAACGXicjVC7TsMwFL3hWcqrwMhiUSExVUkXGCtYGEHQh9RGleM6rVXHCfZNpSrqdzCw8CssCDHCxN/gthmgZeBIlo7OOVfX9wSJFAZd98tZWV1b39gsbBW3d3b39ksHhw0Tp5rxOotlrFsBNVwKxesoUPJWojmNAsmbwfBq6jdHXBsRq3scJ9yPaF+JUDCKVvLvYjnihiA1Q+J1S2W34s5AlomXkzLk+F+8W/ro9GKWRlwhk9SYtucm6GdUo2CST4qd1PCEsiHt87alikbc+Nnssgk5tUqPhLG2TyGZqT8nMhoZM44Cm4woDsyiNxX/8tophhd+JlSSIldsvihMJcGYTGsiPaE5Qzm2hDIt7F8JG1BNGdoyi/Z0b/HQZdKoVjy34t1Wy7XLvLMCHMMJnIEH51CDa7iBOjB4gEd4hlfnyXlx3pz3eXTFyWeO4Becz28Ob5kT</latexit>
Solves task
<latexit sha1_base64="vHe9mN3flA+xmkcJNFXcKeX8kHA=">AAAB8nicbVBNS8NAEJ34WetX1aOXxSJ4Kkkveix68VjRfkAayma7aZdusmF3UiihP8OLB0W8+mu8+W/ctjlo64OBx3szzMwLUykMuu63s7G5tb2zW9or7x8cHh1XTk7bRmWa8RZTUuluSA2XIuEtFCh5N9WcxqHknXB8N/c7E66NUMkTTlMexHSYiEgwilbyH5WccEOQmnG/UnVr7gJknXgFqUKBZr/y1RsolsU8QSapMb7nphjkVKNgks/KvczwlLIxHXLf0oTG3AT54uQZubTKgERK20qQLNTfEzmNjZnGoe2MKY7MqjcX//P8DKObIBdJmiFP2HJRlEmCisz/JwOhOUM5tYQyLeythI2opgxtSmUbgrf68jpp12ueW/Me6tXGbRFHCc7hAq7Ag2towD00oQUMFDzDK7w56Lw4787HsnXDKWbO4A+czx8+nJE1</latexit><latexit sha1_base64="oLuSoEdJx+dHxGxHzsUUdyqXKSE=">AAACF3icjVC7SgNBFL0bXzG+opY2g0GwCrtptAzaWCqaB2yWMDuZTYbM7iwzdwMh5DMsbPwVGxFb7fwbJ8kWmlh4YOBwzrncuSdMpTDoul9OYW19Y3OruF3a2d3bPygfHjWNyjTjDaak0u2QGi5FwhsoUPJ2qjmNQ8lb4fB65rdGXBuhkgccpzyIaT8RkWAUreTfKznihiA1w2654lbdOcgq8XJSgRz/i3fLn52eYlnME2SSGuN7borBhGoUTPJpqZMZnlI2pH3uW5rQmJtgMr9rSs6s0iOR0vYlSObqz4kJjY0Zx6FNxhQHZtmbiX95fobRZTARSZohT9hiUZRJgorMSiI9oTlDObaEMi3sXwkbUE0Z2ipL9nRv+dBV0qxVPbfq3dUq9au8syKcwCmcgwcXUIcbuIUGMFDwCM/w6jw5L86b876IFpx85hh+wfn4BjTtmK4=</latexit><latexit sha1_base64="oLuSoEdJx+dHxGxHzsUUdyqXKSE=">AAACF3icjVC7SgNBFL0bXzG+opY2g0GwCrtptAzaWCqaB2yWMDuZTYbM7iwzdwMh5DMsbPwVGxFb7fwbJ8kWmlh4YOBwzrncuSdMpTDoul9OYW19Y3OruF3a2d3bPygfHjWNyjTjDaak0u2QGi5FwhsoUPJ2qjmNQ8lb4fB65rdGXBuhkgccpzyIaT8RkWAUreTfKznihiA1w2654lbdOcgq8XJSgRz/i3fLn52eYlnME2SSGuN7borBhGoUTPJpqZMZnlI2pH3uW5rQmJtgMr9rSs6s0iOR0vYlSObqz4kJjY0Zx6FNxhQHZtmbiX95fobRZTARSZohT9hiUZRJgorMSiI9oTlDObaEMi3sXwkbUE0Z2ipL9nRv+dBV0qxVPbfq3dUq9au8syKcwCmcgwcXUIcbuIUGMFDwCM/w6jw5L86b876IFpx85hh+wfn4BjTtmK4=</latexit><latexit sha1_base64="oLuSoEdJx+dHxGxHzsUUdyqXKSE=">AAACF3icjVC7SgNBFL0bXzG+opY2g0GwCrtptAzaWCqaB2yWMDuZTYbM7iwzdwMh5DMsbPwVGxFb7fwbJ8kWmlh4YOBwzrncuSdMpTDoul9OYW19Y3OruF3a2d3bPygfHjWNyjTjDaak0u2QGi5FwhsoUPJ2qjmNQ8lb4fB65rdGXBuhkgccpzyIaT8RkWAUreTfKznihiA1w2654lbdOcgq8XJSgRz/i3fLn52eYlnME2SSGuN7borBhGoUTPJpqZMZnlI2pH3uW5rQmJtgMr9rSs6s0iOR0vYlSObqz4kJjY0Zx6FNxhQHZtmbiX95fobRZTARSZohT9hiUZRJgorMSiI9oTlDObaEMi3sXwkbUE0Z2ipL9nRv+dBV0qxVPbfq3dUq9au8syKcwCmcgwcXUIcbuIUGMFDwCM/w6jw5L86b876IFpx85hh+wfn4BjTtmK4=</latexit>
2
<latexit sha1_base64="yq8/5g8NcY6cy0ADZKiGUGrj1YY=">AAAB6HicbVA9TwJBEJ3DL8Qv1NJmIzGxInc0UhJtLCGRjwQuZG+Zg5W9vcvungm58AtsLDTG1p9k579xgSsUfMkkL+/NZGZekAiujet+O4Wt7Z3dveJ+6eDw6PikfHrW0XGqGLZZLGLVC6hGwSW2DTcCe4lCGgUCu8H0buF3n1BpHssHM0vQj+hY8pAzaqzUqg3LFbfqLkE2iZeTCuRoDstfg1HM0gilYYJq3ffcxPgZVYYzgfPSINWYUDalY+xbKmmE2s+Wh87JlVVGJIyVLWnIUv09kdFI61kU2M6Imole9xbif14/NWHdz7hMUoOSrRaFqSAmJouvyYgrZEbMLKFMcXsrYROqKDM2m5INwVt/eZN0alXPrXqtWqVxm8dRhAu4hGvw4AYacA9NaAMDhGd4hTfn0Xlx3p2PVWvByWfO4Q+czx97gYy0</latexit><latexit sha1_base64="g11zAFHkyS//eho1X0Cng1EOxtU=">AAACDXicjVC7SgNBFL0bXzG+opY2g0GwCrtpTBm0sTRgHpAsYXZyNxkyO7vMzAphyRdY2PgrNiK29nb+jZNkC00sPDBwOOdc7twTJIJr47pfTmFjc2t7p7hb2ts/ODwqH5+0dZwqhi0Wi1h1A6pRcIktw43AbqKQRoHATjC5mfudB1Sax/LeTBP0IzqSPOSMGis1a4Nyxa26C5B14uWkAjn+Fx+UP/vDmKURSsME1brnuYnxM6oMZwJnpX6qMaFsQkfYs1TSCLWfLa6ZkQurDEkYK/ukIQv150RGI62nUWCTETVjverNxb+8XmrCup9xmaQGJVsuClNBTEzm1ZAhV8iMmFpCmeL2r4SNqaLM2AJL9nRv9dB10q5VPbfqNWuVxnXeWRHO4BwuwYMraMAt3EELGCA8wjO8Ok/Oi/PmvC+jBSefOYVfcD6+AdgglC0=</latexit><latexit sha1_base64="g11zAFHkyS//eho1X0Cng1EOxtU=">AAACDXicjVC7SgNBFL0bXzG+opY2g0GwCrtpTBm0sTRgHpAsYXZyNxkyO7vMzAphyRdY2PgrNiK29nb+jZNkC00sPDBwOOdc7twTJIJr47pfTmFjc2t7p7hb2ts/ODwqH5+0dZwqhi0Wi1h1A6pRcIktw43AbqKQRoHATjC5mfudB1Sax/LeTBP0IzqSPOSMGis1a4Nyxa26C5B14uWkAjn+Fx+UP/vDmKURSsME1brnuYnxM6oMZwJnpX6qMaFsQkfYs1TSCLWfLa6ZkQurDEkYK/ukIQv150RGI62nUWCTETVjverNxb+8XmrCup9xmaQGJVsuClNBTEzm1ZAhV8iMmFpCmeL2r4SNqaLM2AJL9nRv9dB10q5VPbfqNWuVxnXeWRHO4BwuwYMraMAt3EELGCA8wjO8Ok/Oi/PmvC+jBSefOYVfcD6+AdgglC0=</latexit><latexit sha1_base64="g11zAFHkyS//eho1X0Cng1EOxtU=">AAACDXicjVC7SgNBFL0bXzG+opY2g0GwCrtpTBm0sTRgHpAsYXZyNxkyO7vMzAphyRdY2PgrNiK29nb+jZNkC00sPDBwOOdc7twTJIJr47pfTmFjc2t7p7hb2ts/ODwqH5+0dZwqhi0Wi1h1A6pRcIktw43AbqKQRoHATjC5mfudB1Sax/LeTBP0IzqSPOSMGis1a4Nyxa26C5B14uWkAjn+Fx+UP/vDmKURSsME1brnuYnxM6oMZwJnpX6qMaFsQkfYs1TSCLWfLa6ZkQurDEkYK/ukIQv150RGI62nUWCTETVjverNxb+8XmrCup9xmaQGJVsuClNBTEzm1ZAhV8iMmFpCmeL2r4SNqaLM2AJL9nRv9dB10q5VPbfqNWuVxnXeWRHO4BwuwYMraMAt3EELGCA8wjO8Ok/Oi/PmvC+jBSefOYVfcD6+AdgglC0=</latexit>
Hypothesis space
<latexit sha1_base64="tLmIljFWhXs/Bu7+r0Wbg/j8TVU=">AAACoXicfVFNb9NAEN2YrxK+UjhysbCQEEKR3QscK+BQDoiAmrSS14rGm3Gy6n5Yu+OCZfmfcIX/xL9hnQaJtoiRVnr75s3um5myVtJTmv4aRTdu3rp9Z+/u+N79Bw8fTfYfL7xtnMC5sMq60xI8KmlwTpIUntYOQZcKT8qzd0P+5Bydl9YcU1tjoWFtZCUFUKCWk8lRW1vaoJc+9jUIXE6SdJpuI74Osh1I2C5my/1Ry1dWNBoNCQXe51laU9GBIykU9mPeeAwvn8Ea8wANaPRFt7Xex88Ds4or68IxFG/Zvys60N63ugxKDbTxV3MD+a9c3lD1puikqRtCIy4+qhoVk42HOcQr6VCQagMA4WTwGosNOBAUpjUec4NfhdUazKrjxjrd51nRcYUVcbVAR0nGnVxviLvhFrp8j6F7hx+Dk081OiDrXnYc3FpL04dprPmrAf1PCN/+CAO6bIEcGF9bj33Ht81W3XHfh2VlV1dzHSwOplk6zT4fJIdvd2vbY0/ZM/aCZew1O2RHbMbmTLBz9p39YD+jJPoQzaIvF9JotKt5wi5FlP8GkZ7TfQ==</latexit><latexit sha1_base64="DEUQt5Kk9QbDYQMJ1vZ14+xLeyc=">AAACxnicjVHbattAEF2rt8S9Oe1jX0RNoZRipLy0j6Htg19KW4idgFeY0XokL9mL2B0lFULQD+lrviZf0L/pynGgSUrpwMLZM2eYOTN5paSnJPk1iO7cvXf/wc7u8OGjx0+ejvaezb2tncCZsMq64xw8KmlwRpIUHlcOQecKj/KTj33+6BSdl9YcUlNhpqE0spACKFDL0WjaVJbW6KWPfQUCl6NxMkk2Ed8G6RaM2Tb+T77cGzR8ZUWt0ZBQ4P0iTSrKWnAkhcJuyGuPof0JlLgI0IBGn7Ubf138KjCruLAuPEPxhv2zogXtfaPzoNRAa38z15N/yy1qKt5nrTRVTWjEZaOiVjHZuF9WvJIOBakmABBOhlljsQYHgsJKh0Nu8ExYrcGsWm6s090izVqusCCu5uhonHInyzVx1/+Cy08Y3Dv8HCb5UqEDsu5Ny8GVWpoubKPkb3v0LyF8vxIGdH0EcmB8ZT12Ld+YLdrDrgsnSm8e5DaY70/SZJJ+2x8ffNjedoe9YC/Za5ayd+yATdlXNmOCnbKf7JxdRNPIRHV0dimNBtua5+xaRD9+Ay8H2vY=</latexit><latexit sha1_base64="DEUQt5Kk9QbDYQMJ1vZ14+xLeyc=">AAACxnicjVHbattAEF2rt8S9Oe1jX0RNoZRipLy0j6Htg19KW4idgFeY0XokL9mL2B0lFULQD+lrviZf0L/pynGgSUrpwMLZM2eYOTN5paSnJPk1iO7cvXf/wc7u8OGjx0+ejvaezb2tncCZsMq64xw8KmlwRpIUHlcOQecKj/KTj33+6BSdl9YcUlNhpqE0spACKFDL0WjaVJbW6KWPfQUCl6NxMkk2Ed8G6RaM2Tb+T77cGzR8ZUWt0ZBQ4P0iTSrKWnAkhcJuyGuPof0JlLgI0IBGn7Ubf138KjCruLAuPEPxhv2zogXtfaPzoNRAa38z15N/yy1qKt5nrTRVTWjEZaOiVjHZuF9WvJIOBakmABBOhlljsQYHgsJKh0Nu8ExYrcGsWm6s090izVqusCCu5uhonHInyzVx1/+Cy08Y3Dv8HCb5UqEDsu5Ny8GVWpoubKPkb3v0LyF8vxIGdH0EcmB8ZT12Ld+YLdrDrgsnSm8e5DaY70/SZJJ+2x8ffNjedoe9YC/Za5ayd+yATdlXNmOCnbKf7JxdRNPIRHV0dimNBtua5+xaRD9+Ay8H2vY=</latexit><latexit sha1_base64="DEUQt5Kk9QbDYQMJ1vZ14+xLeyc=">AAACxnicjVHbattAEF2rt8S9Oe1jX0RNoZRipLy0j6Htg19KW4idgFeY0XokL9mL2B0lFULQD+lrviZf0L/pynGgSUrpwMLZM2eYOTN5paSnJPk1iO7cvXf/wc7u8OGjx0+ejvaezb2tncCZsMq64xw8KmlwRpIUHlcOQecKj/KTj33+6BSdl9YcUlNhpqE0spACKFDL0WjaVJbW6KWPfQUCl6NxMkk2Ed8G6RaM2Tb+T77cGzR8ZUWt0ZBQ4P0iTSrKWnAkhcJuyGuPof0JlLgI0IBGn7Ubf138KjCruLAuPEPxhv2zogXtfaPzoNRAa38z15N/yy1qKt5nrTRVTWjEZaOiVjHZuF9WvJIOBakmABBOhlljsQYHgsJKh0Nu8ExYrcGsWm6s090izVqusCCu5uhonHInyzVx1/+Cy08Y3Dv8HCb5UqEDsu5Ny8GVWpoubKPkb3v0LyF8vxIGdH0EcmB8ZT12Ld+YLdrDrgsnSm8e5DaY70/SZJJ+2x8ffNjedoe9YC/Za5ayd+yATdlXNmOCnbKf7JxdRNPIRHV0dimNBtua5+xaRD9+Ay8H2vY=</latexit>
Simple
<latexit sha1_base64="zFVvUIF9Z2hD+LDeFqir1+TO8Qw=">AAAB7XicbVA9SwNBEJ3zM8avqKXNYhCswl0aLYM2lhHNByRH2NvMJWt2947dPSGE/AcbC0Vs/T92/hs3yRWa+GDg8d4MM/OiVHBjff/bW1vf2NzaLuwUd/f2Dw5LR8dNk2SaYYMlItHtiBoUXGHDciuwnWqkMhLYikY3M7/1hNrwRD3YcYqhpAPFY86odVLznstUYK9U9iv+HGSVBDkpQ456r/TV7Scsk6gsE9SYTuCnNpxQbTkTOC12M4MpZSM6wI6jiko04WR+7ZScO6VP4kS7UpbM1d8TEyqNGcvIdUpqh2bZm4n/eZ3MxlfhhKs0s6jYYlGcCWITMnud9LlGZsXYEco0d7cSNqSaMusCKroQguWXV0mzWgn8SnBXLdeu8zgKcApncAEBXEINbqEODWDwCM/wCm9e4r14797HonXNy2dO4A+8zx+W+48e</latexit><latexit sha1_base64="UD0yK0w0W+mMuOEwUpm4g629bNE=">AAACEnicjVC7TsMwFL0pr1JeBUYWiwqJqUq6wFjBwgiCPqQ2qhz3pjW1k8h2kKqo/8DAwq+wIMTKxMbf4LYZoGXgSJaOzjlX1/cEieDauO6XU1hZXVvfKG6WtrZ3dvfK+wdNHaeKYYPFIlbtgGoUPMKG4UZgO1FIZSCwFYwup37rAZXmcXRnxgn6kg4iHnJGjZWat1wmAnvlilt1ZyDLxMtJBXL8L94rf3b7MUslRoYJqnXHcxPjZ1QZzgROSt1UY0LZiA6wY2lEJWo/m500ISdW6ZMwVvZFhszUnxMZlVqPZWCTkpqhXvSm4l9eJzXhuZ/xKEkNRmy+KEwFMTGZ9kP6XCEzYmwJZYrbvxI2pIoyY1ss2dO9xUOXSbNW9dyqd1Or1C/yzopwBMdwCh6cQR2u4BoawOAeHuEZXp0n58V5c97n0YKTzxzCLzgf30Z6lpc=</latexit><latexit sha1_base64="UD0yK0w0W+mMuOEwUpm4g629bNE=">AAACEnicjVC7TsMwFL0pr1JeBUYWiwqJqUq6wFjBwgiCPqQ2qhz3pjW1k8h2kKqo/8DAwq+wIMTKxMbf4LYZoGXgSJaOzjlX1/cEieDauO6XU1hZXVvfKG6WtrZ3dvfK+wdNHaeKYYPFIlbtgGoUPMKG4UZgO1FIZSCwFYwup37rAZXmcXRnxgn6kg4iHnJGjZWat1wmAnvlilt1ZyDLxMtJBXL8L94rf3b7MUslRoYJqnXHcxPjZ1QZzgROSt1UY0LZiA6wY2lEJWo/m500ISdW6ZMwVvZFhszUnxMZlVqPZWCTkpqhXvSm4l9eJzXhuZ/xKEkNRmy+KEwFMTGZ9kP6XCEzYmwJZYrbvxI2pIoyY1ss2dO9xUOXSbNW9dyqd1Or1C/yzopwBMdwCh6cQR2u4BoawOAeHuEZXp0n58V5c97n0YKTzxzCLzgf30Z6lpc=</latexit><latexit sha1_base64="UD0yK0w0W+mMuOEwUpm4g629bNE=">AAACEnicjVC7TsMwFL0pr1JeBUYWiwqJqUq6wFjBwgiCPqQ2qhz3pjW1k8h2kKqo/8DAwq+wIMTKxMbf4LYZoGXgSJaOzjlX1/cEieDauO6XU1hZXVvfKG6WtrZ3dvfK+wdNHaeKYYPFIlbtgGoUPMKG4UZgO1FIZSCwFYwup37rAZXmcXRnxgn6kg4iHnJGjZWat1wmAnvlilt1ZyDLxMtJBXL8L94rf3b7MUslRoYJqnXHcxPjZ1QZzgROSt1UY0LZiA6wY2lEJWo/m500ISdW6ZMwVvZFhszUnxMZlVqPZWCTkpqhXvSm4l9eJzXhuZ/xKEkNRmy+KEwFMTGZ9kP6XCEzYmwJZYrbvxI2pIoyY1ss2dO9xUOXSbNW9dyqd1Or1C/yzopwBMdwCh6cQR2u4BoawOAeHuEZXp0n58V5c97n0YKTzxzCLzgf30Z6lpc=</latexit>
functions
<latexit sha1_base64="ha6+cYdMcIxvOCs6820nv4vYs8Q=">AAAB8HicbZDLSgMxFIZP6q3WW9Wlm2ARXJWZbnRZdOOygr1IO5RMmmlDk8yQZIQy9CncuFDErY/jzrcx085CW38IfPznHHLOHyaCG+t536i0sbm1vVPereztHxweVY9POiZONWVtGotY90JimOCKtS23gvUSzYgMBeuG09u83n1i2vBYPdhZwgJJxopHnBLrrMcoVTQHM6zWvLq3EF4Hv4AaFGoNq1+DUUxTyZSlghjT973EBhnRllPB5pVBalhC6JSMWd+hIpKZIFssPMcXzhnhKNbuKYsX7u+JjEhjZjJ0nZLYiVmt5eZ/tX5qo+sg4ypJLVN0+VGUCmxjnF+PR1wzasXMAaGau10xnRBNqHUZVVwI/urJ69Bp1H2v7t83as2bIo4ynME5XIIPV9CEO2hBGyhIeIZXeEMavaB39LFsLaFi5hT+CH3+ADubkKs=</latexit><latexit sha1_base64="L1hfgXbk9HOQP+EH+uD2lOyEqio=">AAACFXicjVDLSgMxFL3xWeur6tJNsAiuykw3uiy6calgH9IOJZNm2tAkMyQZoQz9Chdu/BU3Im4Fd/6NmXYW2rrwQOBwzrnc3BMmghvreV9oZXVtfWOztFXe3tnd268cHLZMnGrKmjQWse6ExDDBFWtabgXrJJoRGQrWDsdXud9+YNrwWN3ZScICSYaKR5wS66T7KFU0J6ZfqXo1bwa8TPyCVKHA/+L9ymdvENNUMmWpIMZ0fS+xQUa05VSwabmXGpYQOiZD1nVUEclMkM2umuJTpwxwFGv3lMUz9edERqQxExm6pCR2ZBa9XPzL66Y2uggyrpLUMkXni6JUYBvjvCI84JpRKyaOEKq5+yumI6IJta7IsjvdXzx0mbTqNd+r+bf1auOy6KwEx3ACZ+DDOTTgGm6gCRQkPMIzvKIn9ILe0Ps8uoKKmSP4BfTxDSCUmCQ=</latexit><latexit sha1_base64="L1hfgXbk9HOQP+EH+uD2lOyEqio=">AAACFXicjVDLSgMxFL3xWeur6tJNsAiuykw3uiy6calgH9IOJZNm2tAkMyQZoQz9Chdu/BU3Im4Fd/6NmXYW2rrwQOBwzrnc3BMmghvreV9oZXVtfWOztFXe3tnd268cHLZMnGrKmjQWse6ExDDBFWtabgXrJJoRGQrWDsdXud9+YNrwWN3ZScICSYaKR5wS66T7KFU0J6ZfqXo1bwa8TPyCVKHA/+L9ymdvENNUMmWpIMZ0fS+xQUa05VSwabmXGpYQOiZD1nVUEclMkM2umuJTpwxwFGv3lMUz9edERqQxExm6pCR2ZBa9XPzL66Y2uggyrpLUMkXni6JUYBvjvCI84JpRKyaOEKq5+yumI6IJta7IsjvdXzx0mbTqNd+r+bf1auOy6KwEx3ACZ+DDOTTgGm6gCRQkPMIzvKIn9ILe0Ps8uoKKmSP4BfTxDSCUmCQ=</latexit><latexit sha1_base64="L1hfgXbk9HOQP+EH+uD2lOyEqio=">AAACFXicjVDLSgMxFL3xWeur6tJNsAiuykw3uiy6calgH9IOJZNm2tAkMyQZoQz9Chdu/BU3Im4Fd/6NmXYW2rrwQOBwzrnc3BMmghvreV9oZXVtfWOztFXe3tnd268cHLZMnGrKmjQWse6ExDDBFWtabgXrJJoRGQrWDsdXud9+YNrwWN3ZScICSYaKR5wS66T7KFU0J6ZfqXo1bwa8TPyCVKHA/+L9ymdvENNUMmWpIMZ0fS+xQUa05VSwabmXGpYQOiZD1nVUEclMkM2umuJTpwxwFGv3lMUz9edERqQxExm6pCR2ZBa9XPzL66Y2uggyrpLUMkXni6JUYBvjvCI84JpRKyaOEKq5+yumI6IJta7IsjvdXzx0mbTqNd+r+bf1auOy6KwEx3ACZ+DDOTTgGm6gCRQkPMIzvKIn9ILe0Ps8uoKKmSP4BfTxDSCUmCQ=</latexit>
Functions that solve
<latexit sha1_base64="qY5BReh17p74a3n3nJbos5lx5Kk=">AAAB/XicbZDLSgMxFIYz9VbrrV52boJFcFVmutFlURCXFewF2qFk0kwbmkmG5EyhDsVXceNCEbe+hzvfxkw7C239IfDxn3M4J38QC27Adb+dwtr6xuZWcbu0s7u3f1A+PGoZlWjKmlQJpTsBMUxwyZrAQbBOrBmJAsHawfgmq7cnTBuu5ANMY+ZHZCh5yCkBa/XLJ7eJpBkaDCMC2CgxYf1yxa26c+FV8HKooFyNfvmrN1A0iZgEKogxXc+NwU+JBk4Fm5V6iWExoWMyZF2LkkTM+On8+hk+t84Ah0rbJwHP3d8TKYmMmUaB7YwIjMxyLTP/q3UTCK/8lMs4ASbpYlGYCAwKZ1HgAdeMgphaIFRzeyumI6IJBRtYyYbgLX95FVq1qudWvftapX6dx1FEp+gMXSAPXaI6ukMN1EQUPaJn9IrenCfnxXl3PhatBSefOUZ/5Hz+AI0ilUQ=</latexit><latexit sha1_base64="IdZuo/QJSVV7D8Pplh4p+drN71E=">AAACInicjVDLSgMxFM3UV62v8bFzEyyCqzLTjS6LgrhUsA9oh5JJM21oJhmSO4Va+i8u3PgrbkRdCX6MmXYW2rrwQOBwzrnc3BMmghvwvE+nsLK6tr5R3Cxtbe/s7rn7Bw2jUk1ZnSqhdCskhgkuWR04CNZKNCNxKFgzHF5lfnPEtOFK3sM4YUFM+pJHnBKwUtc9uk4lzajBMCCAjRIj1nXLXsWbAS8TPydllON/8a773ukpmsZMAhXEmLbvJRBMiAZOBZuWOqlhCaFD0mdtSyWJmQkmsxOn+NQqPRwpbZ8EPFN/TkxIbMw4Dm0yJjAwi14m/uW1U4guggmXSQpM0vmiKBUYFM76wj2uGQUxtoRQze1fMR0QTSjYVkv2dH/x0GXSqFZ8r+LfVcu1y7yzIjpGJ+gM+egc1dANukV1RNEDekTP6NV5cl6cN+djHi04+cwh+gXn6xsLXpy9</latexit><latexit sha1_base64="IdZuo/QJSVV7D8Pplh4p+drN71E=">AAACInicjVDLSgMxFM3UV62v8bFzEyyCqzLTjS6LgrhUsA9oh5JJM21oJhmSO4Va+i8u3PgrbkRdCX6MmXYW2rrwQOBwzrnc3BMmghvwvE+nsLK6tr5R3Cxtbe/s7rn7Bw2jUk1ZnSqhdCskhgkuWR04CNZKNCNxKFgzHF5lfnPEtOFK3sM4YUFM+pJHnBKwUtc9uk4lzajBMCCAjRIj1nXLXsWbAS8TPydllON/8a773ukpmsZMAhXEmLbvJRBMiAZOBZuWOqlhCaFD0mdtSyWJmQkmsxOn+NQqPRwpbZ8EPFN/TkxIbMw4Dm0yJjAwi14m/uW1U4guggmXSQpM0vmiKBUYFM76wj2uGQUxtoRQze1fMR0QTSjYVkv2dH/x0GXSqFZ8r+LfVcu1y7yzIjpGJ+gM+egc1dANukV1RNEDekTP6NV5cl6cN+djHi04+cwh+gXn6xsLXpy9</latexit><latexit sha1_base64="IdZuo/QJSVV7D8Pplh4p+drN71E=">AAACInicjVDLSgMxFM3UV62v8bFzEyyCqzLTjS6LgrhUsA9oh5JJM21oJhmSO4Va+i8u3PgrbkRdCX6MmXYW2rrwQOBwzrnc3BMmghvwvE+nsLK6tr5R3Cxtbe/s7rn7Bw2jUk1ZnSqhdCskhgkuWR04CNZKNCNxKFgzHF5lfnPEtOFK3sM4YUFM+pJHnBKwUtc9uk4lzajBMCCAjRIj1nXLXsWbAS8TPydllON/8a773ukpmsZMAhXEmLbvJRBMiAZOBZuWOqlhCaFD0mdtSyWJmQkmsxOn+NQqPRwpbZ8EPFN/TkxIbMw4Dm0yJjAwi14m/uW1U4guggmXSQpM0vmiKBUYFM76wj2uGQUxtoRQze1fMR0QTSjYVkv2dH/x0GXSqFZ8r+LfVcu1y7yzIjpGJ+gM+egc1dANukV1RNEDekTP6NV5cl6cN+djHi04+cwh+gXn6xsLXpy9</latexit>
the tasks
<latexit sha1_base64="kHNBGIL9Xh/D+2bTV1fyNlwdkTY=">AAAB8HicbVA9SwNBEJ2LXzF+RS1tFoNgFe7SaBm0sYxgPiQ5wt5mkyzZ3Tt254Rw5FfYWChi68+x89+4Sa7QxAcDj/dmmJkXJVJY9P1vr7CxubW9U9wt7e0fHB6Vj09aNk4N400Wy9h0Imq5FJo3UaDkncRwqiLJ29Hkdu63n7ixItYPOE14qOhIi6FgFJ30iGNOkNqJ7ZcrftVfgKyTICcVyNHol796g5ilimtkklrbDfwEw4waFEzyWamXWp5QNqEj3nVUU8VtmC0OnpELpwzIMDauNJKF+nsio8raqYpcp6I4tqveXPzP66Y4vA4zoZMUuWbLRcNUEozJ/HsyEIYzlFNHKDPC3UrYmBrK0GVUciEEqy+vk1atGvjV4L5Wqd/kcRThDM7hEgK4gjrcQQOawEDBM7zCm2e8F+/d+1i2Frx85hT+wPv8Ab3BkFk=</latexit><latexit sha1_base64="HQzqcZXPyQIUEQ3C41blEr90t9E=">AAACFXicjVC7SgNBFL3rM8ZX1NJmMAhWYTeNlkEbSwXzkGQJs5PZZMjM7DJzVwghX2Fh46/YiNgKdv6Nk2QLTSw8MHA451zu3BOlUlj0/S9vZXVtfWOzsFXc3tnd2y8dHDZskhnG6yyRiWlF1HIpNK+jQMlbqeFURZI3o+HV1G8+cGNFou9wlPJQ0b4WsWAUnXSPA06Q2qHtlsp+xZ+BLJMgJ2XI8b94t/TZ6SUsU1wjk9TaduCnGI6pQcEknxQ7meUpZUPa521HNVXchuPZVRNy6pQeiRPjnkYyU39OjKmydqQil1QUB3bRm4p/ee0M44twLHSaIddsvijOJMGETCsiPWE4QzlyhDIj3F8JG1BDGboii+70YPHQZdKoVgK/EtxWy7XLvLMCHMMJnEEA51CDa7iBOjBQ8AjP8Oo9eS/em/c+j654+cwR/IL38Q2W4JfS</latexit><latexit sha1_base64="HQzqcZXPyQIUEQ3C41blEr90t9E=">AAACFXicjVC7SgNBFL3rM8ZX1NJmMAhWYTeNlkEbSwXzkGQJs5PZZMjM7DJzVwghX2Fh46/YiNgKdv6Nk2QLTSw8MHA451zu3BOlUlj0/S9vZXVtfWOzsFXc3tnd2y8dHDZskhnG6yyRiWlF1HIpNK+jQMlbqeFURZI3o+HV1G8+cGNFou9wlPJQ0b4WsWAUnXSPA06Q2qHtlsp+xZ+BLJMgJ2XI8b94t/TZ6SUsU1wjk9TaduCnGI6pQcEknxQ7meUpZUPa521HNVXchuPZVRNy6pQeiRPjnkYyU39OjKmydqQil1QUB3bRm4p/ee0M44twLHSaIddsvijOJMGETCsiPWE4QzlyhDIj3F8JG1BDGboii+70YPHQZdKoVgK/EtxWy7XLvLMCHMMJnEEA51CDa7iBOjBQ8AjP8Oo9eS/em/c+j654+cwR/IL38Q2W4JfS</latexit><latexit sha1_base64="HQzqcZXPyQIUEQ3C41blEr90t9E=">AAACFXicjVC7SgNBFL3rM8ZX1NJmMAhWYTeNlkEbSwXzkGQJs5PZZMjM7DJzVwghX2Fh46/YiNgKdv6Nk2QLTSw8MHA451zu3BOlUlj0/S9vZXVtfWOzsFXc3tnd2y8dHDZskhnG6yyRiWlF1HIpNK+jQMlbqeFURZI3o+HV1G8+cGNFou9wlPJQ0b4WsWAUnXSPA06Q2qHtlsp+xZ+BLJMgJ2XI8b94t/TZ6SUsU1wjk9TaduCnGI6pQcEknxQ7meUpZUPa521HNVXchuPZVRNy6pQeiRPjnkYyU39OjKmydqQil1QUB3bRm4p/ee0M44twLHSaIddsvijOJMGETCsiPWE4QzlyhDIj3F8JG1BDGboii+70YPHQZdKoVgK/EtxWy7XLvLMCHMMJnEEA51CDa7iBOjBQ8AjP8Oo9eS/em/c+j654+cwR/IL38Q2W4JfS</latexit>
Hypothesis space
<latexit sha1_base64="tLmIljFWhXs/Bu7+r0Wbg/j8TVU=">AAACoXicfVFNb9NAEN2YrxK+UjhysbCQEEKR3QscK+BQDoiAmrSS14rGm3Gy6n5Yu+OCZfmfcIX/xL9hnQaJtoiRVnr75s3um5myVtJTmv4aRTdu3rp9Z+/u+N79Bw8fTfYfL7xtnMC5sMq60xI8KmlwTpIUntYOQZcKT8qzd0P+5Bydl9YcU1tjoWFtZCUFUKCWk8lRW1vaoJc+9jUIXE6SdJpuI74Osh1I2C5my/1Ry1dWNBoNCQXe51laU9GBIykU9mPeeAwvn8Ea8wANaPRFt7Xex88Ds4or68IxFG/Zvys60N63ugxKDbTxV3MD+a9c3lD1puikqRtCIy4+qhoVk42HOcQr6VCQagMA4WTwGosNOBAUpjUec4NfhdUazKrjxjrd51nRcYUVcbVAR0nGnVxviLvhFrp8j6F7hx+Dk081OiDrXnYc3FpL04dprPmrAf1PCN/+CAO6bIEcGF9bj33Ht81W3XHfh2VlV1dzHSwOplk6zT4fJIdvd2vbY0/ZM/aCZew1O2RHbMbmTLBz9p39YD+jJPoQzaIvF9JotKt5wi5FlP8GkZ7TfQ==</latexit><latexit sha1_base64="DEUQt5Kk9QbDYQMJ1vZ14+xLeyc=">AAACxnicjVHbattAEF2rt8S9Oe1jX0RNoZRipLy0j6Htg19KW4idgFeY0XokL9mL2B0lFULQD+lrviZf0L/pynGgSUrpwMLZM2eYOTN5paSnJPk1iO7cvXf/wc7u8OGjx0+ejvaezb2tncCZsMq64xw8KmlwRpIUHlcOQecKj/KTj33+6BSdl9YcUlNhpqE0spACKFDL0WjaVJbW6KWPfQUCl6NxMkk2Ed8G6RaM2Tb+T77cGzR8ZUWt0ZBQ4P0iTSrKWnAkhcJuyGuPof0JlLgI0IBGn7Ubf138KjCruLAuPEPxhv2zogXtfaPzoNRAa38z15N/yy1qKt5nrTRVTWjEZaOiVjHZuF9WvJIOBakmABBOhlljsQYHgsJKh0Nu8ExYrcGsWm6s090izVqusCCu5uhonHInyzVx1/+Cy08Y3Dv8HCb5UqEDsu5Ny8GVWpoubKPkb3v0LyF8vxIGdH0EcmB8ZT12Ld+YLdrDrgsnSm8e5DaY70/SZJJ+2x8ffNjedoe9YC/Za5ayd+yATdlXNmOCnbKf7JxdRNPIRHV0dimNBtua5+xaRD9+Ay8H2vY=</latexit><latexit sha1_base64="DEUQt5Kk9QbDYQMJ1vZ14+xLeyc=">AAACxnicjVHbattAEF2rt8S9Oe1jX0RNoZRipLy0j6Htg19KW4idgFeY0XokL9mL2B0lFULQD+lrviZf0L/pynGgSUrpwMLZM2eYOTN5paSnJPk1iO7cvXf/wc7u8OGjx0+ejvaezb2tncCZsMq64xw8KmlwRpIUHlcOQecKj/KTj33+6BSdl9YcUlNhpqE0spACKFDL0WjaVJbW6KWPfQUCl6NxMkk2Ed8G6RaM2Tb+T77cGzR8ZUWt0ZBQ4P0iTSrKWnAkhcJuyGuPof0JlLgI0IBGn7Ubf138KjCruLAuPEPxhv2zogXtfaPzoNRAa38z15N/yy1qKt5nrTRVTWjEZaOiVjHZuF9WvJIOBakmABBOhlljsQYHgsJKh0Nu8ExYrcGsWm6s090izVqusCCu5uhonHInyzVx1/+Cy08Y3Dv8HCb5UqEDsu5Ny8GVWpoubKPkb3v0LyF8vxIGdH0EcmB8ZT12Ld+YLdrDrgsnSm8e5DaY70/SZJJ+2x8ffNjedoe9YC/Za5ayd+yATdlXNmOCnbKf7JxdRNPIRHV0dimNBtua5+xaRD9+Ay8H2vY=</latexit><latexit sha1_base64="DEUQt5Kk9QbDYQMJ1vZ14+xLeyc=">AAACxnicjVHbattAEF2rt8S9Oe1jX0RNoZRipLy0j6Htg19KW4idgFeY0XokL9mL2B0lFULQD+lrviZf0L/pynGgSUrpwMLZM2eYOTN5paSnJPk1iO7cvXf/wc7u8OGjx0+ejvaezb2tncCZsMq64xw8KmlwRpIUHlcOQecKj/KTj33+6BSdl9YcUlNhpqE0spACKFDL0WjaVJbW6KWPfQUCl6NxMkk2Ed8G6RaM2Tb+T77cGzR8ZUWt0ZBQ4P0iTSrKWnAkhcJuyGuPof0JlLgI0IBGn7Ubf138KjCruLAuPEPxhv2zogXtfaPzoNRAa38z15N/yy1qKt5nrTRVTWjEZaOiVjHZuF9WvJIOBakmABBOhlljsQYHgsJKh0Nu8ExYrcGsWm6s090izVqusCCu5uhonHInyzVx1/+Cy08Y3Dv8HCb5UqEDsu5Ny8GVWpoubKPkb3v0LyF8vxIGdH0EcmB8ZT12Ld+YLdrDrgsnSm8e5DaY70/SZJJ+2x8ffNjedoe9YC/Za5ayd+yATdlXNmOCnbKf7JxdRNPIRHV0dimNBtua5+xaRD9+Ay8H2vY=</latexit>
simplicity bias
<latexit sha1_base64="9hYX6Wjy0VHKyxplovop/5+4+qY=">AAAB+HicbVBNS8NAEJ34WetHox69LBbBU0l60WPRi8cK9gPaUDbbTbt0Nwm7EyGW/hIvHhTx6k/x5r9x2+agrQ8GHu/NMDMvTKUw6Hnfzsbm1vbObmmvvH9weFRxj0/aJsk04y2WyER3Q2q4FDFvoUDJu6nmVIWSd8LJ7dzvPHJtRBI/YJ7yQNFRLCLBKFpp4FaMUHYPE5iTUFAzcKtezVuArBO/IFUo0By4X/1hwjLFY2SSGtPzvRSDKdUomOSzcj8zPKVsQke8Z2lMFTfBdHH4jFxYZUiiRNuKkSzU3xNTqozJVWg7FcWxWfXm4n9eL8PoOpiKOM2Qx2y5KMokwYTMUyBDoTlDmVtCmRb2VsLGVFOGNquyDcFffXmdtOs136v59/Vq46aIowRncA6X4MMVNOAOmtACBhk8wyu8OU/Oi/PufCxbN5xi5hT+wPn8Afmyk0U=</latexit><latexit sha1_base64="2c5HZptIFuLSIg5wRrHLe0x+A4o=">AAACHXicjVC7SgNBFL3rM8ZHVi1tBoNgFXbTaBm0sVQwD0iWMDuZTYbM7C4zd4Ul5EssbPwVGxELG/FvnCRbaGLhgYHDOfdy55wwlcKg5305a+sbm1vbpZ3y7t7+QcU9PGqZJNOMN1kiE90JqeFSxLyJAiXvpJpTFUreDsfXM7/9wLURSXyPecoDRYexiASjaKW+WzFC2TtMYE5CQU3frXo1bw6ySvyCVKHA/8b77kdvkLBM8RiZpMZ0fS/FYEI1Cib5tNzLDE8pG9Mh71oaU8VNMJmnm5IzqwxIlGj7YiRz9efGhCpjchXaSUVxZJa9mfiX180wugwmIk4z5DFbHIoySTAhs6rIQGjOUOaWUKaF/SthI6opQ1to2Ub3l4Oukla95ns1/65ebVwVnZXgBE7hHHy4gAbcwC00gUEGj/AMr86T8+K8Oe+L0TWn2DmGX3A+vwE0lJq+</latexit><latexit sha1_base64="2c5HZptIFuLSIg5wRrHLe0x+A4o=">AAACHXicjVC7SgNBFL3rM8ZHVi1tBoNgFXbTaBm0sVQwD0iWMDuZTYbM7C4zd4Ul5EssbPwVGxELG/FvnCRbaGLhgYHDOfdy55wwlcKg5305a+sbm1vbpZ3y7t7+QcU9PGqZJNOMN1kiE90JqeFSxLyJAiXvpJpTFUreDsfXM7/9wLURSXyPecoDRYexiASjaKW+WzFC2TtMYE5CQU3frXo1bw6ySvyCVKHA/8b77kdvkLBM8RiZpMZ0fS/FYEI1Cib5tNzLDE8pG9Mh71oaU8VNMJmnm5IzqwxIlGj7YiRz9efGhCpjchXaSUVxZJa9mfiX180wugwmIk4z5DFbHIoySTAhs6rIQGjOUOaWUKaF/SthI6opQ1to2Ub3l4Oukla95ns1/65ebVwVnZXgBE7hHHy4gAbcwC00gUEGj/AMr86T8+K8Oe+L0TWn2DmGX3A+vwE0lJq+</latexit><latexit sha1_base64="2c5HZptIFuLSIg5wRrHLe0x+A4o=">AAACHXicjVC7SgNBFL3rM8ZHVi1tBoNgFXbTaBm0sVQwD0iWMDuZTYbM7C4zd4Ul5EssbPwVGxELG/FvnCRbaGLhgYHDOfdy55wwlcKg5305a+sbm1vbpZ3y7t7+QcU9PGqZJNOMN1kiE90JqeFSxLyJAiXvpJpTFUreDsfXM7/9wLURSXyPecoDRYexiASjaKW+WzFC2TtMYE5CQU3frXo1bw6ySvyCVKHA/8b77kdvkLBM8RiZpMZ0fS/FYEI1Cib5tNzLDE8pG9Mh71oaU8VNMJmnm5IzqwxIlGj7YiRz9efGhCpjchXaSUVxZJa9mfiX180wugwmIk4z5DFbHIoySTAhs6rIQGjOUOaWUKaF/SthI6opQ1to2Ub3l4Oukla95ns1/65ebVwVnZXgBE7hHHy4gAbcwC00gUEGj/AMr86T8+K8Oe+L0TWn2DmGX3A+vwE0lJq+</latexit>
simplicity bias
<latexit sha1_base64="9hYX6Wjy0VHKyxplovop/5+4+qY=">AAAB+HicbVBNS8NAEJ34WetHox69LBbBU0l60WPRi8cK9gPaUDbbTbt0Nwm7EyGW/hIvHhTx6k/x5r9x2+agrQ8GHu/NMDMvTKUw6Hnfzsbm1vbObmmvvH9weFRxj0/aJsk04y2WyER3Q2q4FDFvoUDJu6nmVIWSd8LJ7dzvPHJtRBI/YJ7yQNFRLCLBKFpp4FaMUHYPE5iTUFAzcKtezVuArBO/IFUo0By4X/1hwjLFY2SSGtPzvRSDKdUomOSzcj8zPKVsQke8Z2lMFTfBdHH4jFxYZUiiRNuKkSzU3xNTqozJVWg7FcWxWfXm4n9eL8PoOpiKOM2Qx2y5KMokwYTMUyBDoTlDmVtCmRb2VsLGVFOGNquyDcFffXmdtOs136v59/Vq46aIowRncA6X4MMVNOAOmtACBhk8wyu8OU/Oi/PufCxbN5xi5hT+wPn8Afmyk0U=</latexit><latexit sha1_base64="2c5HZptIFuLSIg5wRrHLe0x+A4o=">AAACHXicjVC7SgNBFL3rM8ZHVi1tBoNgFXbTaBm0sVQwD0iWMDuZTYbM7C4zd4Ul5EssbPwVGxELG/FvnCRbaGLhgYHDOfdy55wwlcKg5305a+sbm1vbpZ3y7t7+QcU9PGqZJNOMN1kiE90JqeFSxLyJAiXvpJpTFUreDsfXM7/9wLURSXyPecoDRYexiASjaKW+WzFC2TtMYE5CQU3frXo1bw6ySvyCVKHA/8b77kdvkLBM8RiZpMZ0fS/FYEI1Cib5tNzLDE8pG9Mh71oaU8VNMJmnm5IzqwxIlGj7YiRz9efGhCpjchXaSUVxZJa9mfiX180wugwmIk4z5DFbHIoySTAhs6rIQGjOUOaWUKaF/SthI6opQ1to2Ub3l4Oukla95ns1/65ebVwVnZXgBE7hHHy4gAbcwC00gUEGj/AMr86T8+K8Oe+L0TWn2DmGX3A+vwE0lJq+</latexit><latexit sha1_base64="2c5HZptIFuLSIg5wRrHLe0x+A4o=">AAACHXicjVC7SgNBFL3rM8ZHVi1tBoNgFXbTaBm0sVQwD0iWMDuZTYbM7C4zd4Ul5EssbPwVGxELG/FvnCRbaGLhgYHDOfdy55wwlcKg5305a+sbm1vbpZ3y7t7+QcU9PGqZJNOMN1kiE90JqeFSxLyJAiXvpJpTFUreDsfXM7/9wLURSXyPecoDRYexiASjaKW+WzFC2TtMYE5CQU3frXo1bw6ySvyCVKHA/8b77kdvkLBM8RiZpMZ0fS/FYEI1Cib5tNzLDE8pG9Mh71oaU8VNMJmnm5IzqwxIlGj7YiRz9efGhCpjchXaSUVxZJa9mfiX180wugwmIk4z5DFbHIoySTAhs6rIQGjOUOaWUKaF/SthI6opQ1to2Ub3l4Oukla95ns1/65ebVwVnZXgBE7hHHy4gAbcwC00gUEGj/AMr86T8+K8Oe+L0TWn2DmGX3A+vwE0lJq+</latexit><latexit sha1_base64="2c5HZptIFuLSIg5wRrHLe0x+A4o=">AAACHXicjVC7SgNBFL3rM8ZHVi1tBoNgFXbTaBm0sVQwD0iWMDuZTYbM7C4zd4Ul5EssbPwVGxELG/FvnCRbaGLhgYHDOfdy55wwlcKg5305a+sbm1vbpZ3y7t7+QcU9PGqZJNOMN1kiE90JqeFSxLyJAiXvpJpTFUreDsfXM7/9wLURSXyPecoDRYexiASjaKW+WzFC2TtMYE5CQU3frXo1bw6ySvyCVKHA/8b77kdvkLBM8RiZpMZ0fS/FYEI1Cib5tNzLDE8pG9Mh71oaU8VNMJmnm5IzqwxIlGj7YiRz9efGhCpjchXaSUVxZJa9mfiX180wugwmIk4z5DFbHIoySTAhs6rIQGjOUOaWUKaF/SthI6opQ1to2Ub3l4Oukla95ns1/65ebVwVnZXgBE7hHHy4gAbcwC00gUEGj/AMr86T8+K8Oe+L0TWn2DmGX3A+vwE0lJq+</latexit>
task gradient
<latexit sha1_base64="5gXgYddUQbJOgb2Vbxiy2PQ+7dI=">AAAB9XicbVC7SgNBFL3jM8ZX1NJmMAhWYTeNlkEbywjmAcka7s7OJkNmH8zMKmHJf9hYKGLrv9j5N06SLTTxwMDhnHu4d46fSqGN43yTtfWNza3t0k55d2//4LBydNzWSaYYb7FEJqrro+ZSxLxlhJG8myqOkS95xx/fzPzOI1daJPG9maTci3AYi1AwNFZ6MKjHdKgwEDw2dFCpOjVnDrpK3IJUoUBzUPnqBwnLIhtmErXuuU5qvByVEUzyabmfaZ4iG+OQ9yyNMeLay+dXT+m5VQIaJso+u3yu/k7kGGk9iXw7GaEZ6WVvJv7n9TITXnm5iNPM8JgtFoWZpCahswpoIBRnRk4sQaaEvZWyESpkxhZVtiW4y19eJe16zXVq7l292rgu6ijBKZzBBbhwCQ24hSa0gIGCZ3iFN/JEXsg7+ViMrpEicwJ/QD5/ACplkkU=</latexit><latexit sha1_base64="1lG3sUBkK4ERYdL3hQH7d+rY7jA=">AAACGnicjVA9SwNBEJ3zM8avqKXNYhCswl0aLYM2lgrmA5IzzO3tJUt2747dPSEc+R8WNv4VGxE7sfHfuEmu0MTCBwOP92aYeROkgmvjul/Oyura+sZmaau8vbO7t185OGzpJFOUNWkiEtUJUDPBY9Y03AjWSRVDGQjWDkZXU7/9wJTmSXxnxinzJQ5iHnGKxkr3BvWIDBSGnMWG9CtVt+bOQJaJV5AqFPhfe7/y0QsTmkm7gQrUuuu5qfFzVIZTwSblXqZZinSEA9a1NEbJtJ/Pok3IqVVCEiXKlr1wpv6cyFFqPZaB7ZRohnrRm4p/ed3MRBd+zuM0Myym80VRJohJyPRPJOSKUSPGliBV3N5K6BAVUmO/WbbRvcWgy6RVr3luzbutVxuXxc9KcAwncAYenEMDruEGmkBBwSM8w6vz5Lw4b877vHXFKWaO4Becz29EH5m+</latexit><latexit sha1_base64="1lG3sUBkK4ERYdL3hQH7d+rY7jA=">AAACGnicjVA9SwNBEJ3zM8avqKXNYhCswl0aLYM2lgrmA5IzzO3tJUt2747dPSEc+R8WNv4VGxE7sfHfuEmu0MTCBwOP92aYeROkgmvjul/Oyura+sZmaau8vbO7t185OGzpJFOUNWkiEtUJUDPBY9Y03AjWSRVDGQjWDkZXU7/9wJTmSXxnxinzJQ5iHnGKxkr3BvWIDBSGnMWG9CtVt+bOQJaJV5AqFPhfe7/y0QsTmkm7gQrUuuu5qfFzVIZTwSblXqZZinSEA9a1NEbJtJ/Pok3IqVVCEiXKlr1wpv6cyFFqPZaB7ZRohnrRm4p/ed3MRBd+zuM0Myym80VRJohJyPRPJOSKUSPGliBV3N5K6BAVUmO/WbbRvcWgy6RVr3luzbutVxuXxc9KcAwncAYenEMDruEGmkBBwSM8w6vz5Lw4b877vHXFKWaO4Becz29EH5m+</latexit><latexit sha1_base64="1lG3sUBkK4ERYdL3hQH7d+rY7jA=">AAACGnicjVA9SwNBEJ3zM8avqKXNYhCswl0aLYM2lgrmA5IzzO3tJUt2747dPSEc+R8WNv4VGxE7sfHfuEmu0MTCBwOP92aYeROkgmvjul/Oyura+sZmaau8vbO7t185OGzpJFOUNWkiEtUJUDPBY9Y03AjWSRVDGQjWDkZXU7/9wJTmSXxnxinzJQ5iHnGKxkr3BvWIDBSGnMWG9CtVt+bOQJaJV5AqFPhfe7/y0QsTmkm7gQrUuuu5qfFzVIZTwSblXqZZinSEA9a1NEbJtJ/Pok3IqVVCEiXKlr1wpv6cyFFqPZaB7ZRohnrRm4p/ed3MRBd+zuM0Myym80VRJohJyPRPJOSKUSPGliBV3N5K6BAVUmO/WbbRvcWgy6RVr3luzbutVxuXxc9KcAwncAYenEMDruEGmkBBwSM8w6vz5Lw4b877vHXFKWaO4Becz29EH5m+</latexit>
task gradient
<latexit sha1_base64="5gXgYddUQbJOgb2Vbxiy2PQ+7dI=">AAAB9XicbVC7SgNBFL3jM8ZX1NJmMAhWYTeNlkEbywjmAcka7s7OJkNmH8zMKmHJf9hYKGLrv9j5N06SLTTxwMDhnHu4d46fSqGN43yTtfWNza3t0k55d2//4LBydNzWSaYYb7FEJqrro+ZSxLxlhJG8myqOkS95xx/fzPzOI1daJPG9maTci3AYi1AwNFZ6MKjHdKgwEDw2dFCpOjVnDrpK3IJUoUBzUPnqBwnLIhtmErXuuU5qvByVEUzyabmfaZ4iG+OQ9yyNMeLay+dXT+m5VQIaJso+u3yu/k7kGGk9iXw7GaEZ6WVvJv7n9TITXnm5iNPM8JgtFoWZpCahswpoIBRnRk4sQaaEvZWyESpkxhZVtiW4y19eJe16zXVq7l292rgu6ijBKZzBBbhwCQ24hSa0gIGCZ3iFN/JEXsg7+ViMrpEicwJ/QD5/ACplkkU=</latexit><latexit sha1_base64="1lG3sUBkK4ERYdL3hQH7d+rY7jA=">AAACGnicjVA9SwNBEJ3zM8avqKXNYhCswl0aLYM2lgrmA5IzzO3tJUt2747dPSEc+R8WNv4VGxE7sfHfuEmu0MTCBwOP92aYeROkgmvjul/Oyura+sZmaau8vbO7t185OGzpJFOUNWkiEtUJUDPBY9Y03AjWSRVDGQjWDkZXU7/9wJTmSXxnxinzJQ5iHnGKxkr3BvWIDBSGnMWG9CtVt+bOQJaJV5AqFPhfe7/y0QsTmkm7gQrUuuu5qfFzVIZTwSblXqZZinSEA9a1NEbJtJ/Pok3IqVVCEiXKlr1wpv6cyFFqPZaB7ZRohnrRm4p/ed3MRBd+zuM0Myym80VRJohJyPRPJOSKUSPGliBV3N5K6BAVUmO/WbbRvcWgy6RVr3luzbutVxuXxc9KcAwncAYenEMDruEGmkBBwSM8w6vz5Lw4b877vHXFKWaO4Becz29EH5m+</latexit><latexit sha1_base64="1lG3sUBkK4ERYdL3hQH7d+rY7jA=">AAACGnicjVA9SwNBEJ3zM8avqKXNYhCswl0aLYM2lgrmA5IzzO3tJUt2747dPSEc+R8WNv4VGxE7sfHfuEmu0MTCBwOP92aYeROkgmvjul/Oyura+sZmaau8vbO7t185OGzpJFOUNWkiEtUJUDPBY9Y03AjWSRVDGQjWDkZXU7/9wJTmSXxnxinzJQ5iHnGKxkr3BvWIDBSGnMWG9CtVt+bOQJaJV5AqFPhfe7/y0QsTmkm7gQrUuuu5qfFzVIZTwSblXqZZinSEA9a1NEbJtJ/Pok3IqVVCEiXKlr1wpv6cyFFqPZaB7ZRohnrRm4p/ed3MRBd+zuM0Myym80VRJohJyPRPJOSKUSPGliBV3N5K6BAVUmO/WbbRvcWgy6RVr3luzbutVxuXxc9KcAwncAYenEMDruEGmkBBwSM8w6vz5Lw4b877vHXFKWaO4Becz29EH5m+</latexit><latexit sha1_base64="1lG3sUBkK4ERYdL3hQH7d+rY7jA=">AAACGnicjVA9SwNBEJ3zM8avqKXNYhCswl0aLYM2lgrmA5IzzO3tJUt2747dPSEc+R8WNv4VGxE7sfHfuEmu0MTCBwOP92aYeROkgmvjul/Oyura+sZmaau8vbO7t185OGzpJFOUNWkiEtUJUDPBY9Y03AjWSRVDGQjWDkZXU7/9wJTmSXxnxinzJQ5iHnGKxkr3BvWIDBSGnMWG9CtVt+bOQJaJV5AqFPhfe7/y0QsTmkm7gQrUuuu5qfFzVIZTwSblXqZZinSEA9a1NEbJtJ/Pok3IqVVCEiXKlr1wpv6cyFFqPZaB7ZRohnrRm4p/ed3MRBd+zuM0Myym80VRJohJyPRPJOSKUSPGliBV3N5K6BAVUmO/WbbRvcWgy6RVr3luzbutVxuXxc9KcAwncAYenEMDruEGmkBBwSM8w6vz5Lw4b877vHXFKWaO4Becz29EH5m+</latexit>
Figure 7.
The Simplicity Bias Hypothesis:Larger models have
larger coverage of all possible ways to fit the same data. However,
the implicit simplicity biases of deep networks encourage larger
models to find the simplest of these solutions.
The Simplicity Bias Hypothesis
Deep networks are biased toward finding simple fits
to the data, and the bigger the model, the stronger
the bias. Therefore, as models get bigger, we should
expect convergence to a smaller solution space.
Such simplicity bias could be coming from explicit reg-
ularization
R(f)
commonly used in deep learning (e.g.,
weight decay and dropout). However, even in the absence
of external influences, deep networks naturally adhere to
Occam’s razor,
implicitly favoring simple solutions
that fit
the data (Solomonoff,;,;
et al.,;,;,;
gle et al.,;,). Figure
how simplicity bias can drive convergence.
4. What representation are we converging to?
By now, we hope to have convinced the reader that task and
data pressures, combined with increasing model capacity,
can lead to convergence. We next turn our attention towhat
exactly is the endpoint of all this convergence.
Our central hypothesis, stated in Figure, is that the rep-
resentation we are converging toward is a statistical model
of the underlying reality that generates our observations.
Consistent with the multitask scaling hypothesis, such a rep-
resentation would naturally be useful toward many tasks (or
at least toward any task grounded in reality). Additionally,
this representation might be relatively simple, assuming that
scientists are correct in suggesting that the fundamental laws
of nature are indeed simple functions (Gell-Mann,), in
line with the simplicity bias hypothesis.
But what exactly do we mean by “a statistical model of the
underlying reality.” In this section, we formalize one defi-
nition with concrete mathematical statements.Importantly,
this section should be read as just one concrete candidate
for the form of the platonic representation; other candidates
could be arrived at from other modeling assumptions.
4.1. An idealized world
We consider a world that works as follows, consistent with
the cartoon in Figure. The world consists of a sequence
ofTdiscrete events, denoted asZ≜[z1, . . . , zT] , sampled
from some unknown distributionP(Z). Each event can be
observed in various ways. An observation is a bijective,
deterministic functionobs:Z → · that maps events to an
arbitrary measurement space, such as pixels, sounds, mass,
force, torque, words, etc. Later, in Section, we discuss
limitations and potential extensions to continuous and un-
bounded worlds, and stochastic observations, that could
yield a model that better reflects real learning scenarios.
One can think of an event as corresponding to the state of
the world at some point in time
3
, but it is also fine to simply
consider an event as any variable that indexes observations,
with no further physical meaning
4
.
In this idealized world, knowingP(Z)would be useful for
many kinds of predictions; this would constitute a world
model over the events that cause our observations (Werbos,
1987;,;,).
We will next show that a particular representation ofP(Z)
is recovered by certain contrastive learners.
4.2. A family of contrastive learners converge to a
representation ofP(Z)
Consider a contrastive learner that models observations that
cooccurtogether. For simplicity, we ground our discussion
with the following definition of thecooccurrence proba-
bility,Pcoor, of two observationsxaandxbboth occurring
within some windowTwindow:
Pcoor(xa, xb)∝
X
(t,t

) :|t−t

|≤Twindow
P(Xt=xa, Xt
′=xb).
Analogously, we can definePcoorforZand other observa-
tion modalities. Note thatPcooris symmetric.
Considerpositive pairsas two observations nearby in time
(sampled fromPcoor) andnegative pairsas observations
3
Here we only analyze temporal sequences, but note that the
same could be done with respect to events laid out in space instead.
4
This latter interpretation may be more consistent with Plato’s
intent. Scholars have argued that his allegory of the cave rejects
any notion of a true world state (Nettleship,). Instead, we
could say that the joint distribution of observation indices isitself
the platonic reality.
7

The Platonic Representation HypothesisVISION LANGUAGEPERCEPTION
From Human Perception
From Pixel
Pointwise Mutual Information
From Masked Language
Contrastive Learning (SimCSE)
From Masked Language
Predictive Learning (RoBERTa)
Figure 8.
Color cooccurrence in VISION and LANGUAGE yields perceptual organization:Similar representations of color are
obtained via,from LEFT to RIGHT, the perceptual layout from CIELAB color space, cooccurrence in CIFAR-10 images, and language
cooccurrence modeling (Gao et al.2021);2019); computed roughly following2021)). Details in Appendix.
drawn from any point in time (sampled independently from
the marginal). Our contrastive learner tries to classify if
a pair is positive or negative by learning a representation
fX:X→R
d such that the dot-product kernel approximates
the log odds ratio up to some offset:
⟨fX(xa), fX(xb)⟩ ≈log
P(pos|xa, xb)
P(neg|xa, xb)
+ ˜cX(xa)(3)
= log
Pcoor(xa|xb)
Pcoor(xa)
+cX(xa)(4)
=KPMI(xa, xb) +cX(xa), (5)
whereKPMIis the pointwise mutual information (PMI) ker-
nel, andcX(xa) is constant inxb. We note that is a common
setting for self-supervised contrastive learners with NCE
objectives (Gutmann & Hyv ¨arinen,;,),
such as SimCLR (Chen et al.,) and SimCSE (Gao
et al.,). (See2018) and Appendix
details.)
Under mild conditions that the world is smooth enough (see
Appendix), a choice of fXcan exactly representKPMI:
⟨fX(xa), fX(xb)⟩=KPMI(xa, xb) +cX, (6)
where we observed thatcX(xa) from Equation (5) must be
a constant since both sides are symmetric.
Therefore, the contrastive learners we consider are mini-
mized by a representationfXwhose kernel isKPMI(up to
a constant offset). With sufficient data and optimization, we
will observe convergence to this point.
Thus we have convergence to a representation of the statis-
tics ofX, but what aboutZ? Recall that our idealized world
consists ofbijectiveobservation functions, which, over dis-
crete random variables, preserve probabilities. So we have:
Pcoor(xa, xb) =Pcoor(za, zb)
KPMI(xa, xb) =KPMI(za, zb),
where we usePcoorandKPMIin a modality-agnostic way
to emphasize that different modalities share the same such
quantities.
All these arguments hold not just forXbut also forY(or
any other bijective, discrete modality), implying:
KPMI(za, zb) =⟨fX(xa), fX(xb)⟩ −cX (7)
=⟨fY(ya), fY(yb)⟩ −cY. (8)
Therefore, for any modality in our idealized world, we
observe representational convergence to the same kernel,
which represents certain pairwise statistics ofP(Z).
This analysis suggests that certain representation learning
algorithms may boil down to a simple rule:find an em-
bedding in which similarity equals PMI. We note that this
idea is consistent with prior works that have used PMI as
a similarity measure for clustering in vision and language
(e.g.,2014);2015);2016);
Chambers & Jurafsky2008)).
A study in colorWe conduct a case study to verify that
convergence does happen on real data.2021)
discovered that color distances in learned language repre-
sentations, when trained to predict cooccurrences intext
(Devlin et al.,), closely mirror human perception of
these distances, which we reproduce in Figure
contrastive and predictive models. Interestingly, they noted
an increasing similarity as models scale larger and become
better at modelingtextcooccurrences. In Figure, we also
learn representations of color based onKPMIfrom cooccur-
rences inimages. Indeed, learning cooccurrence statistics
in either domain recovers roughly thesameperceptual rep-
resentation. Details of this experiment are described in
Appendix.
We believe that our simple model encapsulates essential
aspects of complex real-world systems, and offers a path
toward understanding the representation that models are con-
8

The Platonic Representation Hypothesis
verging to—a unified model that is proficient across various
domains and modalities, grounded in the statistical proper-
ties of the underlying world. Section
some limitations.
5. What are the implications of convergence?
Scaling is sufficient, but not necessarily efficientOur
arguments are roughly in line with the claim that “scale is all
you need” to reach high levels of intelligence. We have ar-
gued that as resources are scaled (# parameters, # datapoints,
# flops), representations are converging, regardless of other
modeling choices and even data modality. Does this mean
that scale is all that matters? Not quite: different methods
can scale with different levels ofefficiency(Hestness et al.,
2017;,), and successful methods must
still satisfy some general requirements (e.g., be a consistent
estimator, model pairwise statistics ofP(Z)).
Training data can be shared across modalitiesSuppose
you have access toNimages andMsentences, and want to
learn the best representation. If there is indeed a modality-
agnostic platonic representation, then the image data should
help find it, and so should the language data. The implica-
tion is that if you want to train the best vision model, you
should train not just onNimages but also onMsentences.
This is already becoming common practice (Achiam et al.,
2023;,). Many vision models are fine-
tuned from pre-trained LLMs. The other direction is less
common, but also is implied by our hypothesis: if you want
to build the best LLM,you should also train it on image
data. Indeed,2023) claim evidence that this
is true, where training on images improved performance on
text. In theory, there should be some conversion ratio: a
pixel is worthawords for training LLMs, and a word is
worthbpixels for training vision models.
Ease of translation and adaptation across modalities
When two representations are aligned, transitioning from
one to the other should be a simple function that’s easily ob-
tained. Our hypothesis could explain the phenomenon that
conditional generation is easier than unconditional (Mirza &
Osindero,;,;,), as the
data we condition on may have the same platonic structure
as the data we are generating. In line with this, recent work
has found that representation-conditioning is even easier (Li
et al.,). Similarly, representational convergence could
act as a bridge that lets us find mappings between domains
even without paired data; this may underlie the success of
unpaired translation in vision (Zhu et al.,;,
2024;,) and language (Tran et al.,;
ple et al.,). We emphasize that this doesn’t mean that
models trained on a single modality (e.g., language) can
immediately process raw data from another (e.g., vision).
What makes them adaptable to the new modalities is that
they share a common modality-agnostic representation, and
can readily processrepresentationsof new modalities. Fur-
thermore, this implies that language models would achieve
some notion of grounding in the visual domain even in the
absence of cross-modal data.
5
The primary advantage of
cross-modal data could then simply be sample efficiency.
Scaling may reduce hallucination and biasA promi-
nent shortcoming of current LLMs is their propensity to
hallucinate, or output false statements. If models are indeed
converging toward an accurate model of reality, and scale
powers this convergence, then we may expect hallucina-
tions to decrease with scale. Of course, our hypothesis is
conditioned on the training data for future models constitut-
ing a sufficiently lossless and diverse set of measurements.
This may not come to pass, but it is an implication of our
hypothesis worth pointing out. A similar argument can be
made about certain kinds of bias. It has been shown that
large models can exacerbate existing biases present in their
training data (Hall et al.,). Our hypothesis implies that,
while this may be true, we should expectlargermodels to
amplify biasless. This does not mean bias will be removed,
rather that the model’s biases will more accurately reflect
the data’s biases, rather than exacerbating them.
6. Counterexamples and limitations
Different modalities may contain different information
One immediate objection to our hypothesis is: what about
the information that is unique to a given modality? Can lan-
guage really describe the ineffable experience of watching
a total solar eclipse? Or, how could an image convey the
a concept like “I believe in the freedom of speech,” which
is easy to write in English? Two different models cannot
converge to the same representation if they have access to
fundamentally different information.
More precisely, our mathematical argument in Section
only strictly holds for bijective projections ofZ, so that the
information in all the projections is equivalent to the infor-
mation in the underlying world. This will not hold true for
either lossy or stochastic observation functions. Nonethe-
less, similar arguments have been made theoretically and
empirically that cooccurrence relations are learned by prac-
tical contrastive (Wang & Isola,;,
2021) and predictive learners (Papyan et al.,).
5
In 1688, William Molyneux posed the question: could some-
one born blind, upon being given sight, be able to distinguish
shapes by vision alone? (Locke,) Our arguments suggest
an answer: not immediately, but after a bit of visual experience
(to form a visual representation) it should be easy (by mapping
to prior touch-based representations). Empirical data shows that
indeed congentially blind children given sight can quickly learn
such abilities (Held et al.,).
9

The Platonic Representation Hypothesis5 words 10 words 20 words 30 words
DCI Caption density
0.06
0.08
0.10
0.12
0.14
0.16
0.18
0.20
0.22
Alignment to vision
ImageNet21K
MAE
DINOv2
CLIP
CLIP (I12K ft)
Figure 9.
Increasing caption density improves alignment:We
vary caption length using the Densely-Captioned-Images (DCI)
dataset (Urbanek et al.,). Starting from a dense caption, we
used LLaMA3-8B-Instruct (Meta,) to summarize and gener-
ate coarse-grained captions. We compute the average alignment
score across all vision and language models with standard devi-
ation measured over the language models we evaluated. With
denser captions, the mapping may become more bijective, leading
to improved language-vision alignment scores.
et al.2021) and2023) also showed that
models trained to autoregressively generate text also cap-
ture statistical relations in many other modalities, including
symbolic reasoning, vision, protein folding, and robotics.
A more nuanced version of our hypothesis will need to be
developed to handle the case of non-bijective observations
and abstract concepts. A starting point could be: different
models will converge to the same representationwhen the
input signals are sufficiently high information and the mod-
els are sufficiently high capacity; when they are not, the
lower-information representation will only align with the
higher-information one up to a level capped by the mutual
information between the input signals and by the capacity
of each model. This cap might or might not be practically
important. Popular representations like CLIP are explicitly
optimized to only capture the shared information between
vision and language, yet are highly successful on many pure
vision tasks. We perform a preliminary test of the effect of
information level in Figure), and
find that the more descriptive (higher information) a caption
is, the better its LLM representation aligns with the visual
representation of the corresponding image.
Not all representations are presently convergingOur
argument has mainly focused on two modalities: vision
and language. While we do expect other modalities will
follow similar trends, we have yet to see the same level of
convergence across all domains. For example, in robotics
there is not yet a standardized approach to representing
world states in the same way as there is for representing
images and text. One limitation lies in the hardware used in
robotics, which is often expensive and slow. This creates a
bottleneck in the quantity and diversity of training data.
Sociological bias in producing AI modelsResearcher
bias and collective preferences within the AI community
have shaped the trajectory of model development. There
is often an explicit or implicit goal of designing AI sys-
tems that mimic human reasoning and performance, and
this could lead to convergence toward human-like represen-
tations even if other kinds of intelligence are in fact possible.
Additionally, the “hardware lottery” (Hooker,) sug-
gests that the success of AI models can also depend on the
compatibility of their design with available computational
architectures, further contributing to convergent trends.
Special-purpose intelligences might not convergeDif-
ferent intelligent systems can be designed to accomplish dif-
ferent tasks. For instance: A bioinformatics systems might
predict protein structure; an autonomous vehicle might fol-
low lanes on highways. It’s possible that not much is shared
between these two narrow tasks. Our argument only holds
for intelligences that are optimized to perform well onmany
tasks. We have argued that a representation ofrealityis
a structure that is useful across many tasks, but for any
special purpose there may be shortcuts, or even effective
representations detached from reality. Such shortcuts may
be more efficient and necessary for continued improvements
in specific domains. This will become more relevant if con-
tinued scaling comes up against boundary conditions around
resources like energy and compute.
How do we measure alignment?We focused on one par-
ticular alignment measure, mutual nearest-neighbor, in our
experiments, and cited experiments using several others.
However, there is active debate on the merits and deficien-
cies of all these ways of measuring alignment (Bansal et al.,
2021;,). We discuss our choice and
show results for other alignment metrics in Appendix.
Lots left to explainWe have shown results where differ-
ent models arrive atsimilarbut not thesamerepresenta-
tions. For example, in Figure, alignment clearly increases
but only reaches a score of0.16, according to our mutual
nearest-neighbor metric. The maximum theoretical value for
this metric is1. Is a score of0.16indicative of strong align-
ment with the remaining gap being “noise” or does it signify
poor alignment with major differences left to explain? We
leave this as an open question.
Acknowledgements
We thank
experiments shown in Figure. We thank the anonymous
reviewers for helpful feedback, and for providing the coun-
10

The Platonic Representation Hypothesis
terexample on how to visually convey “I believe in the
freedom of speech.” Thanks for Yonglong Tian, Dilip Kr-
ishnan, Anna Decker, Yoon Kim, Jyo Pari, Ani Nrusimha,
Dave Epstein, Victor Butoi, and Seungwook Han for help-
ful discussions and suggestions. This work was supported
by a Packard Fellowship and a Sloan Research Fellowship
to P.I., by the MIT-IBM Watson AI Lab, by ONR MURI
grant N00014-22-1-2740, by the Center for Brains, Minds,
and Machines, the MIT Quest for Intelligence, NSF STC
award CCF-1231216, the DARPA Knowledge Management
at Scale and Speed (KMASS) program, and the DARPA
Machine Common Sense (MCS) program.
References
Abdou, M., Kulmizev, A., Hershcovich, D., Frank, S.,
Pavlick, E., and Søgaard, A. Can language models encode
perceptual structure without grounding? a case study in
color.arXiv preprint arXiv:2109.06129, 2021.
Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I.,
Aleman, F. L., Almeida, D., Altenschmidt, J., Altman, S.,
Anadkat, S., et al. Gpt-4 technical report.arXiv preprint
arXiv:2303.08774, 2023.
Ainsworth, S. K., Hayase, J., and Srinivasa, S. Git re-basin:
Merging models modulo permutation symmetries.arXiv
preprint arXiv:2209.04836, 2022.
Anil, R., Borgeaud, S., Wu, Y., Alayrac, J.-B., Yu, J., Sori-
cut, R., Schalkwyk, J., Dai, A. M., Hauth, A., et al. Gem-
ini: a family of highly capable multimodal models.arXiv
preprint arXiv:2312.11805, 2023.
Aronszajn, N. Theory of reproducing kernels.Transactions
of the American mathematical society, 68(3):337–404,
1950.
Arora, S., Cohen, N., Hu, W., and Luo, Y. Implicit regular-
ization in deep matrix factorization.Advances in Neural
Information Processing Systems, 32, 2019a.
Arora, S., Khandeparkar, H., Khodak, M., Plevrakis, O.,
and Saunshi, N. A theoretical analysis of contrastive
unsupervised representation learning.arXiv preprint
arXiv:1902.09229, 2019b.
Balestriero, R. and Baraniuk, R. G. A spline theory of
deep learning. InInternational Conference on Machine
Learning, pp. 374–383. PMLR, 2018.
Bansal, Y., Nakkiran, P., and Barak, B. Revisiting model
stitching to compare neural representations.Advances
in neural information processing systems, 34:225–236,
2021.
Baradad, M., Wulff, J., Wang, T., Isola, P., and Torralba,
A. Learning to see by looking at noise. InAdvances in
Neural Information Processing Systems, 2021.
Baradad, M., Chen, R., Wulff, J., Wang, T., Feris, R., Tor-
ralba, A., and Isola, P. Procedural image programs for
representation learning.Advances in Neural Information
Processing Systems, 35:6450–6462, 2022.
Barlow, H. B. et al. Possible principles underlying the trans-
formation of sensory messages.Sensory communication,
1(01):217–233, 1961.
Betker, J., Goh, G., Jing, L., Brooks, T., Wang, J., Li, L.,
Ouyang, L., Zhuang, J., Lee, J., Guo, Y., et al. Improving
image generation with better captions.Computer Sci-
ence. https://cdn. openai. com/papers/dall-e-3. pdf, 2(3):
8, 2023.
BigScience, Scao, T. L., Fan, A., Akiki, C., Pavlick, E., Ili´c,
S., Hesslow, D., Castagn´e, R., Luccioni, A. S., Yvon, F.,
et al. Bloom: A 176b-parameter open-access multilingual
language model.arXiv preprint arXiv:2211.05100, 2022.
Bommasani, R., Hudson, D. A., Adeli, E., Altman, R.,
Arora, S., von Arx, S., Bernstein, M. S., Bohg, J., Bosse-
lut, A., Brunskill, E., et al. On the opportunities and risks
of foundation models.arXiv preprint arXiv:2108.07258,
2021.
Brohan, A., Brown, N., Carbajal, J., Chebotar, Y., Chen,
X., Choromanski, K., Ding, T., Driess, D., Dubey, A.,
Finn, C., et al. Rt-2: Vision-language-action models
transfer web knowledge to robotic control.arXiv preprint
arXiv:2307.15818, 2023.
Cao, R. and Yamins, D. Explanatory models in neuro-
science: Part 2–constraint-based intelligibility.Cognitive
Systems Research, 85, 2024.
Caron, M., Touvron, H., Misra, I., J´egou, H., Mairal, J.,
Bojanowski, P., and Joulin, A. Emerging properties in
self-supervised vision transformers. InProceedings of the
IEEE/CVF international conference on computer vision,
pp. 9650–9660, 2021.
Chambers, N. and Jurafsky, D. Unsupervised learning of
narrative event chains. InProceedings of ACL-08: HLT,
pp. 789–797, 2008.
Chen, T., Kornblith, S., Norouzi, M., and Hinton, G. A
simple framework for contrastive learning of visual rep-
resentations. InInternational conference on machine
learning, pp. 1597–1607. PMLR, 2020.
Cobbe, K., Kosaraju, V., Bavarian, M., Chen, M., Jun, H.,
Kaiser, L., Plappert, M., Tworek, J., Hilton, J., Nakano,
R., Hesse, C., and Schulman, J. Training verifiers to solve
11

The Platonic Representation Hypothesis
math word problems.arXiv preprint arXiv:2110.14168,
2021.
Conwell, C., Prince, J. S., Kay, K. N., Alvarez, G. A., and
Konkle, T. What can 1.8 billion regressions tell us about
the pressures shaping high-level visual representation in
brains and machines?BioRxiv, pp. 2022–03, 2022.
Dettmers, T., Lewis, M., Belkada, Y., and Zettlemoyer, L.
Gpt3. int8 (): 8-bit matrix multiplication for transformers
at scale.Advances in Neural Information Processing
Systems, 35:30318–30332, 2022.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. Bert:
Pre-training of deep bidirectional transformers for lan-
guage understanding.arXiv preprint arXiv:1810.04805,
2018.
Diamond, J. M.Guns, germs and steel: a short history of
everybody for the last 13,000 years. Vintage London,
1998.
Dingle, K., Camargo, C. Q., and Louis, A. A. Input–output
maps are strongly biased towards simple outputs.Nature
communications, 9(1):761, 2018.
Doppelt, G. Reconstructing scientific realism to rebut the
pessimistic meta-induction.Philosophy of Science, 74(1):
96–118, 2007.
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn,
D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M.,
Heigold, G., Gelly, S., et al. An image is worth 16x16
words: Transformers for image recognition at scale.arXiv
preprint arXiv:2010.11929, 2020.
Dravid, A., Gandelsman, Y., Efros, A. A., and Shocher, A.
Rosetta neurons: Mining the common units in a model
zoo. InProceedings of the IEEE/CVF International Con-
ference on Computer Vision, pp. 1934–1943, 2023.
Driess, D., Xia, F., Sajjadi, M. S., Lynch, C., Chowdhery,
A., Ichter, B., Wahid, A., Tompson, J., Vuong, Q., Yu, T.,
et al. Palm-e: An embodied multimodal language model.
arXiv preprint arXiv:2303.03378, 2023.
Gao, T., Yao, X., and Chen, D. SimCSE: Simple contrastive
learning of sentence embeddings. InEmpirical Methods
in Natural Language Processing (EMNLP), 2021.
Garipov, T., Izmailov, P., Podoprikhin, D., Vetrov, D. P.,
and Wilson, A. G. Loss surfaces, mode connectivity, and
fast ensembling of dnns.Advances in neural information
processing systems, 31, 2018.
Gell-Mann, M.The Quark and the Jaguar: Adventures in
the Simple and the Complex. Macmillan, 1995.
Geng, X. and Liu, H. OpenLLaMA: An open reproduction
of LLaMA, May 2023. URLhttps://github.com/
openlm-research/openllama.
Gokaslan, A. and Cohen, V. Openwebtext corpus.http://
Skylion007.github.io/OpenWebTextCorpus, 2019.
Goldblum, M., Finzi, M., Rowan, K., and Wilson, A. G.
The no free lunch theorem, Kolmogorov complexity, and
the role of inductive biases in machine learning.arXiv
preprint arXiv:2304.05366, 2023.
Gretton, A., Bousquet, O., Smola, A., and Sch¨olkopf, B.
Measuring statistical dependence with hilbert-schmidt
norms. InInternational conference on algorithmic learn-
ing theory, pp. 63–77. Springer, 2005.
Groeneveld, D., Beltagy, I., Walsh, P., Bhagia, A., Kinney,
R., Tafjord, O., Jha, A. H., Ivison, H., Magnusson, I.,
Wang, Y., et al. Olmo: Accelerating the science of lan-
guage models.arXiv preprint arXiv:2402.00838, 2024.
Gunasekar, S., Lee, J. D., Soudry, D., and Srebro, N. Im-
plicit bias of gradient descent on linear convolutional
networks. InAdvances in Neural Information Processing
Systems, pp. 9461–9471, 2018.
Gutmann, M. and Hyv¨arinen, A. Noise-contrastive estima-
tion: A new estimation principle for unnormalized statisti-
cal models. InProceedings of the thirteenth international
conference on artificial intelligence and statistics, pp.
297–304. JMLR Workshop and Conference Proceedings,
2010.
Ha, D. and Schmidhuber, J. World models.arXiv preprint
arXiv:1803.10122, 2018.
Hall, M., van der Maaten, L., Gustafson, L., Jones, M., and
Adcock, A. A systematic study of bias amplification.
arXiv preprint arXiv:2201.11706, 2022.
Hardin, C. L. and Rosenberg, A. In defense of convergent
realism.Philosophy of Science, 49(4):604–615, 1982.
He, K., Fan, H., Wu, Y., Xie, S., and Girshick, R. Mo-
mentum contrast for unsupervised visual representation
learning. InProceedings of the IEEE/CVF conference on
computer vision and pattern recognition, pp. 9729–9738,
2020.
He, K., Chen, X., Xie, S., Li, Y., Doll’ar, P., and Girshick,
R. B. Masked autoencoders are scalable vision learners.
2022 ieee. InCVF Conference on Computer Vision and
Pattern Recognition (CVPR), pp. 15979–15988, 2021.
Held, R., Ostrovsky, Y., de Gelder, B., Gandhi, T., Ganesh,
S., Mathur, U., and Sinha, P. The newly sighted fail to
match seen with felt.Nature neuroscience, 14(5):551–
553, 2011.
12

The Platonic Representation Hypothesis
Hestness, J., Narang, S., Ardalani, N., Diamos, G., Jun, H.,
Kianinejad, H., Patwary, M. M. A., Yang, Y., and Zhou, Y.
Deep learning scaling is predictable, empirically.arXiv
preprint arXiv:1712.00409, 2017.
Hooker, S. The hardware lottery.Communications of the
ACM, 64(12):58–65, 2021.
Huh, M., Mobahi, H., Zhang, R., Cheung, B., Agrawal,
P., and Isola, P. The low-rank simplicity bias in deep
networks.Transactions on Machine Learning Research,
2023. ISSN 2835-8856. URLhttps://openreview.
net/forum?id=bCiNWDmlY2.
Isola, P. The discovery of perceptual structure from visual
co-occurrences in space and time. InMIT Ph.D. Thesis,
2015.
Isola, P., Zoran, D., Krishnan, D., and Adelson, E. H. Crisp
boundary detection using pointwise mutual information.
InECCV, 2014.
Isola, P., Zoran, D., Krishnan, D., and Adelson, E. H. Learn-
ing visual groups from co-occurrences in space and time.
InICLR, Workshop paper, 2016.
Jiang, A. Q., Sablayrolles, A., Mensch, A., Bamford, C.,
Chaplot, D. S., Casas, D. d. l., Bressand, F., Lengyel, G.,
Lample, G., Saulnier, L., et al. Mistral 7b.arXiv preprint
arXiv:2310.06825, 2023.
Jiang, A. Q., Sablayrolles, A., Roux, A., Mensch, A., Savary,
B., Bamford, C., Chaplot, D. S., Casas, D. d. l., Hanna,
E. B., Bressand, F., et al. Mixtral of experts.arXiv
preprint arXiv:2401.04088, 2024.
Jordan, K., Sedghi, H., Saukh, O., Entezari, R., and
Neyshabur, B. Repair: Renormalizing permuted
activations for interpolation repair.arXiv preprint
arXiv:2211.08403, 2022.
Kabsch, W. A solution for the best rotation to relate two sets
of vectors.Acta Crystallographica Section A: Crystal
Physics, Diffraction, Theoretical and General Crystallog-
raphy, 32(5):922–923, 1976.
Kabsch, W. A discussion of the solution for the best rotation
to relate two sets of vectors.Acta Crystallographica
Section A: Crystal Physics, Diffraction, Theoretical and
General Crystallography, 34(5):827–828, 1978.
Kaplan, J., McCandlish, S., Henighan, T., Brown, T. B.,
Chess, B., Child, R., Gray, S., Radford, A., Wu, J., and
Amodei, D. Scaling laws for neural language models.
arXiv preprint arXiv:2001.08361, 2020.
Klabunde, M., Schumacher, T., Strohmaier, M., and Lem-
merich, F. Similarity of neural network models: A sur-
vey of functional and representational measures.arXiv
preprint arXiv:2305.06329, 2023.
Koh, J. Y., Salakhutdinov, R., and Fried, D. Grounding
language models to images for multimodal inputs and
outputs. InInternational Conference on Machine Learn-
ing, pp. 17283–17300. PMLR, 2023.
Kornblith, S., Norouzi, M., Lee, H., and Hinton, G. Sim-
ilarity of neural network representations revisited. In
International conference on machine learning, pp. 3519–
3529. PMLR, 2019.
Krizhevsky, A., Hinton, G., et al. Learning multiple layers
of features from tiny images. 2009.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. Imagenet
classification with deep convolutional neural networks.
Communications of the ACM, 60(6):84–90, 2017.
Lample, G., Ott, M., Conneau, A., Denoyer, L., and Ran-
zato, M. Phrase-based & neural unsupervised machine
translation. In Riloff, E., Chiang, D., Hockenmaier, J.,
and Tsujii, J. (eds.),Proceedings of the 2018 Conference
on Empirical Methods in Natural Language Processing,
Brussels, Belgium, October-November 2018. Association
for Computational Linguistics.
Lenc, K. and Vedaldi, A. Understanding image representa-
tions by measuring their equivariance and equivalence. In
Proceedings of the IEEE conference on computer vision
and pattern recognition, pp. 991–999, 2015.
Li, T., Katabi, D., and He, K. Return of unconditional
generation: A self-supervised representation generation
method.arXiv:2312.03701, 2023.
Lian, L., Li, B., Yala, A., and Darrell, T. LLM-grounded dif-
fusion: Enhancing prompt understanding of text-to-image
diffusion models with large language models.arXiv
preprint arXiv:2305.13655, 2023a.
Lian, L., Shi, B., Yala, A., Darrell, T., and Li, B.
LLM-grounded video diffusion models.arXiv preprint
arXiv:2309.17444, 2023b.
Lindsey, D. T. and Brown, A. M. The color lexicon of
american english.Journal of vision, 14(2):17–17, 2014.
Liu, H., Li, C., Wu, Q., and Lee, Y. J. Visual instruction
tuning. InNeurIPS, 2023.
Liu, S., Wang, T., Bau, D., Zhu, J.-Y., and Torralba, A.
Diverse image generation via self-conditioned GANs. In
Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition (CVPR), 2020.
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D.,
Levy, O., Lewis, M., Zettlemoyer, L., and Stoyanov, V.
Roberta: A robustly optimized bert pretraining approach.
arXiv preprint arXiv:1907.11692, 2019.
13

The Platonic Representation Hypothesis
Locke, J.An Essay Concerning Human Understanding.
1690.
L´opez-Cifuentes, A., Escudero-Vinolo, M., Besc´os, J., and
Garc´ıa-Mart´ın,´A. Semantic-aware scene recognition.
Pattern Recognition, 102:107256, 2020.
Lu, K., Grover, A., Abbeel, P., and Mordatch, I. Pretrained
transformers as universal computation engines.arXiv
preprint arXiv:2103.05247, 1, 2021.
Lubana, E. S., Bigelow, E. J., Dick, R. P., Krueger, D., and
Tanaka, H. Mechanistic mode connectivity. InInter-
national Conference on Machine Learning, pp. 22965–
23004. PMLR, 2023.
Ma, J., He, Y., Li, F., Han, L., You, C., and Wang, B. Seg-
ment anything in medical images.Nature Communica-
tions, 15(1):654, 2024.
McInnes, L., Healy, J., and Melville, J. Umap: Uniform
manifold approximation and projection for dimension
reduction.arXiv preprint arXiv:1802.03426, 2018.
Merullo, J., Castricato, L., Eickhoff, C., and Pavlick, E. Lin-
early mapping from image to text space.arXiv preprint
arXiv:2209.15162, 2022.
Meta. Meta LLaMA 3, 2024. URLhttps://ai.meta.
com/blog/meta-llama-3/.
Mirchandani, S., Xia, F., Florence, P., Ichter, B., Driess, D.,
Arenas, M. G., Rao, K., Sadigh, D., and Zeng, A. Large
language models as general pattern machines.arXiv
preprint arXiv:2307.04721, 2023.
Mirza, M. and Osindero, S. Conditional generative adver-
sarial nets.arXiv preprint arXiv:1411.1784, 2014.
Moschella, L., Maiorca, V., Fumero, M., Norelli, A., Lo-
catello, F., and Rodol`a, E. Relative representations enable
zero-shot latent space communication.arXiv preprint
arXiv:2209.15430, 2022.
Nagarajan, V. and Kolter, J. Z. Uniform convergence may
be unable to explain generalization in deep learning.Ad-
vances in Neural Information Processing Systems, 32,
2019.
Nettleship, R. L.Lectures on the ‘Republic’ of Plato, vol-
ume 2. Macmillan, 1897.
Newton-Smith, W.The Rationality of Science. Interna-
tional Library of Philosophy, Psychology, and Scien-
tific Method. Routledge & Kegan Paul, 1981. ISBN
9780710009135.
Ng, E., Subramanian, S., Klein, D., Kanazawa, A., Dar-
rell, T., and Ginosar, S. Can language models learn to
listen? InProceedings of the IEEE/CVF International
Conference on Computer Vision, pp. 10083–10093, 2023.
Ngo, J. and Kim, Y. What do language models hear?
probing for auditory representations in language mod-
els, 2024.
Olshausen, B. A. and Field, D. J. Emergence of simple-cell
receptive field properties by learning a sparse code for
natural images.Nature, 381(6583):607–609, 1996.
Olshausen, B. A. and Field, D. J. Sparse coding with an
overcomplete basis set: A strategy employed by v1?Vi-
sion research, 37(23):3311–3325, 1997.
Oord, A. v. d., Li, Y., and Vinyals, O. Representation learn-
ing with contrastive predictive coding.arXiv preprint
arXiv:1807.03748, 2018.
Oquab, M., Darcet, T., Moutakanni, T., Vo, H. V.,
Szafraniec, M., Khalidov, V., Fernandez, P., Haziza, D.,
Massa, F., El-Nouby, A., Howes, R., Huang, P.-Y., Xu,
H., Sharma, V., Li, S.-W., Galuba, W., Rabbat, M., As-
sran, M., Ballas, N., Synnaeve, G., Misra, I., Jegou, H.,
Mairal, J., Labatut, P., Joulin, A., and Bojanowski, P.
Dinov2: Learning robust visual features without supervi-
sion, 2023.
Oron, S., Dekel, T., Xue, T., Freeman, W. T., and Avidan,
S. Best-buddies similarity—robust template matching
using mutual nearest neighbors.IEEE transactions on
pattern analysis and machine intelligence, 40(8):1799–
1813, 2017.
Papyan, V., Han, X., and Donoho, D. L. Prevalence of
neural collapse during the terminal phase of deep learn-
ing training.Proceedings of the National Academy of
Sciences, 117(40):24652–24663, 2020.
Plato. Republic. c. 375 BC.
Putnam, H. Three kinds of scientific realism.The Philo-
sophical Quarterly (1950-), 32(128):195–200, 1982.
Radford, A., Jozefowicz, R., and Sutskever, I. Learning
to generate reviews and discovering sentiment.arXiv
preprint arXiv:1704.01444, 2017.
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D.,
Sutskever, I., et al. Language models are unsupervised
multitask learners.OpenAI blog, 1(8):9, 2019.
Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G.,
Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J.,
et al. Learning transferable visual models from natural
language supervision. InInternational conference on
machine learning, pp. 8748–8763. PMLR, 2021.
14

The Platonic Representation Hypothesis
Raghu, M., Gilmer, J., Yosinski, J., and Sohl-Dickstein, J.
Svcca: Singular vector canonical correlation analysis for
deep learning dynamics and interpretability.Advances in
neural information processing systems, 30, 2017.
Richens, J. and Everitt, T. Robust agents learn causal world
models.ICLR, 2024.
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh,
S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bern-
stein, M., et al. Imagenet large scale visual recognition
challenge.International journal of computer vision, 115:
211–252, 2015.
Sauer, A., Schwarz, K., and Geiger, A. StylegGAN-XL:
Scaling StyleGAN to large diverse datasets. InACM SIG-
GRAPH 2022 conference proceedings, pp. 1–10, 2022.
Schrimpf, M., Kubilius, J., Hong, H., Majaj, N. J., Rajaling-
ham, R., Issa, E. B., Kar, K., Bashivan, P., Prescott-Roy,
J., Geiger, F., et al. Brain-score: Which artificial neu-
ral network for object recognition is most brain-like?
BioRxiv, pp. 407007, 2018.
Sharma, P., Rott Shaham, T., Baradad, M., Fu, S.,
Rodriguez-Munoz, A., Duggal, S., Isola, P., and Torralba,
A. A vision check-up for language models. InarXiv
preprint, 2024.
Shepard, R. N. Multidimensional scaling, tree-fitting, and
clustering.Science, 210(4468):390–398, 1980.
Shi, Y., De Bortoli, V., Campbell, A., and Doucet, A. Diffu-
sion schr¨odinger bridge matching.Advances in Neural
Information Processing Systems, 36, 2024.
Smola, A. J. and Sch¨olkopf, B.Learning with kernels,
volume 4. Citeseer, 1998.
Solomonoff, R. J. A formal theory of inductive inference.
part i.Information and control, 7(1):1–22, 1964.
Song, L., Smola, A., Gretton, A., Bedo, J., and Borgwardt,
K. Feature selection via dependence maximization.Jour-
nal of Machine Learning Research, 13(5), 2012.
Srinivasan, K., Raman, K., Chen, J., Bendersky, M., and
Najork, M. Wit: Wikipedia-based image text dataset for
multimodal multilingual machine learning. InProceed-
ings of the 44th International ACM SIGIR Conference on
Research and Development in Information Retrieval, pp.
2443–2449, 2021.
Srivastava, A., Rastogi, A., Rao, A., Shoeb, A. A. M., Abid,
A., Fisch, A., Brown, A. R., Santoro, A., Gupta, A.,
Garriga-Alonso, A., et al. Beyond the imitation game:
Quantifying and extrapolating the capabilities of language
models.arXiv preprint arXiv:2206.04615, 2022.
Steinberg, E., Jung, K., Fries, J. A., Corbin, C. K., Pfohl,
S. R., and Shah, N. H. Language models are an effective
representation learning technique for electronic health
record data.Journal of biomedical informatics, 113:
103637, 2021.
Stoica, G., Bolya, D., Bjorner, J., Hearn, T., and Hoffman,
J. Zipit! merging models from different tasks without
training.arXiv preprint arXiv:2305.03053, 2023.
Sucholutsky, I., Muttenthaler, L., Weller, A., Peng, A., Bobu,
A., Kim, B., Love, B. C., Grant, E., Groen, I., Achterberg,
J., Tenenbaum, J. B., Collins, K. M., Hermann, K. L.,
Oktar, K., Greff, K., Hebart, M. N., Jacoby, N., Zhang, Q.,
Marjieh, R., Geirhos, R., Chen, S., Kornblith, S., Rane, S.,
Konkle, T., O’Connell, T. P., Unterthiner, T., Lampinen,
A. K., M¨uller, K.-R., Toneva, M., and Griffiths, T. L.
Getting aligned on representational alignment, 2023.
Team, G., Mesnard, T., Hardin, C., Dadashi, R., Bhupatiraju,
S., Pathak, S., Sifre, L., Rivi`ere, M., Kale, M. S., Love,
J., et al. Gemma: Open models based on gemini research
and technology.arXiv preprint arXiv:2403.08295, 2024.
Tian, Y., Krishnan, D., and Isola, P. Contrastive multiview
coding. InComputer Vision–ECCV 2020: 16th European
Conference, Glasgow, UK, August 23–28, 2020, Proceed-
ings, Part XI 16, pp. 776–794. Springer, 2020a.
Tian, Y., Wang, Y., Krishnan, D., Tenenbaum, J. B., and
Isola, P. Rethinking few-shot image classification: a good
embedding is all you need? InComputer Vision–ECCV
2020: 16th European Conference, Glasgow, UK, August
23–28, 2020, Proceedings, Part XIV 16, pp. 266–282.
Springer, 2020b.
Tolstoy, L.Anna Karenina. The Russian Messenger, 1877.
Torralba, A., Fergus, R., and Freeman, W. T. 80 million tiny
images: A large data set for nonparametric object and
scene recognition.IEEE transactions on pattern analysis
and machine intelligence, 30(11):1958–1970, 2008.
Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi,
A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P.,
Bhosale, S., et al. LLaMA 2: Open foundation and fine-
tuned chat models.arXiv preprint arXiv:2307.09288,
2023.
Tran, D., Burda, Y., and Sutskever, I. Feature-matching
auto-encoders. 2017.
Umeyama, S. Least-squares estimation of transformation
parameters between two point patterns.IEEE Transac-
tions on Pattern Analysis & Machine Intelligence, 13(04):
376–380, 1991.
15

The Platonic Representation Hypothesis
Urbanek, J., Bordes, F., Astolfi, P., Williamson, M., Sharma,
V., and Romero-Soriano, A. A picture is worth more than
77 text tokens: Evaluating CLIP-style models on dense
captions, 2023.
Valle-Perez, G., Camargo, C. Q., and Louis, A. A. Deep
learning generalizes because the parameter-function map
is biased towards simple functions. InInternational Con-
ference on Learning Representations, 2019.
Wang, T. and Isola, P. Understanding contrastive represen-
tation learning through alignment and uniformity on the
hypersphere. InInternational Conference on Machine
Learning, pp. 9929–9939. PMLR, 2020.
Werbos, P. J. Learning how the world works: Specifications
for predictive networks in robots and brains. InProceed-
ings of IEEE International Conference on Systems, Man
and Cybernetics, NY, 1987.
Wightman, R. PyTorch image models.https://github.
com/rwightman/pytorch-image-models, 2021.
Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C.,
Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M.,
et al. Huggingface’s transformers: State-of-the-art natural
language processing.arXiv preprint arXiv:1910.03771,
2019.
Wortsman, M., Ilharco, G., Gadre, S. Y., Roelofs, R.,
Gontijo-Lopes, R., Morcos, A. S., Namkoong, H.,
Farhadi, A., Carmon, Y., Kornblith, S., et al. Model
soups: averaging weights of multiple fine-tuned mod-
els improves accuracy without increasing inference time.
InInternational Conference on Machine Learning, pp.
23965–23998. PMLR, 2022.
Wu, T.-H., Lian, L., Gonzalez, J. E., Li, B., and Darrell, T.
Self-correcting LLM-controlled diffusion models.arXiv
preprint arXiv:2311.16090, 2023.
Xie, S., Ho, Q., and Zhang, K. Unsupervised image-to-
image translation with density changing regularization.
Advances in Neural Information Processing Systems, 35:
28545–28558, 2022.
Yamins, D. L., Hong, H., Cadieu, C. F., Solomon, E. A.,
Seibert, D., and DiCarlo, J. J. Performance-optimized
hierarchical models predict neural responses in higher
visual cortex.Proceedings of the national academy of
sciences, 111(23):8619–8624, 2014.
Zellers, R., Holtzman, A., Bisk, Y., Farhadi, A., and Choi,
Y. HellaSwag: Can a machine really finish your sen-
tence? In Korhonen, A., Traum, D., and M`arquez,
L. (eds.),Proceedings of the 57th Annual Meeting of
the Association for Computational Linguistics, pp. 4791–
4800, Florence, Italy, July 2019. Association for Compu-
tational Linguistics. doi: 10.18653/v1/P19-1472. URL
https://aclanthology.org/P19-1472.
Zhai, X., Puigcerver, J., Kolesnikov, A., Ruyssen, P.,
Riquelme, C., Lucic, M., Djolonga, J., Pinto, A. S., Neu-
mann, M., Dosovitskiy, A., et al. The visual task adapta-
tion benchmark. 2019.
Zhang, R., Isola, P., Efros, A. A., Shechtman, E., and Wang,
O. The unreasonable effectiveness of deep features as a
perceptual metric. InProceedings of the IEEE conference
on computer vision and pattern recognition, pp. 586–595,
2018.
Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., and Tor-
ralba, A. Places: A 10 million image database for scene
recognition.IEEE transactions on pattern analysis and
machine intelligence, 40(6):1452–1464, 2017.
Zhu, J.-Y., Park, T., Isola, P., and Efros, A. A. Unpaired
image-to-image translation using cycle-consistent adver-
sarial networks. InComputer Vision (ICCV), 2017 IEEE
International Conference on, 2017.
Zimmermann, R. S., Sharma, Y., Schneider, S., Bethge,
M., and Brendel, W. Contrastive learning inverts the
data generating process. InInternational Conference on
Machine Learning, pp. 12979–12990. PMLR, 2021.
16

The Platonic Representation Hypothesis
A. Mutualk-Nearest Neighbor Alignment Metric
For two models with representationsf,gthe mutualk-nearest neighbor metric measures the average overlap of their
respective nearest neighbor sets. In this section, we refer to this metric asmNN, which we will formally define below.
For cross-modal domains, define(xi, yi)∈ X as a sample from the data distributionX(e.g.image-caption dataset). For the
single domain alignment measurements, the samples are equivalentxi=yi (e.g., images for vision, and text for language).
Let{xi, yi}
b
i=1 be the corresponding mini-batch sampled from this data distribution. Then given two model representations
fandgthe corresponding features are:ϕi=f(xi) andψi=f(yi) , where the collection of these features are denoted as
Φ ={ϕ1, . . . , ϕb} andΨ ={ψ1, . . . , ψb} . Then for each feature pair(ϕi, ψi) , we compute the respective nearest neighbor
setsS(ϕi)andS(ψj).
dknn(ϕi,Φ\ϕi) =S(ϕi) (9)
dknn(ψi,Ψ\ψi) =S(ψj) (10)
wheredknnreturns the set of indices of itsk-nearest neighbors. Then we measure its average intersection via
mNN(ϕi, ψi) =
1
k
|S(ϕi)∩ S(ψj)| (11)
where| · |is the size of the intersection.
The choice to use mutual nearest-neighborsOur initial efforts to measure alignment with CKA revealed a very
weak trend of alignment between models, even when comparing models within their own modality. This has also been
observed by (Bansal et al.,), which had relied on alternative metrics such as model-stitching as it “reveals aspects of
representations that measures such as centered kernel alignment (CKA) cannot” (Bansal et al.,).
We chose to use nearest-neighbor as a metric, as methods like CKA has a very strict definition of alignment, which may not
fit our current needs. For instance, understanding the precise similarity between unrelated items, such as an orange and Bill
Gates, may not be critical.
Relationship between CKA and Mutual Nearest-NeighborsLetϕi∈R
n andψi∈R
m be vectorized features of two
models (e.g.language and vision models). LetKij=κ(ϕi, ϕj) andLij=κ(ψi, ψj) be the kernel matrices computed from
a dataset using some kernel-functionκ. Using an inner-product kernel, theij-th entry of the centered counterpart of these
Kernel matrices is:
¯
Kij=⟨ϕi, ϕj⟩ −El[⟨ϕi, ϕl⟩]
¯
Lij=⟨ψi, ψj⟩ −El[⟨ψi, ψl⟩] (12)
Then, the cross-covariance ofKandLis given by:
HSIC(K,L) =
1
(n−1)
2
Trace(
¯
K
¯
L) (13)
which serves as an empirical estimator of the Hilbert-Schmidt Independence Criterion (Gretton et al.,). The Centered
Kernel Alignment (CKA) (Kornblith et al.,) is then its normalized counterpart:
CKA(K,L) =
HSIC(K,L)
p
HSIC(K,K)HSIC(L,L)
(14)
CKA measures the congruence between two random variables, with a maximum alignment of1and a minimum of0. It is
invariant to isotropic scaling and offers a strict notion of alignment, measuring alignment across all samples. Hence, the
CKA score reflects the global similarities of the models. This can be illustrated by expanding the trace term in HSIC:
Trace(
¯
K
¯
L) =
X
i
X
j
(⟨ϕi, ϕj⟩ −El[⟨ϕi, ϕl⟩]) (⟨ψi, ψj⟩ −El[⟨ψi, ψl⟩]) (15)
One can modify the definition of alignment to restrict the cross-covariance measurement to samples considered to be nearest
neighbors of the current samplei. This emphasizes similarity over dissimilarity, biasing the measure toward local alignment:
Alignknn(K,L) =
X
i
X
j
α(i, j)·(⟨ϕi, ϕj⟩ −El[⟨ϕi, ϕl⟩]) (⟨ψi, ψj⟩ −El[⟨ψi, ψl⟩]) (16)
whereα(i, j) =1[ϕj∈knn(ϕi)∧ψj∈knn(ψi)∧i̸=j] (17)
17

The Platonic Representation Hypothesis10
1
10
0.78
10
1.30
10
1.48
0.4
0.2
0.0
0.2
0.4
ImageNet21K
10
1
10
0.78
10
1.30
10
1.48
0.4
0.2
0.0
0.2
0.4
MAE
10
1
10
0.78
10
1.30
10
1.48
LANGUAGE model perplexity (log-scale)
0.4
0.2
0.0
0.2
0.4
0.6
DINOv2
Alignment trend using CKNNA metric
10
1
10
0.78
10
1.30
10
1.48
0.4
0.2
0.0
0.2
0.4
0.6
CLIP
10
1
10
0.78
10
1.30
10
1.48
0.4
0.2
0.0
0.2
0.4
0.6
CLIP (I12K ft)
K=1000 K=800 K=500 K=200 K=100 K=50 K=20 K=10
Figure 10.
Cross-modal alignment increases locally:Alignment trend when varying the top-knearest neighbors in the CKNNA
metrics (Eqn.). We center alignment score to the smallest language model and divide the total trend by the standard deviation. When
k= 1024, we recover the original CKA metric, and whenk <|X |it closely resembles the mutual nearest-neighbor metricmNN. Each
line represents the average of all LLM models for a specifick. As we decreasek, the alignment becomes more pronounced. Certain
visual tasks, such as CLIP, show global alignment, while methods like ImageNet1k classification only exhibit local alignment.
Whereα(i, j) is a scalar weighting that assigns1ifjis a mutual nearest neighbors to bothϕiandψi, and0otherwise.
We refer to this metric as the Centered Kernel Nearest-Neighbor Alignment (CKNNA) metric. As the number of nearest
neighborsk→dim(K), we recover the original CKA metric.
CKNNA(K,L) =
Alignknn(K,L)
p
Alignknn(K,K),Alignknn(L,L)
(18)
We can further relax the metric to treat the cross-covariance term identically across all nearest-neighbor samples. This is
equivalent to the assumption that all nearby samples have the same distance. This simplification leads us back to the mutual
nearest neighbor metric:
X
i
X
j
α(i, j)·1 =n·k·mNN(ϕi, ψi) (19)
By equating these metrics, we analyze the changes in alignment between language and vision models as we vary the number
of neighborskin Eqn.. In Figure, we compute the average alignment score across all LLM models. For each k, we
center the scores to the smallest vision model and divide by the standard deviation of the scores. We find that high values of
kshow less conclusive alignment across tasks while decreasingkshows a coherent trend across both models and tasks. We
find that certain visual tasks, such as CLIP, exhibit global alignment, whereas methods like ImageNet21k classification show
only local alignment. This observation suggests that cross-modal alignment occurs locally across most common visual tasks,
and global alignment may require additional language grounding, as done in the CLIP objective.
18

The Platonic Representation Hypothesis
B. Consistency across various metrics
We describe the metrics in Table symmetricproperty implies that the metric is
symmetric with respect to the data pointsd(x, y) =d(y, x) . Theglobalproperty means all samples are used to compute the
distance with respect to every sample. Theordinalproperty is when the ordering of the distance is taken into consideration.
For example, mutual nearest neighbor is not ordinal since the nearest neighbors{a, b, c} and{c, a, b} are treated equally.
Thebatchableproperty is a computational property that makes it feasible to compute in a reasonable time frame.
Vision-vision comparisonIn Figure, we evaluate Spearman’s rank correlation among different metrics and hyperpa-
rameters over78vision models (details in Appendix). We find most metrics highly correlated with each other.
Cross-modal comparisonWe measure vision-language alignment using a range of alternative metrics. We visualize
the corresponding alignment results in Figure. Our findings indicate that alignment sensitivity not only
depends on the metric used to compute it but also varies according to the specific tasks on which the vision models are
trained.
Metric
Property
Description
symmetric global ordinal batchable
CKA ✓ ✓ ✓ ✓
Centered Kernel Alignment (CKA;2019))
measures the similarity of neural networks by comparing
the alignment of their kernel induced by their feature spaces.
Unbiased CKA ✓ ✓ ✓ ✓
Unbiased estimator of CKA that corrects for sample bias in
HSIC (Song et al.,).
SVCCA ✓ ✓ ✓ ✓
Singular Value Canonical Correlation Analysis (SVCCA;
Raghu et al.2017)) compares neural networks by decom-
posing their activities into singular vectors and measuring
correlation.
Mutualk-NN ✓ ✓
Measures the intersection over union (IoU) of nearest neigh-
bors between two models.CKNNA ✓ ✓∗ ✓ ✓
Modified CKA measure that computes the kernel alignment
only for its nearest neighbors. See Appendix.
Cyclek-NN ✓
Measures whether the nearest neighbor in one domain also
considers the original sample as its nearest neighbor in the
other domain.Editk-NN ✓ ✓∗ ✓
Computes the edit distance required to match the nearest
neighbors between two datasets. The score is normalized
by the maximum edit distance.
LCSk-NN ✓ ✓ ∗ ✓
Calculates the longest common subsequence of nearest
neighbors and is normalized by the sequence length.
Figure 11.
Comparative analysis of neural network similarity metrics.✓∗indicates the metric is global and still meaningful when the
nearest neighborkis set to maximum batch-sizek=|X |.
19

The Platonic Representation HypothesisMutual k-NN (k=10,bsz=1000)Mutual k-NN (k=10,bsz=512)Mutual k-NN (k=10,bsz=256)Mutual k-NN (k=10,bsz=128)LCS k-NN (k=10,bsz=1000)Edit k-NN (k=10,bsz=1000)CKNNA (k=5,bsz=1000)CKNNA (k=10,bsz=1000)CKNNA (k=50,bsz=1000)CKNNA (k=100,bsz=1000)CKNNA (k=250,bsz=1000)CKNNA (k=500,bsz=1000)CKNNA (k=600,bsz=1000)CKNNA (k=700,bsz=1000)CKNNA (k=800,bsz=1000)CKNNA (k=900,bsz=1000)CKNNA (k=950,bsz=1000)CKNNA (k=975,bsz=1000)CKNNA (k=1000,bsz=1000)CKA (bsz=1000)Unbiased CKA (bsz=1000)SVCCA (bsz=1000)Cycle k-NN (k=10,bsz=1000)
Mutual k-NN (k=10,bsz=1000)
Mutual k-NN (k=10,bsz=512)
Mutual k-NN (k=10,bsz=256)
Mutual k-NN (k=10,bsz=128)
LCS k-NN (k=10,bsz=1000)
Edit k-NN (k=10,bsz=1000)
CKNNA (k=5,bsz=1000)
CKNNA (k=10,bsz=1000)
CKNNA (k=50,bsz=1000)
CKNNA (k=100,bsz=1000)
CKNNA (k=250,bsz=1000)
CKNNA (k=500,bsz=1000)
CKNNA (k=600,bsz=1000)
CKNNA (k=700,bsz=1000)
CKNNA (k=800,bsz=1000)
CKNNA (k=900,bsz=1000)
CKNNA (k=950,bsz=1000)
CKNNA (k=975,bsz=1000)
CKNNA (k=1000,bsz=1000)
CKA (bsz=1000)
Unbiased CKA (bsz=1000)
SVCCA (bsz=1000)
Cycle k-NN (k=10,bsz=1000)
1.001.001.001.001.000.980.920.930.850.790.690.620.620.640.660.670.680.690.730.730.700.550.71
1.001.001.001.001.000.980.920.930.850.790.690.620.620.640.660.670.680.690.730.730.700.550.71
1.001.001.001.001.000.980.920.930.850.790.690.620.620.640.660.670.680.690.730.730.700.550.71
1.001.001.001.001.000.980.920.930.850.790.690.620.620.640.660.670.680.690.730.730.700.550.71
1.001.001.001.001.000.970.910.930.850.790.690.620.620.640.660.670.680.680.730.730.700.550.71
0.980.980.980.980.971.000.910.910.820.770.660.590.590.610.630.640.650.660.710.710.680.530.71
0.920.920.920.920.910.911.000.970.890.850.760.690.690.690.680.670.670.670.710.710.680.520.62
0.930.930.930.930.930.910.971.000.940.900.820.750.740.740.730.710.700.700.730.730.700.530.60
0.850.850.850.850.850.820.890.941.000.990.930.870.850.840.810.770.750.740.760.760.740.550.57
0.790.790.790.790.790.770.850.900.991.000.970.920.900.880.840.790.760.750.770.770.740.550.52
0.690.690.690.690.690.660.760.820.930.971.000.980.960.930.870.800.770.750.750.750.730.570.44
0.620.620.620.620.620.590.690.750.870.920.981.000.990.970.910.840.800.790.780.780.760.620.38
0.620.620.620.620.620.590.690.740.850.900.960.991.000.990.950.880.850.830.810.810.800.670.38
0.640.640.640.640.640.610.690.740.840.880.930.970.991.000.980.930.900.890.870.870.860.720.40
0.660.660.660.660.660.630.680.730.810.840.870.910.950.981.000.980.960.940.920.920.920.780.42
0.670.670.670.670.670.640.670.710.770.790.800.840.880.930.981.000.990.980.960.960.960.820.45
0.680.680.680.680.680.650.670.700.750.760.770.800.850.900.960.991.001.000.970.970.980.830.47
0.690.690.690.690.680.660.670.700.740.750.750.790.830.890.940.981.001.000.980.980.990.840.49
0.730.730.730.730.730.710.710.730.760.770.750.780.810.870.920.960.970.981.001.000.990.830.56
0.730.730.730.730.730.710.710.730.760.770.750.780.810.870.920.960.970.981.001.000.990.830.56
0.700.700.700.700.700.680.680.700.740.740.730.760.800.860.920.960.980.990.990.991.000.840.51
0.550.550.550.550.550.530.520.530.550.550.570.620.670.720.780.820.830.840.830.830.841.000.40
0.710.710.710.710.710.710.620.600.570.520.440.380.380.400.420.450.470.490.560.560.510.401.00
0.0
0.2
0.4
0.6
0.8
1.0
Spearman's Rank Correlation
Figure 12.
Vision-vision alignment measured with various metrics.Spearman’s rank correlation among different metrics and batch
sizes (bsz) when used to measure alignment among78vision models (see Appendix p-values are
below2.24×10
−105
. Our vision-vision analysis in Figure k-NN withk= 10andbsz= 1000).
20

The Platonic Representation Hypothesisbloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.30
0.33
0.35
0.38
0.40
0.43
0.45
0.48
Alignment to ImageNet21K
vit tiny
vit small
vit base
vit large
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.24
0.26
0.28
0.30
0.32
0.34
Alignment to MAE
mae base
mae large
mae huge
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.30
0.33
0.35
0.38
0.40
0.43
0.45
0.48
Alignment to DINOv2
dino small
dino base
dino large
dino giant
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.38
0.40
0.42
0.44
0.46
0.48
0.50
0.52
Alignment to CLIP
clip base
clip large
clip huge
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.33
0.35
0.38
0.40
0.43
0.45
0.48
0.50
0.53
Alignment to CLIP(I12Kft)
clip base i1k ft
clip large i1k ft
clip huge i1k ft
(a) CKAbloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.20
0.22
0.24
0.26
0.28
0.30
0.32
Alignment to ImageNet21K
vit tiny
vit small
vit base
vit large
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.20
0.22
0.24
0.26
0.28
Alignment to MAE
mae base
mae large
mae huge
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.28
0.30
0.32
0.34
0.36
0.38
Alignment to DINOv2
dino small
dino base
dino large
dino giant
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.32
0.34
0.36
0.38
0.40
0.42
0.44
Alignment to CLIP
clip base
clip large
clip huge
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.24
0.26
0.28
0.30
0.32
0.34
Alignment to CLIP(I12Kft)
clip base i1k ft
clip large i1k ft
clip huge i1k ft (b) Unbiased CKAbloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.34
0.36
0.38
0.40
0.42
0.44
Alignment to ImageNet21K
vit tiny
vit small
vit base
vit large
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.32
0.34
0.36
0.38
0.40
Alignment to MAE
mae base
mae large
mae huge
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.38
0.40
0.42
0.44
0.46
Alignment to DINOv2
dino small
dino base
dino large
dino giant
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.42
0.44
0.46
0.48
0.50
0.52
0.54
Alignment to CLIP
clip base
clip large
clip huge
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.36
0.38
0.40
0.42
0.44
0.46
Alignment to CLIP(I12Kft)
clip base i1k ft
clip large i1k ft
clip huge i1k ft (c) SVCCAbloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.09
0.10
0.11
0.12
0.13
0.14
Alignment to ImageNet21K
vit tiny
vit small
vit base
vit large
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.06
0.07
0.08
0.09
0.10
Alignment to MAE
mae base
mae large
mae huge
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.09
0.10
0.11
0.12
0.13
0.14
0.15
0.16
Alignment to DINOv2
dino small
dino base
dino large
dino giant
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.12
0.14
0.16
0.18
0.20
Alignment to CLIP
clip base
clip large
clip huge
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.10
0.11
0.12
0.13
0.14
0.15
0.16
0.17
Alignment to CLIP(I12Kft)
clip base i1k ft
clip large i1k ft
clip huge i1k ft (d) Mutualk-NN (k= 10)
Figure 13.Cross-modal alignment for various metrics
21

The Platonic Representation Hypothesisbloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.10
0.12
0.14
0.16
0.18
Alignment to ImageNet21K
vit tiny
vit small
vit base
vit large
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.06
0.07
0.08
0.09
0.10
0.11
Alignment to MAE
mae base
mae large
mae huge
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.10
0.12
0.14
0.16
0.18
0.20
Alignment to DINOv2
dino small
dino base
dino large
dino giant
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.14
0.16
0.18
0.20
0.22
0.24
Alignment to CLIP
clip base
clip large
clip huge
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.12
0.14
0.16
0.18
0.20
0.22
Alignment to CLIP(I12Kft)
clip base i1k ft
clip large i1k ft
clip huge i1k ft
(a) CKNNA (k= 10)bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.45
0.48
0.50
0.53
0.55
0.58
0.60
Alignment to ImageNet21K
vit tiny
vit small
vit base
vit large
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.34
0.36
0.38
0.40
0.42
0.44
0.46
0.48
Alignment to MAE
mae base
mae large
mae huge
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.50
0.55
0.60
0.65
Alignment to DINOv2
dino small
dino base
dino large
dino giant
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.55
0.58
0.60
0.62
0.65
0.68
0.70
Alignment to CLIP
clip base
clip large
clip huge
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.50
0.55
0.60
0.65
0.70
Alignment to CLIP(I12Kft)
clip base i1k ft
clip large i1k ft
clip huge i1k ft (b) Cyclek-NN (k= 10)bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.75
0.75
0.75
0.75
0.75
0.76
0.76
0.76
Alignment to ImageNet21K
vit tiny
vit small
vit base
vit large
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.75
0.75
0.75
0.75
0.75
0.76
Alignment to MAE
mae base
mae large
mae huge
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.75
0.76
0.76
0.76
0.76
0.76
Alignment to DINOv2
dino small
dino base
dino large
dino giant
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.76
0.76
0.76
0.76
0.76
0.76
Alignment to CLIP
clip base
clip large
clip huge
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.75
0.76
0.76
0.76
0.76
0.76
Alignment to CLIP(I12Kft)
clip base i1k ft
clip large i1k ft
clip huge i1k ft (c) Edit-distancek-NN (k= 10)bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.16
0.18
0.20
0.22
0.24
0.26
0.28
Alignment to ImageNet21K
vit tiny
vit small
vit base
vit large
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.12
0.14
0.16
0.18
Alignment to MAE
mae base
mae large
mae huge
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.20
0.22
0.24
0.26
0.28
0.30
0.32
Alignment to DINOv2
dino small
dino base
dino large
dino giant
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.22
0.24
0.26
0.28
0.30
0.32
0.34
0.36
0.38
Alignment to CLIP
clip base
clip large
clip huge
bloom0.56b
bloom1.1bbloom1.7b
bloom3bbloom7b
openllama3bopenllama7b
openllama13b
llama7b
llama13bllama33bllama65b
0.18
0.20
0.22
0.24
0.26
0.28
0.30
0.32
Alignment to CLIP(I12Kft)
clip base i1k ft
clip large i1k ft
clip huge i1k ft (d) Longest-Common-Subsequencek-NN (k= 10)
Figure 14.Cross-modal alignment measured with various metrics
22

The Platonic Representation Hypothesis
C. Experiments on Evaluating Alignment and Convergence
To demonstrate representational convergence, we take off-the-shelf models at multiple scales and multiple modalities and
measure their representational alignment.
C.1. Vision-Vision Alignment and Representation Quality
We consider 78 vision models in total:
•17
ViT models ranging from ViT-tiny to ViT-giant, trained on tasks including ImageNet-21k (Dosovitskiy et al.,)
classification, Masked Autoencoders (He et al.,), DINO (Caron et al.,), and CLIP (Radford et al.,),
including some finetuned on ImageNet-12k.
•1randomly initialized ResNet-50.
•11
ResNet-50 models trained with contrastive learning on ImageNet-1k, Places-365 (Zhou et al.,; ´opez-Cifuentes
et al.,), and 9synthetic image datasets used in2022).
•49
ResNet-18 models trained with Alignment and Uniformity contrastive loss (Wang & Isola,) on ImageNet-100,
Places-365, and47realistic and synthetic image datasets from2021).
To test representation quality, we evaluate linear probing performance on all 19 VTAB classification tasks (Zhai et al.,),
which is a standard multi-task transfer learning benchmark containing structured, specialized, and natural datasets covering
diverse domains. To reduce compute requirements, we subsample training and validation datasets to have at most 10,000
samples. We consider a representation solves a task if its performance is≥80%of the best performance on that task across
all 78 models.
To compute the alignment metric, we usek= 10nearest neighbors over1000image representations computed on Places-
365’s validation dataset (Zhou et al.,). This dataset is disjoint from VTAB datasets, although both contain natural
images.
C.2. Cross-Modal Alignment
We compare the representation of an image in a vision model to the representation of a caption describing that image in a
language model. The language model families we consider are BLOOM (BigScience et al.,), OpenLlama (Geng & Liu,
2023), and LLama (Touvron et al.,). For Figure, we included more recent model families such as OLMo (Groeneveld
et al.,), LLama3 (Meta,), Gemma (Team et al.,), and Mistral/Mixtral (Jiang et al.,;). These
models were downloaded from Huggingface (Wolf et al.,).
For vision models, we consider ViT models (Dosovitskiy et al.,) of various sizes trained on various data and objectives.
We mainly consider the popular vision models: classification on ImageNet21K (Russakovsky et al.,), MAE (He et al.,
2021), DINOv2 (Oquab et al.,), CLIP (Radford et al.,), and CLIP finetuned on ImageNet12K. These models
were downloaded from PyTorch Image Models (TIMM) (Wightman,).
To compute the alignment metric, we usek= 10nearest neighbors over 1024 samples from WIT (Wikipedia-based Image
Text)(Srinivasan et al.,). For the vision model, we use class token of each layer, and for the language model, we
average pool each layer to a single token. Since it is not trivial to determine where the alignment might occur, we draw
inspiration from BrainScore(Schrimpf et al.,) and compute pairwise alignment scores, then take the maximum. One of
these pairwise comparisons also includes concatenated features. We applyl2normalization to the features before measuring
the distance. As transformer architectures have “emergent outliers” (Dettmers et al.,), we truncate the elements in the
features that are above the95-th percentile.
Simply taking the last token did not show any strong alignment signal. We also experimented with prompting the language
model and taking the last token representation. The prompt we used was
An image with the caption ‘<caption>’. This is an image of a <fill>
Using prompting showed similar trends to average pooling but had slightly lower alignment scores.
23

The Platonic Representation Hypothesis
D. Color Cooccurrence Experiment
Here we describe the details of how we created the four color representations visualized in Figure, from left to right.
Perceptual representation from CIELAB color spaceWe embed pixels taken from the CIFAR-10 image dataset
(Krizhevsky et al.,;,) based on the CIELAB color space, which is designed as a perceptually
uniformspace that changes numerical values correspond to similar perceived changes in color.
Three representations from cooccurrence in VISION and LANGUAGEFor these three representations, we first obtain
a dissimilarity matrix over colors (in different ways detailed below), then use multidimensional scaling (Shepard,) to
find a 3-dimensional embedding in which Euclidean distance between the embeddings forAandB,zAandzB, best matches
this dissimilarity matrix. We use1,000fits and take the best match. Afterward, we visually align it with the CIELAB space
by finding the best rotation, translation, scaling, and flipping, by running the Kabsch-Umeyama algorithm (Kabsch,;
1978;,) twice, once on zand once on−z, to account for flipping. The dissimilarity matrix we used in each
case is described as following:

VISION: Pixel cooccurrence.We collect color cooccurrence statistics from the CIFAR-10 dataset, and estimate a
joint distributionp(A, B) over300,000 randomly sampled pixel colorsAandBthat occur within a radius of at most 4
pixels of one another. Colors are quantized on a grid in RGB space and represented as discrete variables, andp(A, B)
is modeled as a table of normalized counts, from which we compute the empirical pointwise mutual information
matrixKPMI(A, B) . Quantization ensures that there is no bias from how color distances are represented in RGB
space. Dissimilarity matrix is defined as−KPMI(A, B) +c , wherec= maxA,BKPMI(A, B) is an offset to ensure
non-negativity (similar to the constant in Section
KPMI).
•LANGUAGE.We used an approach similar to2021).

We take20pairs of (color, word) appeared in the dataset collected by2014), where 51
participants were asked to free name each of the330colors from the Munsell Color Chart. We filtered words that
appeared less than100times, and computed each word’s associate color by taking the centroid in CIELAB space.
Our filtering process followed2021) exactly, but resulted in 20colors, a slightly different set than
the18colors they claimed.
–For each of the20color words<col>, we construct three sentences:
The color <col>.
This color is <col>.
The color of this thing is <col>.
and obtain the average sentence embedding from the language encoder, as the embedding for<col>(details
below). We find this approach more effective than2021), which uses object names that potentially
have color biases, even though the objects may appear in multiple colors.

Unlike2021), we did not perform linear regression from language embedding to CIELAB space,
which distorts distances and easily overfits with only20samples. Instead, we used multidimensional scaling to
best preserve distances, as described above.

Masked language contrastive learning (SimCSE) embedding:We used sentence embedding from the
unsupervised SimCSE RoBERTa-L (Gao et al.,) to encode the above sentences into 1024-dimensional
embeddings, and used the pairwise Euclidean distances among<col>embeddings as the dissimilarity matrix.

Masked language predictive learning (RoBERTa) embedding:We concatenated hidden states of the last four
layers of RoBERTa-L (Liu et al.,), following (Devlin et al.,). We averaged across token dimensions,
and obtained a4096-dimensional embedding for each of the above sentences, and used the pairwise Euclidean
distances among<col>embeddings as the dissimilarity matrix.
24

The Platonic Representation Hypothesis
E. Caption Density Experiments
We use LLAMA3-8B-Instruct (Meta,) to generate summary captions at various densities for images in the Densely
Captioned Images dataset (Urbanek et al.,) from the train split. Following2023), we prompt the
language model with the following instructions to generate captions at differing granularity:
system: You are given a full-text description of an image. You should summarize it into about
<num_words> words, being sure to include as much salient visual information as possible given the
<num_words> word constraint, especially information from the start of the original description. The
new description should apply for the original image. Respond with only the summary, in one line.
user: <original_caption>
F. Analysis of Contrastive Learners
F.1. Contrastive objectives learn pointwise mutual information
There are two widely used forms of contrastive objectives. We now discuss each form in details and show how they both are
minimized by the pointwise mutual information (PMI) as stated in Equation (5). To simplify notation, we consider learning
the bivariate modelg(xa, xb)∈R. In Section, such gis optimized within the family of{g=⟨fX, fX⟩:fX∈ FX}.
Recall that our positive pairs are sampled from(x, x+)∼Pcoor , and that the negative pairs are sampled independently from
its marginals which we denote as(x, x−)
i.i.d.
∼PwhereP(x) =
P
x+
Pcoor(x, x+).
1.
The binary NCE loss (Gutmann & Hyv ¨arinen,) is defined with a certain prior over sampling positive vs. negative
pairs. Letpposbe the probability of sampling a positive pair. Then the loss is given by
Lbinary-NCE(g)≜ppos·E
(x,x+)∼Pcoor
[−logσ(g(x, x+))] + (1−ppos)·E
(x,x−)
i.i.d.
∼P
[−logσ(−g(x, x−))].(20)
The Bayes optimal solution is given by
g(xa, xb) = log
P(pos|xa, xb)
1−P(pos|xa, xb)
(21)
= log
P(pos, xa, xb)
P(neg, xa, xb)
(22)
= log
ppos·Pcoor(xa, xb)
(1−ppos)P(xa)P(xb)
(23)
= log
Pcoor(xa, xb)
P(xa)P(xb)
+ log
ppos
1−ppos
(24)
=KPMI(xa, xb) +cX. (25)
2.
The InfoNCE loss (Oord et al.,) is defined with randomly sampling one positive pair along withKnegative
ones. With some hyperparameterτ >0, the loss is given by
LInfoNCE(g)≜E
(x,x+)∼Pcoor
(x
(1)

,x
(2)

,...,x
(K)

)
i.i.d.
∼P
"
−log
e
g(x,x+)/τ
e
g(x,x+)/τ
+
P
K
i=1
e
g(x,x
(i)

)/τ
#
. (26)
The Bayes optimal solution is given by
e
g(x,x+)/τ
e
g(x,x+)/τ
+
P
K
i=1
e
g(x,x
(i)

)/τ
=
Pcoor(x+|x)
Q
j
P(x
(j)
−)
Pcoor(x+|x)
Q
j
P(x
(j)
−) +
P
i
Pcoor(x
(i)
−|x)P(x+)
Q
j̸=i
P(x
(j)
−)
(27)
=
Pcoor(x+|x)/P(x+)
Pcoor(x+|x)/P(x+) +
P
i
Pcoor(x
(i)
−|x)/P(x
(i)
−)
. (28)
25

The Platonic Representation Hypothesis
Forτ= 1, this optima corresponds togchoices where
g(xa, xb) = log
Pcoor(xb|xa)
P(xb)
+cX(xa) (29)
=KPMI(xa, xb) +cX(xa). (30)
For the generalτ̸= 1case, we haveg(and correspondingfX) recoversKPMIup to an offset and a scale. Our main
argument in Section fXrecoversKPMIstill holds.
F.2. Contrastive learners can representKPMIexactly under smoothness conditions
We want to expressKPMI+C using some representation functionfX:X →R
n so that< fX(xa), fX(xb)>=
KPMI(xa, xb) +C
, for someC. For suchfto exist, an equivalent criterion is thatKPMI+C is positive semi-definite (PSD),
as can be seen from eigendecomposition.
Proposition F.1.Suppose that the off-diagonal elements ofKPMIare bounded within[logρmin,logρmin+δ]∈(−∞,0] .
We haveKPMI+Cis positive semi-definite (PSD) for someCif the joint distribution is sufficiently smooth:
Pcoor(zi|zi)
Pcoor(zi)
≥e
N δ
ρmin,∀i. (31)
Proof.Note thatKPMI+Cstill only has non-positive off-diagonal elements if
−C≥logρmin+δ. (32)
For suchC, it is diagonally dominant (and thus PSD) if,
∀i, K PMI(zi, zi) +C≥
X
j̸=i
|KPMI(zi, zj) +C|=−(N−1)C−
X
j̸=i
KPMI(zi, zj), (33)
or equivalently,
∀i, NC +
X
j
KPMI(zi, zj)≥0. (34)
The following choice ofCreadily satisfies the above Equation (34):
C≜−min
i
1
N
X
k
KPMI(zi, zj). (35)
Therefore, it remains to show that Equation (32) is true. Note that
−C≜min
i
1
N
X
k
KPMI(zi, zj)≥
N−1
N
logρmin+
1
N
(min
i
KPMI(zi, zi)). (36)
Therefore, it suffices to have
logρmin+δ≤
N−1
N
logρmin+
1
N
(min
i
KPMI(zi, zi)). (37)
Rearranging terms gives the desired condition
Pcoor(zi|zi)
Pcoor(zi)
≥e
N δ
ρmin,∀i. (38)
RemarkF.2.Proposition
PMI kernelKPMIto beexactlyrepresented as inner products of a learned feature space (up to a scale). The condition here
can be satisfied, for example, if the off-diagonal terms decay linearly with respect toNand stay sufficiently close to each
other. While the condition is somewhat strict, it captures the essence that smoothness and continuity allow easier learning.
Nonetheless, we note that exact representation is not necessary for convergence, and thus this requirement can likely be
relaxed. Please see Section
26