Parth Gupta, Ndiamé Ndiaye,Sergey Norin, and Louis WeiDepartment of Mathematics and Statistics, McGill University, Montréal, QC, Canada.

###### Abstract.

In a recent breakthrough Campos, Griffiths, Morris and Sahasrabudhe obtained the first exponential improvement of the upper bound on the diagonal Ramsey numbers since 1935. We shorten their proof, replacing the underlying *book algorithm* with a simple inductive statement. This modification allows us to

- •
give a very short proof of an improved upper bound on the off-diagonal Ramsey numbers, which extends to the multicolor setting,

- •
clarify the dependence of the bounds on underlying parameters and optimize these parameters, obtaining, in particular, an upper bound

$R(k,k)\leq(3.8)^{k+o(k)}$ on the diagonal Ramsey numbers.

PG, NN and SN were partially supported by NSERC Discovery Grant. PG was also supported by an ISM Undergraduate Research Scholarship.

## 1. Introduction

The *Ramsey number* $R(k,\ell)$ is the smallest positive integer $N$ such thatin any red-blue coloring of the edges of the complete graph on $N$ vertices there exists either a complete subgraph on $k$ vertices with all edges colored red (*a red $K_{k}$*) or a complete subgraph on $\ell$ vertices with all edges colored blue (*a blue $K_{\ell}$*).Ramsey numbers were introduced by Ramsey [Ram30] in 1930, and the problem of estimating their value has been one of the central questions in extremal combinatorics ever since, with particular attention paid to the *diagonal* Ramsey numbers $R(k,k)$.

Erdős and Szekeres [ES35] proved that $R(k,k)\leq 4^{k}$ in 1935. Despite a series of important improvements [Con09, Sah23, Tho88] throughout the years, based on the quasirandomness properties of colorings close to the upper bound, the first exponential improvement of the Erdős-Szekeres bound was obtained only very recently in a breakthrough result by Campos, Griffiths, Morris and Sahasrabudhe [CGMS23].They proved that that there exists $\varepsilon>0$ such that

$R(k,k)\leq(4-\varepsilon)^{k}$ | (1) |

for sufficiently large $k\in\mathbb{N}$. While they mention that they have not attempted to seriously optimize the value of $\varepsilon$, focusing instead on a relatively simply proof, they show that (1) is satisfied with $\varepsilon=2^{-7}$. For the off-diagonal Ramsey numbers, Campos, Griffiths, Morris and Sahasrabudhe [CGMS23] establish

$R(k,\ell)\leq e^{-\ell/400+o(k)}\binom{k+\ell}{\ell}$ | (2) |

for all $k,\ell\in\mathbb{N}$ with $\ell\leq k$, which again improves the best previously known bound [Sah23] by an exponential factor.

The approach of[CGMS23] diverts from the previous methods and does not involve any notion of quasirandomness. Instead, it centres on the *book algorithm*, which works with a pair of disjoint sets of vertices $(X,Y)$ with high density of red edges between them. Perhaps the most technical part of the algorithm is the way the changes in this density are controlled, regulated by a certain discrete logarithmic gradation of the interval $[p,1]$, where $p$ is approximately the initial density.

In an attempt to clarify the extent of applications of the approach of[CGMS23], as well as the underlying optimization involved, we present a simpler proof of improved upper bound on Ramsey numbers, based on a reinterpretation of their method. We don’t introduce any new combinatorial ideas. (In fact, we show that for $\ell<0.69k$ one can abandon most of them and get an exponential improvement using a bare minimum of ingredients from[CGMS23].) We do, however, change the way we package and optimize applications of these ideas, replacing the CGMS book algorithm by a fairly simple induction, perhaps closer paralleling the original Erdős-Szekeres approach. In particular, we don’t explicitly bound allowed changes in density of red edges, eliminating the technicality mentioned above.

Using these modifications we obtain the following improved bounds.

In Section2 we give a short proof of the following explicit bound onRamsey numbers

$R(k,\ell)\leq 4(k+\ell)\left(\frac{\left(\sqrt{5}+1\right)(k+2\ell)}{4\ell}%\right)^{\ell}\left(\frac{k+2\ell}{k}\right)^{k/2}$ | (3) |

for all positive integers $\ell\leq k$. (See 6.)The main idea of the proof is to inductively maintain a lower bound on the *excess* number of red edges between the two sets of vertices $X$ and $Y$ under consideration, as compared to some fixed density $p$ , i.e. a bound on $e_{R}(X,Y)-p|X||Y|$ where $e_{R}(X,Y)$ denotes the number of red edges with one end in $X$ and another in $Y$.

It is not hard to verify that (3) gives an exponential improvement of the Erdős-Szekeres bound for large enough $\ell$ and $k$, as long as $\ell\leq 0.69k$. It is meaningful even in the $\ell=o(k)$ regime, while[CGMS23] focuses on the regime $\ell=\Theta(k)$. Moreover, the results of Section2 straightforwardly extend to the multicolor Ramsey numbers, as shown in Section5, while extending the whole book algorithm seems to require overcoming non-trivial technical obstacles.

In Section3 we present our interpretation of the full power of the CGMS book algorithm, optimizing some of the parameters involved. We replace the quantity maintained inductively by a more involved one, which essentially corresponds to a higher moment of the above mentioned excess number of edges. In Section4 we optimize the initial value of density of red edges at which we can start our induction and prove our main result, the following.

###### Theorem 1.

For all positive integers $\ell\leq k$

$R(k,\ell)\leq e^{G(\ell/k)k+o(k)}\binom{k+\ell}{\ell},$ |

where $G(\lambda)=\left(-0.25\lambda+0.03\lambda^{2}+0.08\lambda^{3}\right)e^{-%\lambda}.$

In particular, 1 implies that

$R(k,k)\leq e^{-0.14e^{-1}k+o(k)}\binom{2k}{k}=(4e^{-0.14e^{-1}})^{k+o(k)}=(3.7%992\ldots)^{k+o(k)},$ |

and, more generally, that

$R(k,\ell)\leq e^{-0.14e^{-1}\ell+o(k)}\binom{k+\ell}{\ell}\leq e^{-\ell/20+o(k%)}\binom{k+\ell}{\ell},$ |

significantly improving (1) and (2).

### 1.1. Notation

As mentioned earlier we denote by $e_{R}(X,Y)$ the number of red edges with one end in a set $X$ and another in $Y$. Given a color $C$ and a vertex $v$ of our graph, we denote by $N_{C}(v)$ the set of vertices $u$ such that the edge $uv$ is colored in color $C$. (Throughout the bulk of the paper we work with two colors: red denoted by $R$, and blue denoted by $B$, but in Section5 we introduce additional colors.)

## 2. Easy bound far from the diagonal

In this section we present a very short inductive argument yielding an exponential improvement to the Erdős-Szekeres upper bound on $R(k,\ell)$ for $\ell<0.69k.$

The main object of our investigation is the same as in[CGMS23]: a pair of disjoint sets of vertices, the density of red edges between which is the critical parameter controlled during the inductive argument. We formalize the setting in the following definitions.

Let $X,Y$ be two non-empty disjoint subsets of vertices of a complete graph with edges colored red and blue. We say that $(X,Y)$ is a *candidate*. We say that a candidate $(X,Y)$ is *$(k,\ell,t)$-good* if $X\cup Y$ contains a red $K_{k}$ or $X$ contains a blue $K_{t}$ or $Y$ contains a blue $K_{\ell}$. The *density of $(X,Y)$* is $d(X,Y)=\frac{e_{R}(X,Y)}{|X|||Y|}$.Let $f_{p}(X,Y)=e_{R}(X,Y)-p|X||Y|$ denote the excess amount of red edges between $X$ and $Y$ when compared to density $p$.The following lemma shows that replacing $Y$ by $N_{R}(v)\cap Y$ for $v\in X$ on average reduces $f_{p}$ by a factor at most $p$. It is essentially a restatement [CGMS23, Observation 5.5].

###### Lemma 2.

Let $(X,Y)$ be a candidate. Then

$\sum_{v\in X}f_{p}(X,N_{R}(v)\cap Y)\geq p|X|f_{p}(X,Y).$ | (4) |

###### Proof.

We have

$\displaystyle\sum_{v\in X}$ | $\displaystyle f_{p}(X,N_{R}(v)\cap Y)-p|X|f_{p}(X,Y)$ | ||

$\displaystyle=\sum_{v\in X}\left(\sum_{y\in N_{R}(v)\cap Y}|N_{R}(y)\cap X|-p|%X|\right)-p|X|e_{R}(X,Y)+p^{2}|X|^{2}|Y|$ | |||

$\displaystyle=\sum_{y\in Y}\left(\left(\sum_{v\in N_{R}(y)\cap X}|N_{R}(y)\capX%|-p|X|\right)-p|X||N_{R}(y)\cap X|+p^{2}|X|\right)$ | |||

$\displaystyle=\sum_{y\in Y}\left(|N_{R}(y)\cap X|-p|X|\right)^{2}\geq 0.$ |

∎

Next we need the following form of the Erdős-Szekeres upper bound on Ramsey numbers.

###### Observation 3.

For all $0<x<1$ and all positive integers $k$ and $\ell$

$R(k,\ell)\leq x^{-k+1}(1-x)^{-\ell+1}.$ |

###### Proof.

By induction on $k+\ell$. If $k=1$ or $\ell=1$ the statement clearly holds, and the induction step follows ,as$R(k,\ell)\leq R(k,\ell-1)+R(k-1,\ell).$∎

The following lemma contains the main inductive argument we use to replace the book algorithm.

###### Lemma 4.

Let $0<x<p<1$, let $k,\ell$ and $t$ be positive integers and let $(X,Y)$ be a candidate such that

$f_{p}(X,Y)\geq(k+t)x^{-k+1}(1-x)^{-\ell+1}(p-x)^{-t+1}$ | (5) |

then $(X,Y)$ is $(k,\ell,t)$-good.

###### Proof.

The proof is by induction on $k+t$. If $k=1$ or $t=1$ then every candidate is $(k,\ell,t)$-good. This implies the base case of induction and allows us to assume $k,t\geq 2$ in the induction step.

By 2 there exists $v\in X$ such that $f_{p}(X,N_{R}(v)\cap Y)\geq p\cdot f_{p}(X,Y)$. Let $Y^{\prime}=N_{R}(v)\cap Y,X_{B}=N_{B}(v)\cap X,X_{R}=N_{R}(v)\cap X$.If

$f_{p}(X_{R},Y^{\prime})\geq\frac{k+t-1}{k+t}xf_{p}(X,Y)\geq(k+t-1)x^{-(k-1)+1}%(1-x)^{-\ell+1}(p-x)^{-t+1}$ |

then by the induction hypothesis $(X_{R},Y^{\prime})$ is $(k-1,\ell,t)$-good, i.e. $X_{R}\cup Y^{\prime}$ contains a red $K_{k-1}$ or $X_{R}$ contains a blue $K_{t}$ or $Y^{\prime}$ contains a blue $K_{\ell}$. In the first case, $X_{R}\cup Y^{\prime}\cup\{v\}\subseteq X\cup Y$ contains a red $K_{k}$, as $X_{R}\cup Y^{\prime}\subseteq N_{R}(v).$ It follows that in each case $(X,Y)$ is $(k,\ell,t)$-good, as desired, and so we may assume that $f_{p}(X_{R},Y^{\prime})<\frac{k+t-1}{k+t}xf_{p}(X,Y)$.

Symmetrically, we may assume that $f_{p}(X_{B},Y^{\prime})<\frac{k+t-1}{k+t}(p-x)f_{p}(X,Y)$.It follows that

$\displaystyle pf_{p}(X,Y)$ | $\displaystyle\leq f_{p}(X,Y^{\prime})=f_{p}(X_{R},Y^{\prime})+f_{p}(X_{B},Y^{%\prime})+f_{p}(\{x\},Y^{\prime})$ | ||

$\displaystyle<\frac{k+t-1}{k+t}pf_{p}(X,Y)+|Y|,$ |

and so $\frac{1}{{k+t}}f_{p}(X,Y)\leq|Y|$.Thus,

$|Y|\geq x^{-k+1}(1-x)^{-\ell+1}(p-x)^{-t+1}\geq x^{-k+1}(1-x)^{-\ell+1}\geq R(%k,\ell),$ |

where the last inequality uses 3, implying that $(X,Y)$ is $(k,\ell,t)$-good.∎

Finally, we derive from4 the promised bound on the Ramsey numbers by induction on $\ell$, applying4, if the density of red edges is sufficiently high, and, otherwise, applying the induction hypothesis to the blue neighborhood of an arbitrary vertex.

###### Theorem 5.

For all $\frac{\sqrt{5}-1}{\sqrt{5}+1}<p<1$ and all positive integers $k$ and $\ell$

$R(k,\ell)\leq 4(k+\ell)\left(\frac{1+\sqrt{5}}{2}p+\frac{1-\sqrt{5}}{2}\right)%^{-k/2}(1-p)^{-\ell}.$ |

###### Proof.

By induction on $\ell$. Note that the theorem trivially holds if $k=1$ or $\ell=1$. In particular, the base case $\ell=1$ trivially holds and we assume $k,l\geq 2$ in the induction step.

Let $x=\frac{1+\sqrt{5}}{2}p+\frac{1-\sqrt{5}}{2}>0$ and note that

$(1-p)^{2}=(1-x)(p-x).$ | (6) |

Consider a red-blue coloring of edges of a complete graph on $n\geq 4(k+\ell)x^{-k/2}(1-p)^{-\ell}$ vertices.In particular, as $x\leq p$, we have

$n\geq 4(k+\ell)x^{-1}(1-p)^{-1}\geq\frac{4(k+\ell)}{p(1-p)}$ | (7) |

If $|N_{B}(v)|\geq\frac{k+\ell-1}{k+\ell}(1-p)n\geq 4(k+\ell-1)x^{-k/2}(1-p)^{-%\ell+1}$ for some vertex $v$ then the coloring of $N_{B}(v)$ contains a red $K_{k}$ or a blue $K_{\ell-1}$ by the induction hypothesis, and so our coloring contains a red $K_{k}$ or a blue $K_{\ell}$.

Thus we may assume that $|N_{B}(v)|<\frac{k+\ell-1}{k+\ell}(1-p)n$ for every $v$, i.e.

$|N_{R}(v)|>n-1-\frac{k+\ell-1}{k+\ell}(1-p)n=\left(p+\frac{(1-p)}{k+\ell}%\right)n-1.$ |

Let $(X,Y)$ be a uniformly random partition of the vertices of our graph. Then

$\mathbb{E}[e_{R}(X,Y)]=\frac{1}{4}\sum_{v}|N_{R}(v)|>\left(p+\frac{(1-p)}{k+%\ell}\right)\frac{n^{2}}{4}-\frac{n}{4}$ |

It follows that

$\displaystyle\mathbb{E}[f_{p}(X,Y)]$ | $\displaystyle\geq\mathbb{E}[e_{R}(X,Y)]-\frac{pn^{2}}{4}\geq\frac{(1-p)}{k+%\ell}\frac{n^{2}}{4}-\frac{n}{4}$ | ||

$\displaystyle\geq\frac{(1-p)^{2}}{k+\ell}\frac{n^{2}}{4}\geq(k+\ell)x^{-k}(1-p%)^{-2\ell+2}$ | |||

$\displaystyle=(k+\ell)x^{-k}(1-x)^{-\ell+1}(p-x)^{-\ell+1},$ |

where the second inequality holds by (7), and the last equality uses (6). Thus $(X,Y)$ is $(k,\ell,\ell)$-good by 4, implying that our coloring contains a red $K_{k}$ or a blue $K_{\ell}$, as desired.∎

Substituting

$p=\frac{(\sqrt{5}+1)k+(2\sqrt{5}-2)\ell}{\left(\sqrt{5}+1\right)(k+2\ell)}$ |

in 5 yields the following corollary, mentioned in the introduction.

###### Corollary 6.

For all positive integers $k\geq\ell$

$R(k,\ell)\leq 4(k+\ell)\left(\frac{\left(\sqrt{5}+1\right)(k+2\ell)}{4\ell}%\right)^{\ell}\left(\frac{k+2\ell}{k}\right)^{k/2}$ |

Let $ES(k,\ell)=\binom{k+\ell-2}{k-1}$ denote the Erdős-Szekeres upper bound on the Ramsey numbers. Then $ES(k,\ell)=e^{O(\log k)}\left(\frac{k+\ell}{k}\right)^{k}\left(\frac{k+\ell}{%\ell}\right)^{\ell},$ and so 6 implies

$\displaystyle\frac{R(k,\ell)}{ES(k,\ell)}$ | $\displaystyle\leq e^{O(\log k)}\left(\frac{(\sqrt{5}+1)(k+2\ell)}{4(k+\ell)}%\right)^{\ell}\left(\frac{(k+2\ell)k}{(k+\ell)^{2}}\right)^{k/2}$ |

Thus 6 yields an exponential improvement of the Erdős-Szekeres bound whenever $\ell/k<0.6989,$ and for $\ell=o(k)$ the improvement is of the order $e^{O(\log k)}\left(\frac{\sqrt{5}+1}{4}\right)^{\ell}<e^{-0.21\ell+O(\log k)}$.For comparison, the strongest bound for moderately small $\ell$ given in [CGMS23] is of the form

$R(k,\ell)\leq e^{-0.05\frac{k}{k+\ell}\cdot\ell+o(k)}ES(k,\ell)$ |

for $\ell\leq k/9$ (see [Theorem 9.1][CGMS23]).Meanwhile, in the same regime ($\ell\leq k/9$) 6 implies

$R(k,\ell)\leq e^{-0.16\ell+O(\log k)}ES(k,\ell).$ |

In the following section, using the full power of the book algorithm, we further improve on this bound and extend the improvement to the diagonal case.

## 3. Optimizing the book algorithm

In this section we present extensions of 4 and 5, which we use to give an upper bound $R(k,\ell)$ that improves on the bounds from [CGMS23]. Just like in 4 the main graph theoretical ideas are borrowed from[CGMS23], but now we need refine our inductive statement and to use every trick in the book from[CGMS23] making the argument substantially more technical. Namely:

- •
instead of $e_{R}(X,Y)-p|X||Y|$ we lower bound “higher moments” of density

$(e_{R}(X,Y)-p|X||Y|)^{r}|X|^{1-r}|Y|^{1-r}$ in the regime $r\to\infty$,

- •
we implement the arguments corresponding to

*big blue steps*and*degree regularization*steps of the[CGMS23] book algorithm, which we did not need in4.

Instead of 2 we will need the following variant of “the convexity of density” bound.

###### Lemma 7.

Let $(X,Y)$ be a candidate. Then

$\sum_{v\in X}d(X,N_{R}(v)\cap Y)|N_{R}(v)\cap Y|\geq e_{R}(X,Y)d(X,Y).$ | (8) |

###### Proof.

We have

$\displaystyle|X|$ | $\displaystyle\left(\sum_{v\in X}d(X,N_{R}(v)\cap Y)|N_{R}(v)\cap Y|\right)=%\sum_{v\in X}\left(\sum_{y\in N_{R}(x)\cap Y}|N_{R}(y)\cap X|\right)$ | ||

$\displaystyle=\sum_{y\in Y}|N_{R}(y)\cap X|^{2}\geq|Y|\left(\frac{\sum_{y\in Y%}|N_{R}(y)\cap X|}{|Y|}\right)^{2}$ | |||

$\displaystyle=\frac{(e_{R}(X,Y))^{2}}{|Y|}=|X|e_{R}(X,Y)d(X,Y),$ |

where the inequality holds by convexity of the square function.∎

The next lemma allows us to extract a large blue book from $X$ if we find sufficiently many vertices with large blue neighborhood. It is essentially [CGMS23, Lemma 4.1] with a different choice of parameters. The proof is exactly the same, but we reproduce it for completeness.

###### Lemma 8.

Let $0<\mu<1$, let $b,k,m$ be positive integers with $m\geq 5\mu^{-1}b^{2}$. Let $(X,Y)$ be a candidate such that $X\geq 5m^{2}$, and there exist at least $R(k,m)$ vertices $v\in X$ such that

$|N_{B}(v)\cap X|\geq\mu|X|.$ |

Then $X$ contains a red $K_{k}$ or a blue book $(S,T)$ with $|S|\geq b$ and $|T|\geq\frac{\mu^{b}}{2}|X|$.

###### Proof.

Let $W$ be the set of vertices $v\in X$ such that $|N_{B}(v)\cap X|\geq\mu|X|.$ As $|W|\geq R(k,m)$, $W$ contains a red $K_{k}$ or a blue $K_{m}$. In the first case the lemma holds, so we assume the second, and let $U$ be the set of vertices of the blue $K_{m}$ in $W$. Let

$\sigma=\frac{e_{B}(U,X\setminus U)}{|U||X\setminus U|}\geq\frac{\mu|X|-m}{|X|}%\geq\mu\left(1-\frac{1}{5b}\right).$ |

Let $S$ be a subset of $U$ of size $b$ chosen uniformly at random, and let $T=|N_{B}(S)\cap(X\setminus U)|$ then

$\displaystyle\mathbb{E}[T]$ | $\displaystyle=\binom{m}{b}^{-1}\sum_{v\in X\setminus U}\binom{|N_{B}(v)\cap U|%}{b}\geq\binom{m}{b}^{-1}\binom{\sigma m}{b}|X\setminus U|$ | ||

$\displaystyle\geq\sigma^{b}\exp\left(-\frac{b^{2}}{\sigma m}\right)|X\setminusU%|\geq\mu^{b}\left(1-\frac{1}{5b}\right)^{b}e^{-1/5}\left(1-\frac{b}{|X|}\right%)|X|$ | |||

$\displaystyle\geq\frac{4}{5}e^{-2/5}\mu^{b}|X|\geq\frac{\mu^{b}}{2}|X|.$ |

As $(S,T)$ is a blue book, there exists a desired choice of $T$.∎

In Section2 we used 3 to upper bound Ramsey numbers in what was essentially the base case of the book algorithm, or, more precisely, the case when we consider a candidate $(X,Y)$ with $|X|$ small. In this section we tighten all aspects of our argument and, in particular, instead of 3 we would like to apply increasingly stronger upper bounds of the same form, which we iteratively obtain.

We use the following slightly technical setup to encode the bounds we can use.Let $\mathcal{R}$ be the closure of the set of all pairs $(x,y)\in(0,1)^{2}$ such that there exists $N_{0}=N_{0}(x,y)$ such that $R(k,\ell)\leq x^{-k}y^{-\ell}$ for all $k+\ell\geq N_{0}$. Let $\mathcal{R_{*}}$ be the interior of $\mathcal{R}$.

The next easy observation records the properties of $\mathcal{R}$ and $\mathcal{R_{*}}$.

###### Observation 9.

- (1)
$(x,1-x)\in\mathcal{R}$ for all $0<x<1$,

- (2)
if $(x,y)\in\mathcal{R}$, $0<x^{\prime}\leq x,0<y^{\prime}\leq y$ then $(x^{\prime},y^{\prime})\in\mathcal{R}$,

- (3)
if $(x,y)\in\mathcal{R}$, $0<x^{\prime}<x,0<y^{\prime}<y$ then $(x^{\prime},y^{\prime})\in\mathcal{R}_{*}$,

- (4)
if $R(k,l)\leq x^{-k+o(k)}y^{-\ell+o(\ell)}$ for all positive integers $k$ and $\ell$ then $(x,y)\in\mathcal{R}$.

###### Proof.

3 implies (1). As $(x^{\prime})^{-k}(y^{\prime})^{-\ell}\leq x^{-k}y^{-ell}$ for $0<x^{\prime}\leq x,0<y^{\prime}\leq y$, (2) holds and implies (3).

Finally, (4) holds as the condition $R(k,l)\leq x^{-k+o(k)}y^{-\ell+o(\ell)}$ implies that for all $x^{\prime}<x,y^{\prime}<y$ there exists $N_{0}$ such that for all positive integers $k$ and $\ell$ with $k+\ell\geq N_{0}$ we have $R(k,l)\leq(x^{\prime})^{-k}(y^{\prime})^{-\ell}$.∎

Finally, we need the following straightforward limit evaluation.

###### Lemma 10.

For all $1>p>\mu>0$

$\lim_{r\to\infty}(p^{1/r}-\mu)^{r}(1-\mu)^{1-r}=p^{\frac{1}{1-\mu}}(1-\mu)$ | (9) |

###### Proof.

We have

$\displaystyle\lim_{r\to\infty}$ | $\displaystyle\log\left((p^{1/r}-\mu)^{r}(1-\mu)^{-r}\right)$ | ||

$\displaystyle=\lim_{r\to\infty}\left(r\log\left(1+\frac{p^{1/r}-1}{1-\mu}%\right)\right)=\frac{\lim_{r\to\infty}\left(r(p^{1/r}-1)\right)}{1-\mu}=\frac{%\log p}{1-\mu},$ |

implying (9).∎

We are now ready for the main technical result of this section, which tightens4.

###### Lemma 11.

For all $0<\mu_{0},x_{0},y_{0},p<1$ such that $x_{0}<p^{\frac{1}{1-\mu_{0}}}(1-\mu_{0})$ and $(x_{0},y_{0})\in\mathcal{R_{*}}$ there exists $L_{0}$ such that for all positive integers $k,\ell,t$ with $\ell\geq L_{0}$ the following holds.Let $(X,Y)$ be a candidate such that $d(X,Y)\geq p$ and

$|X||Y|\geq x_{0}^{-k}y_{0}^{-\ell}\mu_{0}^{-t}$ | (10) |

then $(X,Y)$ is $(k,\ell,t)$-good.

###### Proof.

We start by quantifying the “extra room” implicit in the strict inequality $x_{0}<p^{\frac{1}{1-\mu_{0}}}(1-\mu_{0})$ and the condition $(x_{0},y_{0})\in\mathcal{R_{*}}$. More precisely, using 10 and the condition $x_{0}<p^{\frac{1}{1-\mu_{0}}}(1-\mu_{0})$, which, in particular, implies $x_{0}+\mu_{0}<1$, as well as the conditions $(x_{0},y_{0})\in\mathcal{R_{*}}$, and $\mu_{0}<1$, we deduce that there exist $\varepsilon>0,r\geq 1$ such that

$(1+\varepsilon)(\mu_{0}+\varepsilon)\leq 1,\qquad\mu_{0}+x_{0}+2\varepsilon%\leq 1,\qquad(x_{0}+2\varepsilon,y_{0}+2\varepsilon)\in\mathcal{R},\qquad p%\geq 2\varepsilon$ |

$\qquad\mathrm{and}\qquad x_{0}\leq((p-\varepsilon)^{1/r}-\mu_{0}-3\varepsilon)%^{r}(1-\mu_{0}-\varepsilon)^{1-r}-\varepsilon.$ | (11) |

We choose $L_{0}$ implicitly to be sufficiently large as a function of $\varepsilon$ and $r$ to satisfy the inequalities throughout the proof.Let $x=x_{0}+\varepsilon$, $y=y_{0}+\varepsilon,\mu=\mu_{0}+\varepsilon$, $\delta_{n}=\frac{\varepsilon}{n}$. Note that by (11) we have $x<1$ and so $x_{0}=x-\varepsilon\leq\frac{x}{1+\varepsilon}$. Similarly, $y_{0}\leq\frac{x}{1+\varepsilon}$ and $\mu_{0}\leq\frac{\mu}{1+\varepsilon}$.Note that

$\displaystyle(d(X,Y)$ | $\displaystyle+\delta_{k+t}-p)^{r}|X||Y|\stackrel{{\scriptstyle\eqref{e:%bookmain}}}{{\geq}}\frac{\varepsilon^{r}}{(k+t)^{r}}x_{0}^{-k}y_{0}^{-\ell}\mu%_{0}^{-t}$ | |||

$\displaystyle\geq\frac{\varepsilon^{r}}{(k+t)^{r}}(1+\varepsilon)^{k+\ell+t}x^%{-k}y^{-\ell}\mu^{-t}\geq x^{-k}y^{-\ell}\mu^{-t},$ | (12) |

where the last inequality holds as long as $L_{0}$ is sufficiently large as a function of $\varepsilon$ and $r$.

We will prove by induction on $k+t$ for fixed $\ell\geq L_{0}$ that if $(X,Y)$ is a candidate with $d(X,Y)\geq p-\delta_{k+t}$ such that

$(d(X,Y)+\delta_{k+t}-p)^{r}|X||Y|\geq x^{-k}y^{-\ell}\mu^{-t}$ | (13) |

then$(X,Y)$ is $(k,\ell,t)$-good. By (12) this implies the theorem.

If $k=1$ or $t=1$ then $(X,Y)$ is trivially $(k,\ell,t)$-good, implying, in particular the base case of our statement. Thus we move on to the induction step and assume $k,t\geq 2$.

We assume without loss of generality that $|N_{R}(v)\cap Y|\geq(p-\delta_{k+t})|Y|$ for every $v\in X$, as otherwise we can replace $X$ by $X\setminus v$ increasing the left side of (13) as

$(d(X,Y)+\delta_{k+t}-p)^{r}|X||Y|=(d(X,Y)+\delta_{k+t}-p)^{r-1}(e_{R}(X,Y)-(p-%\delta_{k+t})|X||Y|),$ |

and both terms on the right side of this identity increase after the replacement. Thus

$d(X^{\prime},Y)\geq p-\delta_{k+t}$ | (14) |

for every $X^{\prime}\subseteq X,X^{\prime}\neq\emptyset.$ For readers familiar with[CGMS23] we will indicate how the steps in our proof correspond to steps of their book algorithm. The observation in this paragraph corresponds to the degree regularization step of the book algorithm.

Next we use (13) to obtain a lower bound on $|X|$ by first upper bounding $|Y|$.If $|Y|\geq(x+\varepsilon)^{-k}(y+\varepsilon)^{-\ell}$ then $Y$ contains a red $K_{k}$ or a blue $K_{\ell}$ as $(x+\varepsilon,y+\varepsilon)\in\mathcal{R_{*}}$ by 9 (3) and so $(X,Y)$ is $(k,\ell,t)$-good.Thus we may assume that

$\displaystyle|X|$ | $\displaystyle\stackrel{{\scriptstyle\eqref{e:moment}}}{{\geq}}\frac{x^{-k}y^{-%\ell}\mu^{-t}}{(d(X,Y)+\delta_{k+t}-p)^{r}|Y|}\geq\frac{x^{-k}y^{-\ell}\mu^{-t%}}{(x+\varepsilon)^{-k}(y+\varepsilon)^{-\ell}}$ | |||

$\displaystyle\geq\left(\frac{x+\varepsilon}{x}\right)^{k}\left(\frac{y+%\varepsilon}{y}\right)^{\ell}\mu^{-t}\geq(1+\varepsilon)^{k+\ell+t}.$ | (15) |

Our next goal is to show that most of the vertices of $X$ have blue degree at most $(\mu+\varepsilon)|X|$ as otherwise using 8 we can apply the induction hypothesis to the candidate $(T,X)$ where $(S,T)$ is a large blue book in $X$ guaranteed by8 and (3). This part of the proof corresponds to the big blue step.

Let

$b=\left\lceil\frac{2r\log(k+t)-r\log{\varepsilon}+\log{2}}{\log(1+\varepsilon)%}\right\rceil,m=\lceil 5\mu^{-1}b^{2}\rceil,\qquad\mathrm{and}\qquad w=R(k,m)$ |

with the first two being the parameters we will use in 8.Note that $m\leq C\log^{2}(k+t),$ where $C$ is a constant depending only on $r$ and $\varepsilon$. As $R(k,m)\leq k^{m}\leq\exp(C\log^{3}(k+t))$, by (3) we can choose $L_{0}$ large enough so that

$|X|\geq 5m^{2}\qquad\mathrm{and}\qquad w\leq\frac{\varepsilon(p-\varepsilon)}{%(k+t)^{3}}|X|$ | (16) |

Let $W=\{x\in X\>|\>|N_{B}(x)\cap X|\geq(\mu+\varepsilon)|X|\}$. Suppose first that $|W|\geq w$. Then by 8, $X$ contains a red $K_{k}$ or a blue book $(S,T)$ with $|S|\geq b$ and $|T|\geq\frac{(\mu+\varepsilon)^{b}}{2}|X|$. In the first case $(X,Y)$ is $(k,\ell,t)$-good, and thus we may assume that the second case holds.As $d(T,Y)\geq p-\delta_{k+t}$ by (14), we have $d(T,Y)+\delta_{k+t-b}-p\geq\delta_{k+t-1}-\delta_{k+t}\geq\frac{\varepsilon}{(%k+t)^{2}}.$Thus

$\displaystyle(d(T,Y)+\delta_{k+t-b}-p)^{r}|T||Y|$ | $\displaystyle\geq\left(\frac{\varepsilon}{(k+t)^{2}}\right)^{r}\frac{(\mu+%\varepsilon)^{b}}{2}|X||Y|$ | ||

$\displaystyle\stackrel{{\scriptstyle\eqref{e:moment}}}{{\geq}}\left(\frac{%\varepsilon}{(k+t)^{2}}\right)^{r}\frac{(\mu+\varepsilon)^{b}}{2}x^{-k}y^{-%\ell}\mu^{-t}$ | |||

$\displaystyle=\frac{1}{2}\left(\frac{\varepsilon}{(k+t)^{2}}\right)^{r}\left(%\frac{\mu+\varepsilon}{\mu}\right)^{b}x^{-k}y^{-\ell}\mu^{-t+b}$ | |||

$\displaystyle\geq x^{-k}y^{-\ell}\mu^{-t+b},$ |

where the last inequality holds by the choice of $b$ as

$\left(\frac{\mu+\varepsilon}{\mu}\right)^{b}\geq(1+\varepsilon)^{b}\geq 2\left%(\frac{(k+t)^{2}}{\varepsilon}\right)^{r}.$ |

Thus $(T,Y)$ is $(k,\ell,t-b)$-good by the induction hypothesis, implying that $(X,Y)$ is $(k,\ell,t)$-good.

Thus we may assume that $|W|\leq w$. We now move on to the last part of the argument, corresponding to either the red step or the density increment step of the book algorithm, which are treated in the same way in our implementation.

Let $p^{\prime}=p-\delta_{k+t-1}\geq p-\varepsilon.$ By 7, we have

$\sum_{v\in X}d(X,N_{R}(v)\cap Y)|N_{R}(v)\cap Y|\geq e_{R}(X,Y)d(X,Y).$ | (17) |

As

$\sum_{v\in W}d(X,N_{R}(x)\cap Y)|N_{R}(v)\cap Y|\leq w|Y|\stackrel{{%\scriptstyle\eqref{e:x2}}}{{\leq}}\frac{\varepsilon}{(k+t)^{3}}(p-\varepsilon)%|X||Y|\leq\frac{\varepsilon}{(k+t)^{3}}e_{R}(X,Y)$ |

we have

$\sum_{v\in X-W}d(X,N_{R}(v)\cap Y)|N_{R}(x)\cap Y|\geq\left(d(X,Y)-\frac{%\varepsilon}{(k+t)^{3}}\right)e_{R}(X,Y).$ |

Thus there exists $v\in X$ such that $|N_{B}(v)\cap X|\leq(\mu+\varepsilon)|X|$ and

$d(X,N_{R}(v)\cap Y)\geq d(X,Y)-\frac{\varepsilon}{(k+t)^{3}}.$ | (18) |

Let

$\displaystyle X_{R}=N_{R}(v)\cap X,\qquad$ | $\displaystyle X_{B}=N_{B}(v)\cap X,\qquad$ | $\displaystyle Y^{\prime}=N_{R}(v)\cap Y,$ | ||

$\displaystyle\alpha=d(X,Y^{\prime})-p^{\prime},\qquad$ | $\displaystyle\alpha_{R}=d(X_{R},Y^{\prime})-p^{\prime},\qquad$ | $\displaystyle\alpha_{B}=d(X_{B},Y^{\prime})-p^{\prime}.$ |

Note that $|Y^{\prime}|\geq(p-\varepsilon)|Y|$ by (14), and

$\displaystyle(\alpha_{R}$ | $\displaystyle|X_{R}|+\alpha_{B}|X_{B}|+1)|Y^{\prime}|$ | ||

$\displaystyle=(e_{R}(X_{R},Y^{\prime})+e_{R}(X_{B},Y^{\prime})+e_{R}(\{v\},Y^{%\prime}))-p^{\prime}(|X_{R}|+|X_{B}|)|Y^{\prime}|$ | |||

$\displaystyle\geq e_{R}(X,Y^{\prime})-p^{\prime}|X||Y^{\prime}|=\alpha|X|||Y^{%\prime}|,$ |

implying

$\frac{\alpha_{R}}{\alpha}\frac{|X_{R}|}{|X|}+\frac{\alpha_{B}}{\alpha}\frac{|X%_{B}|}{|X|}+\frac{1}{\alpha|X|}\geq 1.$ | (19) |

If $\alpha_{R}\geq 0$ and$\alpha_{R}^{r}|X_{R}|\geq\frac{x\alpha^{r}|X|}{p-\varepsilon}$ then

$\displaystyle(d(X_{R},Y^{\prime})-p^{\prime})^{r}|X_{R}||Y^{\prime}|$ | $\displaystyle=\alpha_{R}^{r}|X_{R}||Y^{\prime}|\geq x(d(X,Y^{\prime})-p^{%\prime})^{r}|X|\frac{|Y^{\prime}|}{p-\varepsilon}$ | ||

$\displaystyle\stackrel{{\scriptstyle\eqref{e:alpha}}}{{\geq}}x\left(d(X,Y)-%\frac{\varepsilon}{(k+t)^{3}}+\delta_{k+t-1}-p\right)^{r}|X||Y|$ | |||

$\displaystyle\geq x(d(X,Y)+\delta_{k+t}-p)^{r}|X||Y|{\geq}x^{-k+1}y^{-\ell}\mu%^{-t}.$ |

This implies that $(X_{R},Y^{\prime})$ is $(k-1,\ell,t)$-good by the induction hypothesis, and so $(X,Y)$ is $(k,\ell,t)$-good.

Thus we may assume that either $\alpha_{R}<0$ or $\alpha_{R}^{r}|X_{R}|<\frac{x\alpha^{r}|X|}{p-\varepsilon}$, i.e.

$\frac{\alpha_{R}}{\alpha}<x^{1/r}(p-\varepsilon)^{-1/r}\left(\frac{|X|}{|X_{R}%|}\right)^{1/r}.$ |

Symmetrically, we also have

$\frac{\alpha_{B}}{\alpha}<\mu^{1/r}(p-\varepsilon)^{-1/r}\left(\frac{|X|}{|X_{%B}|}\right)^{1/r}.$ |

It follows from (19) that

$x^{1/r}\left(\frac{|X_{R}|}{|X|}\right)^{1-1/r}+\mu^{1/r}\left(\frac{|X_{B}|}{%|X|}\right)^{1-1/r}+\frac{(p-\varepsilon)^{1/r}}{\alpha|X|}>(p-\varepsilon)^{1%/r}.$ | (20) |

The remainder of the proof is occupied with showing that (20) contradicts (11).As $\frac{|X_{R}|}{|X|}+\frac{|X_{B}|}{|X|}\leq|X|$, $\frac{|X_{B}|}{|X|}\leq(\mu+\varepsilon)$, the function $\lambda\to x^{1/r}(1-\lambda)^{1-1/r}+\mu^{1/r}\lambda^{1-1/r}$ increases for $0\leq\lambda\leq\frac{\mu}{\mu+x}$ and $\mu+\varepsilon\leq\frac{\mu}{\mu+x}$ we can lower bound the left side of (20) by replacing $\frac{|X_{B}|}{|X|}$ by $\mu+\varepsilon$ and $\frac{|X_{R}|}{|X|}$ by $1-\mu$, implying

$x^{1/r}(1-\mu)^{1-1/r}+\mu+\varepsilon+\frac{(p-\varepsilon)^{1/r}}{\alpha|X|}%\geq(p-\varepsilon)^{1/r}.$ | (21) |

Note that

$\displaystyle\alpha$ | $\displaystyle=d(X,Y^{\prime})-p^{\prime}\stackrel{{\scriptstyle\eqref{e:alpha}%}}{{\geq}}d(X,Y)-\frac{\varepsilon}{(k+t)^{3}}+\delta_{k+t-1}-p$ | ||

$\displaystyle\geq\delta_{k+t-1}-\delta_{k+t}-\frac{\varepsilon}{(k+t)^{3}}=%\frac{\varepsilon}{(k+t)(k+t-1)}-\frac{\varepsilon}{(k+t)^{3}}\geq\frac{%\varepsilon}{(k+t)^{2}},$ |

and so

$\frac{(p-\varepsilon)^{1/r}}{\alpha|X|}\stackrel{{\scriptstyle\eqref{e:x}}}{{%\leq}}\frac{(k+t)^{2}}{\varepsilon(1+\varepsilon)^{k+\ell+t}}\leq\varepsilon,$ |

where the last inequality holds whenever $L_{0}$ (and thus $\ell$) is large enough as a function of $\varepsilon$.It follows that (21) in turn implies$x^{1/r}(1-\mu)^{1-1/r}+\mu+2\varepsilon\geq(p-\varepsilon)^{1/r},$ i.e.

$x\geq((p-\varepsilon)^{1/r}-\mu-2\varepsilon)^{r}(1-\mu)^{1-r}$ |

contradicting (11).∎

The main result of this section follows immediately from 11.

###### Theorem 12.

For all $0<\mu,x,y,p<1$ such that $x<p^{\frac{1}{1-\mu}}(1-\mu)$ and $(x,y)\in\mathcal{R_{*}}$ there exists $L_{0}$ such that for all positive integers $k,\ell$ with $\ell\geq L_{0}$ the following holds. Every red-blue coloring of edges the complete graph on $N\geq x^{-k/2}(\mu y)^{-\ell/2}$ with the density of red edges at least $p$ contains a red $K_{k}$ or a blue $K_{\ell}$.

###### Proof.

Let $y_{0}>y$ such that $(x,y_{0})\in\mathcal{R_{*}}$ and let $L_{0}$ be chosen sufficiently large to satisfy the requirements below including the condition $(y_{0}/y)^{\ell/2}\geq 3$ for $\ell\geq L_{0}$. Then $N\geq x^{-k/2}(\mu y)^{-\ell/2}\geq 2x^{-k/2}(\mu y_{0})^{-\ell/2}+1$, and so $\lfloor N/2\rfloor\geq x^{-k/2}(\mu y_{0})^{-\ell/2}$. Let $X$ and $Y$ be two disjoint subsets of vertices of our graph each of size $\lfloor N/2\rfloor$ chosen to maximize $d(X,Y)$. Then $d(X,Y)\geq p$ and $|X||Y|=(\lfloor N/2\rfloor)^{2}\geq x^{-k}y_{0}^{-\ell}\mu^{-\ell}$. The theorem follows by applying 11 to the candidate $(X,Y)$ with $x_{0}=x,\mu_{0}=\mu$.∎

## 4. Optimizing descent to a candidate

In this section we derive our main result from 12 in the same manner that 5 was derived from 4 in Section2: By using induction on $\ell$ with the induction hypothesis applied to the blue neighborhood of a vertex, if the density of red edges is insufficiently high to apply 12.

We express our upper bound on Ramsey numbers $R(k,\ell)$ with $\ell\leq k$ in the form $e^{F(\ell/k)k~{}+~{}o(k)}$ and seek to minimize $F$. As $e^{F((\ell-1)/k)k~{}+~{}o(k)}/e^{F(\ell/k)k~{}+~{}o(k)}\approx e^{-F^{\prime}(%\ell/k)},$ we can use the induction hypothesis if the density of blue edges is about $e^{-F^{\prime}(\ell/k)}$. Thus we will be applying 12 when the density of red edges is at least $1-e^{-F^{\prime}(\ell/k)}$. The parameters $x,y$ and $m$ are then chosen as functions of $\ell/k$ to satisfy the requirements in 12. We formalize this strategy in the following theorem.

###### Theorem 13.

Let $F:(0,1]\to\mathbb{R}_{+}$ be smooth and let $M,X,Y:(0,1]\to(0,1)$ be such that

$\displaystyle F^{\prime}(\lambda)<0,\qquad X(\lambda)=(1-e^{-F^{\prime}(%\lambda)})^{\frac{1}{1-M(\lambda)}}(1-M(\lambda)),\qquad(X(\lambda),Y(\lambda)%)\in\mathcal{R},$ | ||

$and\phantom{\rule{0ex}{0ex}}F(\lambda )>-{\displaystyle \frac{}{}}$ |