Warning: this is an htmlized version!
The original is here, and the conversion rules are here. |
% (find-angg "LATEX/2010diags.tex") % (find-angg ".emacs" "idct") % (find-dn4ex "edrx08.sty") % (find-angg ".emacs.templates" "s2008a") % (find-angg ".zshrc" "makebbl") % (defun b () (interactive) (find-zsh "makebbl 2010diags.bbl catsem,filters")) % (defun b () (interactive) (find-zsh "bibtex 2010diags")) % (defun b () (interactive) (find-zsh "bibtex 2010diags; makeindex 2010diags")) % (defun c () (interactive) (find-zsh "cd ~/LATEX/ && ~/dednat4/dednat41 2010diags.tex && latex 2010diags.tex")) % (defun c () (interactive) (find-zsh "cd ~/LATEX/ && ~/dednat4/dednat41 2010diags.tex && latex 2010diags.tex" 1 '(eek "M->"))) % (defun c () (interactive) (find-zsh "cd ~/LATEX/ && ~/dednat4/dednat41 2010diags.tex && pdflatex 2010diags.tex")) % (eev "cd ~/LATEX/ && Scp 2010diags.{dvi,pdf} [email protected]:slow_html/LATEX/") % (defun d () (interactive) (find-xdvipage "~/LATEX/2010diags.dvi")) % (find-xdvipage "~/LATEX/2010diags.dvi") % % (setq b-cmd "bibtex 2010diags") % (setq c-cmd "~/dednat4/dednat41 2010diags.tex && latex 2010diags.tex") % (setq cbcbc-cmd (concat c-cmd " && " b-cmd " && " c-cmd " && " b-cmd " && " c-cmd)) % (defun bc () (interactive) (find-zsh cbcbc-cmd 1 '(eek "M->"))) % % (find-pspage "~/LATEX/2010diags.ps") % (find-pspage "~/LATEX/2010diags.pdf") % (find-xpdfpage "~/LATEX/2010diags.pdf") % (find-man "dvipdf") % (find-zsh0 "cd ~/LATEX/ && dvipdf 2010diags.dvi 2010diags.pdf") % (find-zsh0 "cd ~/LATEX/ && dvips -D 300 -o 2010diags.ps 2010diags.dvi") % (find-zsh0 "cd ~/LATEX/ && dvips -D 600 -P pk -o 2010diags.ps 2010diags.dvi && ps2pdf 2010diags.ps 2010diags.pdf") % (find-zsh0 "cd ~/LATEX/ && dvips -D 300 -o tmp.ps tmp.dvi") % (find-pspage "~/LATEX/tmp.ps") % (ee-cp "~/LATEX/2010diags.pdf" (ee-twupfile "LATEX/2010diags.pdf") 'over) % (ee-cp "~/LATEX/2010diags.pdf" (ee-twusfile "LATEX/2010diags.pdf") 'over) % (find-twusfile "LATEX/" "2010diags") % http://angg.twu.net/LATEX/2010diags.pdf % «.mental-space» (to "mental-space") % «.projs-and-lifts» (to "projs-and-lifts") % «.dn-types» (to "dn-types") % «.dictionary» (to "dictionary") % «.internal» (to "internal") % «.parallel-notations» (to "parallel-notations") % «.functors» (to "functors") % «.nts» (to "nts") % «.adjunctions» (to "adjunctions") % «.fnta» (to "fnta") % «.transmission» (to "transmission") % «.intuition» (to "intuition") % «.hyps» (to "hyps") % «.pres-frob-bcc» (to "pres-frob-bcc") % «.eq-elim» (to "eq-elim") % «.archetypal» (to "archetypal") % «.CCCs» (to "CCCs") % «.olts» (to "olts") % «.database» (to "database") % «.epilogue» (to "epilogue") % «.placement» (to "placement") % «.obvious-and-readable» (to "obvious-and-readable") % «.generalization» (to "generalization") % «.cat-diags-as-trees» (to "cat-diags-as-trees") % «.cat-sem-diag» (to "cat-sem-diag") % «.reading» (to "reading") % \documentclass[oneside]{book} \documentclass[oneside]{article} \usepackage[latin1]{inputenc} \usepackage[colorlinks]{hyperref} % (find-es "tex" "hyperref") \usepackage{breakurl} \usepackage{edrx08} % (find-dn4ex "edrx08.sty") %L process "edrx08.sty" -- (find-dn4ex "edrx08.sty") \input edrxheadfoot.tex % (find-dn4ex "edrxheadfoot.tex") \begin{document} \input 2010diags.dnt %* % (eedn4-51-bounded) %Index of the slides: %\msk % To update the list of slides uncomment this line: %\makelos{tmp.los} % then rerun LaTeX on this file, and insert the contents of "tmp.los" % below, by hand (i.e., with "insert-file"): % (find-fline "tmp.los") % (insert-file "tmp.los") % For "book": \def\mychapter #1#2{\chapter{#1}\label{#2}} \def\mysection #1#2{\section{#1}\label{#2}} \def\mysubsection#1#2{\subsection{#1}\label{#2}} % For "article": \def\mychapter #1#2{\section{#1}\label{#2}} \def\mysection #1#2{\subsection{#1}\label{#2}} \def\mysubsection#1#2{\subsubsection{#1}\label{#2}} \def\mysection #1#2{\section{#1}\label{#2}} \def\Setito{\Set^{\ito}} \def\Setmto{\Set^{\monicto}} \def\Pred{�{Pred}} \def\Pred{�{Pred}} \def\cob{�{c.o.b.}} \def\EqE{{=}E} \def\TB{�\!_{B}} \def\TAB{�\!_{A×B}} \def\SDTB{Æ_\DD\!\TB} \def\pip { \pi'} \def\opip {\ov\pi'} \def\pipstar { \pi^{\prime*}} \def\opipstar{\ov\pi^{\prime*}} \def\pistar { \pi^{*}} \def\opistar {\ov\pi^{*}} \def\opi {\ov\pi} \def\EqEdomain{\opipstarÆ_\DD\TB \land \opistar\dd^*Q} \def\EqEL {\opipstarÆ_\DD\TB} \def\EqER {\opistar\dd^*Q} \def\EqEdomthin{\EqEL{∧}\EqER} \def\EqEdomwide{\EqEL\land\EqER} \def\fdiag#1{\fbox{\!\!\!$\diag{#1}$}} \def\fdiag#1{\fbox{$\diag{#1}$}} \def\ffdiag#1#2{\begin{tabular}{l}#2\\\fbox{$\diag{#1}$}\end{tabular}} \def\fdiag#1{\ffdiag{#1}{foo}} \def\ctabular#1{\begin{tabular}{c}#1\end{tabular}} \def\ltabular#1{\begin{tabular}{l}#1\end{tabular}} % \def\fdiagwithboxest#1#2{\ltabular{#2 \\ \fdiagwithboxes{#1}}} \def\fdiagest#1#2{\ltabular{#2 \\ \fdiag{#1}}} \def\fdiag#1{\fbox{$\diag{#1}$}} \def\sqrtdot{\sqrt{\,�\,}} \def\tondot{\ton{.\;}} \def\corn#1{\ulcorner#1\urcorner} %:*×*{×}* %:*=*{=}* %:*&*{∧}* %:*∧*{∧}* %:*⊃*{⊃}* % :*g^*f^**g^*\!f^** % Internal Diagrams in Category Theory % % Eduardo Ochs, 2010 % % Version: 2010oct21 \title{Internal Diagrams in Category Theory} \author{Eduardo Ochs (LLaRC, PURO, UFF) \\ {\tt [email protected]} \\ {\url{http://angg.twu.net/}}} \date{2010nov01} \maketitle \begin{abstract} We can regard operations that discard information, like specializing to a particular case or dropping the intermediate steps of a proof, as {\sl projections}, and operations that reconstruct information as {\sl liftings}. By working with several projections in parallel we can make sense of statements like ``$\Set$ is the archetypal Cartesian Closed Category'', which means that proofs about CCCs can be done in the ``archetypal language'' and then lifted to proofs in the general setting. The method works even when our archetypal language is diagrammatical, has potential ambiguities, is not completely formalized, and does not have semantics for all terms. We illustrate the method with an example from hyperdoctrines and another from synthetic differential geometry. \end{abstract} \bsk {\footnotesize (Submitted to the Special Issue on Categorical Logic of {\sl Logica Universalis}. The text may be considered in final form, modulo a missing ``thanks'' section and corrections and suggestions from the referees, but there are still many formatting adjustments to be made... This paper is not in the {\tt birkjour} format yet!)} % http://article.gmane.org/gmane.science.mathematics.categories/5991 \bsk % -------------------- % «mental-space» (to ".mental-space") % (sec "Mental Space and Diagrams" "mental-space") \mysection {Mental Space and Diagrams} {mental-space} {\sl My memory is limited, and not very dependable: I often have to rededuce results to be sure of them, and I have to make them fit in as little ``mental space'' as possible...} Different people have different measures for ``mental space''; someone with a good algebraic memory may feel that an expression like % $\Frobnat: Æ_f(P∧f^*Q) \xton{\cong} Æ_fP∧Q$ % % % $$Æ_f(P∧f^*Q) \xton{\cong} Æ_fP∧Q$$ % is easy to remember, while I % % --- let's construct a semi-fictional ``I'' on the course of this % --- paper, and analyze how this ``I'' thinks --- again: while I % always think diagramatically, and so what I do is that I remember this diagram, $$\includegraphics[scale=0.5]{frob-sketch.eps}$$ % {\myttchars % \footnotesize % \begin{verbatim} % =========> % | / | % /====> - O % || \ | % \========= % % ---------> % \end{verbatim} % } \noindent and I reconstruct the formula from it. \msk I cannot yet define precisely what it is to ``think diagramatically'', but for the purposes of this paper a loose definition --- or rather, a set of concepts and techniques for diagrammatical thinking, plus non-trivial consequences of ``thinking'' in that way --- shall do. I will resort to a narrative device: {\sl I will speak as from a semi-fictional ``I'' who thinks as diagramatically as possible, and who always uses diagrams as a help for his bad algebraic thinking.} % -------------------- % «projs-and-lifts» (to ".projs-and-lifts") % (sec "Projections and Liftings" "projs-and-lifts") \mysection {Projections and Liftings} {projs-and-lifts} Take the concept of ``projection'', from the realm of covering spaces or from Linear Algebra. A projection map discards information --- coordinates, maybe --- and may collapse objects that were originally distinct. % When I represent projections diagrammatically I try to organize my diagrams to make the projection arrows --- most of them, at least --- go down. Figure [2-100-99] is an example. {\sl Specialization} acts like a projection. Look at the top arrow in Fig.\ 1: except for the choice of a particular value for $n$, we have only lost information. {\sl Discarding intermediate steps}, as in the bottom arrow, is a kind of {\sl erasing}, and erasings are evidently projections. % (find-854 "" "generalization") % (find-854page 14 "generalization") \def\twoninenyninelemma#1#2{ \fbox{ $\begin{array}{rcl} 2^{#2}-2^{#1} &=& 2^{1+#1}-2^{#1} \\ % &=& 2^{1}�2^{#1}-2^{#1} \\ &=& 2�2^{#1}-1�2^{#1} \\ &=& (2-1)�2^{#1} \\ % &=& 1�2^{#1} \\ &=& 2^{#1} \\ \end{array} $} } %:*&*&* %D diagram 2^100-2^99 %D 2Dx 100 %D 2D 100 proof-n %D 2D %D 2D +50 proof-99 %D 2D %D 2D +40 statement-99 %D 2D %D (( %D proof-n .tex= \twoninenyninelemma{n}{n+1} %D proof-99 .tex= \twoninenyninelemma{99}{100} %D statement-99 .tex= \fbox{$\begin{array}{rcl}2^{100}-2^{99}&=&2^{99}\end{array}$} %D @ 0 @ 1 -> %D @ 1 @ 2 -> %D )) %D enddiagram %D $$\diag{2^100-2^99}$$ $$\text{Figure [2-100-99]}$$ %:*&*{∧}* The opposite of {\sl to project} is {\sl to lift}; projecting is easy, lifting is hard. A projection map, $p: E \to B$, may have any number of preimages for a point $b$ in the base space --- one, many, none ---, and, to make things worse, we will often be interested in very specific cases where we are somehow able to recognize whether a given lifting is ``good'' or not --- for example, we may be looking for the ``right'' generalization for $2^{100}-2^{99} = 2^{99}$ ---, but where we are unable to define exactly what a ``good lifting'' is. % -------------------- % «dn-types» (to ".dn-types") % (sec "Downcased Types" "dn-types") \mysection {Downcased Types} {dn-types} Suppose that we have a point $p�A×B$ and a function $f:B \to C$. There is a single ``obvious'' way to build a point of $C$ starting from this data. It is ``obvious'' because we have a search method that finds it; it is related to proof search ({\sl via} the Curry-Howard isomorphism). Here's how it works, briefly and informally. \widemtos A point $p�A×B$ is a pair made of an `$a$' and a `$b$'. If we allow ``long names'' for variables we can replace the `$p$' by another name, that reflects its type more closely; so let's rename `$p$' to `$a,b$'. Similarly, an $f:B \to C$ is an operation that takes each `$b$' to a `$c$', so let's rename `$f$' to `$b \mto c$'. Now, with this new notation, we are looking for an operation that takes an `$a,b$' and a `$b \mto c$' and produces a term that ``deserves the name'' `$c$'. We start at the tree in the lower right rectangle in Figure [Two], and we want to arrive at the tree in the lower left; both trees have the same shape, and by relating them we will see that the term corresponding to $c$ is $f('p)$. Note that in this new language --- ``Downcased Types'' --- we have no syntactical distinction between variables and non-atomic terms. % (find-854 "" "standard-erasings") % (find-854page 13 "standard-erasings") % (find-854 "" "generalization") % (find-854page 14 "generalization") %L forths["<.>"] = function () pusharrow("<.>") end %L forths["<-->"] = function () pusharrow("<-->") end %L forths["|-->"] = function () pusharrow("|-->") end % This gives us a way to draw all the "standard % erasings" of a tree from a single tree definition. % Low-level words: % \def\archfull#1#2#3{#1\equiv #2:#3} \def\ARCHFULL {\let\arch=\archfull} \def\archdnc#1#2#3{#1} \def\ARCHDNC {\let\arch=\archdnc} \def\archtermtype#1#2#3{#2:#3} \def\ARCHTERMTYPE{\let\arch=\archtermtype} \def\archterm#1#2#3{#2} \def\ARCHTERM {\let\arch=\archterm} \def\archtype#1#2#3{#3} \def\ARCHTYPE {\let\arch=\archtype} \def\rY#1{#1} \def\RY {\let\r=\rY} \def\rN#1{} \def\RN {\let\r=\rN} \def\ptypeY#1{#1} \def\PTYPEY {\let\ptype=\ptypeY} \def\pytpeN#1{} \def\PTYPEN {\let\ptype=\pytpeN} % % High-level words: % \def\FULLRP{\ARCHFULL\RY\PTYPEY} \def\FULLR {\ARCHFULL\RY\PTYPEN} \def\FULLP {\ARCHFULL\RN\PTYPEY} \def\FULL {\ARCHFULL\RN\PTYPEN} \def\TERMTYPERP{\ARCHTERMTYPE\RY\PTYPEY} \def\TERMTYPER {\ARCHTERMTYPE\RY\PTYPEN} \def\TERMTYPEP {\ARCHTERMTYPE\RN\PTYPEY} \def\TERMTYPE {\ARCHTERMTYPE\RN\PTYPEN} \def\TERMRP{\ARCHTERM\RY\PTYPEY} \def\TERMR {\ARCHTERM\RY\PTYPEN} \def\TERMP {\ARCHTERM\RN\PTYPEY} \def\TERM {\ARCHTERM\RN\PTYPEN} \def\TYPER {\ARCHTYPE\RY\PTYPEN} \def\TYPE {\ARCHTYPE\RN\PTYPEN} \def\DNCR {\ARCHDNC\RY\PTYPEN} \def\DNC {\ARCHDNC\RN\PTYPEN} %: %: \arch{a,b}{p}{A×B} %: ------------------\r{'} %: \arch{b}{'p}{B} \arch{b|->c}{f}{B->C} %: -------------------------------------\r{\app} %: \arch{c}{f('p)}{C} %: %: ^archetypal-deriv %: %: %: \arch{a,b}{p}{A×B} \arch{b|->c}{f}{B->C} %: ========================================== %: \arch{c}{f('p)}{C} %: %: ^archetypal=deriv %: %D diagram archderivs2 %D 2Dx 100 +70 +05 +45 +35 %D 2D 100 \foo{\FULLRP} %D 2D %D 2D +50 \foo{\DNCR} \foo{\TERMTYPERP} %D 2D %D 2D +50 \foo{\DNC} \foo{\TERMR} \foo{\TYPE} %D 2D %D 2D +45 \fooo{\DNC} \fooo{\TERMR} \fooo{\TYPE} %D 2D %D (( \foo{\DNCR} x+= 15 %D # \foo{\TERMR} y+= -10 %D # \fooo{\TERMR} y+= -10 %D %D \foo{\FULLRP} \foo{\TERMTYPERP} -> %D \foo{\FULLRP} \foo{\DNCR} -> %D \foo{\TERMTYPERP} \foo{\TERMR} -> %D \foo{\TERMTYPERP} \foo{\TYPE} -> %D \foo{\DNCR} \foo{\DNC} -> %D \foo{\TYPE} \foo{\DNC} <--> .slide= 30pt %D %D \foo{\TERMR} \fooo{\TERMR} -> %D \foo{\TYPE} \fooo{\TYPE} -> %D \foo{\DNC} \fooo{\DNC} -> %D \fooo{\TYPE} \fooo{\DNC} <--> .slide= 25pt %D )) %D enddiagram %D $$\def\foo#1{#1\fcded{archetypal-deriv}} \def\fooo#1{#1\fcded{archetypal=deriv}} \diag{archderivs2} $$ $$\text{Figure [Two]}$$ All the solid arrows in Figure [Two] are erasings. The dashed arrows are ``uppercasings'' when we go rightwards, ``downcasings'' when we go to the left; note that `$×$' is downcased to `$,$', and `$\to$' is downcased to `$\mto$'. If we start at the lower left and we move right through the dashed arrow, we get the types of the (three) objects involved; what we still need to do from there is a kind of ``term inference''... ``Type inference'' is very well-known, and polymorphic languages like Haskell and ML implement algorithms for it; ``term inference'', on the other hand, is rarely mentioned in the literature, but techniques like parametricity (\cite{WadlerTfF}, \cite{Bernardy10}) provide useful meta-theorems about properties that all possible inferrable terms must obey. It would be lovely if these techniques could do term inference for us, algorithmically --- but they can't, so when we need to infer terms we usually do the work by hand. As term inference is hard, let's turn our attention to something easier, and looser. ``Inference'' carries the connotation of something that can be done algorithmically, without any previous knowledge of the result. We will focus on the process of ``reconstructing'' the desired term, $c \equiv f('p)$, from the downcased tree in the lower left of Figure [Two]. A ``reconstruction'' may need hints to be completed, and may depend on making the right choices at some points, motivated by unformalizable bits of common sense, or by intuition, hindsight, or by vague rememberances, by a sense of good style, or whatever else. A ``reconstruction'' is the result of an {\sl incomplete} algorithm (or a ``method'', rather than an ``algorithm''); when we perfect a method for reconstruction, and finish filling up all its gaps, it becomes an ``inference algorithm''. For our purposes, ``reconstruction'' will be enough --- and real ``inference'' will be close to impossible. % Amazingly, for our purposes, ``reconstruction'' will be enough --- % and real ``inference'' seems to be impossible. % -------------------- % «dictionary» (to ".dictionary") % (sec "The Dictionary" "dictionary") \mysection {The Dictionary} {dictionary} The downcased notation has to be used with care, as it doesn't come with any built-in devices to protect us from ambiguities. For example, we could have tried to find a term ``deserving the name'' `$a,a \mto a,a$' --- and there are four different ones!... One way to avoid this problem is to consider that the downcased ``names'' are just that, {\sl names}, and that we have a {\sl dictionary} that associates to each name its {\sl meaning}. In the case of $�$-calculus, a dictionary relating each downcased name to its meaning can be extracted from a derivation tree (if all the rule names are present), and the derivation tree can be reconstructed from its associated dictionary. For example, %: %: a,b p %: ---' ---' %: b b|->c 'p f %: ----------\app -------\app %: c f('p) %: %: ^dict-1-dnc ^dict-1-std %: %D diagram dict-1 %D 2Dx 100 +90 %D 2D 100 dnc <----> dict %D 2D %D 2D +20 %D 2D %D (( dnc .tex= \fcded{dict-1-dnc} BOX %D dict .tex= \dict BOX %D @ 0 @ 1 -> sl^ %D @ 0 @ 1 <- sl_ %D )) %D enddiagram %D $$\def\dict{\fbox{$\begin{array}{rcl} \angg{b} &:=& '\angg{a,b} \\ \angg{c} &:=& \angg{b \mto c}\angg{b} \\ \end{array}$}} \diag{dict-1} $$ % and if we add to that dictionary the entries % $$\begin{array}{rcl} \angg{a,b} &:=& p \\ \angg{b \mto c} &:=& f \\ \end{array} $$ % then we can reconstruct the tree % $$\ded{dict-1-std}$$ % from that. Note that we use double angle brackets, $\angg{�}$, to separate names from one another and to distinguish them from standard notation, and that we use `$\equiv$' to indicate change of notation --- usually between downcased and standard. % -------------------- % «internal» (to ".internal") % (sec "Internal Diagrams" "internal") \mysection {Internal Diagrams} {internal} Several of the initial ideas for the system of downcased types came from attempts to formalize something that I will call ``physicist's notation'', and that should be familiar to most readers; I have never seen any detailed formalizations of it, though. Suppose that we have a function $f:\R \to \R$, and we draw the graph of $y=f(x)$. Then, given points $x_0$, $x_1$, $x_2$, $x'$ on the ``$x$-axis'' we have default meanings for the names $y_0$, $y_1$, $y_2$, $y'$: namely, $y_0 := f(x_0)$, $y' := f(x')$, and so on. It makes sense to write $X$ for the domain of $f$ and $Y$ for its codomain, and if we draw the ``internal diagram'' (as in \cite{LawvereSchanuel}, p.14) of the map $f$, we get this: % (find-LATEX "2010unilog-current.tex" "internal-diagram") % (find-854file "") % (find-854 "" "phys-notation") % (find-854page 9) % (find-854text 11) \def\ooo(#1,#2){\begin{picture}(0,0)\put(0,0){\oval(#1,#2)}\end{picture}} \def\oooo(#1,#2){{\setlength{\unitlength}{1ex}\ooo(#1,#2)}} \def\ctabular#1{\begin{tabular}{c}#1\end{tabular}} \def\ltabular#1{\begin{tabular}{l}#1\end{tabular}} \def\rtabular#1{\begin{tabular}{r}#1\end{tabular}} %D diagram x-to-y %D 2Dx 100 +35 +20 +20 +30 +15 +15 %D 2D 100 External \N ---> \R %D 2D +08 n |--> sn %D 2D %D 2D +20 internal n0 |--> r0 a0 |--> b0 %D 2D +08 internal n1 |--> r1 a1 |--> b1 %D 2D +08 Internal n2 |--> r2 a2 |--> b2 %D 2D +10 n. |--> r. a. |--> b. %D 2D +14 nn |--> rn an |--> bn %D 2D +08 ns |--> rs as |--> bs %D 2D %D 2D +20 stan down %D 2D %D (( External .tex= \rtabular{external\\view} y+= 4 place %D Internal .tex= \rtabular{internal\\view} place %D stan .tex= \ctabular{standard\\notation} place %D down .tex= \ctabular{downcased\\notation} place %D \N .tex= X \R .tex= Y -> .plabel= a f %D # \N .tex= \R \R -> .plabel= a f %D n .tex= x sn .tex= f(x) |-> %D n0 .tex= x_0 r0 .tex= y_0 |-> %D n1 .tex= x_1 r1 .tex= y_1 |-> %D # n2 .tex= x_2 r2 .tex= y_2 |-> %D n2 .tex= x' r2 .tex= y' |-> %D n. .tex= \vdots place r. .tex= \vdots place %D nn .tex= x_n rn .tex= y_n |-> %D n0 nn midpoint .TeX= \oooo(7,21) place %D r0 rn midpoint .TeX= \oooo(8,21) place %D %D a2 .tex= x_2 b2 .tex= y_2 |-> %D an .tex= x_n bn .tex= y_n |-> %D as .tex= x bs .tex= y |-> %D %D )) %D enddiagram %D $$\diag{x-to-y}$$ Note that for each name like $x_2$ of a point in the $x$-axis (the ``space of `$x$'s'', $X$) we have chosen a similar name for a point in the $y$-axis (the ``space of `$y$'s'', $Y$). We will call the implicit operation on names the {\sl syntactical action} of the function $f \equiv x \mto y$; in the case above, the syntactical action replaced the `$x$' of each original name into a `$y$', and kept all the ``decorations'' unchanged. The syntactical action does not need to be defined for all possible names of points in $X$ --- in fact, we restrict ourself to ``good'' names of points in $X$ exactly to make the syntactical action easier to describe. Now let's take a function less abstract: $\sqrtdot: \N \to \R$. Its action is implicit in its name (``$\sqrtdot$''), and if we examine its internal diagram, % %D diagram N-to-R %D 2Dx 100 +35 +20 +20 +30 +15 +15 %D 2D 100 External \N ---> \R %D 2D +08 n |--> sn %D 2D %D 2D +20 internal n0 |--> r0 a0 |--> b0 %D 2D +08 internal n1 |--> r1 a1 |--> b1 %D 2D +08 Internal n2 |--> r2 a2 |--> b2 %D 2D +10 n. |--> r. a. |--> b. %D 2D +14 nn |--> rn an |--> bn %D 2D +08 ns |--> rs as |--> bs %D 2D %D 2D +20 stan down %D 2D %D (( External .tex= \rtabular{external\\view} y+= 4 place %D Internal .tex= \rtabular{internal\\view} place %D stan .tex= \ctabular{standard\\notation} place %D down .tex= \ctabular{downcased\\notation} place %D \N \R -> .plabel= a \sqrt{\,�\,} %D n sn .tex= \sqrt{n} |-> %D # n0 .tex= 0 r0 .tex= \sqrt{0} |-> %D n0 .tex= 0 r0 .tex= 0 |-> %D # n1 .tex= 1 r1 .tex= \sqrt{1} |-> %D n1 .tex= 1 r1 .tex= 1 |-> %D n2 .tex= 2 r2 .tex= \sqrt{2} |-> %D n. .tex= \vdots place r. .tex= \vdots place %D nn .tex= n rn .tex= \sqrt{n} |-> %D n0 nn midpoint .TeX= \oooo(7,21) place %D r0 rn midpoint .TeX= \oooo(8,21) place %D %D a2 .tex= 2 b2 .tex= \sqrt{2} |-> %D an .tex= n bn .tex= \sqrt{n} |-> %D as .tex= n bs .tex= r |-> %D %D )) %D enddiagram %D $$\diag{N-to-R}$$ % we may conclude that it should be possible to use `$n \mto \sqrt{n}$' as its name; or even `$2 \mto \sqrt{2}$' --- ``the function that takes 2 to $\sqrt{2}$, for every value of `2'\,''... \bsk As the reader may have guessed, there is no clear separation between what are ``good names'' and ``bad names'' for an object; instead we have a murky line. We have just seen in the first example that `$x_0$', `$x'$', etc, can all be downcasings for `$X$', and maybe the `$1$' in second example, if taken as a name, should be uppercaseable both as $\N$ and as $\R$... The trick is all in our use of the dictionary: we {\sl can} define the meaning of `$n \mto \sqrt{n}$' to be $\sqrtdot:\N \to \R$, and, if we feel that $\#_\$^!$ is a good name for the square root, we can define it to stand for the square root too... % -------------------- % «parallel-notations» (to ".parallel-notations") % (sec "Parallel Notations" "parallel-notations") \mysection {Parallel Notations} {parallel-notations} We are using two notations --- downcased and standard --- in parallel; it is possible to use more. One way to visualize what is going on is to think in terms of computer interfaces. Imagine a user inspecting huge data structures by displaying them on a computer screen in several forms --- he can toggle switches that control what gets shown and what is omitted. The downcased notation of the previous sections is {\sl my} compromise between {\sl my} intuition and a formalization. Suppose that someone has created a third notation, that he claims that is much closer to {\sl his} intuitions. He can set the controls to display only his favorite notation, as in the left side of the diagram below; or he can display all together at the same time --- as at the top. % than his that is closer % % If that user has another notation, besides standard and downcased, % that is closer to his intuition than downcased types (that, after % all, are {\sl my} compromise between {\sl my} intuition and a % formalization), then he can set the controls to display only his % favorite notation, as in the left side of the diagram below; or he % can display all together at the same time --- as at the top. %D diagram int-down-std %D 2Dx 100 +40 +20 +50 +30 %D 2D 100 all %D 2D %D 2D +35 int dstd %D 2D %D 2D +45 _nt down std %D 2D %D (( all .tex= \ctabular{all"that\\together} %D int .tex= \ctabular{ineffable\\purely\\intuitive\\thought} y+= 5 x+= 10 %D dstd .tex= \foo{\FULLRP} %D down .tex= \foo{\DNC} %D std .tex= \foo{\TERMTYPERP} %D all int -> all dstd -> %D dstd down -> dstd std -> %D )) %D enddiagram %D $$\def\foo#1{#1\fcded{archetypal-deriv}} \def\fooo#1{#1\fcded{archetypal=deriv}} \diag{int-down-std} $$ $$\text{Figure [screens]}$$ Working with three parallel notations is similar to working with two. \msk At this point our two notations, downcased and standard, generate trees with exactly the same shapes; in the next section we will begin to compare diagrams done in different notations, but the same shapes; and starting on section \ref{hyps} the parallelism will be looser --- there will be correspondences between trees and dictionaries, on one side, and ``strictly 2-dimensional'' categorical diagrams, on another. In all cases it is convenient to work in a ``projected view'', yet pretend that we are in the situation with total information --- that the rest of the information ``is there'', but hidden. % -------------------- % «functors» (to ".functors") % (sec "Functors" "functors") \mysection {Functors} {functors} Fix a set $A$. It induces a functor $(A×)$, that takes each set $B$ to $A×B$ and each function $f:B \to C$ to a function $(A×)f:A×B \to A×C$. Let's use the subscripts `0' and `1' to distinguish the two actions of a functor: $(A×)_0$ is its action on objects (sets, in this case), $(A×)_1$ is its action on morphisms (i.e., on functions). It is quite common in the literature of Category Theory to see things like: ``let $(A×):\Set \to \Set$ the functor that takes each set $B$ to $A×B$''. The action on morphisms is not described --- it is ``obvious'', and it is left to the reader to discover. The reader should apply a kind of search algorithm to find it. Which algorithm is this? Let's downcase this problem. The action on objects takes each ``space of `$b$'s'' to a ``space of `$a,b$'s''; its syntactical action is to prepend an `$a,$'. The action on morphisms takes each $b \mto c$ to an $a,b \mto a,c$ --- i.e., the syntactical action on morphisms consists of applying the syntactical action on objects to both the domain and the codomain. {\sl This is a general pattern:} in all functors the two syntactical actions will be related in this way. % (find-854 "" "dn-functors") % (find-854page 27 "dn-functors") %D diagram functor-int-ext %D 2Dx 100 +30 +30 +30 +30 %D 2D 100 ext E0 ----> E1 %D 2D +08 E2 ----> E3 %D 2D %D 2D %D 2D +22 int I0 ----> I1 D0 ===> D1 %D 2D | | - - %D 2D | |-> | | |-> | %D 2D v v v v %D 2D +30 I2 ----> I3 D2 ===> D3 %D 2D %D 2D +20 std dnc %D 2D %D (( E0 .tex= \Set E1 .tex= \Set -> .plabel= a (A×) %D E2 .tex= B E3 .tex= A×B |-> %D ext .tex= \rtabular{external\\view} y+= 4 place %D )) %D (( I0 .tex= B I1 .tex= A×B %D I2 .tex= C I3 .tex= A×C %D int .tex= \rtabular{internal\\view} y+= 15 place %D I0 I1 |-> .plabel= a (A×)_0 %D I2 I3 |-> %D I0 I2 -> .plabel= l f %D I0 I3 harrownodes nil 20 nil |-> .plabel= a (A×)_1 %D I1 I3 -> .plabel= r (A×)f %D )) %D (( D0 .tex= b D1 .tex= a,b => %D D2 .tex= c D3 .tex= a,c => %D D0 D2 |-> D1 D3 |-> %D D0 D3 harrownodes nil 20 nil |-> %D )) %D (( std .tex= \ctabular{standard\\notation} x+= 15 place %D dnc .tex= \ctabular{downcased\\notation} x+= 15 place %D )) %D enddiagram %D $$\diag{functor-int-ext}$$ The diagram is very similar to the ones in the previous section, but there the blobs were sets and they had points; now the blobs --- which we are no longer drawing --- are categories, that have objects, and between these objects we may have {\sl morphisms}. Also, the downcasing is changing the notation more than before, and we are downcasing the `$\mto$'s that were functor actions as `$\funto$'... the `$\funto$' is to remind us that: 1) something non-obvious is going on --- points of $B$ are not being taken to points of $A×B$, instead the whole set $B$ was fed into $(A×)_0$, and we got back $A×B$ ---, and 2) that `$b \funto a,b$' is {\sl two} actions. \msk So, we know just the ``name'' of the action on morphisms of this functor. How do we reconstruct its ``meaning''? The answer is obtained by liftings: %: %: b|->c %: =========(A×) %: a,b|->a,c %: %: ^smashed-Axf %: %: [\arch{a,b}{p}{A×B}]^1 %: ------------------\r{'} %: [\arch{a,b}{p}{A×B}]^1 \arch{b}{'p}{B} \arch{b|->c}{f}{B->C} %: ------------------\r{} -------------------------------------\r{\app} %: \arch{a}{p}{A} \arch{c}{f('p)}{C} %: ------------------------------------------------\r{\ang,} %: \arch{a,c}{\ang{p,f('p)}}{A×C} %: --------------------------------------------------\r{�;}1 %: \arch{a,b|->a,c}{�p\ptype{�(A×B)}.\ang{p,f('p)}}{A×B->A×C} %: %: ^archetypal-deriv-big %: %D diagram lifting-functor-1 %D 2Dx 100 +100 %D 2D 100 fulldnc fullterms %D 2D %D 2D +55 projected %D 2D %D (( fulldnc .tex= \DNC\fcded{archetypal-deriv-big} %D fullterms .tex= \TERMP\fcded{archetypal-deriv-big} %D projected .tex= \fcded{smashed-Axf} %D projected fulldnc --> %D fulldnc fullterms --> %D )) %D enddiagram %D $$\diag{lifting-functor-1}$$ We find that $(A×)f := �p�A×B.\ang{p,f('p)}$. % -------------------- % «nts» (to ".nts") % (sec "Natural Transformations" "nts") \mysection {Natural Transformations} {nts} Now fix two sets, $A$ and $A'$, and a map $\aa: A \to A'$. We have natural constructions for functors $(A×)$ and $(A'×)$, and, besides that, a natural transformation (``NT'', from now on), $(\aa×)$, going from $(A×)$ to $(A'×)$. An NT has a {\sl single} action, that takes {\sl objects} to {\sl morphisms}. We will need a convention. if $X$ and $Y$ are objects of a category $\catC$, then both $f:X \to Y$ and $X \ton{f} Y$ will denote a morphism, but $X \to Y$ denotes the full hom-set $\Hom_\catC(X, Y)$. With that convention, if $F,G: \catA \to \catB$ are functors and $T:F \tondot G$ is a NT, we can represent the internal view of $T$ as: % $$A \mapsto (FA \ton{TA} GA).$$ Let's take the convention up one level: $A \tondot (FA \ton{TA} GA)$ will denote a specific NT, but $A \tondot (FA \to GA)$ will mean the class of all NTs from $F$ to $G$; and we downcase `$\tondot$' as `$\tnto$'. Diagrammatically: % %D diagram NT-alphax-int-ext %D 2Dx 100 +30 +30 +30 +30 %D 2D 100 E0 ----> E1 %D 2D +08 ext E2 E3 %D 2D +08 E4 ----> E5 %D 2D +08 E6 ----> E7 %D 2D %D 2D %D 2D +22 I1 D1 %D 2D |---> | ===> - %D 2D +20 int IL |-> I2 DL |-> D2 %D 2D |---> v ===> v %D 2D +20 I3 D3 %D 2D %D 2D +20 std dnc %D 2D %D (( E0 .tex= \phantom{\Set} E1 .tex= \phantom{\Set} %D E2 .tex= \Set E3 .tex= \Set %D E4 .tex= \phantom{\Set} E5 .tex= \phantom{\Set} %D E0 E1 -> sl_ .plabel= a A× %D E2 place E3 place %D E4 E5 -> sl^^ .plabel= b A'× %D E0 E1 midpoint E4 E5 midpoint -> sl_ .plabel= r \aa× %D E6 .tex= B E7 .tex= \aa×B |-> %D )) %D (( I1 .tex= A×B %D IL .tex= B I2 .tex= \phantom{O} %D I3 .tex= A'×B %D IL I1 |-> .plabel= a A× %D IL I3 |-> .plabel= b A'× %D IL I2 -> .PLABEL= ^(0.55) . %D IL I2 -> .PLABEL= _(0.55) \aa× %D I1 I3 -> .plabel= r \aa×B %D )) %D (( D1 .tex= a,b %D DL .tex= b D2 .tex= \phantom{O} %D D3 .tex= a',b %D DL D1 => %D DL D3 => %D DL D2 -> .PLABEL= ^(0.55) * %D D1 D3 |-> %D )) %D (( ext .tex= \rtabular{external\\view} y+= 5 place %D int .tex= \rtabular{internal\\view} y+= 0 place %D std .tex= \ctabular{standard\\notation} x+= 15 place %D dnc .tex= \ctabular{downcased\\notation} x+= 15 place %D )) %D enddiagram %D $$\diag{NT-alphax-int-ext}$$ We know the ``syntactical action'' of $(\aa×)$: it is $B \mto (A{×}B \ton{\aa{×}B} A'{×}B)$ in standard notation, $b \tnto (a,b \mto a',b)$ after downcasing. To obtain a ``meaning'' for that we can apply the same procedure as in the previous section; we discover that $\aa{×}B \; := \; �p:A{×}B.\ang{f(p),'p}$. Now let's look at the notations for the general case. Let $F,G: \catA \to \catB$ be functors, and $T: F \tondot G$ be an NT between them. Everything is similar, but we need a way to downcase the objects $FA$ and $GA$... a good choice is as `$a^F$' and `$a^G$' --- because the idea of ``an $a^F$'' {\sl suggests nothing at all}, and so it reminds us that the functors $F$ and $G$ may be abstract. %D diagram NT-T-int-ext %D 2Dx 100 +30 +30 +30 +30 %D 2D 100 E0 ----> E1 %D 2D +08 ext E2 E3 %D 2D +08 E4 ----> E5 %D 2D +08 E6 ----> E7 %D 2D %D 2D %D 2D +22 I1 D1 %D 2D |---> | ===> - %D 2D +20 int IL |-> I2 DL |-> D2 %D 2D |---> v ===> v %D 2D +20 I3 D3 %D 2D %D 2D +20 std dnc %D 2D %D (( E0 .tex= \phantom{\catA} E1 .tex= \phantom{\catB} %D E2 .tex= \catA E3 .tex= \catB %D E4 .tex= \phantom{\catA} E5 .tex= \phantom{\catB} %D E0 E1 -> sl_ .plabel= a F %D E2 place E3 place %D E4 E5 -> sl^^ .plabel= b G %D E0 E1 midpoint E4 E5 midpoint -> sl_ .plabel= r T %D E6 .tex= A E7 .tex= TA |-> %D )) %D (( I1 .tex= FA %D IL .tex= A I2 .tex= \phantom{O} %D I3 .tex= GA %D IL I1 |-> .plabel= a F %D IL I3 |-> .plabel= b G %D IL I2 -> .PLABEL= ^(0.55) . %D IL I2 -> .PLABEL= _(0.55) T %D I1 I3 -> .plabel= r TA %D )) %D (( D1 .tex= a^F %D DL .tex= a D2 .tex= \phantom{O} %D D3 .tex= a^G %D DL D1 => %D DL D3 => %D DL D2 -> .PLABEL= ^(0.55) * %D D1 D3 |-> %D )) %D (( ext .tex= \rtabular{external\\view} y+= 5 place %D int .tex= \rtabular{internal\\view} y+= 0 place %D std .tex= \ctabular{standard\\notation} x+= 15 place %D dnc .tex= \ctabular{downcased\\notation} x+= 15 place %D )) %D enddiagram %D $$\diag{NT-T-int-ext}$$ We often have to deal with categories whose objects don't have any reasonable notion of ``elements''. If our $\catB$ is a category like that, then $a^F \mto a^G$ will be a name for a morphism, but the names `$a^F$' and `$a^G$' ``will not have semantics'' --- our dictionary will not attribute any meanings to them. See \cite{Kromer}, especially its sections 3.3.4.4 and 5.3.2.1, for a discussion; our downcased notation is, in a sense, ``a language for diagram chasing''. % -------------------- % «adjunctions» (to ".adjunctions") % (sec "Adjunctions" "adjunctions") \mysection {Adjunctions} {adjunctions} Fix an object $B$ of $\Set$, and let's write $B{\to}C$ for $C^B$. Then we have a functor $(B{\to}): \Set \to \Set$, whose syntactical action is $C \mapsto (B{\to}C)$, in standard notation, and $c \funto b \mto c$ after downcasing. This functor is right adjoint to $(×B): \Set \to \Set$, and we will represent the adjunction diagrammatically as: \def\cur {�{cur}} \def\uncur{�{uncur}} %D diagram adj-int-ext %D 2Dx 100 +35 +30 +30 +30 %D 2D 100 ext E0 ----> E1 %D # +08 E2 ----> E3 %D 2D %D 2D %D 2D +22 int I0 <---| I1 D0 <=== D1 %D 2D | | - - %D 2D | <-> | | <-> | %D 2D v v v v %D 2D +30 I2 |---> I3 D2 ===> D3 %D 2D %D 2D +20 std dnc %D 2D %D (( E0 .tex= \Set E1 .tex= \Set %D E0 E1 <- sl^ .plabel= a (×B) %D E0 E1 -> sl_ .plabel= b (B{\to}) %D )) %D (( I0 .tex= A×B I1 .tex= A %D I2 .tex= C I3 .tex= B{->}C %D I0 I1 <-| .plabel= a (×B) %D I2 I3 |-> .plabel= b (B{->}) %D I0 I2 -> .PLABEL= _(0.43) \uncur\;g %D I0 I2 -> .PLABEL= _(0.57) f %D I0 I3 harrownodes nil 20 nil <-| sl^ .plabel= a \uncur %D I0 I3 harrownodes nil 20 nil |-> sl_ .plabel= b \cur %D I1 I3 -> .PLABEL= ^(0.43) g %D I1 I3 -> .PLABEL= ^(0.57) \cur\;f %D )) %D (( D0 .tex= a,b D1 .tex= a <= %D D2 .tex= c D3 .tex= b|->c => %D D0 D2 |-> D1 D3 |-> %D D0 D3 harrownodes nil 20 nil <-> %D )) %D (( ext .tex= \rtabular{external\\view} y+= 4 place %D int .tex= \rtabular{internal\\view} y+= 15 place %D std .tex= \ctabular{standard\\notation} x+= 15 place %D dnc .tex= \ctabular{downcased\\notation} x+= 15 place %D )) %D enddiagram %D $$\diag{adj-int-ext}$$ The two transpositions, $\cur$ and $\uncur$, are natural transformations with actions: % $$\begin{array}{lcl} (A^\op,C) \mapsto ((A{×}B \ton{f} C) \mapsto (A \ton{\cur\;f} (B{\to}C))) \\ (A^\op,C) \mapsto ((A \ton{g} (B{\to}C)) \mapsto (A{×}B \ton{\uncur\;g} C)) \\ \end{array} $$ % where $A^\op$ is an object of $\Set^\op$, and $(A^\op,C)$ is an object of $\Set^\op×\Set$. \msk The notation in the general case is similar. An adjunction $L \dashv R$ is represented diagrammatically as: % %D diagram adj-gen-int-ext %D 2Dx 100 +35 +30 +30 +30 %D 2D 100 ext E0 ----> E1 %D # +08 E2 ----> E3 %D 2D %D 2D %D 2D +22 int I0 <---| I1 D0 <=== D1 %D 2D | | - - %D 2D | <-> | | <-> | %D 2D v v v v %D 2D +30 I2 |---> I3 D2 ===> D3 %D 2D %D 2D +20 std dnc %D 2D %D (( E0 .tex= \catB E1 .tex= \catA %D E0 E1 <- sl^ .plabel= a L %D E0 E1 -> sl_ .plabel= b R %D )) %D (( I0 .tex= LA I1 .tex= A %D I2 .tex= B I3 .tex= RB %D I0 I1 <-| .plabel= a L %D I2 I3 |-> .plabel= b R %D I0 I2 -> .PLABEL= _(0.43) g^\flat %D I0 I2 -> .PLABEL= _(0.57) f %D I0 I3 harrownodes nil 20 nil <-| sl^ .plabel= a \flat %D I0 I3 harrownodes nil 20 nil |-> sl_ .plabel= b \sharp %D I1 I3 -> .PLABEL= ^(0.43) g %D I1 I3 -> .PLABEL= ^(0.57) f^\sharp %D )) %D (( D0 .tex= a^L D1 .tex= a <= %D D2 .tex= b D3 .tex= b^R => %D D0 D2 |-> D1 D3 |-> %D D0 D3 harrownodes nil 20 nil <-> %D )) %D (( ext .tex= \rtabular{external\\view} y+= 4 place %D int .tex= \rtabular{internal\\view} y+= 15 place %D std .tex= \ctabular{standard\\notation} x+= 15 place %D dnc .tex= \ctabular{downcased\\notation} x+= 15 place %D )) %D enddiagram %D $$\diag{adj-gen-int-ext}$$ We will usually draw $L$ as going left, $R$ as going right, and call the transpositions `$\flat$' and `$\sharp$'. \msk Just as a functor is two actions plus two properties --- namely, that the action on morphisms respects identities and composition ---, a natural isomorphism can be thought as two NTs going in opposite directions, plus the assurance that their composites are identities; the downcased squares for the $(×B) \dashv (B{\to})$ and $L \dashv R$ adjunctions in the diagrams above can be considered as being (two-dimensional) downcased names themselves, having the same meaning as their ``linearized'' versions, % $$\begin{array}{lcl} (a^\op;c) \tnto ((a,b \mto c) \bij (a \mto (b \mto c))) \\ (a^\op;b) \tnto ((a^L \mto c) \bij (a \mto b^R)) \\ \end{array} $$ % and those meanings can either be the 6-uples % $$\begin{array}{lcl} (\Set, \Set, (×B), (B{\to}), \uncur, \cur) \\ (\catA, \catB, L, R, \flat, \sharp) \\ \end{array} $$ % or longer tuples including, say, the properties and the unit and the counit of the adjunction. The possibility of changing some definitions while keeping the notations the same will be very important in \cite{OchsDownHyp}, where that will be used to ``project out'' from the definitions all components which involve equalities of morphisms, keeping only the constructions. % -------------------- % «transmission» (to ".transmission") % (sec "Transmission" "transmission") \mysection {Transmission} {transmission} A good way to understand how {\sl reconstruction} works is to think on a simpler, more extreme case. When I reconstruct something that I have half-forgotten, I do have vague memories about it... how do they act? On the other hand, if I am reconstructing something that I have received from someone else in an {\sl incomplete}, but {\sl reconstructible}, form, then the vague memories are out of the picture. \msk Consider the following diagram, which describes, in a simplistic way, how a theorem $T$, discovered by an author $A$ and published in a paper $P$, is read and understood by a reader $R$. In the diagram the time flows to the left, and knowledge flows (roughly) downwards. %D diagram reconstruction %D 2Dx 100 +20 +20 +30 %D 2D 100 A_0 --------> A_2 %D 2D +08 T_1 --> T_2 %D 2D +08 T_3 %D 2D +08 P %D 2D +08 T'_0 --> T'_1 %D 2D +08 R_0 ---> R_1 %D 2D +08 %D 2D %D (( A_0 A_2 -> %D T_1 T_2 -> %D T_3 place %D P place %D T'_0 T'_1 -> %D R_0 R_1 -> %D )) %D enddiagram %D $$\diag{reconstruction}$$ The theorem was discovered in the form $T_1$, but in order to make it transmissible the author changed it a bit, and it became $T_2$, then $T_3$ in written form, which was what got published. The author was also slightly transformed in the process, and at the time of the publication he had become $A_2$. That particular theorem was difficult to state, so the reader $R_0$ started by understanding just parts of its statement; let's call that preliminary object understood by our reader a ``statement with holes''. Then, through weeks of study, more and more of the theorem's statement, and of the proof of the theorem (let's use the term ``theorem'' to refer to both the statement and the proof), became clear to $R$, up to the point where the object in $R$'s mind has become $T'_1$, which no longer has any holes; all gaps have been filled, the statement is now clearly a consequence of its proof, and our reader $R$, who at this point has become a slightly different person, $R_1$, now has even some {\it intuition} about the theorem (whatever that means)... Our reader $R$ has {\sl reconstructed}, in his mind, the author $A$'s theorem, from what was published in the paper $P$. $R$'s task was harder than the task I face when I try to reconstruct a theorem that at some point I knew --- if I review all that I have studied before, I {\sl should} be able to fill up all my gaps, but $R$ has no guarantee that just by looking in the literature he will be able to find all the missing knowledge he needs... even though we {\sl expect} published papers to be clear enough, and complete enough. The process of reconstruction performed by $R$ {\sl ought} to be considered an algorithm --- even though we know that what $R$ did to understand the paper $P$ was, at best, a ``method''. % Let's see why, by examining an ideal case. % Let's treat an ideal case. Suppose that the $A$ wanted to be clear, that he was lucid in his choice of exposition, and that the paper $P$ was submitted to a journal with no space constraints. So $A$, and also his editors and referees, {\sl want} the paper $P$ to be as readable as possible. The reader $R_0$ receives the paper $P$ being aware that the theorem $T_3$ in it should be readable, and starts to devote his time to understand $T_3$. After a few weeks $R$ succeeds. {\sl What did the reader $R$ do?} And what did $A$ and the journal's editors do to be sure that $A$ would succeed in his task? When $R$ checks the paper's details and fills up the gaps his process is akin to {\sl proof search}, and thus a kind of {\sl lifting}... What $R$ does is too complex to be formalizable as an ``algorithm''; yet the author and the editors are aware of what their intended readers are expected to be able to do, and they added to the paper enough ``hints'' to let the readers succeed --- so the reader performs a ``proof search with hints''. There are algorithms that do proof searches with hints, so let's commit an abuse of language here and take the analogy with algorithms seriously: the reader $R$ is performing a proof search with hints, and $A$ provided enough hints in the paper $P$ to be sure that, given $P$ as input, the reader $R$ will check its theorems completing all the missing details, then stop. % There's more, though. Let's now suppose that both $A$ and $R$ are people whose thinking is mostly diagrammatical, like the semi-fictional ``me''. How does does $R$ (re)construct in his mind some ``intuition'' about the theorems? The theorem $T_3$ was published in algebraic form, so part of $R$'s task --- and he needs to do this to be able to fill the logical gaps --- is to find the diagrams to let him reason about the paper; another part of his task is to work out the details in the paper's examples; for that he needs to take his ``general'' diagrams and apply them to particular cases; this is {\sl specialization}, as in Section \ref{projs-and-lifts}. What we call ``intuition'' should comprise the ability to specialize, plus a lot more. % (find-kopkadaly4page (+ 12 607) "Index") % (find-kopkadaly4page (+ 12 637) "\\ref") % (find-kopkadaly4page (+ 12 615) "cross-reference") % (find-kopkadaly4page (+ 12 56) "cross-reference") % (find-kopkadaly4page (+ 12 213) "cross-reference") % (find-kopkadaly4text) % synthetic form % -------------------- % «intuition» (to ".intuition") % (sec "Intuition" "intuition") \mysection {Intuition} {intuition} When $R$ finally understands the paper $P$ he would have developed some ``intuitions'' about the theorem $T$. His intuitions may or may not be the same as $A$'s, so let's name them differently: $I_A$ for the author's intuitions, $I_R$ for the reader's. Instead of taking the easy way and saying that ``it is impossible to talk about what mathematical intuition is'', we will improvise a simple model that will let us talk about {\sl some aspects} of having intuition about a theorem. Let's suppose that that theorem $T$ says that when the hypotheses $H_1$ and $H_2$ hold, then the conclusion $C_2$ also does; and the proof of $T$ is made of two lemmas, $L_1$ and $L_2$, and structured like this: % $$\ded{int-full}$$ % so the stronger theorem $H_1 \land H_2 \vdash C_1 \land C_2$ is also true. Note that the tree above represents $T'_1$, not $T_3$; $T_3$ was written in a ``linear'', ``algebraic'' way, to comply with the usual mathematical practice, even though both $A$ and $R$ tend to think diagrammatically. The reader $R_1$ is able to do several things with $T'_1$; most of them can be understood diagrammatically, as movements through parts of the diagram below. Let's use the following convention (that will hold for this diagram only): a double bar with the name of a lemma at its right will stand for {\sl all the steps in the lemma} --- they are known, but are not visible ---, while a double bar without a label will mean a situation where the intermediate steps are not known at that moment, and will have to be reconstructed. %: %: %: H_1 H_1 H_1\subs %: ===L_1 === ========L_1 %: H_1 C_1 H_2 C_1 H_2 C_1\subs H_2\subs %: ===L_1 ===========L_2 =========== =====================L_2 %: C_1 C_2 C_2 C_2\subs %: %: ^int-1p ^int-full ^int-struct ^int-subs %: %D diagram intuition %D 2Dx 100 +45 +90 %D 2D 100 full %D 2D %D 2D +50 firstpart struct subs %D 2D %D 2D +40 H1|-C1 H1H2|-C2 H1H2|-C2[Set] %D 2D %D (( full .tex= \fcded{int-full} BOX %D firstpart .tex= \fcded{int-1p} BOX %D struct .tex= \fcded{int-struct} BOX %D subs .tex= \fcded{int-subs} BOX %D H1|-C1 .tex= \fbox{$H_1|-C_1$} BOX %D H1H2|-C2 .tex= \fbox{$H_1,H_2|-C_2$} BOX %D H1H2|-C2[Set] .tex= \fbox{$H_1[\catC{:}{=}\Set],H_2[\catC{:}{=}\Set]|-C_2[\catC{:}{=}\Set]$} BOX %D full firstpart -> .plabel= l (1) %D full struct -> .plabel= r (2) %D full subs -> .plabel= r (3) %D firstpart H1|-C1 -> .plabel= r (4) %D struct H1H2|-C2 -> .plabel= r (5) %D subs H1H2|-C2[Set] -> .plabel= r (6) %D )) %D enddiagram %D $$\def\subs{[\catC:=\Set]} \diag{intuition} $$ The center box is a sketch of the full proof, and the small box below it is the statement of the full theorem. If the reader $R_1$ remembers any of those, he can reconstruct the other one by projecting or lifting through (5). The box at the top is the full proof, and $R_1$ can reconstruct it by completing the sketch of the proof, lifting through (2). One day $R_1$ decides to present the theorem to some colleagues. He assigns a temporal order to the lemmas: ``Lemma $L_1$ has to be presented first''. Its statement and its proof are obtained by projecting through (1) and (4). Note that (1) is a case of {\sl zooming in} into a part of the proof. $R_1$ is also able to {\sl use} the theorem. He can apply it to a particular case, projecting through (3) and (6) or just through (3) (compare that with the arrow (3) in Figure [Fib]). He can also reuse the {\sl structure} of parts of the proof, but changing the theorem in deeper ways (e.g., in the Curry-Howard isomorphism); this is not shown above. We will pretend that ``having intuition about a theorem'' means having the abilities discussed above: remembering parts, reconstructing, zooming out, zooming it, temporal order, transmission, specializing, reusing the structure. % -------------------- % «hyps» (to ".hyps") % (sec "Hyperdoctrines" "hyps") \mysection {Hyperdoctrines} {hyps} Let's take the {\sl beginning} of the definition of a hyperdoctrine (\cite{LawvereAdjFo}, \cite{LawvereEqHyp}, \cite{SeelyBeck}, \cite{TaylorPhDThesis}, \cite{Jacobs}). A hyperdoctrine is a cloven fibration $p:\bbE \to \bbB$, where the ``base category'' $\bbB$ has finite products and each fiber $\bbE_B$ is cartesian closed, plus for each morphism $f:A \to B$ in $\bbB$ adjoints $Æ_f \dashv f^* \dashv å_f$ for the change-of-base functor $f^*$, {\sl plus more structure}; but let's skip the ``plus more structure'' part for the moment --- we will describe that extra structure briefly at the end of this section, and the full details can be found at \cite{OchsDownHyp} and \cite{OchsUniLog}. It turns out that this definition {\sl generalizes} a familiar object: the ``codomain fibration'', $\Cod: \Setmto \to \Set$ --- that we will abbreviate as $\Pred$ --- is a hyperdoctrine. Actually $\Pred$ can be considered to be the {\sl archetypical} hyperdoctrine, in a sense that can be made precise; we will return to ``generalizations'' and ``archetypes'' in Section \ref{archetypal}. An object $P$ of $\Setmto$ is a monic: $P \equiv (A' \monicto A)$. A morphism $P \to Q$ in $\Setmto$, where $Q \equiv (B' \monicto B)$, is a pair of arrows $(f':A' \to B', f:A \to B)$ making the obvious square commute. The projection functor $\Cod$ takes $P \equiv (A' \monicto A)$ to $A$ and $(f',f):P \to Q$ to $f$. A monic with codomain $A$, $(A' \monicto A)$, is said to be a {\sl subobject} of $A$. In $\Set$ we have {\sl canonical subobjects} --- the ones whose maps are inclusions --- and we can think of them as being {\sl predicates} over sets. We will have a special shorthand for predicates: instead of writing, for example, % $$\sst{(x,y)�X×Y}{P(x,y)} \ito X×Y$$ % we will write just: % $$\ssst{x,y}{Pxy}$$ % The double bar in `$\ssst{x,y}{Pxy}$' is to remind us that this is two objects, plus a map; and we write the `$\monicto$' as `$\ito$' when we want to stress that the map is an inclusion. \msk We will draw objects of $\bbE$ above their projections, and we'll usually omit the projection arrows. We will downcase a predicate like `$\ssst{x,y}{Pxy}$' in diagrams as just `$P(x,y)$' --- the `$x,y$' part can be recovered by looking down. Let's focus on what happens in $\Pred$. For a map $f: A \to B$ in $\bbB$, the change-of-base functor $f^*$ takes each predicate $\ssst{b}{P(b)}$ over $B$ to a predicate $\ssst{a}{P(f(a))}$ over $A$. When $f$ is a projection map, like $:X×Y \to X$, the adjoints $Æ_$ and $å_$ ``are'' $�y$ and $�y$; when $f$ is the ``diagonal'' map $\DD:B \to B×B$ or the ``dependent diagonal'' $\dd:A×B \to A×B×B$, the adjoints ``are'' the operations `$b{=}b'∧$' and `$b{=}b'⊃$'. This is not hard to believe if we start with the right examples, and we check first the particular cases and then generalize. Take $X=\{0,1,2,3,4\}$, $Y=\{3,4\}$, $A=\{0\}$, $B=\{0,1,2,3\}$, and let's use a positional notation for predicates: we will write $\ssst{x,y}{x \ge y}$ as $\sm{00001\\00011}$, i.e.: $$\def\pr#1{#1} \def\ph#1{\phantom{#1}} \def\dmat#1#2{ \sm{ \pr{\{}#1\pr{,}\ph{\}}\\ \ph{\{}#2\pr{\}}\ph{,}\\ }} % \dmat{\ph{(0,4),(1,4),(2,4),(3,4),}\pr{(4,4)}} {\ph{(0,3),(1,3),(2,3),}\pr{(3,3),(4,3)}} % \;\ito\;\, % \dmat{\pr{(0,4),(1,4),(2,4),(3,4),}\pr{(4,4)}} {\pr{(0,3),(1,3),(2,3),}\pr{(3,3),(4,3)}} $$ In the two diagrams below the left side shows the abstract view, the right side shows a very particular case, and in the middle, in downcased notation, we see how these change-of-base functors and their adjoints act on arbitrary predicates. % (find-angg ".emacs.papers" "jacobs") % (find-jacobspage (+ 19 43) "fibration of monos") % (find-books "__cats/__cats.el" "lambek-scott") % (find-lambekscottpage (+ 8 200) "Toposes with canonical subobjects") % (find-angg ".emacs.papers" "taylor") % (find-paultaylorthesispage (+ 11 84) "hyperdoctrine") %D diagram 3-pi %D 2Dx 100 +30 +30 +30 +30 +30 %D 2D 100 A0 A1 B0 B1 C0 C1 %D 2D %D 2D +20 A2 A3 B2 B3 C2 C3 %D 2D %D 2D +20 A4 A5 B4 B5 C4 C5 %D 2D %D 2D +20 A6 A7 B6 B7 C6 C7 %D 2D %D (( A0 .tex= P A1 .tex= Æ_P %D A2 .tex= ^*Q A3 .tex= Q %D A4 .tex= R A5 .tex= å_R %D A6 .tex= X×Y A7 .tex= X %D )) %D (( B0 .tex= Pxy B1 .tex= �y.Pxy %D B2 .tex= Qx B3 .tex= Qx %D B4 .tex= Rxy B5 .tex= �y.Rxy %D B6 .tex= x,y B7 .tex= x %D )) %D (( C0 .tex= \sm{00001\\00011} C1 .tex= \sm{00011} %D C2 .tex= \sm{00111\\00111} C3 .tex= \sm{00111} %D C4 .tex= \sm{01111\\11111} C5 .tex= \sm{01111} %D C6 .tex= X×Y C7 .tex= X %D )) %D (( A0 A1 A2 A3 A4 A5 A6 A7 %D @ 0 @ 1 |-> @ 0 @ 2 -> @ 1 @ 3 -> @ 0 @ 3 harrownodes nil 20 nil <-> %D @ 2 @ 3 <-| @ 2 @ 4 -> @ 3 @ 5 -> @ 2 @ 5 harrownodes nil 20 nil <-> %D @ 4 @ 5 |-> @ 6 @ 7 -> .plabel= a \pi %D )) %D (( B0 B1 B2 B3 B4 B5 B6 B7 %D @ 0 @ 1 => @ 0 @ 2 |-> @ 1 @ 3 |-> @ 0 @ 3 harrownodes nil 20 nil <-> %D @ 2 @ 3 <= @ 2 @ 4 |-> @ 3 @ 5 |-> @ 2 @ 5 harrownodes nil 20 nil <-> %D @ 4 @ 5 => @ 6 @ 7 |-> .plabel= a \pi %D )) %D (( C0 C1 C2 C3 C4 C5 C6 C7 %D @ 0 @ 1 |-> @ 0 @ 2 -> @ 1 @ 3 -> @ 0 @ 3 harrownodes nil 20 nil <-> %D @ 2 @ 3 <-| @ 2 @ 4 -> @ 3 @ 5 -> @ 2 @ 5 harrownodes nil 20 nil <-> %D @ 4 @ 5 |-> @ 6 @ 7 -> .plabel= a \pi %D )) %D enddiagram %D $$\diag{3-pi}$$ %D diagram 3-dd %D 2Dx 100 +30 +30 +30 +30 +30 %D 2D 100 A0 A1 B0 B1 C0 C1 %D 2D %D 2D +30 A2 A3 B2 B3 C2 C3 %D 2D %D 2D +30 A4 A5 B4 B5 C4 C5 %D 2D %D 2D +20 A6 A7 B6 B7 C6 C7 %D 2D %D (( A0 .tex= P A1 .tex= Æ_P %D A2 .tex= ^*Q A3 .tex= Q %D A4 .tex= R A5 .tex= å_R %D A6 .tex= A×B A7 .tex= A×B×B -> .plabel= a \dd %D )) %D (( B0 .tex= Pab B1 .tex= b=b'∧Pab %D B2 .tex= Qabb B3 .tex= Qabb' %D B4 .tex= Rab B5 .tex= b=b'⊃Rab %D B6 .tex= a,b B7 .tex= a,b,b' |-> .plabel= a b':=b %D )) %D (( C0 .tex= \dsm1000 C1 .tex= \sm{0001\\0000\\0000\\0000} %D C2 .tex= \dsm1100 C3 .tex= \sm{0011\\0011\\0011\\0000} %D C4 .tex= \dsm1110 C5 .tex= \sm{1111\\1111\\1111\\0111} %D C6 .tex= A×B C7 .tex= A×B×B -> .plabel= a \dd %D )) %D (( A0 A1 A2 A3 A4 A5 A6 A7 %D @ 0 @ 1 |-> @ 0 @ 2 -> @ 1 @ 3 -> @ 0 @ 3 harrownodes nil 20 nil <-> %D @ 2 @ 3 <-| @ 2 @ 4 -> @ 3 @ 5 -> @ 2 @ 5 harrownodes nil 20 nil <-> %D @ 4 @ 5 |-> %D )) %D (( B0 B1 B2 B3 B4 B5 B6 B7 %D @ 0 @ 1 => @ 0 @ 2 |-> @ 1 @ 3 |-> @ 0 @ 3 harrownodes nil 20 nil <-> %D @ 2 @ 3 <= @ 2 @ 4 |-> @ 3 @ 5 |-> @ 2 @ 5 harrownodes nil 20 nil <-> %D @ 4 @ 5 => %D )) %D (( C0 C1 C2 C3 C4 C5 C6 C7 %D @ 0 @ 1 |-> @ 0 @ 2 -> @ 1 @ 3 -> @ 0 @ 3 harrownodes nil 20 nil <-> %D @ 2 @ 3 <-| @ 2 @ 4 -> @ 3 @ 5 -> @ 2 @ 5 harrownodes nil 20 nil <-> %D @ 4 @ 5 |-> %D )) %D enddiagram %D $$\def�{\phantom{0}} \def\dsm#1#2#3#4{\sm{���#1\\��#2�\\�#3��\\#4���}} \diag{3-dd} $$ Each arrow `$\bij$' above stands for two transpositions, one in each direction. Let's establish another convention: $P \to f^*Q$ is an hom-set, but $P \vdash f^*Q$ is (the name of) a morphism --- we have $P \vdash f^*Q : P \to f^*Q$. With that convention we can write the transpositions rules (cf.\ the ``mate rules'' in \cite{Jacobs}) as: %: %: Q|-å_R x;Qx|-�y.Rxy %: -------�^\flat ------------�^\flat %: ^*Q|-R x,y;Qx|-Rxy %: %: ^Fa-flat-std ^Fa-flat-dnc %: %: ^*Q|-R x,y;Qx|-Rxy %: -------�^\sharp ------------�^\sharp %: Q|-å_R x;Qx|-�y.Rxy %: %: ^Fa-sharp-std ^Fa-sharp-dnc %: $$\ded{Fa-flat-std} \qquad \ded{Fa-flat-dnc}$$ $$\ded{Fa-sharp-std} \qquad \ded{Fa-sharp-dnc}$$ Note that `$\vdash$' has different meanings in standard and in downcased notations. We will only use `$\vdash$' to refer to vertical morphisms. \msk It turns out that it is possible to {\sl define} the adjoints $Æ_f \dashv f^* \dashv å_f$, for an arbitrary $f: A \to B$, from $f^*$, $Æ_$, $å_$, $Æ_$, $å_$: in $\Pred$, % % :*=*=* %:*∧*∧* %:*⊃*⊃* % %D diagram adjs-to-DD*-sst %D 2Dx 100 +65 %D 2D 100 P ====> ÆP %D 2D - - %D 2D | <-> | %D 2D v v %D 2D +25 Q* <=== Q %D 2D - - %D 2D | <-> | %D 2D v v %D 2D +25 R ====> åR %D 2D %D 2D +20 A |---> B %D 2D %D (( P .tex= \ssst{a,b}{Pab} ÆP .tex= \ssst{a,b,b'}{b=b'∧Pab} %D Q* .tex= \ssst{a,b}{Qabb} Q .tex= \ssst{a,b,b'}{Qabb'} %D R .tex= \ssst{a,b}{Qab} åR .tex= \ssst{a,b,b'}{b=b'⊃Pab} %D A .tex= A×B B .tex= A×B×B %D )) %D (( P .tex= \ssst{a}{P(a)} ÆP .tex= \ssst{b}{�a.f(a)=b∧P(a)} %D Q* .tex= \ssst{a}{Q(f(a))} Q .tex= \ssst{b}{Q(b)} %D R .tex= \ssst{a}{R(a)} åR .tex= \ssst{b}{�a.f(a)=b⊃R(a)} %D A .tex= A B .tex= B %D )) %D (( P ÆP |-> %D P Q* -> ÆP Q -> P Q harrownodes nil 20 nil <-> %D Q* Q <-| %D Q* R -> Q åR -> Q* åR harrownodes nil 20 nil <-> %D R åR |-> %D A B -> .plabel= a f %D )) %D enddiagram %D $$\diag{adjs-to-DD*-sst} $$ % %:*=*{=}* %:*∧*{∧}* %:*⊃*{⊃}* % The transpositions and the functor actions implicit in the adjunction diagram above can be expressed as derived rules. The proof is hard (see \cite{SeelyBeck}); we will see how to understand it diagrammatically in \cite{OchsDownHyp}. \bsk What really matters here is: how were the diagrams above obtained? Look at the figure below; it shows some of the projections involved. We start with the definition in English, at the top. By representing it diagrammatically (projection (1)) we lose the subtleties of the English language --- see \cite{Freyd76} and \cite{FreydScedrov} for a diagrammatic notation in which at least the quantifiers are preserved --- but we add positional notation. The projections (3) and (4) are specializations: we take $\bbE := \Setito$, $p := \Cod$, $A:=X×Y$, and so on. The arrows (2) and (5) produce internal diagrams, as in the section \ref{adjunctions}. The downcased diagrams are not shown; Figure [Fib] is a map of interesting projections, and it could have been far larger than it is. %D diagram 5-algebraic-internal %D 2Dx 100 +35 %D 2D 100 P ====> ÆP %D 2D - - %D 2D | <-> | %D 2D v v %D 2D +20 Q* <=== Q %D 2D - - %D 2D | <-> | %D 2D v v %D 2D +20 R ====> åR %D 2D %D 2D +15 A |---> B %D 2D %D (( P .tex= P ÆP .tex= Æ_fP %D Q* .tex= f^*Q Q .tex= Q %D R .tex= R åR .tex= å_fR %D A .tex= A B .tex= B %D )) %D (( P ÆP |-> %D P Q* -> ÆP Q -> P Q harrownodes nil 20 nil <-> %D Q* Q <-| %D Q* R -> Q åR -> Q* åR harrownodes nil 20 nil <-> %D R åR |-> %D A B -> .plabel= a f %D )) %D enddiagram %D % $$\diag{5-algebraic-internal}$$ % %D diagram 5-set-binary %D 2Dx 100 +35 %D 2D 100 P ====> ÆP %D 2D - - %D 2D | <-> | %D 2D v v %D 2D +20 Q* <=== Q %D 2D - - %D 2D | <-> | %D 2D v v %D 2D +20 R ====> åR %D 2D %D 2D +15 A |---> B %D 2D %D (( P .tex= \sm{00001\\00011} ÆP .tex= \sm{00011} %D Q* .tex= \sm{00111\\00111} Q .tex= \sm{00111} %D R .tex= \sm{01111\\11111} åR .tex= \sm{01111} %D A .tex= X×Y B .tex= X %D )) %D (( P ÆP |-> %D P Q* -> ÆP Q -> P Q harrownodes nil 20 nil <-> %D Q* Q <-| %D Q* R -> Q åR -> Q* åR harrownodes nil 20 nil <-> %D R åR |-> %D A B -> .plabel= a \pi %D )) %D enddiagram %D % $$\diag{5-set-binary}$$ % %D diagram 5-algebraic-external %D 2Dx 100 +25 +40 %D 2D 100 \bbE \bbE_A \bbE_B %D 2D %D 2D +35 \bbB A B %D 2D %D (( \bbE_A \bbE_B %D @ 0 @ 1 -> .slide= 13pt .plabel= a Æ_f %D @ 0 @ 1 <- .plabel= m f^* %D @ 0 @ 1 -> .slide= -13pt .plabel= b å_f %D @ 0 @ 1 midpoint y+= -5 .TeX= \text{\tiny$�$} place %D @ 0 @ 1 midpoint y+= 5 .TeX= \text{\tiny$�$} place %D )) %D (( \bbE \bbB -> .plabel= l p %D A B -> .plabel= a f %D )) %D enddiagram %D % $$\diag{5-algebraic-external}$$ % %D diagram 5-set-external %D 2Dx 100 +25 +40 %D 2D 100 \bbE \bbE_A \bbE_B %D 2D %D 2D +35 \bbB A B %D 2D %D (( \bbE_A .tex= \Setito_{X×Y} \bbE_B .tex= \Setito_X %D @ 0 @ 1 -> .slide= 13pt .plabel= a Æ_\pi %D @ 0 @ 1 <- .plabel= m \pi^* %D @ 0 @ 1 -> .slide= -13pt .plabel= b å_\pi %D @ 0 @ 1 midpoint y+= -5 .TeX= \text{\tiny$�$} place %D @ 0 @ 1 midpoint y+= 5 .TeX= \text{\tiny$�$} place %D )) %D (( \bbE .tex= \Setito \bbB .tex= \Set -> .plabel= l \Cod %D A .tex= X×Y B .tex= X -> .plabel= a \pi %D )) %D enddiagram %D % $$\diag{5-set-external}$$ % %D diagram adjoints-to-cob %D 2Dx 100 +95 %D 2D 100 text %D 2D %D 2D +60 EAEB PQR %D 2D %D 2D +85 EXYEX binary %D 2D %D (( %D text .tex= \fibrationbox BOX place %D EAEB .tex= \fdiag{5-algebraic-external} BOX place %D EXYEX .tex= \fdiag{5-set-external} BOX place %D PQR .tex= \fdiag{5-algebraic-internal} BOX place %D binary .tex= \fdiag{5-set-binary} BOX place %D text EAEB -> .plabel= l (1) %D EAEB PQR -> .plabel= a (2) %D EAEB EXYEX -> .plabel= l (3) %D PQR binary -> .plabel= r (4) %D EXYEX binary -> .plabel= a (5) %D )) %D enddiagram %D $$\def\fibrationbox{\fbox{\ltabular{ A fibration $p:\bbE \to \bbB$ \\ plus for each $f:A \to B$ in $\bbB$ \\ adjoints $Æ_f \dashv f^* \dashv å_f$ \\ }}} \diag{adjoints-to-cob} $$ $$\text{Figure [Fib]}$$ % -------------------- % «pres-frob-bcc» (to ".pres-frob-bcc") % (sec "Preservations, Frobenius, Beck-Chevalley" "pres-frob-bcc") \mysection {Preservations, Frobenius, Beck-Chevalley} {pres-frob-bcc} The complete definition of a hyperdoctrine is what we have seen in the last section, plus these {\sl properties}: * each $f^*$ preserves, modulo iso, the $�$, the $∧$, and the $⊃$ of each fiber; * the Frobenius Property holds, * the Beck-Chevalley Condition holds. We can regard these properties as {\sl structure}. They can all be stated in the same way: for $f:A \to B$ in $\bbB$ and for predicates $P$ and $Q$ over $B$, we have natural constructions $\Ptruenat$, $\Pandnat$, $\Pimpnat$, $\Frobnat$, that produce arrows that we want to be isos; so we establish rules $\Ptrue$, $\Pand$, $\Pimp$, $\Frob$, that produce arrows going in the opposite directions of the previous ones. %: %: f:A->B f:A->B %: ===========\Ptruenat -----------\Ptrue %: f^*�_B|-�_A �_A|-f^*�_B %: %: ^Ptruenat-short-std ^Ptrue-std %: %: f P Q f P Q %: ===================\Pandnat -------------------\Pand %: f^*(P∧Q)|-f^*P∧f^*Q f^*P∧f^*Q|-f^*(P∧Q) %: %: ^Pandnat-short-std ^Pand-std %: %: f Q R f Q R %: ===================\Pimpnat -------------------\Pimp %: f^*P∧f^*Q|-f^*(Q⊃R) f^*(Q⊃R)|-f^*P⊃f^*Q %: %: ^Pimpnat-short-std ^Pimp-std %: %: %: f P Q f P Q %: =====================\Frobnat ---------------------\Frob %: Æ_f(P∧f^*Q)|-(Æ_fP)∧Q Æ_f(P∧f^*Q)|-(Æ_fP)∧Q %: %: ^Frobnat-short-std ^Frob-std %: $$\ded{Ptruenat-short-std} \qquad \ded{Ptrue-std}$$ $$\ded{Pandnat-short-std} \qquad \ded{Pand-std}$$ $$\ded{Pimpnat-short-std} \qquad \ded{Pimp-std}$$ $$\ded{Frobnat-short-std} \qquad \ded{Frob-std}$$ \msk The natural construction for the $\Frobnat$ arrow is given by the diagram below. Note that it shows both $\Frobnat$ (going rightwards, shortened to just `$\nat$') and $\Frob$: % % (find-854 "" "frobenius") % (find-854 "" "frobenius" "diagram Frob-std") % (find-854page 80 "Frobenius") % %D diagram Frob-std %D 2Dx 100 +45 +35 +10 +30 %D 2D 100 B0 ================> B1 %D 2D ^ ^ ^ %D 2D | / | %D 2D - \ - %D 2D +20 B2 ========> B3 <--> B3' %D 2D - / - %D 2D | \ | %D 2D v v v %D 2D +20 B4 <================ B5 %D 2D %D 2D +15 b0 |---------------> b1 %D 2D %D (( B0 .tex= P B1 .tex= Æ_fP %D B2 .tex= P&f^*Q B3 .tex= Æ_f(P&f^*Q) B3' .tex= (Æ_fP)&Q %D B4 .tex= f^*Q B5 .tex= Q %D b0 .tex= A b1 .tex= B %D )) %D (( %D B0 B1 |-> B2 B0 -> B3 B1 -> B3' B1 -> %D B4 B5 <-| B2 B4 -> B3 B5 -> B3' B5 -> %D B2 B3 |-> B3 B3' -> sl^ .plabel= a \nat B3 B3' <- sl_ .plabel= b \Frob %D B0 B2 midpoint B1 B3 midpoint harrownodes nil 20 nil |-> %D B2 B4 midpoint B3 B5 midpoint harrownodes nil 20 nil |-> %D )) %D (( b0 b1 -> .plabel= a f %D )) %D enddiagram % $$\diag{Frob-std}$$ Similarly to what we have done in the section \ref{dictionary}, we can extract from the diagram above a dictionary defining some names in terms of the others; note that due to our conventions $P{∧}f^*Q \to P$ is a hom-set, but $P{∧}f^*Q \vdash P$ is a particular morphism, and we can treat it as a name. By expanding the definitions in $Æ_f(P{\land}f^*Q) \vdash (Æ_fP){\land}Q$, we obtain a definition for the $\Frobnat$ rule: %: %: f Q f Q %: ---\cob -----\cob %: P f^*Q P f^*Q %: --------- ------------' %: P∧f^*Q|-P f P∧f^*Q|-f^*Q %: -----------------Æ_f --------------{Æ_f}^\flat %: Æ_f(P∧f^*Q)|-Æ_fP Æ_f(P∧f^*Q)|-Q %: -------------------------------------\ang{,} %: Æ_f(P∧f^*Q)|-(Æ_fP)∧Q %: %: ^Frobnat-std %: $$\begin{array}{l} \ded{Frobnat-short-std} \quad := \\ \\ \phantom{OO} \ded{Frobnat-std} \\ \end{array} $$ The diagram also says that the arrow generated by the $\Frob$ rule is inverse to the arrow generated by the (derived) rule $\Frobnat$. This is by a convention: in a `$\leftrightarrows$' the two arrows are the directions of an iso. % (find-854 "" "frobenius") % (find-854 "" "frobenius" "diagram Frob-std") % (find-854page 80 "Frobenius") \msk The statement of the Beck-Chevalley Condition (``BCC'' from now on) is slightly more complex: it requires four arrows in $\bbB$. For any commutative square in $\bbB$ as below, % %D diagram BCCL-std %D 2Dx 100 +45 +55 +45 %D 2D 100 B0 <====================== B1 %D 2D -\\ -\\ %D 2D | \\ | \\ %D 2D v \\ v \\ %D 2D +20 B2 <\\> B2' ============== B3 \\ %D 2D /\ \/ /\ \/ %D 2D +15 \\ B4 \\ B5 %D 2D \\ - \\ - %D 2D \\ | \\| %D 2D \\v \v %D 2D +20 B6 <===================== B7 %D 2D %D 2D +10 b0 |---------------------> b1 %D 2D |-> |-> %D 2D +35 b2 |--------------------> b3 %D 2D %D (( %D B0 .tex= f^{\prime*}P B1 .tex= P %D B2 .tex= z^{\prime*}f^*Æ_zP B2' .tex= f^{\prime*}z^*Æ_zP B3 .tex= z^*Æ_zP %D B4 .tex= Æ_{z'}f^{\prime*}P B5 .tex= Æ_zP %D B6 .tex= f^*Æ_zP B7 .tex= Æ_zP %D B0 B1 <-| B0 B2 -> B0 B2' -> B1 B3 -> B2 B2' <-> B2' B3 <-| %D B0 B4 |-> B1 B5 |-> %D B2 B6 <-| B3 B7 <-| %D B6 B7 <-| B5 B7 -> .plabel= r \id %D B4 B6 -> sl_ .plabel= l î B4 B6 <- sl^ .plabel= r \BCCL %D B0 B2' midpoint B1 B3 midpoint harrownodes nil 20 nil <-| %D B0 B2 midpoint B4 B6 midpoint dharrownodes nil 20 nil |-> %D B1 B3 midpoint B5 B7 midpoint dharrownodes nil 20 nil <-| %D )) %D (( b0 .tex= X×_{Y}Z b1 .tex= Z b2 .tex= X b3 .tex= Y %D b0 b1 -> .plabel= b f' %D b0 b2 -> .plabel= l z' %D b1 b3 -> .plabel= r z %D b2 b3 -> .plabel= a f %D b0 relplace 20 7 \pbsymbol{7} %D )) %D enddiagram % $$\diag{BCCL-std}$$ % and any object $P$ over $Z$, we have a natural construction for an arrow: % $$\BCCLnat: Æ_{z'} f^{\prime*} P \to f^* Æ_z P$$ The rule $\BCCL$ says that when $f$, $f'$, $z$, $z'$ form a pullback the arrow $\BCCLnat$ has an inverse, $\BCCL$: %: %: f z f' z' P %: ---------------\BCCL %: f^*Æ_zP|-Æ_{z'}f^{\prime*}P %: %: ^BCCL-rule %: $$\ded{BCCL-rule}$$ We will sometimes write $\BCCLnat$ and $\BCCL$ together as an iso, and call that iso $\BCCL$ (by a slight abuse of language); and we simplify the diagram for Beck-Chevalley by drawing only its square in $\bbB$ and on top of it the two faces of the cube in $\bbE$ that border on the missing edge. We will sometimes draw the arrows generated by $\Ptrue$, $\Pand$, $\Pimp$ and $\Frob$ as isos, too. % Beck-Chevalley (for the left adjoint to $f^*$): % -------------------- % «eq-elim» (to ".eq-elim") % (sec "The Eq-Elim rule" "eq-elim") \mysection {The Eq-Elim rule} {eq-elim} % (find-angg ".emacs.papers" "jacobs") % (find-jacobspage (+ 19 179) "replacement") The main interest of hyperdoctrines is that they are exactly the categories in which we can interpret (a certain system of typed, intuitionistic) first-order logic. The proof of this includes a way to interpret each deduction rule of first-order logic categorically, and another ``translation'' going in the opposite direction. Let's look at a tiny part --- possibly the hardest one --- of these translations. \msk The rule for equality-elimination in Natural Deduction can be formulated as this (\cite{SeelyBeck}, p.507): %: %: b=b' Qabb %: ---=I -----------=E %: b=b Qabb' %: %: ^=I ^=E %: $$\ded{=E}$$ Categorically, it will become this morphism (we will call it $\EqE$): % $$\ssst{a,b,b'}{b{=}b'{∧}Qabb} \vdash \ssst{a,b,b'}{Qabb'}$$ There are three different natural constructions in a hyperdoctrine for objects deserving the name $\ssst{a,b,b'}{b{=}b'{∧}Qabb}$. We will use this one in $\EqE$: %: %: \DD \ssst{b}{�} \dd \ssst{a,b,b'}{Qabb'} %: ----------------Æ_0 -------------------------\cob %: \opip \ssst{b,b'}{b=b'} \opi \ssst{a,b}{Qabb} %: ------------------------\cob -----------------------\cob %: \ssst{a,b,b'}{b=b'} \ssst{a,b,b'}{Qabb} %: ---------------------------------------------------{∧} %: \ssst{a,b,b'}{b=b'∧Qabb} %: %: ^=E-domain-pred %: %: %: b|->b,b' \ssst{b}{�} a,b|->a,b,b' \ssst{a,b,b'}{Qabb'} %: ----------------Æ_0 -------------------------\cob %: a,b,b'|->b,b' \ssst{b,b'}{b=b'} a,b,b'|->a,b \ssst{a,b}{Qabb} %: --------------------------\cob -----------------------\cob %: \ssst{a,b,b'}{b=b'} \ssst{a,b,b'}{Qabb'} %: ---------------------------------------------------{∧} %: \ssst{a,b,b'}{b=b'∧Qabb'} %: %: ^=E-domain-pred %: %: %: \DD \TB \dd Q %: --------Æ_0 ------\cob %: \opip Æ_\DD\TB \opi \dd^*Q %: ---------------\cob --------------\cob %: \opipstar"Æ_\DD\TB \opistar\dd^*Q %: -------------------------------------{∧} %: \opipstar"Æ_\DD\TB∧\opistar\dd^*Q %: %: ^=E-domain-std %: $$\thinmtos \ded{=E-domain-pred}$$ $$\ded{=E-domain-std}$$ Note that $\opi:A{×}B{×}B \to A{×}B$ is the projection on the first two coordinates, and $\opip:A{×}B{×}B \to B{×}B$ the projection on the last two coordinates. To show that the rule $\EqE$ can be interpreted in a hyperdoctrine we need a way to construct, for arbitrary objects $A$, $B$ in the base category and an arbitrary predicate $Q$ over $A{×}B{×}B$, a morphism $\EqEdomwide \vdash Q$. We will construct it as a composite of three maps. % Let $f := \opip:A×B×B \to B×B$ be the projection on the two last % coordinates, and Take $P := \TB = \ssst{b}{�}$, $f:=\opistar$, $z:=\DD$, $z':=$, $f':=\pi'$ in the diagram for Beck-Chevalley. Then the $\BCCL$ iso says that {\sl two} different categorical constructions for $\ssst{a,b,b'}{b=b'}$ are isomorphic, and by drawing also the $\Ptrue$ iso and its image by $Æ_\dd$ we get isos between the {\sl three} different categorical constructions for $\ssst{a,b,b'}{b=b'}$: %D diagram eq-simpl-std %D 2Dx 100 +40 +50 %D 2D PT %D 2D 100 E0 <-> E1 <--| E2 %D 2D - - - - %D 2D | | | | %D 2D v v v | %D 2D +30 E3 <-> E4 | %D 2D ^ ^ | %D 2D \ |BCC | %D 2D v v v %D 2D +20 E5 <--| E6 %D 2D %D 2D +15 B0 ---> B1 %D 2D | _| | %D 2D | | %D 2D v v %D 2D +30 B2 ---> B3 %D 2D %D (( E0 .tex= \TAB E1 .tex= \pipstar\TB E2 .tex= \TB %D E3 .tex= Æ_\TAB E4 .tex= Æ_\pipstar\TB %D E5 .tex= \opipstarÆ_\DD\TB E6 .tex= Æ_\DD\TB %D E0 E1 <-> .plabel= a \Ptrue E1 E2 <-| %D E0 E3 |-> E1 E4 |-> E2 E6 |-> E0 E4 varrownodes nil 17 nil |-> %D E3 E4 <-> %D E3 E5 <-> .plabel= l i E4 E5 <-> .plabel= r \BCCL %D E5 E6 <-| %D )) %D (( B0 .tex= A×B B1 .tex= B %D B2 .tex= A×B×B B3 .tex= B×B %D @ 0 @ 1 -> .plabel= b \pip %D @ 0 @ 2 -> .plabel= l \dd @ 1 @ 3 -> .plabel= r \DD %D @ 2 @ 3 -> .plabel= b \opip %D @ 0 relplace 7 7 \pbsymbol{7} %D )) %D enddiagram %D %D diagram eq-simpl-dnc %D 2Dx 100 +30 +45 %D 2D PT %D 2D 100 E0 <-> E1 <--| E2 %D 2D - - - - %D 2D | | | | %D 2D v v v | %D 2D +30 E3 <-> E4 | %D 2D ^ ^ | %D 2D \ |BCCL | %D 2D v v v %D 2D +20 E5 <--| E6 %D 2D %D 2D +15 B0 ---> B1 %D 2D | _| | %D 2D | | %D 2D v v %D 2D +30 B2 ---> B3 %D 2D %D (( E0 .tex= � E1 .tex= � E2 .tex= � %D E3 .tex= b=b' E4 .tex= b=b' %D E5 .tex= b=b' E6 .tex= b=b' %D E0 E1 <-> .plabel= a \Ptrue E1 E2 <= %D E0 E3 => E1 E4 => E2 E6 => E0 E4 varrownodes nil 17 nil |-> %D E3 E4 <-> %D E3 E5 <-> .plabel= l i E4 E5 <-> .plabel= r \BCCL %D E5 E6 <= %D )) %D (( B0 .tex= a,b B1 .tex= b %D B2 .tex= a,b,b' B3 .tex= b,b' %D @ 0 @ 1 |-> .plabel= b \pip %D @ 0 @ 2 |-> .plabel= l \dd @ 1 @ 3 |-> .plabel= r \DD %D @ 2 @ 3 |-> .plabel= b \opip %D @ 0 relplace 7 7 \pbsymbol{7} %D )) %D enddiagram %D $$\diag{eq-simpl-std} \qquad \diag{eq-simpl-dnc} $$ Now look at the diagram below. The arrow $\Frob^{\prime\nat}$ (going rightwards, shortened to $\nat$) is built in the same way as $\Frobnat$, but using the maps $!$ and $\cong$ in place of $$ and $'$. In any fibration we have natural isos $f^*g^*P \bij (f;g)^*P$ and $\id^*P \bij P$ (for {\sl diagrammatic} proofs for that, see \cite{OchsDownHyp}), and as $;\opi=\id$, this gives us the map $\cong$. The arrow $\Frob'$ is the inverse of $\Frob^{\prime\nat}$; its construction (not shown) is an easy consequence of Frobenius. The map $_Q$ is a counit for the adjunction $Æ_ \dashv ^*$. %D diagram Frob-5 %D 2Dx 100 +30 +50 +40 %D 2D 100 E0 ================> E1 %D 2D ^ ^ ^ %D 2D | / | %D 2D - \ - %D 2D +25 E2 ========> E3 <--> E3' %D 2D - / - %D 2D | \ | %D 2D v v v %D 2D +25 E4 <================ E5 <=== E6 %D 2D %D 2D +15 S2 ========> S3 <--> S3' %D 2D - / - %D 2D | \ | %D 2D v v v %D 2D +25 S4 <================ S5 %D 2D %D 2D +15 B0 |---------------> B1 <--- B2 %D 2D %D (( E0 .tex= \TAB E1 .tex= Æ_\f\TAB %D E2 .tex= \f^*Q E3 .tex= Æ_\f\f^*Q E3' .tex= (Æ_\f\TAB)&\g^*\f^*Q %D E4 .tex= \f^*\g^*(\f^*Q) E5 .tex= \g^*\f^*Q E6 .tex= \f^*Q %D %D S2 .tex= \f^*Q S3 .tex= Æ_\f\f^*Q S3' .tex= (Æ_\f\TAB)&\g^*\f^*Q %D S4 .tex= \f^*Q S5 .tex= Q %D %D B0 .tex= A×B B1 .tex= A×B×B B2 .tex= A×B %D )) %D (( E0 E1 |-> %D E2 E0 -> E3 E1 -> E3' E1 -> %D # E2 E3 |-> E3 E3' <-> %D E2 E3 |-> E3 E3' -> sl^ .PLABEL= ^(0.40) \nat %D E3 E3' <- sl_ .PLABEL= _(0.43) \Frob' %D E2 E4 -> .plabel= l \cong E3 E5 -> E3' E5 -> %D E4 E5 <-| E5 E6 <-| %D E0 E2 midpoint E1 E3 midpoint harrownodes nil 20 nil |-> %D E2 E4 midpoint E3 E5 midpoint harrownodes nil 20 nil |-> %D )) %D (( # S0 S1 |-> %D # S2 S0 -> S3 S1 -> S3' S1 -> %D # S2 S3 |-> S3 S3' <-> %D S2 S3 |-> # S3 S3' -> sl^ .PLABEL= ^(0.40) \nat %D # S3 S3' <- sl_ .PLABEL= _(0.43) \Frob_5 %D S2 S4 -> .plabel= l \id S3 S5 -> .plabel= a \ee_Q # S3' S5 -> %D S4 S5 <-| # S5 S6 <-| %D # S0 S2 midpoint S1 S3 midpoint harrownodes nil 20 nil |-> %D S2 S4 midpoint S3 S5 midpoint harrownodes nil 20 nil |-> %D %D # S3 S3' <- .plabel= a \Frob' S3' S5 --> .plabel= r =E'_Q %D )) %D (( B0 B1 -> .plabel= a \f %D B1 B2 -> .plabel= a \g %D B0 B2 -> sl__ .plabel= b \id %D )) %D enddiagram % $$\def\TAB{�\!_{A×B}} \def\f{} \def\g{{\ov}} \diag{Frob-5} $$ If $Q \equiv \ssst{a,b,b'}{Qabb'}$ then we can downcase the diagram above as: %D diagram Frob-5-dnc %D 2Dx 100 +40 +45 +40 %D 2D 100 E0 ================> E1 %D 2D ^ ^ ^ %D 2D | / | %D 2D - \ - %D 2D +25 E2 ========> E3 <--> E3' %D 2D - / - %D 2D | \ | %D 2D v v v %D 2D +25 E4 <================ E5 <=== E6 %D 2D %D 2D +15 S2 ========> S3 <--> S3' %D 2D - / - %D 2D | \ | %D 2D v v v %D 2D +25 S4 <================ S5 %D 2D %D 2D +15 B0 |---------------> B1 <--- B2 %D 2D %D (( E0 .tex= � E1 .tex= b=b' %D E2 .tex= Qabb E3 .tex= b=b'∧Qabb E3' .tex= b=b'∧Qabb %D E4 .tex= Qabb E5 .tex= Qabb E6 .tex= Qabb %D %D S2 .tex= Qabb S3 .tex= b=b'∧Qabb S3' .tex= b=b'∧Qabb %D S4 .tex= Qabb S5 .tex= Qabb' %D %D B0 .tex= a,b B1 .tex= a,b,b' B2 .tex= a,b %D )) %D (( E0 E1 => %D E2 E0 |-> E3 E1 |-> E3' E1 |-> %D # E2 E3 => E3 E3' <-> %D E2 E3 => E3 E3' |-> sl^ .PLABEL= ^(0.50) \nat %D E3 E3' <-| sl_ .PLABEL= _(0.50) \Frob' %D E2 E4 <-> .plabel= l \cong E3 E5 |-> E3' E5 |-> %D E4 E5 <= E5 E6 <= %D E0 E2 midpoint E1 E3 midpoint harrownodes nil 20 nil |-> %D E2 E4 midpoint E3 E5 midpoint harrownodes nil 20 nil |-> %D )) %D (( S2 S3 => # S3 S3' |-> sl^ .PLABEL= ^(0.40) \nat %D # S3 S3' <-| sl_ .PLABEL= _(0.43) \Frob_5 %D S2 S4 |-> .plabel= l \id S3 S5 |-> .plabel= a \ee_Q # S3' S5 |-> %D S4 S5 <= # S5 S6 <= %D S2 S4 midpoint S3 S5 midpoint harrownodes nil 20 nil |-> %D %D # S3 S3' <-| .plabel= a \Frob' S3' S5 |--> .plabel= r =E'_Q %D )) %D (( B0 B1 |-> .plabel= a b':=b %D B1 B2 |-> # .plabel= a \g %D B0 B2 |-> sl__ .plabel= b \id %D )) %D enddiagram % $$\def\TAB{�\!_{A×B}} \def\f{} \def\g{{\ov}} \diag{Frob-5-dnc} $$ The arrow $\EqE$ is the composite below. % %D diagram EqE-as-composite %D 2Dx 100 +55 +55 +50 %D 2D 100 A0 A1 B0 B1 %D 2D %D 2D +30 A2 A3 B2 B3 %D 2D %D (( A2 x+= 30 %D A3 x+= 30 %D B2 x+= 30 %D B3 x+= 30 %D )) %D (( A0 .tex= Æ_^*Q A1 .tex= (Æ_\TAB)∧\opistar^*Q %D A2 .tex= Q A3 .tex= \EqEdomthin %D A0 A1 <-> .plabel= a \Frob' %D A0 A2 -> .PLABEL= _(0.43) _Q A1 A3 <-> .PLABEL= ^(0.60) i×\opistar^*Q %D A2 A3 <-- .plabel= b \EqE %D )) %D (( B0 .tex= b=b'∧Qabb %D B1 .tex= b=b'∧Qabb %D B2 .tex= Qabb' %D B3 .tex= b=b'∧Qabb %D B0 B1 <-> %D B0 B2 |-> B1 B3 <-> %D B2 B3 <--| .plabel= b \EqE %D )) %D enddiagram %D $$\diag{EqE-as-composite}$$ % -------------------- % «archetypal» (to ".archetypal") % (sec "Archetypal Models" "archetypal") \mysection {Archetypal Models} {archetypal} Imagine that we have placed side to side the downcased and the standard diagrams of the last section. When we go from downcased to standard --- e.g, from $\ssst{a,b}{b{=}b'∧Qabb}$ to $Æ_^*Q$ --- we are attributing a precise meaning to a (potentially ambiguous) ``name''; however, when we go in the opposite direction, from standard to downcased, we are ``attributing meaning'' in another sense: we are giving an ``intuitive meaning'' (or better: ``intuitive content'') to something that, if we had received it in an article, as in the section \ref{transmission}, could have felt initially as something purely abstract... I had to make some ``informal definitions'' in the course of this paper, using terms that I could not define precisely, like {\sl diagrammatical reasoning}, {\sl reconstruction}, and {\sl intuition}... I will have to make a few more. \begin{quote} % (Informal) definition. The {\sl archetypal model} for a structure --- for example, the archetypal model for a hyperdoctrine, which is going to be $\Pred$ --- is {\sl a particular case of that structure that ``suggests'' a certain ``language'' for working on that structure} --- i.e., for doing constructions and proofs on it. \end{quote} That ``archetypical language'' does not need to be unambiguous --- think in $\ssst{a,b}{b{=}b'∧Qabb}$ and its several different precise meanings ---, does not need to have a downcased version --- in section \ref{hyps} the internal diagrams were what mattered, the downcasings were mentioned just in passing ---, and does not need to be convenient for expressing {\sl all} possible constructions. What is relevant is that the archetypical language, {\sl when used side-to-side with the ``algebraic'' language}, should give us a way to reason, both {\sl intuitively} and {\sl precisely}, about the structure we're working on; in particular, it should let us formulate reasonable conjectures quickly, and check them with reasonable ease... but what are ``reasonable conjectures'', and where do they come from? If we want to be able to reconstruct a theory from minimal information we need to have ways to: 1) generate ``reasonable conjectures'', 2) filter out those which are either impossible or too hard to prove, 3) prove the others. {\sl Bounded} proof search takes care of points (1) and (2); to give a partial answer to (1) we will concentrate our attention on Category Theory --- more specifically, on situations where we are trying to generalize a ``base case''. There we have two (non-disjoint) sources of ``reasonable conjectures'': 1a) constructions and proofs that make sense and hold in the base case, that we may expect to generalize; 1b) {\sl language}. Our choice of names for objects in a hyperdoctrine gave us several different formal meanings for $\ssst{a,b}{b{=}b'∧Qabb}$ --- we expect them to be at least equivalent modulo isomorphism. This is similar to writing $a+b+c$ instead of $a+(b+c)$ or $(a+b)+c$: our choice of language ``suggests'' that $a+(b+c) = (a+b)+c$. An ``archetypal example'', in which all the main ideas appear, is: \begin{quote} $\Set$ is the archetypical CCC. \end{quote} % -------------------- % «CCCs» (to ".CCCs") % (sec "Cartesian Closed Categories" "CCCs") \mysection {Cartesian Closed Categories} {CCCs} % (find-854 "" "CCCs") % (find-854 "" "CCCs") % (find-854page 43 "CCCs") % (find-854 "" "lambda-calc-in-a-hyp") % (find-854page 87 "lambda-calc-in-a-hyp") % Here is a well-know example --- maybe even the ``archetypical % example'' for our method. A {\sl Cartesian Closed Category} $(\catC, ×, 1, \to)$ is a category $\catC$ plus a ``cartesian closed structure'' $(×, 1, \to)$ on it. The following diagram fixes the (categorical) notation that we will use for the operations induced by $(×, 1, \to)$: %D diagram CCC-1 %D 2Dx 100 +30 +30 +25 +30 +30 %D 2D 100 P0 T0 E0 E1 %D 2D %D 2D +30 P1 P2 P3 T1 E2 E3 %D 2D %D (( P0 .tex= A P1 .tex= B P2 .tex= B×C P3 .tex= C %D P0 P1 -> .plabel= a f %D P0 P2 -> .plabel= m \ang{f,g} %D P0 P3 -> .plabel= a g %D P1 P2 <- .plabel= b \pi %D P2 P3 -> .plabel= b \pi' %D )) %D (( T0 .tex= A T1 .tex= 1 -> .plabel= r ! %D )) %D (( E0 .tex= A×B E1 .tex= A %D E2 .tex= C E3 .tex= B{->}C %D E0 E1 <-| %D E0 E2 -> .PLABEL= _(0.43) \uncur"g %D E0 E2 -> .PLABEL= _(0.57) f %D E1 E3 -> .PLABEL= ^(0.43) g %D E1 E3 -> .PLABEL= ^(0.57) \cur"f %D E0 E3 harrownodes nil 20 nil <-| sl^ %D E0 E3 harrownodes nil 20 nil |-> sl_ %D E2 E3 |-> %D )) %D enddiagram %D $$\diag{CCC-1}$$ Each of its three parts can be attributed a precise meaning, as we did with the adjunction square in section \ref{adjunctions}; and if we regard each of them as a different adjunction (see \cite{Awodey}, pages 182 and 188, for the three adjunctions), then we get the equations that these operations have to obey. \msk The Big Theorem is: {\sl CCCs are exactly the categorical models for the simply-typed $�$-calculus with pairs and unit}. \msk The precise, ``standard'' statement for that theorem, including definitions, statements, and proofs, is quite long. Amazingly, it can be reconstructed from just this downcased diagram, % (find-854 "" "lambda-calc-in-a-hyp") %D diagram CCC-1-dnc %D 2Dx 100 +30 +30 +25 +30 +30 %D 2D 100 P0 T0 E0 E1 %D 2D %D 2D +30 P1 P2 P3 T1 E2 E3 %D 2D %D (( P0 .tex= a P1 .tex= b P2 .tex= b,c P3 .tex= c %D P0 P1 |-> %D P0 P2 |-> %D P0 P3 |-> %D P1 P2 <-| %D P2 P3 |-> %D )) %D (( T0 .tex= a T1 .tex= * |-> %D )) %D (( E0 .tex= a,b E1 .tex= a %D E2 .tex= c E3 .tex= b|->c %D E0 E1 <= %D E0 E2 |-> %D E1 E3 |-> %D E0 E3 harrownodes nil 20 nil <-> %D E2 E3 => %D )) %D enddiagram %D $$\diag{CCC-1-dnc}$$ % plus this downcasing of the Natural Deduction rules for $∧$, $�$, and $⊃$: %: %: a a %: : : %: b c b,c b,c %: ----\ang{,} --- ---' %: b,c b c %: %: ^HLD-1 ^HLD-2 ^HLD-3 %: %: a %: -! %: * %: %: ^HLD-4 %: %: a a a [b]^1 %: : : :::: %: b b|->c c %: ---------\app -----�;1 %: c b|->c %: %: ^HLD-5 ^HLD-6 %: $$\begin{array}{cccc} \ded{HLD-1} & \ded{HLD-2} \quad \ded{HLD-3} \\ \\ \ded{HLD-4} \\ \\ \ded{HLD-5} & \ded{HLD-6} \\ \end{array} $$ Starting from that, we uppercase the downcased diagram in the following way, that makes clear how an intermediate, ``sequent-like'' form (as in \cite{LambekScott}, pp.47--49), should behave: % % (find-books "__cats/__cats.el" "lambek-scott") % (find-lambekscottpage (+ 8 47) "1. Propositional calculus as a deductive system") % %D diagram CCC-HL2 %D 2Dx 100 +30 +30 +25 +30 +30 +35 %D 2D 100 P0 T0 E0 E1 E0' %D 2D %D 2D +30 P1 P2 P3 T1 E2 E3 E2' %D 2D %D (( P0 .tex= A P1 .tex= B P2 .tex= B{×}C P3 .tex= C %D P0 P1 -> .plabel= a a|-b %D P0 P2 -> .plabel= m a|-\ang{b,c} %D P0 P3 -> .plabel= a a|-c %D P1 P2 <- .plabel= b p|-p %D P2 P3 -> .plabel= b p|-p' %D )) %D (( T0 .tex= A T1 .tex= 1 -> .plabel= l a|-* %D )) % %D (( E0' .tex= (B{->}C){×}B E2' .tex= C % %D E0' E2' -> .plabel= c \ev_{BC} % %D )) %D (( E0 .tex= A×B E1 .tex= A %D E2 .tex= C E3 .tex= B{->}C %D E0 E1 <-| %D E0 E2 -> .PLABEL= _(0.43) a,b|-fb %D E0 E2 -> .PLABEL= _(0.57) a,b|-c %D E1 E3 -> .PLABEL= ^(0.43) a|-f %D E1 E3 -> .PLABEL= ^(0.57) a|-�b�B.c %D E0 E3 harrownodes nil 20 nil <-| sl^ %D E0 E3 harrownodes nil 20 nil |-> sl_ %D E2 E3 |-> %D )) %D enddiagram %D $$\diag{CCC-HL2}$$ % Then the next steps are to state precisely: * the syntax and the rules of the type system, * the operations of a CCC, * the translation of each of the rules of the type system, * the translation of each of the operations of the CCC, * the reductions and conversions of the type system, * the equations obeyed by the operations of a CCC, \noindent and then we have to prove that the CCC equations are respected by the translation to $�$-calculus, and that the $�$-calculus equations are respected by the translation to categories. We also need to allow ``impurities'' in both the $�$-calculus and in the CCCs. Take a look at the CCC of the next section: it has extra {\sl constants} --- an object $A$ and morphisms $\corn0$, $\corn1$, $+$, $�$ --- that do not exist in all CCCs, and these constants obey certain {\sl equations}. In a {\sl Pure Type System} (\cite{GeuversPhD}) extra constants and equations are not allowed, but in theorems like the Big Theorem above (\cite{Jacobs} has many more like it) we allow in the type system exactly the kinds of constants and equations that are easy to interpret in the categorical models. The kinds of allowed ``impurities'' are usually defined in the beginning, in the definition of the type system and of the categorical structure, but this can be delayed until after the translations are presented. % (find-books "__cats/__cats.el" "awodey") % -------------------- % «olts» (to ".olts") % (sec "Objects of Line Type" "olts") \mysection {Objects of Line Type} {olts} Our approach also works in cases where we do have an archetypical language with clear intuitive content, but where we have no concrete archetypical model besides a term model built purely syntactically. Through this section let's pretend that we do not know Synthetic Differential Geometry (\cite{Kock81}, \cite{BellSIA}, \cite{MoeRey}). \msk Let $(A, \corn0, \corn1, +, �)$ be a commutative ring in a CCC. That means: we have a diagram % %D diagram ring-object %D 2Dx 100 +35 +35 %D 2D 100 A0 ====> A1 <============== A2 %D 2D %D 2D +20 a0 |---> b0 %D 2D +6 a1 |---> b1 %D 2D +6 b2 <-------------| c2 %D 2D +6 b3 <-------------| c3 %D 2D %D (( A0 .tex= 1 A1 .tex= A A2 .tex= A×A %D a0 .tex= * b0 .tex= 0 %D a1 .tex= * b1 .tex= 1 %D b2 .tex= a+b c2 .tex= a,b %D b3 .tex= ab c3 .tex= a,b %D )) %D (( A0 A1 -> sl^ .plabel= a \corn0 %D A0 A1 -> sl_ .plabel= b \corn1 %D A1 A2 <- sl^ .plabel= a + %D A1 A2 <- sl_ .plabel= b � %D a0 b0 |-> %D a1 b1 |-> %D b2 c2 <-| %D b3 c3 <-| %D )) %D enddiagram %D $$\diag{ring-object}$$ % and the morphisms $\corn0$, $\corn1$, $+$, $�$ behave as expected. Let $D$ be the set of zero-square infinitesimals of $A$, i.e., $\sst{�A}{^2=0}$; $D$ can be defined categorically as an equalizer. If we take $A:=\R$, then $D=\{0\}$; but if we let $A$ be a ring with nilpotent infinitesimals, then $\{0\} \subsetneq A$. % Our notation will suggest that we are in $\R$, though. \msk The main theorem of \cite{Kock77} says that if the map % $$\begin{array}{rrcl} \aa: & A×A & \to & (D{\to}A) \\ & (a,b) & \mto & ��D.(a+b) \\ \end{array} $$ % is invertible, then we can use $\aa$ and $\aa�$ to {\sl define} the derivative of maps from $A$ to $A$ --- {\sl every} morphism $f: A \to A$ in the category $\catC$ will be ``differentiable'' ---, and the resulting differentiation operation $f \mapsto f'$ behaves as expected: we have, for example, $(fg)'=f'g+fg'$ and $(f�g)' = (f'�g)g'$. Commutative rings with the property that their map $\aa$ is invertible are called {\sl ring objects of line type}. ROLTs are hard to construct, so most of the proofs about them have to be done in a very abstract setting. However, if we can use the following downcasings for $\aa$ and $\aa�$ --- note that $\bb=(\aa�;)$, that $\cc=(\aa�;')$, and that these notations do not make immediately obvious that $\aa$ and $\aa�$ are inverses ---, % %D diagram aa-and-aa-inverse %D 2Dx 100 +30 +30 +15 +40 +40 %D 2D 100 A0 <-- A1 --> A2 b0 <-- b1 --> b2 %D 2D ^ |^ ^ ^ |^ ^ %D 2D \ || / \ || / %D 2D \ v| / \ v| / %D 2D +30 A3 b3 %D 2D %D 2D +15 B0 <-- B1 --> B2 C0 <-- C1 --> C2 %D 2D ^ |^ ^ ^ |^ ^ %D 2D \ || / \ || / %D 2D \ v| / \ v| / %D 2D +30 B3 C3 %D 2D %D (( A0 .tex= A A1 .tex= A×A A2 .tex= A %D A3 .tex= (D{->}A) %D @ 0 @ 1 <- .plabel= a %D @ 1 @ 2 -> .plabel= a ' %D @ 0 @ 3 <- .plabel= l \bb # \sm{\bb\;:=\\\aa�;} %D @ 1 @ 3 -> sl_ .PLABEL= _(0.42) \aa %D @ 1 @ 3 <- sl^ .PLABEL= ^(0.38) \aa� %D @ 2 @ 3 <- .plabel= r \cc %D )) %D (( B0 .tex= a B1 .tex= a,b B2 .tex= b %D B3 .tex= (\mapsto"a+b) %D @ 0 @ 1 <-| .plabel= a %D @ 1 @ 2 |-> .plabel= a ' %D @ 0 @ 3 <-| .plabel= l \bb %D @ 1 @ 3 |-> .PLABEL= _(0.43) \aa %D @ 2 @ 3 <-| .plabel= r \cc %D )) %D (( C0 .tex= f(0) C1 .tex= (f(0),f'(0)) C2 .tex= f'(0) %D C3 .tex= (\mapsto"f()) %D @ 0 @ 1 <-| .plabel= a %D @ 1 @ 2 |-> .plabel= a ' %D @ 0 @ 3 <-| .plabel= l \bb %D @ 1 @ 3 <-| .PLABEL= ^(0.43) \aa� %D @ 2 @ 3 <-| .plabel= r \cc %D )) %D enddiagram %D $$\diag{aa-and-aa-inverse}$$ % then all the proofs in the first two sections of \cite{Kock77} can be reconstructed from half-diagrammatic, half-$�$-calculus-style proofs, done in the archetypal language, where the intuitive content is clear. This will be shown in a sequel to \cite{OchsDownHyp}. % -------------------- % «database» (to ".database") % (sec "A Database of Categorical Theorems" "database") \mysection {A Database of Categorical Theorems} {database} {\sl We, the people who think mostly diagrammatically, would like to have a database of theorems from Category Theory, all presented in a diagrammatical form.} This database would include the theorems I know, the theorems I have half-forgotten, and lots of theorems that I never knew, that were added to the database by other people. \msk That database already exists, but in a non-ideal, not-very-diagrammatical form: it is the published literature on Category Theory. Each paper has its own notion of ``obviousness'': when an author skips over details and claims that something is obvious, he supposes --- in the spirit of section \ref{transmission} --- that the reader will be able to fill up the gaps. I started to work on this subject because my notion of ``obvious'' seemed to be too different from the ones that I could find in the literature. The ``archetypal models'' being generalized are almost always mentioned in the papers, but only in passing, and after all the axioms having been laid down; all the formal arguments are made in the abstract notation only; and informal notations can only be presented when they are used to prove new nontrivial results, and when their formalization has been completed --- but then they are not ``informal'' anymore, and they have been promoted to ``internal languages'', ``term calculi'', ``proof nets'', ``circuit diagrams'', etc (\cite{JoyalStreet} is an early example of an informal notation made formal, with a nice discussion). As each notion of ``obvious'' is backed by a method for proof search, it ought to be possible to formalize --- even if roughly --- how each such proof-search method would work; and each different notion of what is ``obvious'' leads to a different way to present, and store, theorems. The same theorem $T$ is remembered by a reader $R_1$ as $T_1$ and by a reader $R_2$ as $T_2$; as the two readers reconstruct the missing details the theorems grow to $T_1'$, $T_2'$, $T_1''$, $T_2''$. If we could put these objects side to side we could clarify how these different kinds of reasonings work. {\sl It is possible to reason coherently with infinitesimals.} This became clear when Non-Standard Analysis was invented, and then even clearer when Synthetical Differential Geometry came up. One way to prove that a way of reasoning is coherent is to model it mathematically --- and to show that the model has good properties. NSA and SDG ``validate'' reasonings with infinitesimals. Similarly, one way to validate reasonings based on internal diagrams, archetypal languages and informal notations is to formalize how these reasonings work --- and to build bridges between them and the standard ways. \msk Diagrams like the ones in sections \ref{pres-frob-bcc} and \ref{eq-elim} can be interpreted formally: a diagram induces a dictionary, which induces trees, which then let us interpret each object and arrow from the diagram as a $�$-term. A full set of rules for interpreting diagrams can be developed --- including ways to deal with lists of exceptions and hints --- and so it would be conceivable to have a database of categorical definitions and theorems in which the entries would appear as diagrams (plus their associated hints), but at the same time they would have precise meanings, understood by a proof assistant... Note that diagrams are becoming more and more meaningful mathematically in the last decades. Compare these ideas with the discussion in \cite{Kromer}, p.83, where he quotes this from \cite{EilenbergSteenrod}, p.xi: \begin{quote} The diagrams incorporate a large amount of information. Their use provides extensive savings in space and in mental effort. In the case of many theorems, the setting up of the correct diagram is the major part of the proof. We therefore urge that the reader stop at the end of each theorem and attempt to construct for himself the relevant diagram before examining the one which is given in the text. Once this is done, the subsequent demonstration can be followed more readily; in fact, the reader can usually supply it himself. \end{quote} % (find-kromerpage (+ 34 83) "The diagrams incorporate a large amount of information.") % (find-kromertext "The diagrams incorporate a large amount of information") % -------------------- % «epilogue» (to ".epilogue") % (sec "Epilogue" "epilogue") \mysection {Epilogue} {epilogue} This is an atypical paper. It has no theorems, and it doesn't prove any new mathematical truths --- instead, we have shown several ways to represent constructions, and to structure proofs in several layers. ``Theorems'' usually involve {\sl equalities} between {\sl constructions}, and hence they do not belong do the lowest layers; there are even ways --- not detailed here --- to ``project out'' the parts of a theorem that involve equalities, and keep only the constructions... The result is a sort of a ``syntactical skeleton'' of the theorem. Our ``structuring'' puts a situation with total information at the top, and below that, in different, parallel ``legs'', we put different partial views. Theorems do not belong to the {\sl structuring}, either. At some point we will have ``meta-theorems'' about properties of this structuring. But finding these meta-theorems should not be top priority right now: these meta-theorems will use known theorems, and at this point it is more important to put more known theorems in this format. \msk The above may look too vague, too philosophical (in contraposition to ``mathematical''), too abstract. Category Theory used to be called ``abstract nonsense''; the ideas of this paper, then, could look like ``meta-abstract nonsense''... However, we are not pointing only towards the more abstract: we are showing ways --- and reasons --- to do the {\sl general} in parallel with the {\sl particular}, and to arrange the results of doing mathematics like that in forms that are better for {\sl reconstruction} and {\sl transmission}. Our structuring allows the controlled use of {\sl informal notations}. Several different informal notations can be used in parallel at the same time. This can be useful for comparing different people's notations, and for changing details on a notation and letting it evolve over time. The database of categorical knowledge proposed in the last section does not need a unified notation, and one of its functions could be to serve as a dictionary between notations --- and as a way to make parts of the existing literature more accessible. % (find-books "__phil/__phil.el" "corfield") % (find-books "__analysis/__analysis.el" "needham") % (find-books "__analysis/__analysis.el" "nelsen") % (find-eface-links 'default) % (find-eface-links 'tex-verbatim) % (find-efacedescr 'tex-verbatim) % (find-efunction 'set-face-attribute) % (face-attribute 'tex-verbatim :family) % (set-face-attribute 'tex-verbatim nil :family "courier") % (set-face-attribute 'tex-verbatim nil :family nil) % (find-854 "" "trees-to-dictionaries") % (find-854page 21 "trees-to-dictionaries") % (find-LATEX "2009-planodetrabalho.tex" "bibliography") % (find-angg ".zshrc" "makebbl") \bibliography{catsem,filters} \bibliographystyle{alpha} \end{document} %* To do: labelling of the figures % Local Variables: % coding: raw-text-unix % ee-anchor-format: "«%s»" % End: