Jekyll2019-11-01T13:30:30+00:00http://blog.russelldmatt.com/feed.xmlBlogThe Metric Tensor2019-10-29T00:00:00+00:002019-10-29T00:00:00+00:00http://blog.russelldmatt.com/2019/10/29/the-metric-tensor<div style="display: none;"> <p><script type="math/tex">% <![CDATA[ \newcommand{\vec}{\left[\begin{matrix}#1\\#2\end{matrix}\right]} \newcommand{\vv}{\overrightarrow{#1}} \newcommand{\norm}{\lVert#1\rVert} \newcommand{\mat}{\left[\begin{matrix}#1 & #3\\#2 & #4\end{matrix}\right]} %]]></script></p> </div> <p>In the last post, I tried to explain what a tensor is. It’s complicated; it’s a long post. But what I didn’t tackle is the why. Why do we care about this generalization of vectors and matrices?</p> <p>To be honest, I mostly don’t know yet. My hope is to actually learn the math behind general relatively at some point, and my current understanding is that tensors are part of that math. However, I do have one interesting point to make.</p> <p>What is the dot product of a vector with itself? It’s the length squared, right?</p> <p>Take, for instance, the vector <script type="math/tex">\vv{v} = [3, 4]</script> (with length 5):</p> <script type="math/tex; mode=display">\vec{3}{4} \cdot \vec{3}{4} = 3 \cdot 3 + 4 \cdot 4 = 25</script> <p>Right, of course this works. We’ve just reformulated the Pythagorean theorem in a linear-algebra sort of way.</p> <p>But wait, something is odd here. In the last post, we made a big deal about how <em>covectors</em> were different than <em>vectors</em>. <em>covectors</em> were functions from vectors to scalars, not vectors. What does it even mean, then, to multiply two vectors together? In programming terms, it’s like we’ve made a type error.</p> <p>If we wanted to construct a (multi-linear) function from 2 vectors to a scalar, as we seem to want when taking the dot product of 2 vectors, we’d need a (0, 2)-tensor. Recall, that an (n, m)-tensor is a multi-linear function from m vectors and n covectors to a scalar.</p> <p>That’s actually correct, and the (0, 2)-tensor that we want is called <em>the metric tensor</em>. To see why, let’s change our basis from the standard orthonormal basis to something else.</p> <p>Let’s use a new basis of <script type="math/tex">\vv{e_1} = [4, 4]</script> and <script type="math/tex">\vv{e_2} = [-1, 0]</script>. What are the coordinates of the vector <script type="math/tex">\vv{v}</script> in the new basis? It looks like <script type="math/tex">[1, 1]</script> will do the trick. How convenient.</p> <p>Ok, so what’s the length of <script type="math/tex">\vv{v}</script> now? It’s the same! The length of a vector does not depend on the coordinate system.</p> <p>Right, right, what I meant was, how do we compute the length of the vector now? Dot product right?</p> <script type="math/tex; mode=display">\vec{1}{1} \cdot \vec{1}{1} = 1 \cdot 1 + 1 \cdot 1 = 2</script> <p>Uh… that’s not right. No, of course that doesn’t work. The length of the vector has to depend on the length of the basis vectors. What I meant was to first scale each coordinate by the length of the appropriate basis vector before doing the multiplication. Something like this:</p> <script type="math/tex; mode=display">\vec{1}{1} \cdot \vec{1}{1} = (1 \cdot \norm{\vv{e_1}}) \cdot (1 \cdot \norm{\vv{e_1}}) + (1 \cdot \norm{\vv{e_2}}) \cdot (1 \cdot \norm{\vv{e_2}}) = 1 \cdot 32 + 1 \cdot 1 = 33</script> <p>Hmm, yea not that either. I guess I’m still trying to use the Pythagorean theorem, but my triangle is not a right triangle anymore. I’m making a triangle with one basis vector <script type="math/tex">\vv{e_1} = [4, 4]</script> and one basis vector <script type="math/tex">\vv{e_2} = [-1, 0]</script>, but those vectors aren’t orthogonal.</p> <p>All this would be much more clear with a picture:</p> <div style="text-align: center;"> <img src=" /assets/by-post/the-metric-tensor/v.jpg" style="width: 400px; margin-bottom: 20px;" /> </div> <p>So maybe law of cosines? <script type="math/tex">c^2 = a^2 + b^2 - 2ab\cos{C}</script>? Actually yes, that’s exactly right, but let me show you another way.</p> <p>Like I said before, what we want is called <em>the metric tensor</em>.</p> <script type="math/tex; mode=display">[[ {\vv{e_1}\cdot\vv{e_1}}, {\vv{e_2}\cdot\vv{e_1}} ], [ {\vv{e_1}\cdot\vv{e_2}}, {\vv{e_2}\cdot\vv{e_2}} ]]</script> <p>I wrote it out that way, as a row of row-vectors, on purpose. The metric tensor is a (0, 2)-tensor, meaning it’s a function from two vectors to a scalar, and a row of row vectors has the right dimensionality for that multiplication. Let’s try it out with our new basis:</p> <script type="math/tex; mode=display">% <![CDATA[ \begin{align*} \vv{e_1} \cdot \vv{e_1} &= 32 \\ \vv{e_1} \cdot \vv{e_2} &= -4 \\ \vv{e_2} \cdot \vv{e_2} &= 1 \\ \end{align*} %]]></script> <p>So, our metric tensor is:</p> <script type="math/tex; mode=display">[[32, -4], [-4, 1]]</script> <p>Let’s multiply it by our vector <script type="math/tex">v = [1, 1]</script>:</p> <script type="math/tex; mode=display">[[32 -4], [-4, 1]] \vec{1}{1} = [28, -3]</script> <p>And again?</p> <script type="math/tex; mode=display">[28, -3] \vec{1}{1} = 25</script> <p>It works! So, why haven’t we ever heard of this thing before? Well, let’s write out the metric tensor in the standard, orthonormal basis:</p> <script type="math/tex; mode=display">% <![CDATA[ \begin{align*} \vv{b_1} &= [1, 0] \\ \vv{b_2} &= [0, 1] \\ \vv{b_1} \cdot \vv{b_1} &= 1 \\ \vv{b_1} \cdot \vv{b_2} &= 0 \\ \vv{b_2} \cdot \vv{b_2} &= 1 \\ \end{align*} %]]></script> <p>So, the metric tensor, in an orthonormal basis, is the identity function:</p> <script type="math/tex; mode=display">[[1, 0], [0, 1]]</script> <p>which is why ignoring it, and treating vectors and covectors interchangeably, is usually fine.</p>What is a Tensor?2019-10-28T00:00:00+00:002019-10-28T00:00:00+00:00http://blog.russelldmatt.com/2019/10/28/what-is-a-tensor<p>I just completed the very good youtube playlist <a href="https://www.youtube.com/playlist?list=PLJHszsWbB6hrkmmq57lX8BV-o-YIOFsiG">Tensors for Beginners</a> by eigenchris and I want to jot down some notes before I forget everything.</p> <p><em>An (n, m)-tensor is a multi-linear function from m vectors and n covectors to a scalar.</em></p> <p>A tensor is a “geometrical object” in the same way that a vector is a “geometrical object” (and a vector is a tensor, so it really is in the same way). We often deal with the coordinates of a vector, which assumes a particular basis. But the exact same vector will have different coordinates if we change the basis. So, the vector itself is “invariant” under a change of basis, but the coordinates are not. However, the coordinates change in a predictable way under a change of basis. All the same is true for tensors (again, vectors <em>are</em> tensors).</p> <p><em>Covectors</em> are a new “type of thing”. They’re functions from a vector to a scalar. One concrete way to think about them is that they’re “row vectors”. If you multiply a row vector by a vector, you get a scalar.</p> <p><em>Tensor product</em>: So, a covector * vector = scalar. But a vector * covector = matrix. The latter is an example of a tensor product. More generally, a tensor product takes the cartesian product of the inputs, and for each ordered pair, you multiply the elements. So in the simple case of an n-dimensional vector v and an m-dimensional covector c, the tensor product v ⊗ c would have (n x m) dimensions, i.e. it can be represented by an (n x m) matrix! Think about each element of that matrix; the (i, j)th element is the product of the ith element of v and the jth element of c. So, you can see concretely what I mean by “the tensor product takes the cartesian product of the inputs, and for each ordered pair, you multiply the elements”.</p> <p>Back to “what is a tensor”. A simple (n, m)-tensor can be constructed by the tensor product of n vectors and m covectors. Again, let’s think about a matrix. We just said that a matrix can be constructed via the tensor product of a vector and a covector. So, I guess that means a matrix is a (1, 1)-tensor! So, why did I say “simple” in “A <em>simple</em> (n, m)-tensor …”. Think about the set of matrices you can construct by multiplying a vector v * a row vector c. What’s their rank? Rank 1, of course! Every column is a scaled version of every other column, since all the columns are just scaled versions of v (the jth column is v * c[j]). Same goes for rows; each row is a scaled version of c (the ith row is v[i] * c). A rank 1 matrix is a very boring matrix indeed. If you think about a matrix as a function from vector -&gt; vector (since, when you multiply a matrix by a vector you get a vector), all the output vectors lie on the same line (and that line points in same the direction as v). So, if these are 2-dimensional vectors, the rank 1 matrix will project all 2 dimensional vectors onto a line. Slight tangent, but this corresponds to having a zero determinant, having a zero eigenvalue, and being non-invertible.</p> <p>So, are all tensors simple and uninteresting in the same way? No, tensors form a vector space, meaning that they can be scaled and added to each other, and the output will be another tensor. To create more interesting tensors, you can take linear combinations of simple tensors. Again, let’s make an analogy to something familiar: vectors. Any vector can be thought of as a linear combination of a set of “basis vectors” (and that’s how we get the vector’s coordinates). In 2-d space, using the standard basis, the two basis vectors are [0,1] and [1,0]. Every other vector is a linear combination of those two “simple” vectors. Tensors work the same way. In fact, if you start with a n-dimensional vector space (with n basis vectors) and a m-dimensional covector space (with m basis covectors), you can construct (n x m) basis (1, 1)-tensors by taking the tensor product of each of the n basis vectors with each of the m basis covectors.</p> <p>To make that more concrete, let’s say n = 2 and m = 3 and let’s use the standard basis. You can construct the following 6 basis (1, 1)-tensors:</p> <script type="math/tex; mode=display">% <![CDATA[ \newcommand{\vec}{\left[\begin{matrix}#1\\#2\end{matrix}\right]} \newcommand{\covec}{\left[\begin{matrix}#1 & #2 & #3\end{matrix}\right]} \newcommand{\mat}{\left[\begin{matrix}#1 & #3 & #5 \\ #2 & #4 & #6\end{matrix}\right]} \newcommand{\VS}{V^*} \newcommand{\reals}{\mathbb{R}} \vec{1}{0} \otimes \covec{1}{0}{0} = \mat{1}{0}{0}{0}{0}{0} \\ \vec{1}{0} \otimes \covec{0}{1}{0} = \mat{0}{0}{1}{0}{0}{0} \\ \vec{1}{0} \otimes \covec{0}{0}{1} = \mat{0}{0}{0}{0}{1}{0} \\ \vec{0}{1} \otimes \covec{1}{0}{0} = \mat{0}{1}{0}{0}{0}{0} \\ \vec{0}{1} \otimes \covec{0}{1}{0} = \mat{0}{0}{0}{1}{0}{0} \\ \vec{0}{1} \otimes \covec{0}{0}{1} = \mat{0}{0}{0}{0}{0}{1} \\ %]]></script> <p>Now it’s easy to see how those 6 “simple” (1, 1)-tensors form a basis for any (2 x 3)-dimensional (1, 1)-tensor. Another thing that this example makes clear is that (1, 1) does not describe the dimensions of the matrix, it describes the number of vectors and covectors that were combined (via the tensor product) to create the tensor. What is the dimension of the (1, 1)-tensor? In this case it’s (2 x 3), but more generally if we take <script type="math/tex">dim(x)</script> to be the dimension of <script type="math/tex">x</script>, an (n, m)-tensor has dimension <script type="math/tex">dim(v_1) dim(v_2) \cdots dim(v_n) dim(c_1) dim(c_2) \cdots dim(c_m)</script>. These things can get big, fast!</p> <p>So what about these linear functions? I started the post by saying: <em>An (n, m)-tensor is a multi-linear function from m vectors and n covectors to a scalar</em>, and yet we’ve barely mentioned functions at all. Well, remember when I said that covectors were <em>functions from a vector to a scalar</em>? We were on to something there.</p> <p>Let’s denote the vector space of vectors as <script type="math/tex">V</script>. Let’s denote the vector space of covectors (called the dual vector space) with the symbol <script type="math/tex">\VS</script>. Another way to write this would be <script type="math/tex">V \rightarrow \reals</script>, since covectors are functions from a vector to a scalar (in my examples, I’ll use the reals as an example of a scalar, but it could be any field, i.e. rational, algebraic, reals, complex, etc.). So, what do we get when we take the tensor product of a vector and a covector? We already know this: a matrix, i.e. a (1, 1)-tensor. But what <em>is</em> a matrix? As I mentioned above, you can think about a matrix as a (linear) function from vectors to vectors, i.e. <script type="math/tex">V \rightarrow V</script>. What if we rewrote that as <script type="math/tex">V \rightarrow (\VS \rightarrow \reals)</script>? Kind of weird at first, but if you can think about a covector as a function from a vector to a scalar, can’t we similarly think about a vector as a function from a covector to a scalar? In other words, a covector * vector is a scalar. If we have one argument (either the covector or the vector), then we can treat that argument as fixed and we’re left with a function from the other argument to a scalar. So, to summarize: <script type="math/tex">(V \times \VS) \rightarrow \reals</script>, <script type="math/tex">V \rightarrow V</script>, <script type="math/tex">V \rightarrow (\VS \rightarrow \reals)</script>, and <script type="math/tex">\VS \rightarrow (V \rightarrow \reals)</script> are all ways of saying the same thing.</p> <p>What do those statements mean in the familiar context of a matrix?</p> <ul> <li><script type="math/tex">(V \times \VS) \rightarrow \reals</script> is saying a matrix is: A function from a row vector and a vector to a scalar. Well, a row vector * a matrix * a vector = a scalar, so yea that checks out.</li> <li><script type="math/tex">V \rightarrow V</script> is saying a matrix is: A function from a vector to a vector. Yes, a matrix * a vector = a vector.</li> <li><script type="math/tex">V \rightarrow (\VS \rightarrow \reals)</script> is saying a matrix is: A function from a vector to (a function from a row vector to a scalar). A little weird, but ok, since a matrix * a vector = a vector, and vectors <em>are</em> functions from row vectors to scalars.</li> <li><script type="math/tex">\VS \rightarrow (V \rightarrow \reals)</script> is saying a matrix is: A function from a row vector to (a function from a vector to a scalar). Huh, this one is a little new. What’s a (1 x n) row vector * an (n x m) matrix? Well, it’s a (1 x m) row vector. And what’s a (1 x m) row vector? We can think of it like a function from an (m x 1) vector to a scalar. Ok, checks out!</li> </ul> <p>So, our (1, 1)-tensor is like a function from a vector and a covector to a scalar, i.e. <script type="math/tex">(V \times \VS) \rightarrow \reals</script>. Furthermore, that function can be “partially applied”, i.e. if you pass in just the vector, you get a function from a covector to a scalar: <script type="math/tex">V \rightarrow (\VS \rightarrow \reals)</script>. Likewise, if you pass just the covector, you get a function from a vector to a scalar: <script type="math/tex">\VS \rightarrow (V \rightarrow \reals)</script>.</p> <p>I think we’re ready to level up from (1, 1)-tensors. What about a (2, 1)-tensor? A (2, 1)-tensor is a (linear) function from 2 covectors and 1 vector to a scalar: <script type="math/tex">(V \times \VS \times \VS) \rightarrow \reals</script>. If you provide one covector, you’re left with a (1, 1)-tensor, i.e. <script type="math/tex">\VS \rightarrow ((V \times \VS) \rightarrow \reals)</script>. So, with this recursive viewpoint, we can build up an understanding of an (n, m)-tensor. An (n, m)-tensor is a function from n covectors and m vectors to a scalar, i.e. <script type="math/tex">(\VS_1 \times \VS_2 \times \cdots \times \VS_n \times V_1 \times V_2 \times \cdots \times V_m) \rightarrow \reals</script>.</p> <!-- Dimensionality, revisited: Remember when we previously said that the dimension of an (n, m)-tensor is $$dim(v_1) dim(v_2) \cdots dim(v_n) dim(c_1) dim(c_2) \cdots dim(c_m)$$? Let's revisit that with our new understanding of tensors as linear functions. To keep things manageable, let's say we have a one dimensional vector which repesents the size of a house, and the size can only be one of three values {small, medium, large}. In addition, we have two (linear) functions that take our one-dimensional size "vector" and produce a scalar. To keep things concrete, function A estimates the value of the house from the size, and function B estimates the number of bedrooms. How many different --> <!-- say we have a one dimensional vector space, maybe our one dimension is the number of square feet of a house. And we have a linear function from that vector space to a real number (our covector). Maybe it represents the average price of a house with that many square feet. If we take the tensor product of our vector space and covector space, we have a (1, 1)-tensor, a function from a square footage and -->I just completed the very good youtube playlist Tensors for Beginners by eigenchris and I want to jot down some notes before I forget everything.Pythagorean Proof2019-10-17T00:00:00+00:002019-10-17T00:00:00+00:00http://blog.russelldmatt.com/2019/10/17/pythagorean-proof<p>A particularly beautiful proof of the Pythagorean Theorem:</p> <video controls="" style="min-width: 300px; max-width: 100%; max-height: 800px; border: 2px solid gray;"> <source src=" /assets/by-post/pythagorean-proof/pythag-proof.mp4" />" type="video/mp4"&gt; Your browser does not support the video tag. </video>A particularly beautiful proof of the Pythagorean Theorem:Using the Simulation Hypothesis Against Itself2019-07-12T00:00:00+00:002019-07-12T00:00:00+00:00http://blog.russelldmatt.com/2019/07/12/simulation-hypothesis-against-itself<p>Let’s formulate the <a href="https://en.wikipedia.org/wiki/Simulation_hypothesis#Simulation_hypothesis">simulation hypothesis</a>, which we will call H:</p> <ol> <li> <p>Conscious beings will eventually figure out how to simulate other conscious beings.</p> </li> <li> <p>When they do so, they will simulate <em>many</em> more of them than ever existed in their universe.</p> </li> <li> <p>Therefore, if all you know is that you are a conscious being, the probability that you exist in the first, top-level, non-simulated universe is extraordinarily small given the fact that the vast majority of conscious beings live in the lower levels of the simulations.</p> </li> </ol> <p>One interesting implication of this line of reasoning is that there are likely to be many levels of this simulation. Conscious beings in the first, top-level, non-simulated universe will simulate a universe of conscious beings in the level below them, who in turn simulate a universe of conscious beings in the level below them, and so on.</p> <p>Let’s formulate a similar hypothesis, H’, along those lines:</p> <ol> <li> <p>Conscious beings will eventually figure out how to simulate other conscious beings.</p> </li> <li> <p>Every level of the simulation will have fewer resources than the level above it.</p> </li> <li> <p>Our universe has a finite amount of resources. This is arguably a fact, not a hypothesis, but what is a fact other than a hypothesis which is extraordinarily likely to be correct, so let’s include it.</p> </li> <li> <p>Therefore, the number of levels is not infinite. There exists a “bottom” level, which will never successfully simulate another level below it.</p> </li> <li> <p>Each level (other than the bottom) will simulate <em>many</em> more conscious beings than ever existed in their level.</p> </li> <li> <p>Therefore, the vast majority of conscious beings will exist in the “bottom” level.</p> </li> </ol> <p>Hypothesis H’ sounds perfectly in line with the hypothesis H, since H’s main conclusion was that it’s unlikely you live in the top, non-simulated, level, which H’ agrees with. H’ just goes a bit further and states that, not only do you not live in the top level, but there is a high probability that you live in the bottom level of the simulation for mostly the same set of reasons. It’s important to note that, by definition, the bottom level will never figure out how to simulate a level below it.</p> <p>Whether or not we will ever simulate consciousness is highly disputed (in fact, we’re disputing it right now). But if you think that there is a high probability that we will eventually simulate consciousness, as many smart people do, then you must think that P(H’) is relatively low, since H’ implies that the probability that we live in the bottom level of the simulation (which will not be able to simulate a level below) it is high.</p> <p>In addition, to my eye, P(H) and P(H’) seem to be highly correlated. They mostly rely on the same argument, they just emphasize slightly different aspects of the same conclusion, which is that there is this pyramid of simulated universes in which each level has drastically more conscious beings than the level above it.</p> <p>So, it seems to me (note the hedging, as this is quite provisional), that the higher you think the probability is that we will eventually simulate consciousness is, the lower you should think that P(H) is (since H’ is evidence against our ability to simulate consciousness and P(H’) and P(H) are highly correlated). However, if we do simulate consciousness, then point 1 of hypothesis H (H.1) is true (H.1 says that conscious beings will eventually figure out how to simulate other conscious beings). And, at least in my opinion, H.1 is the point that, a priori, had the lowest probability of being true! In fact, conditional on point 1 being true, I’d say that P(H) is almost 1 because all the other points in H seem so obviously correct.</p> <p>To summarize, <br /></p> <hr /> <p>Proof by contradiction:</p> <p>Assume statement SC: It is very likely that we will be able to simulate consciousness.</p> <ul> <li>P(H’) ≈ 0 because… hand-wavy math? Ok fine: <ol> <li>P(SC) = P(H’ &amp; SC) + P(¬ H’ &amp; SC) ≈ 1 (by assumption)</li> <li>∴ P(¬ SC) ≈ 0 (from 1)</li> <li>∴ P(H’ &amp; ¬ SC) ≈ 0 (from 2)</li> <li>P(H’ &amp; SC) « P(H’ &amp; ¬ SC) (H’ implies that you’re likely at bottom and can’t simulate)</li> <li>∴ P(H’ &amp; SC) ≈ 0 (from 3, 4)</li> <li>∴ P(H’) = P(H’ &amp; SC) + P(H’ &amp; ¬SC) ≈ 0 (from 3, 5)</li> </ol> </li> <li> <p>P(H) is very small because P(H) is highly correlated with P(H’).</p> </li> <li>Also, P(H) is very close to 1 because <ul> <li>P(H.1) ≈ 1 due to P(SC) ≈ 1 (our assumption)</li> <li>P(H) = P(H.3 | H.1 &amp; H.2) * P(H.2 | H.1) * P(H.1)</li> <li>P(H.2 | H.1) ≈ 1 and (H.3 | H.1 &amp; H.2) ≈ 1 because they’re “obvious”</li> <li>∴ P(H) ≈ 1</li> </ul> </li> </ul> <p>Contradiction!</p> <p><strong>Therefore, it is very unlikely that we will be able to simulate consciousness.</strong></p> <hr /> <p><br /></p> <p>Obviously, the above “proof” isn’t really a proof, and we haven’t found a literal contradiction, but it does feel hand-wavily correct to me…</p> <p>“But wait!”, you may be thinking, “I thought this article was going to use the simulation hypothesis to argue against itself, which means arguing that we don’t live in a simulation - not that we won’t be able to simulate consciousness”. Ok, you got me. This doesn’t <em>directly</em> argue against the simulation hypothesis. But, I did say that P(H) is close to one if we can simulate consciousness (SC), i.e. P(H | SC) ≈ 1. And P(H) = P(H | SC) * P(SC) + P(H | ¬ SC) * P(¬ SC). So if you previously thought that P(SC) was higher than you do now, then presumably your new P(H) is also lower, since the weight you’re putting on P(H | SC) - a number close to 1 - is now lower.</p> <p>One last thought. Although I don’t yet see a critical flaw in the argument, I’m not particularly convinced by it either, which is kind of odd. And I’m not really sure why either. I think it’s partly that all these arguments are so counter-intuitive and abstract that I don’t really trust myself to be able to spot the logical flaw. But I’ll do my best - so with that in mind…</p> <h3 id="potential-holes-work-in-progress">Potential holes: work in progress</h3> <p>H’ states that because our universe is finite, and each level has fewer resources than the level above it, there are not infinite levels - but this isn’t strictly true. It means that any pyramid of simulations which includes us is finite. The level above us could be infinite. An infinite universe could, for whatever reason, choose to run a simulation with finite resources. It could also choose to run some simulations with finite resources and others with infinite. So you could imagine a branching tree of simulations, some of which are finitely long, while others are infinite. In the infinite chains, there would be no bottom, since infinity doesn’t end. So the conclusion that “most conscious beings will exist in the bottom simulation” would be patently untrue in that case, since <em>the vast majority</em> (or all? infinities are hard…) would live in the infinitely long chains. All that said, if that picture were true, it’d be completely improbable that we’d live in one of the few universes with finite resources, which we do, so I guess it’s probably not true.</p> <p>I’m sure there are many more holes… will add to this as I think of them.</p>Let’s formulate the simulation hypothesis, which we will call H:Deriving the Taylor Series2019-06-27T00:00:00+00:002019-06-27T00:00:00+00:00http://blog.russelldmatt.com/2019/06/27/Taylor-Series<div style="display:flex;align-items:center"> <div style="margin-right: 15px; margin-bottom:15px;"> <svg height="64" class="octicon octicon-info" viewBox="0 0 14 16" version="1.1" width="56" aria-hidden="true"><path fill-rule="evenodd" d="M6.3 5.69a.942.942 0 01-.28-.7c0-.28.09-.52.28-.7.19-.18.42-.28.7-.28.28 0 .52.09.7.28.18.19.28.42.28.7 0 .28-.09.52-.28.7a1 1 0 01-.7.3c-.28 0-.52-.11-.7-.3zM8 7.99c-.02-.25-.11-.48-.31-.69-.2-.19-.42-.3-.69-.31H6c-.27.02-.48.13-.69.31-.2.2-.3.44-.31.69h1v3c.02.27.11.5.31.69.2.2.42.31.69.31h1c.27 0 .48-.11.69-.31.2-.19.3-.42.31-.69H8V7.98v.01zM7 2.3c-3.14 0-5.7 2.54-5.7 5.68 0 3.14 2.56 5.7 5.7 5.7s5.7-2.55 5.7-5.7c0-3.15-2.56-5.69-5.7-5.69v.01zM7 .98c3.86 0 7 3.14 7 7s-3.14 7-7 7-7-3.12-7-7 3.14-7 7-7z" /></svg> </div> <p>My previous post <a href="/2019/06/26/Maclaurin-Series.html">Deriving the Maclaurin Series</a> is a prerequisite for this post.</p> </div> <hr /> <p><br /></p> <p>We’ve just learned how to approximate a function <script type="math/tex">f(x)</script> using the Maclaurin Series. It’s a pretty sweet hammer, so let’s start hitting some nails. How about <script type="math/tex">e^x</script>?</p> <p><img src=" /assets/by-post/Taylor-Series/exp.png" style="max-width: 100%;" /></p> <p>If you care about the region around 0, you’re golden. This is a great approximation. But what if you care about <script type="math/tex">x = 6</script>? It’s… not great. One option is to add more terms. We used the first 5 terms of the Maclaurin Series. We could use 100. That would definitely work. However, if you know beforehand that you care about a particular region that is not the region around 0, there’s a better way.</p> <div class="aside"> <p>When might you know beforehand which x-region you care about in real life? Actually, quite often. Let’s say the function you’re approximating is the percentage humidity in the air as a function of temperature. Are you particularly interested in a temperature of 0? Probably not. You’re probably most interested in temperatures that are close to your current temperature. If you come up with real-life examples, there’s often a natural “default” x value and you probably care most about the x-region around that default value.</p> </div> <p>So what’s the better way? It’s called the Taylor Series, and it’s just two small steps away from the Maclaurin Series.</p> <h4 id="step-1">Step 1</h4> <p>We want to polynomial <script type="math/tex">p</script> that best approximates a function <script type="math/tex">f</script> around the point <script type="math/tex">f(a)</script>. However, as a first step, we’re first going to solve for a polynomial <script type="math/tex">p'</script> that is almost the polynomial we want. It’s going to be exactly the Maclaurin Series, except in place of <script type="math/tex">f(0)</script>, <script type="math/tex">f'(0)</script>, etc., we’re going to use <script type="math/tex">f(a)</script>, <script type="math/tex">f'(a)</script>, etc. I.e.</p> <script type="math/tex; mode=display">p'(x) = f(a) + f'(a)x + \frac{f''(a)}{2!}x^2 + \frac{f'''(a)}{3!}x^3 + \dots + \frac{f^{(n)}(a)}{n!}x^n + \ldots</script> <p>Because we’re now using the function value (and derivatives) around <script type="math/tex">x = a</script>, the resulting polynomial will look very similar to <script type="math/tex">f</script> around <script type="math/tex">a</script>. The only problem is that it’s still centered around the origin.</p> <p>A picture is going to be much easier to understand:</p> <p><img src=" /assets/by-post/Taylor-Series/exp-p-prime.png" style="max-width: 100%;" /></p> <p>Notice how the red line, <script type="math/tex">p'(x)</script> has the same value and shape around <script type="math/tex">x = 0</script> as <script type="math/tex">f(x)</script> does around <script type="math/tex">x = 6</script>?</p> <h4 id="step-2">Step 2</h4> <p>This is the easy step, we need to shift our function to the right by <script type="math/tex">a</script>. You may just remember this back from algebra, but if not, we want <script type="math/tex">p(x) = p'(x - a)</script>.</p> <script type="math/tex; mode=display">% <![CDATA[ \begin{align*} p(x) &= p'(x - a) \\ &= f(a) + f'(a)(x - a) + \frac{f''(a)}{2!}(x - a)^2 + \frac{f'''(a)}{3!}(x - a)^3 + \dots + \frac{f^{(n)}(a)}{n!}(x - a)^n + \ldots \end{align*} %]]></script> <p>And, not that it will surprise you, but here’s the final picture (re-centered around <script type="math/tex">x = 6</script>):</p> <p><img src=" /assets/by-post/Taylor-Series/exp-p.png" style="max-width: 100%;" /></p> <div class="aside"> <p>Notice that <script type="math/tex">p(x)</script> is a <em>much</em> worse approximation of <script type="math/tex">e^x</script> around <script type="math/tex">x = 0</script>, similar to how the Maclaurin Series was a terrible approximation around <script type="math/tex">x = 6</script>. There’s no magic going on here. We traded off accuracy around <script type="math/tex">x=0</script> for accuracy around <script type="math/tex">x=6</script>. If you need more accuracy everywhere, you need more terms.</p> </div>Deriving the Maclaurin Series2019-06-26T00:00:00+00:002019-06-26T00:00:00+00:00http://blog.russelldmatt.com/2019/06/26/Maclaurin-Series<p>Do you remember that formula from calculus that states</p> <script type="math/tex; mode=display">f(x) = f(0) + f'(0)x + \frac{f''(0)}{2!}x^2 + \frac{f'''(0)}{3!}x^3 + \dots + \frac{f^{(n)}(0)}{n!}x^n + \ldots</script> <p>Two questions: why is it true and why is it useful?</p> <h2 id="why-its-true">Why it’s true</h2> <p>What are we trying to accomplish with Maclaurin Series? We are trying to find a <em>polynomial</em> which equals the function <script type="math/tex">f(x)</script> (e.g. <script type="math/tex">e^x</script> or <script type="math/tex">sin^2(x)</script>).</p> <p>When are two functions equal? Two functions are equal if they have the same value for all inputs (all values of <script type="math/tex">x</script>, in this case). One way that could be true is that the functions have the same value at <script type="math/tex">x = 0</script>, as well as the same derivative, the same second derivative, the same third derivative, etc. This isn’t air tight (and indeed, <a href="https://en.wikipedia.org/wiki/Non-analytic_smooth_function">it isn’t always true</a>), but it holds for many functions<sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup> and should seem somewhat intuitive. If a function is the same at a certain point, and the amount it changes around that point (its first derivative) is the same, and the amount that changes around that point (its second derivative) is the same, <em>all</em> the way down, then how can these functions ever diverge? Well, often they don’t.</p> <p>So, to recap, we’re going to look for a polynomial, <script type="math/tex">p</script>, that has the same value as <script type="math/tex">f</script> at <script type="math/tex">x=0</script>, as well as the same derivative, for <em>all</em> derivatives.</p> <p>What’s the form of a polynomial? It looks something like this:</p> <script type="math/tex; mode=display">p(x) = a + bx + cx^2 + dx^3 + ex^4 + \ldots</script> <h4 id="0th-derivative">0th derivative</h4> <p>We need <script type="math/tex">p(0) = f(0)</script>. When <script type="math/tex">x = 0</script> all terms of the polynomial <script type="math/tex">p</script> go to zero, other than the first. In other words:</p> <script type="math/tex; mode=display">% <![CDATA[ \begin{align*} f(0) &= p(0) \\ &= a + bx + cx^2 + dx^3 + ex^4 + \ldots |_{x=0} \\ &= a \end{align*} %]]></script> <p>We’ve discovered our first coefficient in <script type="math/tex">p</script>: <script type="math/tex">p(x) = f(0) + bx + cx^2 + dx^3 + ex^4 + \ldots</script></p> <h4 id="1st-derivative">1st derivative</h4> <p>We need <script type="math/tex">p'(0) = f'(0)</script>.</p> <script type="math/tex; mode=display">% <![CDATA[ \begin{align*} f'(0) &= p'(0) \\ &= 0 + b + 2cx + 3dx^2 + 4ex^3 + \ldots |_{x=0} \\ &= b \end{align*} %]]></script> <p>We’ve discovered our second coefficient in <script type="math/tex">p</script>: <script type="math/tex">p(x) = f(0) + f'(0)x + cx^2 + dx^3 + ex^4 + \ldots</script></p> <h4 id="2nd-derivative">2nd derivative</h4> <p>We need <script type="math/tex">p''(0) = f''(0)</script>.</p> <script type="math/tex; mode=display">% <![CDATA[ \begin{align*} f''(0) &= p''(0) \\ &= 0 + 0 + 2c + 3 \cdot 2dx + 4 \cdot 3ex^2 + \ldots |_{x=0} \\ &= 2c \end{align*} %]]></script> <p>We’ve discovered our third coefficient in <script type="math/tex">p</script>: <script type="math/tex">p(x) = f(0) + f'(0)x + \frac{f''(0)}{2}x^2 + dx^3 + ex^4 + \ldots</script></p> <h4 id="3rd-derivative">3rd derivative</h4> <p>We need <script type="math/tex">p'''(0) = f'''(0)</script>.</p> <script type="math/tex; mode=display">% <![CDATA[ \begin{align*} f'''(0) &= p'''(0) \\ &= 0 + 0 + 0 + 3 \cdot 2d + 4 \cdot 3 \cdot 2 ex + \ldots |_{x=0} \\ &= 3!d \end{align*} %]]></script> <p>We’ve discovered our fourth coefficient in <script type="math/tex">p</script>: <script type="math/tex">p(x) = f(0) + f'(0)x + \frac{f''(0)}{2}x^2 + \frac{f'''(0)}{3!}x^3 + ex^4 + \ldots</script></p> <h4 id="and-beyond">And beyond</h4> <p>We could continue this pattern (seriously, try a few), but at this point you’re probably seeing a pattern emerge. The <script type="math/tex">n</script>th term of the polynomial seems to be</p> <script type="math/tex; mode=display">\frac{f^{(n)}(0)}{n!}x^n</script> <p>making the entire function</p> <script type="math/tex; mode=display">p(x) = \sum_{n=0}^{\infty} \frac{f^{(n)}(0)}{n!}x^n</script> <h2 id="why-its-useful">Why it’s useful</h2> <p>Take this section with a grain of salt. It’s very possible that I don’t know the most important or useful practical applications of the Maclaurin Series. But here is my answer: polynomials are easy! They’re way nicer to deal with than arbitrary functions. In addition, derivative are (pretty) easy, and that’s all we need to turn an arbitrary function into polynomial.</p> <p>For example, what’s the integral of <script type="math/tex">ln(x + e^{a + x^2})</script>? Uhh…. wolfram alpha, anyone?</p> <p>How about the integral of:</p> <script type="math/tex; mode=display">p(x) = e^{-a} + (2 - e^{-2a})x + e^{-3a}(2 - 6e^{2a})x^2 + \ldots</script> <p>Sure, it’s a big equation, but it’s completely trivial to take that integral (remember, <script type="math/tex">a</script> is just a constant).</p> <script type="math/tex; mode=display">\int p(x) = c + e^{-a}x + \frac{(2 - e^{-2a})}{2} x^2 + \frac{e^{-3a}(2 - 6e^{2a})}{3} x^3 + \ldots</script> <p>And (big surprise) <script type="math/tex">p(x)</script> is the first few terms of the Maclaurin Series of <script type="math/tex">ln(x + e^{a + x^2})</script>.</p> <p>They’re also fast. Let’s say you have a function which is expensive to compute, and whose input is changing relatively quickly. Maybe instead of recomputing your expensive function every time your input changes, you could approximate it with some large, but finite number of terms of its Maclaurin Series. Then, re-evaluating it will take almost no time at all!</p> <h2 id="bonus-section">Bonus section!</h2> <p>Hopefully you’ve followed the sections above, but if not, maybe a concrete example can help.</p> <p>Let’s consider the function <script type="math/tex">f(x) = sin(x) + 1</script> in the domain of <script type="math/tex">[-10, 10]</script>. We can approximate this function using the first <script type="math/tex">n</script> terms of the Maclaurin Series. As <script type="math/tex">n</script> increases, our approximation looks closer and closer to the original <script type="math/tex">f(x)</script>.</p> <script src="/assets/js/p5/0.8.0/p5.js"></script> <script src=" /assets/by-post/Maclaurin-Series/sketch.js"></script> <style> #plot { box-shadow: 5px 5px 5px grey; border: 1px solid grey; width: 500px; max-width: 100%; display: block; margin: 0 auto 30px; position: relative; } /* https://stackoverflow.com/questions/5445491/height-equal-to-dynamic-width-css-fluid-layout#answer-6615994 */ /* Get a 1:1 aspect ratio */ #plot #dummy { margin-top: 100%; } #plot canvas { position: absolute; top: 0; left: 0; } </style> <div id="plot"> <div id="dummy"></div> </div> <div class="aside"> <p>Notice how the red line (the Maclaurin Series approximation of <script type="math/tex">f(x)</script>) only seems to change when going from an odd number of terms to an even number of terms. Can you figure out why that is?</p> </div> <div class="footnotes"> <ol> <li id="fn:1"> <p>Analytic functions is the technical term for the class of functions for which the Maclaurin Series (and the more general Taylor Series) holds. <a href="#fnref:1" class="reversefootnote">&#8617;</a></p> </li> </ol> </div>Do you remember that formula from calculus that statesAyumu2019-04-26T00:00:00+00:002019-04-26T00:00:00+00:00http://blog.russelldmatt.com/2019/04/26/ayumu<p>Check out the 7-year old chimp Ayumu play this memory game:</p> <iframe width="560" height="315" src="https://www.youtube.com/embed/qyJomdyjyvM" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe> <div class="aside"> <p>(sorry, I think playback on other sites is disabled for the embedded clip - so you’ll have to go to youtube to watch it)</p> </div> <p>Then, try your hand at the same game! The rules are simple:</p> <ul> <li>Click the white circle</li> <li>The numbers 1-5 numbers will flash for 200 ms before become hidden behind white squares <ul> <li>both the number of numbers (5) and the amount of time they will show for (200ms) are configurable via the sliders below the game</li> </ul> </li> <li>Click on the squares corresponding to the numbers in ascending order, i.e. click on the square hiding 1 first, then 2, and so on</li> <li>See how many sequences you can get right in a row!</li> </ul> <p>I highly recommend playing with all 9 numbers a few times and then re-watch the youtube video to fully appreciate Ayumu’s inhuman abilities.</p> <script src="/assets/js/p5/0.8.0/p5.js"></script> <script src="/assets/js/p5/0.8.0/p5.dom.js"></script> <script src="/assets/js/p5/0.8.0/p5.sound.js"></script> <script src=" /assets/by-post/ayumu/sketch.js"></script> <div id="game"></div> <p><br /></p> <hr /> <p><br /> <a href="https://io9.gizmodo.com/this-chimp-will-kick-your-ass-at-memory-games-but-how-5883579">Here’s another article about Ayumu</a>.</p>Check out the 7-year old chimp Ayumu play this memory game:Eigenvectors II: Steady state2019-03-23T00:00:00+00:002019-03-23T00:00:00+00:00http://blog.russelldmatt.com/2019/03/23/eigenvectors-part-two<p>Try your hand at this problem:</p> <p>Let’s say there are only three families of operating systems in the world: Windows, Linux, and Mac. Let’s also say that, every year:</p> <ul> <li>20% of Windows users switch to Mac, and 1% switch to Linux.</li> <li>3% of Mac users switch to Windows, and 10% switch to Linux.</li> <li>2% of Linux users switch to Windows, and 5% switch to Mac.</li> </ul> <p>In the far future, which OS will have the most users? And what percentage of users will they have?</p> <p>As always, you’ll learn a lot more if you actively participate so if you want to give it a shot, now would be the time.</p> <p>Given this title of this post, this may not come as a shock, but this is a perfect problem for linear algebra. However, like in the last post, let’s not jump right into matrices because I want to connection to plain ol’ algebra to be clear.</p> <h3 id="plain-ol-algebra">Plain ol’ Algebra</h3> <p>So, let’s set off to solve this world problem with normal algebra. We have three variables we need to keep track of:</p> <ul> <li>the number of windows users (as a percentage of total users) - let’s call that <script type="math/tex">W</script></li> <li>the number of mac users (as a percentage) - let’s call that <script type="math/tex">M</script></li> <li>the number of linux users (as a percentage) - <script type="math/tex">L</script></li> </ul> <p>Now let’s convert the yearly user transition information into a system of three equations. I’m going to use a subscript <script type="math/tex">X_n</script> to represent the value of the variable <script type="math/tex">X</script> in year <script type="math/tex">n</script>.</p> <script type="math/tex; mode=display">% <![CDATA[ \begin{align*} W_{n+1} &= 0.79 W_n + 0.03 M_n + 0.02 L_n \\ M_{n+1} &= 0.20 W_n + 0.87 M_n + 0.05 L_n \\ L_{n+1} &= 0.01 W_n + 0.10 M_n + 0.93 L_n \\ \end{align*} %]]></script> <p>So far, all we’ve done is rote translation of our word problem into a set of equations. Now, we need to make a conceptual leap. Let’s assume there exist some values of <script type="math/tex">W</script>, <script type="math/tex">M</script>, and <script type="math/tex">L</script> for which <script type="math/tex">W_{n+1} = W_n</script>, <script type="math/tex">M_{n+1} = M_n</script>, and <script type="math/tex">L_{n+1} = L_n</script>. <em>If</em> that were true, then we could say that <script type="math/tex">W_{n+2} = W_{n+1}</script> (to see why, replace <script type="math/tex">X_n</script> with <script type="math/tex">X_{n+1}</script> in the system of equations above). In fact, we could say that <script type="math/tex">W_x = W_n</script> for any <script type="math/tex">x</script>. In other words, the values of <script type="math/tex">W</script>, <script type="math/tex">M</script>, and <script type="math/tex">L</script> would not change from year to year. Let’s call these hypothetical values of <script type="math/tex">W</script>, <script type="math/tex">M</script>, and <script type="math/tex">L</script>: <script type="math/tex">W_s</script>, <script type="math/tex">M_s</script>, and <script type="math/tex">L_s</script>, respectively (<script type="math/tex">s</script> for <em>steady state</em>). Now, we can re-write our system of equations as follows:</p> <script type="math/tex; mode=display">% <![CDATA[ \begin{align*} W_s &= 0.79 W_s + 0.03 M_s + 0.02 L_s \tag{1} \\ M_s &= 0.20 W_s + 0.87 M_s + 0.05 L_s \tag{2} \\ L_s &= 0.01 W_s + 0.10 M_s + 0.93 L_s \tag{3} \\ \end{align*} %]]></script> <p>Three equations, three unknowns… you know the drill! Feel free to skip this next section of pure algebra. There is nothing interesting or fancy going on here.</p> <script type="math/tex; mode=display">% <![CDATA[ \begin{align*} W_s &= 0.79 W_s + 0.03 M_s + 0.02 L_s \tag{1} \\ 0.21 W_s &= 0.03 M_s + 0.02 L_s \\ W_s &= \frac{0.03 M_s + 0.02 L_s}{0.21} \\ M_s &= 0.20 \frac{0.03 M_s + 0.02 L_s}{0.21} + 0.87 M_s + 0.05 L_s \tag{sub $W_s$ into 2} \\ M_s &\approx 0.8986 M_s + 0.069 L_s \\ 0.1014 M_s &\approx 0.069 L_s \\ M_s &\approx 0.512 L_s \\ W_s &\approx \frac{0.03 0.512 L_s + 0.02 L_s}{0.21} \\ W_s &\approx 0.1925 L_s \\ L_s &= 0.01(0.1925 L_s) + 0.1(0.512 L_s) + L_s \tag{sub $W_s$ and $M_s$ into 3} \\ L_s &= 1 L_s \\ \end{align*} %]]></script> <p>Huh… well that’s not super helpful. Oh yea, since they represent fractions of the total user base, they all need to add up to one!</p> <script type="math/tex; mode=display">% <![CDATA[ \begin{align*} W_s + M_s + L_s &= 1 \\ 0.1925 L_s + 0.512 L_s + L_s &\approx 1 \\ 0.1925 L_s + 0.512 L_s + L_s &\approx 1 \\ W_s &\approx 0.1028 \\ M_s &\approx 0.3634 \\ L_s &\approx 0.5338 \\ \end{align*} %]]></script> <p>QED! So, the distribution of users that is <em>stable</em>, i.e. will not change over time, is ~54% Linux, ~36% Mac, and ~10% Windows. We haven’t really proven that we’ll actually ever get to that distribution (we won’t, but we’ll approach it), just that <em>if</em> we get there, we will stay there. But, I’m a bit antsy to get to the linear algebra section, given that’s the whole point of this post, so let’s move on.</p> <h3 id="linear-algebra">Linear algebra</h3> <p>Now that we know how to solve the problem without linear algebra, let’s set off to solve it with linear algebra.</p> <p>First things first, let’s translate our system of three equations into a matrix, since that’s one way to think about what matrices are:</p> <script type="math/tex; mode=display">% <![CDATA[ \newcommand{T}{\left[\begin{matrix}0.79 & 0.03 & 0.02\\0.2 & 0.87 & 0.05\\0.01 & 0.1 & 0.93\end{matrix}\right]} \newcommand{\vec}{\left[\begin{matrix}#1\\#2\\#3\end{matrix}\right]} \newcommand{\vv}{\overrightarrow{#1}} %]]></script> <script type="math/tex; mode=display">\T \vec{W_n}{M_n}{L_n} = \vec{W_{n+1}}{M_{n+1}}{L_{n+1}}</script> <div class="aside"> <p>As as aside, this matrix is called a <em>transition</em> matrix, since it represents the transition from one year to the next.</p> </div> <p>Right off the bat, this formulation (along with a numerical library like python’s numpy or sympy, which can exponentiate matrices) gives us the ability to quickly come up with a good guess of the answer. Let’s just say that, right now, everybody is a Window’s user. What will the distribution of users be in 1000 years?</p> <script type="math/tex; mode=display">\T^{1000} \vec{1}{0}{0} \approx \vec{0.10275689}{0.36340852}{0.53383459}</script> <p>Look familiar? Yep, that’s basically the same solution that we came up with in the previous section (after a lot of annoying algebra). So, already linear algebra has proven useful.</p> <p>However, I promised you eigenvectors, so let’s deliver on that. Note that I’m assuming that you’ve already read my <a href="/2019/03/09/golden-fibonacci.html">previous post</a>. If you haven’t, this will probably go much too quickly.</p> <p>Unlike the last post, I’m not going to go over how to compute all the eigenvectors by hand (since the idea is the same). Instead, I’ll just use python’s sympy library to compute them for me and I’ll write the three (rounded) eigenvectors - along with their corresponding eigenvalues - below:</p> <script type="math/tex; mode=display">\left [ \left ( 1.0, \quad \left[\begin{matrix}0.1925\\0.6808\\1.0\end{matrix}\right]\right ), \quad \left ( 0.7489, \quad \left[\begin{matrix}0.9011\\-1.9011\\1.0\end{matrix}\right]\right ), \quad \left ( 0.8411, \quad \left[\begin{matrix}-0.1233\\-0.8767\\1.0\end{matrix}\right]\right ) \right ]</script> <p>The first thing to notice is that the eigenvalues of the second and third eigenvectors are less than one, which - like in the fibonacci example - means that any amount of those vectors will decay to zero over time. The eigenvalue of the first eigenvector is 1, which means it will neither decay nor grow over time. It will stay exactly the same as you continually multiply by the matrix. So, with this information, we can essentially predict what will happen for any starting distribution of users.</p> <p>Take an initial vector <script type="math/tex">\vv{v}</script> that represents the starting distribution of users across the three families of operating systems. Decompose <script type="math/tex">\vv{v}</script> into a linear combination of the three eigenvectors, with coefficients <script type="math/tex">a</script>, <script type="math/tex">b</script>, and <script type="math/tex">c</script>, respectively. Over time, <script type="math/tex">b</script> and <script type="math/tex">c</script> will decay to zero as they are multiplied over an over by an eigenvalue less than one. <script type="math/tex">a</script> will stay unchanged (since the eigenvalue corresponding to the first eigenvector is one). So, eventually, you just end up with <script type="math/tex">a</script> times the first eigenvector (which we will call <script type="math/tex">\vv{e_1}</script>).</p> <p>But what is <script type="math/tex">a</script>? Well, there are a few ways to answer this, but one clever way is to notice that our distribution of users must always sum to one (since… that’s what distributions do). And we know that eventually the distribution will be <script type="math/tex">a \vv{e_1}</script>. So, to ensure that the elements of that long-term distribution sum to one, we can set <script type="math/tex">a = 1 / sum(\vv{e_1})</script>. The actual elements of <script type="math/tex">e_1</script> are (approximately) 0.1925, 0.6808, and 1.0, so <script type="math/tex">a \approx 0.5338</script>.</p> <p>And with that, we have our solution:</p> <script type="math/tex; mode=display">0.5338 \left[\begin{matrix}0.1925\\0.6808\\1.0\end{matrix}\right] \approx \vec{0.10275689}{0.36340852}{0.53383459}</script> <p>And (again), we have our answer! By inspecting the eigenvectors and eigenvalues of the transition matrix, we can quickly understand the long term dynamics of our system.</p> <p>One last thing to point out. You might be wondering why python’s sympy said that the first eigenvector was <script type="math/tex">\approx [0.1925, 0.6808, 1]</script> instead of <script type="math/tex">\approx [0.1, 0.36, 0.54]</script>, which was the solution that made sense in our context. Given what’s special about eigenvectors is that they don’t change direction, and they normally <em>do</em> change length (unless it’s a very special case, like in this example, where the corresponding eigenvalue is 1), the length isn’t really important. So, many numerical libraries will just normalize the length of the eigenvector to be one. sympy seems to just set the value of the third element to be one and scales the other elements accordingly. The point is, it doesn’t really matter.</p>Try your hand at this problem:Understanding eigenvectors with Fibonacci2019-03-09T00:00:00+00:002019-03-09T00:00:00+00:00http://blog.russelldmatt.com/2019/03/09/golden-fibonacci<style> p { margin-bottom: 30px; } </style> <p>Here’s the punchline: the ratio between successive terms of the Fibonacci sequence converges to the golden ratio (<script type="math/tex">\varphi</script>, 1.618…).</p> <p>Why give away the punchline so early? Because my goal with this article is to give you an intuitive understanding of eigenvectors (and their corresponding eigenvalues) through explaining this phenomenon, and not the other way around. It’s a tall order, and this is a long post, so let’s get started.</p> <p>Before delving into the why and how, let’s just demonstrate that it’s true.</p> <table> <tbody> <tr> <td>n</td> <td>0</td> <td>1</td> <td>2</td> <td>3</td> <td>4</td> <td>5</td> <td>6</td> <td>7</td> <td>8</td> </tr> <tr> <td>nth term of Fib</td> <td>0</td> <td>1</td> <td>1</td> <td>2</td> <td>3</td> <td>5</td> <td>8</td> <td>13</td> <td>21</td> </tr> <tr> <td>nth term / {n-1}th term</td> <td> </td> <td> </td> <td>1</td> <td>2</td> <td>1.5</td> <td>1.66</td> <td>1.6</td> <td>1.625</td> <td>1.615</td> </tr> </tbody> </table> <p>A graph might be better at illustrating just how quickly the ratio approaches <script type="math/tex">\varphi</script>.</p> <p><img src=" /assets/by-post/golden-fibonacci/images/ratios.png" /></p> <h3 id="the-fib-function">The fib function</h3> <p>Let’s imagine a function which takes two (consecutive) elements of the Fibonacci sequence, and returns the next two (with one overlapping). It’s a simple function - here it is:</p> <script type="math/tex; mode=display">% <![CDATA[ \newcommand{\vv}{\overrightarrow{#1}} \newcommand{\vec}{\left[\begin{matrix}#1\\#2\end{matrix}\right]} \newcommand{\fib}{fib(\vec{#1}{#2}) &= \vec{#2}{#3}} %]]></script> <script type="math/tex; mode=display">\begin{align*} \fib{x}{y}{x+y} \end{align*}</script> <p>For example:</p> <script type="math/tex; mode=display">\begin{align*} \fib{0}{1}{1} \\ \fib{1}{1}{2} \\ \fib{1}{2}{3} \\ \fib{2}{3}{5} \\ \fib{3}{5}{8} \\ \end{align*}</script> <p>You get the idea.</p> <h3 id="eigenvectors">Eigenvectors</h3> <p>Ok, so how does this all relate to eigenvectors? Well, what even is an eigenvector? Google says it’s</p> <blockquote> <p>a vector which when operated on by a given operator gives a scalar multiple of itself.</p> </blockquote> <p>That’s a bit abstract (as math often is), so let’s put it in the context of our fib function.</p> <blockquote> <p>an input vector which, when given to the fib function, produces an output vector which is a scaled version of itself.</p> </blockquote> <p>Better, but let’s round it out with an example.</p> <p>Is <script type="math/tex">\vec{0}{1}</script> an eigenvector of the fib function? <script type="math/tex">fib(\vec{0}{1}) = \vec{1}{1}</script>. Is <script type="math/tex">\vec{1}{1}</script> just a scaled version of <script type="math/tex">\vec{0}{1}</script>. Nope! So, <script type="math/tex">\vec{0}{1}</script> is not an eigenvector of the fib function.</p> <p>Finding an eigenvector of the fib function with guess and check could take a while, so let’s try to solve for one. Remember, we want to find an input vector <script type="math/tex">\vec{x}{y}</script> such that <script type="math/tex">fib(\vec{x}{y})</script> is a scaled version of <script type="math/tex">\vec{x}{y}</script>, i.e. <script type="math/tex">fib(\vec{x}{y}) = \alpha \vec{x}{y}</script>. We also know that <script type="math/tex">fib(\vec{x}{y}) = \vec{y}{x + y}</script>. So, we need to solve <script type="math/tex">\alpha \vec{x}{y} = \vec{y}{x + y}</script>.</p> <script type="math/tex; mode=display">% <![CDATA[ \begin{align*} \alpha \vec{x}{y} &= \vec{y}{x + y} \\ \alpha x &= y \tag{1} \\ \alpha y &= x + y \\ \alpha^2 x &= x + \alpha x \tag{substitute $\alpha x$ for $y$ from 1} \\ \alpha^2 x - \alpha x - x &= 0 \\ \alpha^2 - \alpha - 1 &= 0 \tag{divide by x, assuming $x \neq 0$)} \\ \alpha &= \frac{1 \pm \sqrt{5}}{2} \tag{by the quadratic formula}\\ \end{align*} %]]></script> <p>This gives us two values of <script type="math/tex">\alpha</script>, let’s call:</p> <script type="math/tex; mode=display">% <![CDATA[ \begin{align*} \alpha_1 &= \frac{1 + \sqrt{5}}{2} \tag{$\varphi$, the golden ratio!} \\ \alpha_2 &= \frac{1 - \sqrt{5}}{2} \tag{$-\varphi + 1$} \end{align*} %]]></script> <p>We wanted to find an eigenvector <script type="math/tex">[x, y]</script>. This is easy now that we know <script type="math/tex">\alpha</script>. Let’s arbitrarily choose <script type="math/tex">x = 1</script> and <script type="math/tex">\alpha = \alpha_1</script>. From <script type="math/tex">(1)</script>, we know that <script type="math/tex">y = \alpha_1 x</script>, so <script type="math/tex">y = \alpha_1</script>.</p> <p>Let’s check our work. We’re claiming that <script type="math/tex">\vec{1}{\alpha_1}</script> is an eigenvector. <script type="math/tex">fib(\vec{1}{\alpha_1}) = \vec{\alpha_1}{1 + \alpha_1}</script>. Is <script type="math/tex">\vec{\alpha_1}{1 + \alpha_1}</script> a scaled version of <script type="math/tex">\vec{1}{\alpha_1}</script>?</p> <p>The ratio of first elements is easy to compute: <script type="math/tex">\frac{\alpha_1}{1} = \alpha_1</script></p> <p>The ratio of second elements will take a bit of algebra:</p> <script type="math/tex; mode=display">% <![CDATA[ \begin{align*} \frac{1 + \alpha_1}{\alpha_1} &= \frac{1}{\alpha_1} + 1 \\ &= \frac{2}{1 + \sqrt{5}} + 1 \\ &= \frac{3 + \sqrt{5}}{1 + \sqrt{5}} \\ &= \frac{(3 + \sqrt{5})(1 - \sqrt{5})}{(1 + \sqrt{5})(1 - \sqrt{5})} \\ &= \frac{3 -2 \sqrt{5} - 5}{1 - 5} \\ &= \frac{-2 -2 \sqrt{5}}{-4} \\ &= \frac{1 + \sqrt{5}}{2} \\ &= \alpha_1 \end{align*} %]]></script> <p>So, to summarize, yes! The output vector is a scaled version of the input vector, and it’s scaled by <script type="math/tex">\alpha_1</script>. In other words:</p> <script type="math/tex; mode=display">fib(\vec{1}{\alpha_1}) = \alpha_1 \vec{1}{\alpha_1}</script> <p>What is the ratio by which the vector is scaled (<script type="math/tex">\alpha_1</script>) in this context? That’s the <strong>eigenvalue</strong>. Each eigenvector has a corresponding eigenvalue.</p> <h3 id="ok-but-why">Ok… but why?</h3> <p>Right. Why is this helpful?</p> <p>Well, for one, it’s really easy to understand what will happen when you call your function on an eigenvector. It will just be multiplied by the eigenvalue.</p> <p>What happens when you call your function twice on an eigenvector? Still easy, it will be multiplied by the eigenvalue squared.</p> <p>How about if you call your function 100 times? multiplied by the eigenvalue ^ 100. QED.</p> <p>It’s worth stressing that this is a <em>huge</em> shortcut! Not only in computation cost, but in intuitive understanding. If I asked you, what’s the 100th term of the fibonacci series, if the fibonacci series started with <script type="math/tex">1</script> and <script type="math/tex">\alpha_1</script>, you’d probably have no idea off the top of you head. But if you know that <script type="math/tex">\vec{1}{\alpha1}</script> is an eigenvector of the fibonacci function with an eigenvalue of <script type="math/tex">\alpha_1</script>, then you can calculate this very quickly: <script type="math/tex">fib^{100}(\vec{1}{\alpha1}) = \alpha_1^{100} \vec{1}{\alpha1}</script>. The 100th term is the first term of the output. That’s <script type="math/tex">\alpha_1^{100} \approx 1.6^{100} \approx 2.5 \cdot 10^{10}</script>.</p> <h3 id="but-what-about-non-eigenvectors">But what about non-eigenvectors?</h3> <p>Yes - the elephant in the room. You’re almost surely not lucky enough to be handed an eigenvector. Here’s where the <em>linear</em> in <em>linear algebra</em> comes in.</p> <p>What does it mean for a function to be linear? It means the following two properties hold:</p> <ol> <li> <script type="math/tex; mode=display">f(\vv{u} + \vv{v}) = f(\vv{u}) + f(\vv{v})</script> </li> <li> <script type="math/tex; mode=display">f(\alpha \vv{u}) = \alpha f(\vv{u}) \tag{where $\alpha$ is a scalar}</script> </li> </ol> <p>If those properties hold, then we can do something quite magical. We can decompose any input vector into a linear combination of eigenvectors, and then deal with them separately. This may be a bit abstract, but bear with me for a second.</p> <p>Assume <script type="math/tex">\vv{e_1}</script> and <script type="math/tex">\vv{e_2}</script> are two eigenvectors of the <em>linear</em> function <script type="math/tex">f</script> with eigenvalues <script type="math/tex">e_1</script> and <script type="math/tex">e_2</script> respectively. We are trying to understand what happens when <script type="math/tex">f</script> is applied to an input vector <script type="math/tex">\vv{v}</script>. Let’s say we can find two constants <script type="math/tex">a</script> and <script type="math/tex">b</script>, such that <script type="math/tex">a \vv{e_1} + b \vv{e_2} = \vv{v}</script>. Then, due to the two properties of linear functions above, we can say the following:</p> <script type="math/tex; mode=display">% <![CDATA[ \begin{align*} f(\vv{v}) &= f(a \vv{e_1} + b \vv{e_2}) \\ &= f(a \vv{e_1}) + f(b \vv{e_2}) \tag{by property 1} \\ &= a f(\vv{e_1}) + b f(\vv{e_2}) \tag{by property 2} \\ &= a e_1 \vv{e_1} + b e_2 \vv{e_2} \tag{due to the fact that $\vv{e_1}$ and $\vv{e_2}$ are eigenvectors} \\ \end{align*} %]]></script> <p>Wow! So we just turned our function application into something much easier to understand (and compute) - namely scalar multiplication and vector addition. Note that in order to do this, we had to know <script type="math/tex">a</script> and <script type="math/tex">b</script> up front, which is not completely trivial.</p> <p>Let’s push this a bit further. Let’s represent the vector <script type="math/tex">\vv{v}</script> as a linear combination of <script type="math/tex">{e_1}</script> and <script type="math/tex">{e_2}</script> using the following notation: <script type="math/tex">\vv{v} = \vec{a}{b}</script>. Note that this is just shorthand for <script type="math/tex">\vv{v} = a \vv{e_1} + b \vv{e_2}</script>.</p> <p>Using that notation, we can say <script type="math/tex">f(\vv{v}) = \vec{a e_1}{b e_2}</script>, right? And <script type="math/tex">f(f(\vv{v})) = \vec{a e_1^2}{b e_2^2}</script>. And using my made-up notation for repeated function application: <script type="math/tex">f^n(\vv{v}) = \vec{a e_1^n}{b e_2^n}</script>.</p> <p>There’s a name for what we’ve just done. It’s called a <em>change of basis</em>. We took our vector <script type="math/tex">\vv{v} = \vec{x}{y}</script>, where <script type="math/tex">x</script> and <script type="math/tex">y</script> were coordinates in the standard basis, i.e. <script type="math/tex">\vv{v} = \vec{x}{y} = x \vec{1}{0} + y \vec{0}{1}</script>, and turned it into <script type="math/tex">\vv{v} = \vec{a}{b}</script> where <script type="math/tex">a</script> and <script type="math/tex">b</script> are coordinates in a new basis - the <em>eigenbasis</em> - i.e. <script type="math/tex">\vv{v} = \vec{a}{b} = a \vv{e_1} + b \vv{e_2}</script>. And why did we do it? Because applying the function <script type="math/tex">f</script> (any number of times) is <em>really easy</em> in the eigenbasis.</p> <h3 id="back-to-fib">Back to fib</h3> <p>For the love of god, give me an example! You got it. Let’s go back to our fib function. Exercise for the reader, show that the fib function is <em>linear</em>.</p> <p>We solved for the two eigenvalues of the fib function above: <script type="math/tex">\alpha_1</script> and <script type="math/tex">\alpha_2</script>. Now that we know that they are eigenvalues, I’m going to rename them to be <script type="math/tex">e_1</script> and <script type="math/tex">e_2</script> to conform to our notation in the last section:</p> <script type="math/tex; mode=display">% <![CDATA[ \begin{align*} e_1 &= \frac{1 + \sqrt{5}}{2} \approx 1.618 \\ e_2 &= \frac{1 - \sqrt{5}}{2} \approx -0.618 \end{align*} %]]></script> <p>We also solved for an eigenvector that corresponded to the first eigenvalue, which we will now call <script type="math/tex">\vv{e_1} = \vec{1}{e_1}</script>.</p> <p>We need an eigenvector that corresponds to our second eigenvalue <script type="math/tex">e_2</script> as well. Here is one: <script type="math/tex">\vv{e_2} = \vec{1}{e_2}</script>. I’ll leave it as an exercise to the reader to check that this is correct.</p> <p>Now, let’s try to represent our (non-eigenvector) starting input for the real fibonacci series, namely <script type="math/tex">\vec{0}{1}</script>, in the eigenbasis. That equates for finding an <script type="math/tex">a</script> and <script type="math/tex">b</script>, such that <script type="math/tex">\vec{0}{1} = a \vv{e_1} + b \vv{e_2}</script>. And, tada, here they are: <script type="math/tex">a \approx 0.447</script> and <script type="math/tex">b = -a \approx -0.4472</script>.</p> <p>I’m not going to explain how to compute <script type="math/tex">a</script> and <script type="math/tex">b</script> because, well, this is going to be long enough already. But if you want to understand how to compute the coordinates of a vector in a new basis, <a href="https://www.khanacademy.org/math/linear-algebra/alternate-bases/change-of-basis/v/linear-algebra-change-of-basis-matrix">here’s a link</a>.</p> <p>To make it more clear why these values of <script type="math/tex">a</script> and <script type="math/tex">b</script> do the trick, let’s write it out in full:</p> <script type="math/tex; mode=display">\vec{0}{1} = a \vv{e_1} + b \vv{e_2} \approx 0.447 \vec{1}{1.618} + -0.447 \vec{1}{-0.618}</script> <p>And even better? A picture:</p> <p><img src=" /assets/by-post/golden-fibonacci/images/0_1_in_eigenbasis.png" /></p> <p>And now, we will repeatedly apply the <script type="math/tex">fib</script> function. Remember that on each application of <script type="math/tex">fib</script>, the <script type="math/tex">a</script> coordinate gets multiplied by <script type="math/tex">e_1 \approx 1.618</script> and <script type="math/tex">b</script> gets multiplied by <script type="math/tex">e_2 \approx -0.618</script>.</p> <table> <tbody> <tr> <td>n</td> <td>0</td> <td>1</td> <td>2</td> <td>3</td> <td>4</td> <td>5</td> <td>6</td> <td>7</td> <td>8</td> <td>9</td> </tr> <tr> <td><script type="math/tex">a e_1^n</script></td> <td>0.447</td> <td>0.724</td> <td>1.171</td> <td>1.894</td> <td>3.065</td> <td>4.96</td> <td>8.025</td> <td>12.985</td> <td>21.01</td> <td>33.994</td> </tr> <tr> <td><script type="math/tex">b e_2^n</script></td> <td>-0.447</td> <td>0.276</td> <td>-0.171</td> <td>0.106</td> <td>-0.065</td> <td>0.04</td> <td>-0.025</td> <td>0.015</td> <td>-0.01</td> <td>0.006</td> </tr> <tr> <td><script type="math/tex">a e_1^n\vv{e_1} + b e_2^n\vv{e_2}</script></td> <td><script type="math/tex">\vec{0}{1}</script></td> <td><script type="math/tex">\vec{1}{1}</script></td> <td><script type="math/tex">\vec{1}{2}</script></td> <td><script type="math/tex">\vec{2}{3}</script></td> <td><script type="math/tex">\vec{3}{5}</script></td> <td><script type="math/tex">\vec{5}{8}</script></td> <td><script type="math/tex">\vec{8}{13}</script></td> <td><script type="math/tex">\vec{13}{21}</script></td> <td><script type="math/tex">\vec{21}{34}</script></td> <td><script type="math/tex">\vec{34}{55}</script></td> </tr> <tr> <td>ratio</td> <td> </td> <td>1.0</td> <td>2.0</td> <td>1.5</td> <td>1.667</td> <td>1.6</td> <td>1.625</td> <td>1.615</td> <td>1.619</td> <td>1.618</td> </tr> </tbody> </table> <p>Notice that <script type="math/tex">a</script> increases exponentially, with a base of <script type="math/tex">e_1 = \varphi \approx 1.618</script>, and <script type="math/tex">b</script> decreases exponentially, with a base of <script type="math/tex">e_2 \approx -0.618</script>. As <script type="math/tex">b</script> approaches zero, all you’re left with is some constant (<script type="math/tex">a e_1^n</script>) times the eigenvector <script type="math/tex">\vv{e_1} = \vec{1}{e_1}</script>. Since the ratio of the second to the first element in the eigenvector <script type="math/tex">\vv{e_1}</script> is <script type="math/tex">\frac{e_1}{1} = e_1 = \varphi</script>, and the output of this sequence approaches a constant (<script type="math/tex">a e_1^n</script>) times <script type="math/tex">\vv{e_1}</script> as <script type="math/tex">b \to 0</script>, it makes sense that the ratio of the output vectors (and therefore consecutive elements of the fibonacci sequence) also converges to <script type="math/tex">\varphi</script>.</p> <!-- CR mrussell: explain again why this change of basis was useful in this context --> <h3 id="dont-eigenvectors-have-something-to-do-with-matrices-though">Don’t eigenvectors have something to do with matrices, though?</h3> <p>Yep! Matrices are the language of linear algebra because linear functions (often called <em>linear maps</em>) can be represented by a matrix. Why is that true? And is it always true? To be honest, I’m not the best person to explain that. <a href="https://en.wikibooks.org/wiki/Linear_Algebra/Representing_Linear_Maps_with_Matrices">Here’s a wiki article on representing linear maps with matrices</a>, although fair warning, wiki articles on math have a tendency to be overly precise and technical at the cost of understandability.</p> <p>Although I won’t prove the general case, I can show you how this particular linear map (<script type="math/tex">fib</script>) can be represented as a matrix.</p> <p>Consider the matrix <script type="math/tex">M</script>:</p> <script type="math/tex; mode=display">% <![CDATA[ \newcommand{M}{\left[\begin{matrix}0 & 1\\1 & 1\end{matrix}\right]} %]]></script> <script type="math/tex; mode=display">M = \M</script> <p>Now, let’s see the effect of multiplying a vector, I dunno… say <script type="math/tex">\vec{0}{1}</script>, by <script type="math/tex">M</script>:</p> <script type="math/tex; mode=display">M \vec{0}{1} = \M \vec{0}{1} = \vec{1}{1}</script> <p>And if we multiply again?</p> <script type="math/tex; mode=display">M^2 \vec{0}{1} = \M \M \vec{0}{1} = \vec{1}{2}</script> <p>How about 9 times?</p> <script type="math/tex; mode=display">M^9 \vec{0}{1} = \vec{34}{55}</script> <p>Multiplication by <script type="math/tex">M</script> is the <script type="math/tex">fib</script> function! What are the eigenvectors and eigenvalues of the matrix? You guessed it. The same as the <script type="math/tex">fib</script> function.</p> <h3 id="questions-to-test-your-understanding">Questions to test your understanding:</h3> <!-- CR mrussell: add clickable answers --> <p>From above:</p> <ol> <li> <p>Show that the <script type="math/tex">fib</script> function is linear.</p> </li> <li> <p>Show that <script type="math/tex">\vv{e_2}</script> is an eigenvector of the <script type="math/tex">fib</script> function.</p> </li> </ol> <p>In addition:</p> <ol> <li> <p>Let’s say I have a linear function which stretches vectors in the x-direction. More specifically, it doubles the x values. What are the eigenvectors and values of that function?</p> </li> <li> <p>Let’s say I have a linear function which rotates vectors by 90°. What are the eigenvectors and eigenvalues of that function?</p> </li> <li> <p>What is a fast way to compute <script type="math/tex">M^n</script> for large values of <script type="math/tex">n</script>? How expensive is it if you just repeatedly multiplied by <script type="math/tex">M</script>? How expensive is the fast version?</p> </li> <li> <p>Are there any two starting values for the fibonacci sequence that would not eventually converge to some scaled version of <script type="math/tex">\vv{e_1}</script>? What would it converge to?</p> </li> <li> <p>How many eigenvectors does the M matrix have? How many eigenvalues?</p> </li> </ol>Sierpinski Triangle2019-01-05T00:00:00+00:002019-01-05T00:00:00+00:00http://blog.russelldmatt.com/2019/01/05/sierpinski<script src=" /assets/by-post/sierpinski/main.js" defer=""></script> <style> #canvas { box-shadow: 5px 5px 5px grey; border: 1px solid grey; width: 600px; max-width: 100%; height: 600px; display: block; margin: 0 auto 30px; } </style> <canvas id="canvas"></canvas> <p>Inspired by <a href="https://www.youtube.com/channel/UCvjgXvBlbQiydffZU7m1_aw">The Coding Train</a><sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup>, I’d like to write more “creative” programs. I know it’s simple, but what better way to start than by writing a program to draw the Sierpinski triangle. Baby steps.</p> <div class="footnotes"> <ol> <li id="fn:1"> <p>If you haven’t yet come across it, <a href="https://www.youtube.com/channel/UCvjgXvBlbQiydffZU7m1_aw">The Coding Train</a> is a youtube channel where Daniel Shiffman<sup id="fnref:2"><a href="#fn:2" class="footnote">2</a></sup> creates incredibly cool visualizations using the <a href="https://p5js.org/">p5.js</a> javascript library, or <a href="https://en.wikipedia.org/wiki/Processing_(programming_language)">Processing</a>, the programming language from which p5.js was inspired. I highly recommend it. Related note, I used <a href="https://github.com/Schmavery/reprocessing">reprocessing</a>, a <a href="https://reasonml.github.io/">reasonML</a> library, also inspired by Processing. <a href="#fnref:1" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:2"> <p>A pretty smart dude. BA in Mathematics and Philosophy from Yale. Now he teaches at NYU’s Tisch. <a href="#fnref:2" class="reversefootnote">&#8617;</a></p> </li> </ol> </div>