Back to the drawing board — Month #1 (and a bit more)

July 17, 2013

So this is the first post after my blog (and life) transition! Just wish to record some progress in my concept art studies after not drawing and painting for 6 years. Feel so good that sometimes I can’t believe this is going to be what I do full-time! (Why didn’t I start this 1.5 years ago?!)

In any case, the goal of this blog is to have bad paintings and drawings at the beginning and hopefully show improvements over time. My ultimate hope is for this to envolve into a log that shows dreams can be achieved no matter how distant it is from where you are.

Anyways, here are some selected pages of what I’ve been doing over the last month or so =P

First, some animal thumbnails~ (went to the zoo last week, but those are mostly from photo as I’m still not very comfortable drawing while standing in the crowd >.<)


Gorillas are super fun to draw!

Bears (and some polar bears)


African collection




This is me trying to figure out how does animal legs bend…


Copied animal poses before the zoo, to get a better idea of how to capture the gesture.

Now onto human figure invention~


Those are given a background and invent a character to put into the scene.


This was super fun to draw~ I am happy to see that now I can draw figures without finding a photo reference ^^


Take a pose and change the character posing. (we actually had a model doing in and was asked to change him into ‘superhero’, ‘sexy women’ etc =P)


I think I really like drawing fat guys…


Hand expressions, roughly a half of them copied from master animators from Disney, the other half drawn according to my own (left) hand =P


Designing a ‘space pirate’.

Okay, below are some absolutely uncategorized random stuff I just decided to throw in:

A little layout. (obviously inspired by Ratatouille)


Some plants (drawn on spot in Caltech)


Balls, painted in Photoshop

Okay…onto…oil painting! I have to say that I have never done oil in my life…(and I only ever painted a couple times back in high school >.<) But it’s so exciting to start!!!

I started off by going to those life model sessions where there is a model posing and everyone drops in and paints:






First-time still life.


Some thumbnail landscapes from my travel photos.

Anyways, painting is something I am most looking forward to improve! Just started to systematically learning it last week, so stay turned!

A sketch of a piece I really wanted to paint once I get better at it:


The beginning of a long voyage — from Princeton to Pixar

July 16, 2013

Dear blog readers:

I am both amazed and touched to see that the blog kept getting traffic long after I stopped using it, hopefully you have found something here interesting or useful. The intension of this post is to announce I decided to start blogging here again. But in a completely different context, to go along with my new life beyond mathematics. Along the way I’d like to give my choice a little explanation (mainly towards my dear mathematical readers) and perhaps give my two cents on life and dreams.

As mentioned in this post 1.5 years ago, I decided to restart my career with the end goal of working at Pixar or Disney as a concept artist.

Definition: Concept art is a form of illustration where the main goal is to convey a visual representation of a design, idea, and/or mood for use in films, video games, animation, or comic books before it is put into the final product.

Roughly speaking, for life action movies one reads the script and selects locations, props and actors; in animation, however, most of the times the world which the story take place doesn’t exist! This means that every single detail needs to be designed from imagination: from a chair, a lamp to whole islands, cities and characters. A concept artist is the person who reads the story and designs the world to stage it!

To illustrate the idea let’s see a couple of examples:

As we know cars (2) was a movie that takes place in an imaginary universe with residents being cars instead of people and animals; Here’s some of Pixar’s brilliant designs:

This was one of the paintings hanging in the palace, note the UK coat of arms logo on the floor~


All landmark architectures around the world are re-designed to have car elements in them.


And here’s what a busy London street looks like (I just love the way buses and taxis look so… British! Note that London double deck buses actually have that ‘sigle lens glass’ on them =P)


Anyways, hopefully I have given some ideas on what animation concept design is about~ Now onto some properties of a concept designer:

Propositions: A concept artist is typically:
– Super creative
– Has childlike imagination
– Thinks and communicates visually
– Draws and paints well

Hmm…I am totally convinced that this is what I am made for more than anything else! In fact, this is what I wanted way before I got into mathematics, but like many people, childhood dreams sometimes get buried and forgotten in the attic. Luckily I found it back that winter in Disney World and will never, ever, lose it again.

Once I figured out what I wanted the rest is very simple: just do everything possible to get there! In this case I am perhaps at the antiportal point to where I wanted to be: there is just no obvious route from mathematics to animation design, which means I get to create my own path! So I started with making some route maps and took some steps to look into how they work–

Route #1:
Hang around the Princeton computer science department and specialize in computer graphics
–> get a PhD in computer science
–> get into the software research group at Pixar
–> try to get to know people in the art and design group
–> go from there

Current status: So I started by taking graduate computer graphics and algorithms course at Princeton in spring 2012, apparently I’m not bad at them (especially algorithms) and that we Princeton CS department actually has good connections to Pixar. However, after chasing down Tony DeRose in Stony Brook when he was giving a public lecture, I found that unfortunately I do need to first have a PhD in CS and that the software group does not really talk to the design group that much…Hence I estimated this route is not the most efficient.

Route #2:
Get my PhD in mathematics
–> leave academia and work in finance for a few years
–> save enough money
–> go to Art Center, major in entertainment design.
–> graduate and get into Pixar

Current status: The major problem with this plan is of course it involved doing relatively uninteresting and unrelated stuff for a few years. Plus although I’m sure I would have love to attend art school, getting another bachelor’s degree is time consuming. Hence after some research on how I might go about getting a job in finance, I decided to move onto the third plan and perhaps come back to this if nothing seem more efficient.

Route #3: (and this is what I am doing right now!)
Move to California
–> take courses from independent studios (such as concept design academy)
–> become technically at least as good as people who went through art school
–> build a portfolio
–> apply to Pixar directly

Current status: Deep down I have always known this is actually the best and fastest way to go, but I didn’t go for it till this summer because in this case the last six years I spent in mathematics is officially absolutely useless. But now I figured that trying to utilize them would only result in making the process taking longer. Looking into this, the choice is actually very simple: I should have no pity in completely starting over and not look back. ‘Being good at something should only work towards one’s advantage’ sounds like a tautology, but in reality it’s striking to see how often abilities and past accomplishments become burdens that prevents people from chasing dreams and, eventually, prevents them from getting to where they wanted to be.

So here I am in Pasadena since June, I’m thrilled to say that I have never felt more alive since I finished undergrad! Not only that I got to draw and paint all the time, seeing improvements on a daily basis but also I have finally found the group I belong to by being around truly creative people! It will take some time, but I know this is what I want to do and I will get there!

In any case, if Mike Wazowski ended up as a scarer, what am I concerned about? Inspired by the ending of Monsters University, current plan: I’ll work on getting as close as I can to Picar till 2015, if it hasn’t worked out by then I’m going back to Shanghai and start by sweeping the floor at the Shanghai Disneyland.

From now on I will record my progress as an concept artist in training here. Hence if you are here to read mathematics, please unsubscribe…and wish me luck! ^^

With hope, onward

On Tao’s talk and the 3-dimensional Hilbert-Smith conjecture

May 6, 2012

Last Wednesday Terry Tao briefly dropped by our little town and gave a colloquium. Surprisingly this is only the second time I hear him talking (the first one goes back to undergrad years in Toronto, he talked about arithmetic progressions of primes, unfortunately it came before I learned anything [such as those posts] about Szemeredi’s theorem). Thanks to the existence of blogs, feels like I knew him much better than that!

This time he talked about Hilbert’s 5th problem, Gromov’s polynomial growth theorem for discrete groups and their (Breuillard-Green-Tao) recently proved more general analogy of Gromov’s theorem for approximate groups. Since there’s no point for me to write 2nd-handed blog post while people can just read his own posts on this, I’ll just record a few points I personally found interesting (as a complete outsider) and moving on to state the more general Hilbert-Smith conjecture, very recently solved for 3-manifolds by John Pardon (who now graduated from Princeton and became a 1-st year grad student at Stanford, also appeared in this earlier post when he gave solution to Gromov’s knot distortion problem).

Warning: As many of you know I never take notes during talks, hence this is almost purely based on my vague recollection of a talk half a week ago, inaccuracy and mistakes are more than possible.

All topological groups in this post are locally compact.

Let’s get to math~ As we all know, a Lie group is a smooth manifold with a group structure where the multiplication and inversion are smooth self-diffeomorphisms. i.e. the object has:

1. a topological structure
2. a smooth structure
3. a group structure

It’s not too hard to observe that given a Lie group, if we ‘forget’ the smooth structure and just see it as a topological group which is a (topological) manifold, then we can uniquely re-construct the smooth structure from the group structure. From my understanding, this is mainly because given any element in the topological group we can find a unique homomorphism of the group \mathbb{R} into the manifold, sending 0 to identity and 1 to the element. resulting a class of curved through the identity, a.k.a the tangent space. Since the smooth structure is determined by the tangent space of the identity, all we need to know is how to ‘multiply’ two such parametrized curves.

The way to do that is to ‘zig-zag’:

Pick a small \varepsilon, take the image of \varepsilon under the two homomorphisms, alternatingly multiplying them to obtain a sequence of points in the topological group. As \varepsilon \rightarrow 0 the sequence becomes denser and converges to a curve.

The above shows that given a Lie group to start with, the smooth structure is uniquely determined by the topological group structure. Knowing this leads to the natural question:

Hilbert’s fifth problem: Is it true that any topological group which are (topological) manifolds admits a smooth structure compatible with group operations?

Side note: I had a little post-colloquium discussion with our fellow grad student Sam Lewallen, he asked:

Question: Is it possible for the same topological manifold to have two different Lie group structures where the induced smooth structures are different?

Note that neither the above nor Hilbert’s fifth problem shows such thing is impossible, since they both start with the phase ‘given a topological group’. My *guess* is this should be possible (so please let me know if you know the answer!) The first attempt might be trying to generate an exotic \mathbb{R}^4 from Lie group. Since the 3-dimensional Heisenberg group induces the standard (and unique) smooth structure on \mathbb{R}^3, I guess the 4-dimensional Heisenberg group won’t be exotic.

Anyways, so the Hilbert 5th problem was famously solved in the 50s by Montgomery-Zippin and Gleason, using set-theoretical methods (i.e. ultrafilters).

Gromov comes in later on and made the brilliant connection between (infinite) discrete groups and Lie groups. i.e. one see a discrete group as a metric space with word metric, ‘zoom out’ the space and produce a sequence of metric spaces, take the limit (Gromov-Hausdorff limit) and obtain a ‘continuous’ space. (which is ‘almost’ a Lie group in the sense that it’s an inverse limit of Lie groups.)

Hence he was able to adapt the machinery of Montgomery-Zippin to prove things about discrete groups:

Theorem: (Gromov) Any group with polynomial growth is virtually nilpotent.

Side note: I learned about this through the very detailed and well-presented course by Dave Gabai. (I thought I must have blogged about this, turns out I haven’t…)

The beauty of the theorem is (in my opinion) that we are given any discrete group, and all that’s known is how large the balls are (in fact, not even that, we know how large the large balls grow), yet the conclusion is all about the algebraic structure of the group. To learn more about Gromov’s work, see his paper. Although unrelated to the rest of this post, I shall also mention Bruce Kleiner’s paper where he proved Gromov’s theorem without using Hilbert’s 5th problem, instead he used space of harmonic maps on graphs.

Now we finally comes to a point of briefly mentioning the work of Tao! So they adopted Gromov’s methods of limiting and ‘ultra-filtering’ to apply to stuff that’s not even a whole discrete group: Since Gromov’s technique was to take the limit of a sequence of metric spaces which are zoomed out versions of balls in a group, but the Gromov-Hausdorff limit actually doesn’t care about the fact that those spaces are zoomed out from the same group, they may as well be just a family of subsets of groups with ‘bounded geometry’ of a certain kind.

Definition: An K-approximate group S is a (finite) subset of a group G where S\cdot S = \{ s_1 s_2 \ | \ s_1, s_2 \in S \} can be covered by K translates of S. i.e. there exists p_1, \cdots, p_K \in G where S \cdot S \subseteq \cup_{i=1}^k p_i \cdot S.

We shall be particularly interested in sequence of larger and larger sets (in cardinality) that are K-approximate groups with fixed K.

Intervals [-N, N] \subseteq \mathbb{Z} are 2-approximate groups.

Balls of arbitrarily large radius in \mathbb{Z}^n are C \times 2^n approximate groups.

Balls of arbitrarily large radius in the 3-dimensional Heisenberg group are C \times 2^4 approximate groups. (For more about metric space properties of the Heisenberg group, see this post)

Just as in Gromov’s theorem, they started with any approximate group (a special case being sequence of balls in a group of polynomial growth), and concluded that they are in fact always essentially balls in Nilpotent groups. More precisely:

Theorem: (Breuillard-Green-Tao) Any K-approximate group S in G is covered by C(K) many translates of subgroup G_0 < G where G_0 has a finite (depending only on K) index nilpotent normal subgroup N.

With this theorem they were able to re-prove (see p71 of their paper) Cheeger-Colding’s result that

Theorem: Any closed n dimensional manifold with diameter 1 and Ricci curvature bounded below by a small negative number depending on n must have virtually nilpotent fundamental group.

Where Gromov’s theorem yields the same conclusion only for non-negative Ricci curvature.

Random thoughts:

1. Can Kleiner’s property T and harmonic maps machinery also be used to prove things about approximate groups?

2. The covering definition as we gave above in fact does not require approximate group S to be finite. Is there a Lie group version of the approximate groups? (i.e. we may take compact subsets of a Lie group where the self-product can be covered by K many translates of the set.) I wonder what conclusions can we expect for a family of non-discrete approximate groups.

As promised, I shall say a few words about the Hilbert-Smith conjecture and drop a note on the recent proof of it’s 3-dimensional case by Pardon.

From the solution of Hilbert’s fifth problem we know that any topological group that is a n-manifold is automatically equipped with a smooth structure compatible with group operations. What if we don’t know it’s a manifold? Well, of course then they don’t have to be a Lie group, for example the p-adic integer group \mathbb{Z}_p is homeomorphic to a Cantor set hence is not a Lie group. Hence it makes more sense to ask:

Hilbert-Smith conjecture: Any topological group acting faithfully on a connected n-manifold is a Lie group.

Recall an action is faithful if the homomorphism \varphi: G \rightarrow homeo(M) is injective.

As mentioned in Tao’s post, in fact \mathbb{Z}_p is the only possible bad case! i.e. it is sufficient to prove

Conjecture: \mathbb{Z}_p cannot act faithfully on a finite dimensional connected manifold.

The exciting new result of Pardon is that by adapting 3-manifold techniques (finding incompressible surfaces and induce homomorphism to mapping class groups) he was able to show:

Theorem: (Pardon ’12) There is no faithful action of \mathbb{Z}_p on any connected 3-manifolds.

And hence induce the Hilbert-Smith conjecture for dimension 3.

Discovering this result a few days ago has been quite exciting, I would hope to find time reading and blogging about that in more detail soon.

A train track on twice punctured torus

April 22, 2012

This is a non-technical post about how I started off trying to prove a lemma and ended up painting this:

One of my favorite books of all time is Thurston‘s ‘Geometry and Topology of 3-manifolds‘ (and I just can’t resist to add here, Thurston, who happen to be my academic grandfather, is in my taste simply the coolest mathematician on earth!) Anyways, for those of you who aren’t topologists, the book is online and I have also blogged about bits and parts of it in some old posts such as this one.

I still vividly remember the time I got my hands on that book for the first time (in fact I had the rare privilege of reading it from an original physical copy of this never-actually-published book, it was a copy on Amie‘s bookshelf, which she ‘robbed’ from Benson Farb, who got it from being a student of Thurston’s here at Princeton years ago). Anyways, the book was darn exciting and inspiring; not only in its wonderful rich mathematical content but also in its humorous, unserious attitude — the book is, in my opinion, not an general-audience expository book, but yet it reads as if one is playing around just to find out how things work, much like what kids do.

To give a taste of what I’m talking about, one of the tiny details which totally caught my heart is this page (I can’t help smiling each time when flipping through the book and seeing the page, and oh it still haunts me >.<):

This was from the chapter about Kleinian groups, when the term ‘train-track’ was first defined, he drew this image of a train(!) on moving on the train tracks, even have smoke steaming out of the engine:

To me such things are simply hilarious (in the most delightful way).

Many years passed and I actually got a bit more into this lamination and train track business. When Dave asked me to ‘draw your favorite maximal train track and test your tube lemma for non-uniquely ergodic laminations’ last week, I ended up drawing:

Here it is, a picture of my favorite maximal train track, on the twice punctured torus~! (Click for larger image)

Indeed, the train is coming with steam~

Since we are at it, let me say a few words about what train tracks are and what they are good for:

A train track (on a surface) is, just as one might expect, a bunch of branches (line segments) with ‘switches’, i.e. whenever multiple branches meet, they must all be tangent at the intersecting point, with at least one branch in each of the two directions. By slightly moving the switches along the track it’s easy to see that generic train track has only switches with one branch on one side and two branches on the other.

On a hyperbolic surface S_{g,p}, a train track is maximal if its completementry region is a disjoint union of triangles and once punctured monogons. i.e. if we try to add more branches to a maximal track, the new branch will be redundant in the sense that it’s merely a translate of some existing branch.

As briefly mentioned in this post, train tracks give natural coordinate system for laminations just like counting how many times a closed geodesic intersect a pair of pants decomposition. To be slightly more precise, any lamination can be pushed into some maximal train track (although not unique), once it’s in the track, any laminations that’s Hausdorff close to it can be pushed into the same track. Hence given a maximal train track, the set of all measured laminations carried by the train track form an open set in the lamination space, (with some work) we can see that as measured lamination they are uniquely determined by the transversal measure at each branch of the track. Hence giving a coordinate system on \mathcal{ML})(S).

Different maximal tracks are of course them pasted together along non-maximal tracks which parametrize a subspace of \mathcal{ML}(S) of lower dimension.

To know more about train tracks and laminations, I highly recommend going through the second part of Chapter 8 of Thurston’s book. I also mentioned them for giving coordinate system on the measured lamination space in the last post.

In any case I shall stop getting into the topology now, otherwise it may seem like the post is here to give exposition to the subject while it’s actually here to remind myself of never losing the Thurston type childlike wonder and imagination (which I found strikingly larking in contemporary practice of mathematics).

Filling and unfilling measured laminations

April 10, 2012

(images are gradually being inserted ~)

I’m temporarily back into mathematics to (try) finish up some stuff about laminations. While I’m on this, I figured maybe sorting out some very basic (and cool) things in a little post here would be a good idea. Browsing through the blog I also realized that as a student of Dave’s I have been writing surprisingly few posts related to what we do. (Don’t worry, like all other posts in this blog, I’ll only put in stuff anyone can read and hopefully won’t be bored reading :-P)

Here we go. As mentioned in this previous post, my wonderful advisor has proved that the ending lamination space is connected and locally connected (see Gabai’08).

Definition: Let S_{g,p} be a hyperbolic surface of genus g and p punctures. A (geodesic) lamination L \subseteq S is a closed set that can be written as a disjoint union of geodesics. i.e. L = \sqcup_{\alpha \in I} \gamma_\alpha where each \gamma_\alpha is a (not necessary closed) geodesic, \gamma is called a leaf of L.

Let’s try to think of some examples:

i) One simple closed geodesic

ii) A set of disjoint simple closed geodesics

iii) A non-closed geodesic spirals onto two closed ones

iV) Closure of a single simple geodesic where transversal cross-sections are Cantor-sets

An ending lamination is a lamination where
a) the completement S \backslash L is a disjoint union of discs and once punctured discs (filling)
b) all leaves are dense in L. (minimal)

Exercise: example i) satisfies b) and example iv) as shown satisfies both a) and b) hence is the only ending lamination.

It’s often more natural to look at measured laminations, for example as we have seen in the older post, measured laminations are natural generalizations of multi-curves and the space \mathcal{ML}(S) is homeomorphic to \mathbb{R}^{6g-6+2p} (Thurston) with very natural coordinate charts (given by train-tracks).

Obviously not all measured laminations are supported on ending laminations (e.g. example i) and ii) with atomic measure on the closed curves.) It is well known that if a lamination fully supports an invariant measure, then as long as the base lamination satisfies a), it automatically satisfies b) and hence is an ending lamination. This essentially follows from the fact that having a fully supported invariant measure and being not minimal implies the lamination is not connected and hence won’t be filling.

Exercise:Example iii) does not fully support invariant measures.

Scaling of the same measure won’t effect the base lamination, hence we may eliminate a dimension by quotient that out and consider the space of projective measured laminations \mathcal{PML}(S) \approx \mathbb{S}^{6g-7+2p}. Hence we may decompose measured laminations into filling and unfilling ones. i.e.

\mathcal{PML}(S) = \mathcal{FPML}(S) \sqcup \mathcal{UPML}(S)

where \mathcal{FPML}(S) projects to the ending laminations via the forgetting measure map \pi.

This decomposition of the standard sphere \mathbb{S}^{6g-7+2p} is mysterious and very curious in my opinion. To get a sense of this, let’s take a look at the following facts:

Fact 1: \mathcal{UPML} is a union of countably many disjoint hyper-discs (i.e. discs of co-dimension 1).

Well, if a measured lamination is unfilling, it must contain some simple closed geodesic as a leaf (or miss some simple closed geodesic). For each such geodesic C, there are two possible cases:

Case 1: C is non-separating. The set of measured laminations that missed C is precisely the set of projective measured laminations supported on S_{g-1, p+2}, hence homeomorphic to \mathbb{S}^{6g-13+2p+4} = \mathbb{S}^{(6g-7+2p)-2} we may take any such measured lamination, disjoint union with C, we may assign any ratio of wrights to C and the lamination. This corresponds to taking the cone of \mathbb{S}^{(6g-7+2p)-2} with vertex being the atomic measure on C. Yields a disc of dimension (6g-7+2p)-1.

Case 2: C is separating. Similarly, the set of measured laminations missing C is supported on two connected surfaces with total genus g and total punctures p+2.

To describe the set of projective measured laminations missing C, we first determine the ratio of measure between two connected components and then compute the set of laminations supported in each component. i.e. it’s homeomorphic to [0,1] \times \mathbb{S}^{d_1} \times \mathbb{S}^{d_2}/\sim where d_1+d_2 = 6g-2*7+2(p+2) = 6g-10+2p and (0, x_1, y) \sim (0, x_2, y) and (1, x, y_1) \sim (1, x, y_2).

Exercise: check this is a sphere. hint: if d_1 =d_2 = 1, we have:

Again we cone w.r.t. the atomic measure corresponding to C, get a hyper disc.

At this point you may think ‘AH! \mathcal{UPML} is only a countable union of hyper-discs! How complicated can it be?!’ Turns out it could be, and (unfortunately?) is, quite messy:

Fact 2: \mathcal{UPML} is dense in \mathcal{PML}.

This is easy to see since any filling lamination is minimal, hence all leaves are dense, we may simply take a long segment of some leaf where the beginning and end point are close together on some transversal, close up the segment by adding a small arc on the transversal, we get a simple closed geodesic that’s arbitrarily close to the filling lamination in \mathcal{PML}. Hence the set of simple closed curves with atomic measure are dense, obviously implying \mathcal{UPML} dense.

So how exactly does this decomposition look like? I found it very mysterious indeed. One way to look at this decomposition is: we know two \mathcal{UPML} discs can intersect if and only if their corresponding curved are disjoint. Hence in some sense the configuration captures the structure of the curve complex. Since we know the curve complex is connected, we may start from any disc, take all discs which intersect it, then take all discs intersecting one of the discs already in the set, etc.

We shall also note that all discs intersecting a given disc must pass through the point corresponding to the curve at the center. Hence the result will be some kind of fractal-ish intersecting discs:


Yet somehow it manages to ‘fill’ the whole sphere!

Hopefully I have convinced you via the above that countably many discs in a sphere can be complicated, not only in pathological examples but they appear in ‘real’ life! Anyways, with Dave’s wonderful guidance I’ve been looking into proving some stuff about this (in particular, topology of \mathcal{FPML}). Hopefully the mysteries would become a little clearer over time~!

The travelling salesman problem

March 19, 2012

This is one of those items I should have written about long ago: I first heard about it over a lunch chat with professor Guth; then I was in not one, but two different talks on it, both by Peter Jones; and now, finally, after it appeared in this algorithms lecture by Sanjeev Arora I happen to be in, I decided to actually write the post. Anyways, it seem to live everywhere around my world, hence it’s probably a good idea for me to look more into it.

Has everyone experienced those annoy salesman who keeps knocking on you and your neighbors’ doors? One of their wonderful properties is that they won’t stop before they have reached every single household in the area. When you think about it, in fact this is not so straight foreword to do; i.e. one might need to travel a long way to make sure each house is reached.

Problem: Given N points in \mathbb{R}^2, what’s the shortest path that goes through each point?

Since this started as an computational complexity problem (although in fact I learned the analysts’s version first), I will mainly focus on the CS version.

Trivial observations:

In total there are about N! paths, hence the naive approach of computing all their lengths and find the minimum takes more than N! \sim (N/e)^N time. (which is a long time)

This can be easily improved by a standard divide and concur:

Let V \subseteq \mathbb{R}^2 be our set of points. For each subset S \subseteq V, for each p_1, p_2 \in S, let F(S, p_1, p_2) = length of shortest path from p_1 to p_2 going through all points in S.

Now the value of this F can be computed recursively:

\forall |S| = 2, F(S, p_1, p_2) = d(p_1, p_2);

Otherwise F(S, p_1, p_2) =

\min \{ F(S \backslash \{p_2\}, p_1, q) + d(q, p_2) \ | \ q \in S, q \neq p_1, p_2 \}

What we need is minimum of F(V, p_1, p_2) where p_1, p_2 \in V. There are 2^N subsets of V, for any subset there are \leq N choices for each of p_1, p_2. F(S, p_1, p_2) is a minimum of N numbers, hence O(N) time (as was mentioned in this pervious post for selection algorithm). Summing that up, the running time is 2^N\times N^2\times N \sim O(2^N), slightly better than the most naive way.

Can we make it polynomial time? No. It’s well known that this problem is NP-hard, this is explained well in the wikipedia page for the problem.

Well, what can we do now? Thanks to Arora (2003), we can do an approximate version in polynomial time. I will try to point out a few interesting ideas from that paper. The process involved in this reminded me of the earlier post on nonlinear Dvoretzky problem (it’s a little embracing that I didn’t realize Sanjeev Arora was one of the co-authors of the Dvoretzky paper until I checked back on that post today! >.< ) it turns out they have this whole program about ‘softening’ classic problems and produce approximate versions.

Approximate version: Given N points in \mathbb{R}^2, \forall \varepsilon > 0, find a path \gamma through each point such that length l(\gamma) < (1+\varepsilon)l(\mbox{Opt}).

Of course we shall expect the running time T to be a function of \varepsilon and N, as \varepsilon \rightarrow 0 it shall blow up (to at least exponential in N, in fact as we shall see below, it will blow up to infinity).

The above is what I would hope is proved to be polynomial. In reality, what Arora did was one step more relaxed, namely a polynomial time randomized approximate algorithm. i.e. Given V and \varepsilon, the algorithm produces a path \gamma such that E(l(\gamma)-l(\mbox{Opt}) < \varepsilon). In particular this means more than half the time the route is within (1+\varepsilon) to the optimum.

Theorem (Arora ’03): T(N, \varepsilon) \sim O(N^{1/\varepsilon}) for the randomized approximate algorithm.

Later in that paper he improved the bound to O(N \varepsilon^{C/\varepsilon}+N\log{N}), which remains the best know bound to date.

Selected highlights of proof:

One of the great features in the approximating world is that, we don’t care if there are a million points that’s extremely close together — we can simply merge them to one point!

More precisely, since we are allowing a multiplicative error of \varepsilon, we also have trivial bound l(\mbox{Opt}) > \mbox{    diam}(V), Hence the length can be increased by at least \varepsilon \mbox{diam}(V) which means if we move each point by a distance no more than \varepsilon \mbox{diam}(V) / (4N) and produce a path \gamma' connecting the new points with l(\gamma')< (1+\varepsilon/2)l(\mbox{Opt}), then we can simply get our desired \gamma from \gamma', as shown:

i.e. the problem is “pixelated”: we may bound V in a square box with side length \mbox{diam}(V), divide each side into 8N/\varepsilon equal pieces and assume all points are in the center of the gird cell it lies in (for convenience later in the proof we will assume 8N/\varepsilon = 2^k is a power of 2, rescale the structure so that each cell has side length 1. Now the side length of the box is 8N/\varepsilon = 2^k):

Now we do this so-called quadtree construction to separate the points (reminds me of Whitney’s original proof of his extension theorem, or the diatic squares proof of open sets being countable) i.e. bound V in a square box and keep dividing squares into four smaller ones until no cell contains more than one point.

In our case, we need to randomize the quad tree: First we bound V in a box that’s 4 times as large as our grid box (i.e. of side length 2^{k+1}), shift the larger box by a random vector (-i/2^k,-j/2^k) and then apply the quad tree construction to the larger box:

At this point you may wonder (at least I did) why do we need to pass to a larger square and randomize? From what I can see, doing this is to get

Fact: Now when we pick a grid line at random, the probability of it being an ith level dividing line is 2^i/2^k = 2^{i-k}.

Keep this in mind.

Note that each site point is now uniquely defined as an intersection of no more than k nesting squares, hence the total number of squares (in all levels) in this quad tree cannot exceed N \times k \sim N \times \log{N/\varepsilon}.

Moving on, the idea for the next step is to perturb any path to a path that cross the sides of the square at some specified finite set of possible “crossing points”. Let m be the unique number such that 2^m \in [(\log N)/\varepsilon, 2 (\log N)/ \varepsilon ] (will see this is the best m to choose). Divide sides of each square in our quad tree into 2^m equal segments:

Note: When two squares of different sizes meet, since the number of equally squares points is a power of 2, the portals of the larger square are also portals of the smaller one.

With some simple topology (! finally something within my comfort zone :-P) we may assume the shortest portal-respecting path crosses each portal at most twice:

In each square, we run through all possible crossing portals and evaluate the shortest possible path that passes through all sites inside the square and enters and exists at the specified nodes. There are (2^{4 \times 2^m})^2 \sim (side length)^2 \sim (N/\varepsilon)^2 possible entering-exiting configurations, each taking polynomial time in N (in fact \sim N^{O(1/\varepsilon)} time) to figure out the minimum.

Once all subsquares has their all paths evaluated, we may move to the one-level larger square and spend another \log(N/\varepsilon) \times (N/\varepsilon)^2 operations. In total we have

N \times \log{N/\varepsilon} \times (N/\varepsilon)^2 \times N^{O(1/\varepsilon)}

\sim N^{O(1/\varepsilon)}

which is indeed polynomial in N/\varepsilon many operations.

The randomization comes in because the route produced by the above polynomial time algorithm is not always approximately the optimum path; it turns out that sometimes it can be a lot longer.

Expectation of the difference between our random portal respecting minimum path \mbox{OPT}_p and the actual minimum \mbox{OPT} is bounded simply by the fact that minimum path cannot cross the grid lines more that \mbox{OPT} times. At each crossing, the edge it crosses is at level i with probability 2^{i-k}. to perturb a level i intersection to a portal respecting one requires adding an extra length of no more than 2 \times 2^{k-i}/2^m \sim 2^{k+1-i}/(\log N / \varepsilon):

\displaystyle \mathbb{E}_{a,b}(\mbox{OPT}_p - \mbox{OPT})

\leq \mbox{OPT} \times \sum_{i=1}^k 2^{i-k} \times 2^{k+1-i} / (\log N / \varepsilon)

= \mbox{OPT} \times 2 \varepsilon / \log N < \varepsilon \mbox{OPT}

P.S. You may find images for this post being a little different from pervious ones, that’s because I recently got myself a new iPad and all images above are done using iDraw, still getting used to it, so far it’s quite pleasant!

Bonus: I also started to paint on iPad~

–Firestone library, Princeton. (Beautiful spring weather to sit outside and paint from life!)

Back to the drawing (painting) board

March 12, 2012

OK, so recently I made some attempts in painting on computer (meaning open a blank canvas in Photoshop and paint on a tablet). After spending just a few nights on it, I can’t help wondering ‘why haven’t I done this earlier?!’. Anyways, to help saving fellow mathematicians from making the same mistake, I’m writing this post. (okay you got me, I just want to post because I think it’s cool >..< Oh well, I guess travelling salesman has to wait q few days.)

This one was painted last night (you should blame it if you can’t wait to read about the travelling salesman problem). After reviewing pervious attempts, I found somehow I tend to use less saturated colors when shifting from canvas to digital. Hence I picked a circus scene for color training (the scene was from Disney’s Hunchback of Notre Dame album), seem to work~ Also, before this I have been sketching at a smaller size hence details a little smoother on this one, click to enlarge.

OK, here comes my complete digital painting history to date:

The very first sketch I did the night my tablet arrived (by the way, I’m using a Wacom Intuos, small). This is a scene I saw back in Evanston, when I used to take mid-night long walks along the shore of lake Michigan, there were days when moon rise from the lake, creating a shiny silver region that silently sparkles in the mideast of the otherwise dark water. My picture certainly doesn’t do the justice, but this was one of the views I have been longing to capture but can’t do with a camera. (low light + moving water) People, if you every visit Northwestern, I highly recommend checking the moonrise schedule!

After the first try (which is mainly black and white), I decided that I should get familiar with color mixing techniques, since the eye-dropper tool is non-existant in the real world. So I did a ‘master study’ i.e. pulled out a hard copy of Pixar concept drawing (originally in colored chalk I believe) and tried to get all the colors working ^_^ So here comes my version of radiator springs from Cars~

The next day I went out of my office, sat on the stairs for a couple hour with a computer and tablet, did this quick sketch of “the only interesting piece of architecture around the math department” – Lewis library by Frank Gary. (never liked the green and orange blocks on the building hence I intentionally deleted them :-P) Can’t get into any details because it was past midnight and that day was fairly cold.

In conclusion, digital vs. traditional:

1. Much better at anything to do with color (mixing, getting color from another part of the picture, etc);
2. Faster (used to take me weeks to finish a painting)
3. Unlimited resources (creating custom brushes, color won’t run out)

1. More personal satisfactory (well, I guess nothing compares to holding canvas in you hands)
2. Can actually see the brush and stroke at the same time (took me some time to get used to not seeing where my hands are, but in fact not as bad as one might think)

Anyways, I’m about a week old in this field, so don’t believe anything I said :-P. By the way, a side effect of doing this is, I have now started to think about the RGB value each time I see a colored object in real life >.<.

Hope you had fun! (I certainly did)

Adding faulted data to produce accurate results, faster

March 5, 2012

In computer algorithms, there is this thing called ‘soft heap‘, due to Princeton computer scientist Bernard Chazelle (2000), which is basically corrupting a certain percentage of data when building the data structure. When first heard of that, I can’t imagine how on earth could such a thing be anywhere useful.

Indeed it is, and the reason it works is quite interesting to me. Hence I’m writing this post about finding the (real) k-th smallest element in a set via this corrupted data structure. By the way, this was presented to me in a very interesting lecture by Robert Tarjan. For you computer scientists: be warned, I’m completely ignorant in the subject of algorithms.

Part 1: Heap

Naively sorting a set (i.e. compare the first two elements, put the smaller first; compare the 2nd and 3rd, move the third past the 2nd if it’s smaller; then compare the 3rd and the first, etc.) could take \sum_{i=1}^N \sim O(N^2) many comparisons. One can of course do better, the current known fastest sort takes O(N \log N) operations.

One way to achieve O(N \log N) is the so-called heap sort:

A heap is a binary tree with a number (key) in each node and the child of a node always has key larger than that of the node. (i.e. the minimum element is at the root)

Now the problem of sorting breaks into two parts:

i) build a heap with our given keys
ii) sort elements in a heap

As we can see, if we manage to build a heap with all branches having roughly the same length (computer scientist call that a balanced binary tree), there will be 2^n elements at level n. The height of the heap is \log_2 (N+1) \sim O(\log N). Main idea of the sort is to go through all elements and aim for doing a constant times the height many operations at each element (then we would get our N \log N).

Building the heap: Suppose we already have the first n element put in a heap, occupying all nodes in the first k levels and some nodes in the k+1-th level. We put the n+1-th element at the first empty node on the k+1-th level; compare it to its parent, swap if it’s smaller than it’s parent; compare to its parent again… Do this until it’s larger than it’s parent. Note that 1) swap does not change the shape of the tree. 2) when the process stops, we end up with a new heap. Hence inserting this new element takes at most twice the height many operations (compare and swap once at each level).

In total, building the heap takes no more than N \times 2 \log N \sim O(N \log N) operations. (in fact one can do this in O(N) time, but for our purpose it’s not needed)

Sort from the heap: Take the root as the minimum element, delete it from the heap; now compare the two elements in level 1, put the smaller into the root, now compare the two children of the empty node, etc. (if at some stage the empty node has only one child then just move it, stop if the empty node has no child.) At the end, delete the leaf node that’s empty. Again this process takes no more than
twice the height of the tree many operations.

Combining the above two steps, we have N \log N operations.

As one can guess heaps does much more than sorting stuff, once we spent our time to build data into a heap, we can save such structure and update it once the data set changes:

Inserting element: \sim O(\log N) (put the element at the next available leaf and move upwards)
Find minimum: \sim O(1) (well, the min is at the root)
Delete minimum: \sim O(\log N) (Delete, move up the smaller level-2 element, etc.)

OK~ so much for that, let’s move to the more interesting business:

Part 2: Error

The idea is: given a set of data with keys, instead of putting them into a heap, we put them into another structure (i.e. ‘soft heap’) such that the structure assigns a fake key every element and treat it as the real key. It does so in a way that:

i) All elements have fake keys larger than or equal to their real key.
ii) Only a certain percentage (\varepsilon) of elements have fake key different (larger than) real key.

The advantage of doing that is, unlike heap, it now takes only \sim O(1) (amortized) time to delete the minimum element, while building it still takes O(N) time. (Explaining exactly how this is done requires another post which I might write…for now, check out Chazelle ’00.)

Meanwhile, I guess the most curious thing is, how can we get any uncorrupted results with a corrupted data structure? As an example, let’s see how to find the (genuine) medium with data structure having \varepsilon precent of corrupted keys:

The mean observation is that, since all fake keys are larger than real keys, by deleting the smallest 1/2-\varepsilon of elements (according to fake keys), the genuine medium will be larger than all deleted elements (in real key) and hence surely remains in the set.

Now we have reduced the problem to finding the N\varepsilon/2-th smallest element in a set of cardinality (1/2+\varepsilon)N. Of course we can solve this by putting the elements in a reversed soft heap (i.e. all real keys are changed to their negative value), delete the smallest (1/2+\varepsilon) (fake key) elements, etc.

At each step, we reduce the size of the data set by at least a factor of (1/2+\varepsilon), and the amortized running time at each step is O(n) (building the soft heap) + O(n) (deleting about half of the elements) which is, in total \sim O(n).

Since the size shrinks in constant factor, the total running time is simply a geometric series, hence \sim O(N).


1. In the above argument, \varepsilon can be anything less than 1/2 hence in fact quite a lot of data are allowed to be corrupted!

2. It doesn’t matter whether we are finding the medium or the k-th smallest element, indeed, after the first step in the above argument we are literally finding the k-th smallest element. The general rule being simply taking the negative of the keys before building the soft heap whenever k<n/2.

3. The problem of finding the k-th smallest element is very well studied and algorithms that do this are called selection algorithms. Obviously our O(N) running time is optimum since it takes that much time to read off the data set! However, there are many other ways to achieve this linear running time, some are not even amortized, but as far as I know, one cannot do this with a real heap.

Simulating paintings on computer

February 27, 2012

Recently I’ve been playing around in our CS department and discovered there are actually people working in this field called computer graphics. Given that I’ve been playing with photoshop even before knowing what complex numbers are, this is quite exciting!

I guess I got a little too carried away in this homework item about coding your own non-photorealistic filter.

Anyways, I was never quite satisfied with any of those ‘artistic filters’ that came with commercial software. For example, here’s the photoshop ‘colored pencil’:



and dry brush:

Well, just believe I have tried to adjust the parameters to produce reasonable results…

I head down to MoMa and Metropolitan museum on the weekend and looked at some classical paintings with the intension of figuring out how might one go about programing it. (Since color mixes quite differently on canvas and on screen, there are actually some quite interesting problems to solve in order to make the computer “paint”.)

Anyways, here’s what I ended up with: (I would really like to make a Van Gogh filter…although haven’t gotten anywhere convincing yet…)

Pointillism: (I found this being the obvious one to start with):

real painting (Georges Seurat):

my pointillism filter:

(applied to a fairly ordinary photo:)


classical oil painting portrait:


Just for fun, I also did one for cubism:
(obviously inspired by Picasso’s violin)


my cubism filter:

Anyways, that was quite fun to program (especially for someone who haven’t touched code since high school).

In fact, I even found it rewarding in improving one’s painting skills. For example one thing I figured out was the distribution of luminance in painting is normally quite off from that of the real world, each artistic has their own ‘luminance curve’ and shifts the hue/saturation in a particular way. Umm…maybe coding filters should be added to all fine art curriculum :-P

A Chinese new year greeting~

January 23, 2012

So this is the first day of the Chinese new year, classes/seminars still haven’t resumed since December 17th, still two weeks to go before anything starts…

Anyways, I’m borrowing this occasion to thank everyone for continuously supporting this blog! We’ve had a great year, check the annual report at the end of the post for details.

As some of you may already know, I have decided that mathematics is not going to be my life-long career. So far it’s not clear in which direction will this blog go, it’s simply too important to abandon. I guess only time would tell. However, expect changes.

Finally, a sketch I doodled up last night~ Happy Chinese new year!

(btw, for those who have seen me in winter, yes I’m waring that Klein bottle hat in the picture!)

The stats helper monkeys prepared a 2011 annual report for this blog.

Here’s an excerpt:

The concert hall at the Sydney Opera House holds 2,700 people. This blog was viewed about 15,000 times in 2011. If it were a concert at Sydney Opera House, it would take about 6 sold-out performances for that many people to see it.

Click here to see the complete report.