Pages

Tuesday, 26 November 2013

A graph labelling problem

I've been doing some work using the shapefiles for the current Canadian federal electoral districts.  Each shape in the shapefile corresponds to a riding (or, in a few cases, a part of a riding).  One thing I have noticed while working with them is that there is no apparent structure to the order in which the shapes appear.

For example, Shape 1 is the boundary of the Vegreville--Wainwright riding in Alberta, while Shape 2 is Bruce--Grey--Owen Sound in Ontario.  Since Alberta and Ontario do not even share a border, these two districts are far apart from each other.  Meanwhile, the ridings adjacent to Vegreville--Wainwright are numbered 5, 21, 144, 183 195, 217, and 240.  Considering there are only 312 shapes [1] in total, 5 and 21 are the only numbers remotely close to 1.  If there is any reasoning behind the numbering scheme, it's not easy to tell what it is by looking at the numbers alone.  One assumes it's arbitrary.

I use the numbers frequently, so it would be helpful if I could tell a little bit more about how close two ridings are based on their numbers.

The best numbering system would have consecutive numbers mapped to adjacent ridings, so that one riding was adjacent to another if and only if their labels differed by exactly 1.  This impossible to do for every riding, however, since it would require each riding to be adjacent to only two other ridings, but most are adjacent to three or more others.

Instead, we could number the ridings so that the numbers of  ``close'' ridings are also close.

There are a couple ways we could do this.  First, let's put the problem in terms of graph theory.  Let $G$ be the graph whose vertices are the electoral districts.  Define two vertices to be adjacent if their corresponding districts share a border (excluding the case where two districts share a point).  Suppose we label each vertex $v$ with a number $n(v)$ between 1 and the number of vertices, using each number for exactly one vertex.
  1. Ensure adjacent vertices have close numbers.  Thus, for each pair of adjacent vertices, compute the difference between their labels, then take the sum of these numbers over all adjacent pairs.  That is, compute
    $$\sum_{uv\in E}|n(u)-n(v)|$$
    Define the best labellings to be those that minimize minimize this sum.
  2. Approximate the the distances between vertices with the difference between their respective labels.  To this end, let $d(u,v)$ be the distance between two vertices in the graph and let $D(u,v)=|n(u)-n(v)|$.  The difference between the labels approximates the distance between the vertices if $|d(u,v)-D(u,v)|$ is small.  Thus, define the best labellings to be those that minimizes the sum
    $$\sum_{uv}|d(u,v)-D(u,v)|$$
The first approach bears some similarity to the graph bandwidth problem.   In the bandwidth problem, the goal is to minimize the maximum value of $|n(u)-n(v)|$.  If this value is $m$, then $m|E|$ is an upper bound on the sum that appears in the first approach.  It's possible, however, that the sum and the maximum are minimized by different labellings. 

The fact that the graph bandwidth problem is NP-hard suggests that it will also be difficult to minimize the sum.  I also suspect that the sum in the second approach will be of comparable difficulty.

Edit: One way of interpreting the graph bandwidth is not obvious from the name (the name is derived from matrices associated with the graph).  If we think of the labels $n(v)$ as locations of vertices along the line from 1 to $|V|$, then $|n(u)-n(v)|$ is the length of the edge.  So the graph bandwidth is the smallest possible value for the length of the longest edge.   We can say something similar about the first solution.  The goal in that case is to minimize the total length of the edges, or, equivalently, the average edge length.

[1] The astute reader will notice that there are 308 ridings in total for the last election, and there will be 338 in the next.  So where does this 312 come from?  Two ridings include islands that are too far from the mainland to be included in the same shape, so are included as separate shapes, and one riding includes 3 mainland components that are separated from each other by another riding, yielding 4 extra shapes.

Monday, 25 November 2013

Antiderivative vs. Indefinite integral

I spent some time today getting ready for my class for the next term.  In particular, I was reading through the sections on antiderivatives and indefinite integrals.  I had normally taken these things to be distinct concepts.  Given a function $f$, an antiderivative of $f$ is any function whose derivative is $f$.  The indefinite integral is the set of all antiderivatives.  Since there are infinitely many antiderivatives, while an antiderivative is a single function related to $f$ by differentiation, these are not the same thing.

Or at least, that's what I thought until I read the aforementioned sections. The definitions I read today did not seem to make that distinction.  I consulted with the textbook that I used in my own first year calculus class many years ago.  This textbook does make the distinction.

I suspect the difference between the two ideas is not appreciated by a lot of students.  We don't dwell on it much once we give the definitions.  After that, we usually teach them techniques for finding indefinite integrals, and as long as they remember to include the constant of integration, we're happy.

The textbook that doesn't distinguish the two is one of the most widely used introductory calculus texts.  It's quite an old version, though.  Perhaps newer versions do distinguish.

In any case, it made me wonder how important, if at all, the distinction is.  Was a serious error made by leaving it out?  Am I being too picky by insisting my students understand the difference?

One thing I do find somewhat bothersome about erasing the distinction is the abuse of notation that results.  If $F$ is an antiderivative of $f$, then it can be written
$$\int f(x) dx = F(x)$$
The constant of integration is omitted because there is no distinction between the two terms.  This is an abuse of notation, because there are actually infinitely many functions that could appear on the right, all differing form $F$ by a constant.  So the $=$ sign doesn't belong there, because it implies uniqueness.  Of course, even writing the constant of integration
$$\int f(x) dx = F(x)+C$$
can be seen as an abuse of notation, but at least it captures the idea that there are infinitely many antiderivatives of $f$ and each of them has the form on the right.  I suppose, to capture the essence of the idea, we could instead write
$$\int f(x) dx = \{F(x)|F'(x)=f(x)\}=\{G(x)+C|C\in{\mathbb R}\}$$
where $G$ is a known antiderivative of $f$, but I've never seen it written this way.

Monday, 4 November 2013

Expansion Minimization

The determinant of a square matrix $A$ is typically defined in terms of cofactor expansion along the first row of $A$.  Then cofactor expansion along other rows or along columns is defined, and shown (or stated) to be equal to the determinant.  Following that, there is usually an example of a matrix that contains at least one zero.  This example is used to illustrate that sometimes, if we choose the appropriate row or column to expand along (namely, one with zeros), we can reduce work needed to compute the determinant using cofactor expansion.

It is certainly easy to tell which row(s) or column(s) to expand along to minimize the number of cofactors it is necessary to compute at each step.  Simply expand along a row or column that has the most zeros.

But what row or column should we expand along to minimize the total number cofactors to compute overall?  Or, to put it in more practical terms, what row or column should we expand along to minimize the
 amount of writing necessary?

Something to consider is that there are two ways to get a zero in the cofactor expansion.  If $B$ is some square submatrix encountered in the cofactor expansion of $A$, then either $b_{ij}=0$ or $\det B_{ij}=0$ ($B_{ij}$ being the $(i,j)-minor of $B$) to make a cofactor.  Since determining whether or not $\det B_{ij}=0$ requires a determinant computation, we would probably want to know the answer to the question to answer the question, which leads to circular reasoning.  So we'll assume that we're basing our expansion minimization strategy purely on which entries of $A$ are 0.

Of course, the answer to the question posed here is mostly useless for practical purposes, since nobody is going to compute the determinant through cofactor expansion for all but the smallest matrices.


But since we frequently present these example and pose these problems to students, with the expectation that they'll know what the best row or column is to expand along, I wonder if there's actually some way to tell what the best path to the determinant is from the matrix itself.