Posts Tagged ‘MATLAB’

Losing Weight with … Graph Algorithms!?

June 16, 2014

weightOverTheLastYear

Contents

Problem Statement

Between the Septembers of 2009 and 2010 I participated in an incentivised diet and exercise plan. My college roommate and I had both gained weight since college, and bet each other $20 every two weeks that the slope of our weight loss trajectory would be more negative than the other’s. We came out even, and each lost 30lbs.

Since then, I’ve kept the weight off, but not without the same kind of tracking that helped me get there. I now use the Withings Smart Body Analyzer to upload my weight readings to the cloud and track my trends.

This didn’t exist for me in 2009, though, and I was taking down all my weigh-ins manually. I lost some of this data, and recently found their only relic: a JPG with the data plotted with appropriate axes.

How could I programmatically extract the data from the picture, to recreate the raw data they represented, so I could upload it to my new, snazzy Withings profile?

Just like last time we surprisingly (or not?) get to use dmperm!

Load the file

I made the picture public back then to gloat about my progress.

I = imread('https://dl.dropboxusercontent.com/u/5355654/weightOverTheLastYear.jpg');
image(I); axis equal tight

weight_01

image(I); axis equal tight, xlim([80,180]); ylim([80,180]);

weight_02

Prepare the image

We can split out the different color channels and create a sz variable for later …

R = I(:,:,1); G = I(:,:,2); B = I(:,:,3); sz = size(R);

… as well as crop out the axes …

R(571:end,:) = 255; R(:,1:100) = 255;
G(571:end,:) = 255; G(:,1:100) = 255;
B(571:end,:) = 255; B(:,1:100) = 255;

… and check out a kind of grayscale version of the image:

D = sqrt(double(R).^2 + double(G).^2 + double(B).^2);
imagesc(D); axis equal tight; colorbar; colormap bone;

weight_03

imagesc(D); axis equal tight, xlim([80,180]); ylim([80,180])

weight_04

Detect the blobs

You can do blob detection on your own in MATLAB in a cinch / pinch. We first make a Gaussian kernel, and convolve it with our image to find help localize information packets that are about the size of our kernel:

k = @(x,y,t) 1 / (2*pi*t^2) * exp(- (x.^2+y.^2) / (2*t^2) );
[x,y] = meshgrid(-8:8,-8:8);
L = conv2(D,k(x,y,1),'same');
imagesc(L); axis equal tight; colorbar

weight_05

imagesc(L); axis equal tight, xlim([80,180]); ylim([80,180]);


weight_06Then we take the laplacian, the sum of the second derivatives in both dimenions which helps us find the edges of these blobs. This assigns the appropriate sign to the data that is close in shape to our kernel. We do some mild trickery to keep the image the same size as the original:

zh = zeros(1,sz(2));
zv = zeros(sz(1),1);
L2 = [zh ; diff(L,2,1) ; zh] + [zv diff(L,2,2) zv];
imagesc(L2); axis equal tight; colorbar

weight_07

imagesc(L2); axis equal tight, xlim([80,180]); ylim([80,180]);

weight_08

Adjacency matrices?

So we have “detected” our blobs, but we still need to find out where they are in the image. We do this by thresholding our laplacian data to find out the index locations in our matrix of every pixel that matters to us. We can then build an adjacency matrix. All of the pixels we care about are “connected” to themselves in this matrix. We also can take a look at all of the pixels above, below, to the left, and to the right of our pixels of interest to see if they are thresholded out of our interest or not. We make the matrix, A, sparse because it is HUGE and we know that many of it’s entries will be zero. Why store almost half a trillion zeros?!

T = L2 > 35;
spsz = [numel(T),numel(T)];
A = logical(sparse(spsz(1),spsz(2)));

idx = find(T);
[r,c] = ind2sub(sz,idx);

A(sub2ind(spsz,idx,idx)) = 1;
A(sub2ind(spsz,idx,sub2ind(sz,r+1,c))) = T(sub2ind(sz,r+1,c));
A(sub2ind(spsz,idx,sub2ind(sz,r-1,c))) = T(sub2ind(sz,r-1,c));
A(sub2ind(spsz,idx,sub2ind(sz,r,c+1))) = T(sub2ind(sz,r,c+1));
A(sub2ind(spsz,idx,sub2ind(sz,r,c-1))) = T(sub2ind(sz,r,c-1));

Learn about the blobs

DMPERM to the rescue–we’ve made an adjacency matrix where the connected components are the blobs we care about! When we run it, we can look at each connected component, find the pixels that belong to, and average their locations. You can see each connected component and the “location” we’ve assigned each one. It’s not perfect, but it’s really close:

C = zeros(size(T));
[p,q,r,s] = dmperm(A);
n = numel(r)-1;
px = nan(n-2,1);
py = nan(n-2,1);
for> G=2:n-1
    idx = p(r(G):r(G+1)-1);
    [rows,cols] = ind2sub(sz,idx);
    py(G-1) = mean(rows);
    px(G-1) = mean(cols);
    C(idx) = .25;
end
C(sub2ind(sz,round(py),round(px))) = 1;
imagesc(C); axis equal tight; colorbar

weight_09

imagesc(C); axis equal tight, xlim([80,180]); ylim([80,180]);

weight_10

Extract the real data

With all of that work done, we can prescribe how the pixel locations relate to weights and dates, and actually collect the real data from the image:

weights = interp1(30.5:67:566.5,225:-5:185,py);
dates = interp1([112,1097],[datenum('9-8-2009'),datenum('12-1-2010')],px);

dateStrs = datestr(dates,'yyyy-mm-dd HH:MM:SS');

f = fopen('weight.csv','w');
fprintf(f,'Date, Weight\n');
D = 1:numel(weights)
    fprintf(f,'%s, %f\n',dateStrs(D,:),weights(D));
end
fclose(f);

plot(dates,weights,'.','MarkerSize',10), datetick

weight_11

All in all, this is a pretty gross way to get at the underlying data, but it was fun to try and to get it working. What have you used DMPERM for recently? What have you used MATLAB or Image Processing for recently?

Enter the Rosser matrix

January 8, 2014

I love matrices. They can encode love affairs, process images — heck, things like representation theory let us use matrices for practically anything.

Rosser Matrix

I also watch Cleve Moler‘s MathWorks blog, Cleve’s Corner, like a hawk. So when he recently posted about the Rosser Matrix I was left disappointed by what he didn’t talk about. The matrix itself is interesting because of its place in eigenvalue history. Eigenvalue: the word is just awesome. If it’s not comfortable for you, just think of the eigenvalues of a matrix like they are your…values. When you go out in the world, you make an impact and push things in the direction of the values you believe in, and certain values are more important to you than others. Matrices do the same thing with the (eigen)values they espouse.

So it’d be great if we could compute the eigenvalues of a matrix: they tell us a lot (or at least something) about who they are. These days, this is straightforward, there are many (even free) computational tools to do it. Back in the day, however, eigenvalues were a difficult thing to find, and some were harder than others. For example, eigenvalues that are really close to one another are hard to pin down precisely, and when an eigenvalue is repeated (that’s a thing) we’d like to find every copy of it.

So, back to Rosser. He makes this test matrix in 1950 that’s got a lot of good stuff in there that they could compute exactly:

  • A double eigenvalue.
  • Three nearly equal eigenvalues.
  • Dominant eigenvalue of opposite sign.
  • A zero eigenvalue.
  • A small, nonzero eigenvalue.

Then they could benchmark proposed eigenvalue-finding-algorithms (which would run for days on behemoth computers) against how close they were to the actual eigenvalues.

I love this steampunk mathematics, but the juiciest parts seemed to be left out of Cleve’s post: what algorithms were they actually using back then and (more importantly) how does one make a test matrix? It appeared that it wasn’t just Cleve leaving out the good stuff either, MATLAB itself doesn’t tell us anything interesting about how to make the Rosser matrix:

%   Copyright 1984-2005 The MathWorks, Inc.
%   $Revision: 5.10.4.2 $  $Date: 2005/11/18 14:15:39 $

R  = [ 611.  196. -192.  407.   -8.  -52.  -49.   29.
       196.  899.  113. -192.  -71.  -43.   -8.  -44.
      -192.  113.  899.  196.   61.   49.    8.   52.
       407. -192.  196.  611.    8.   44.   59.  -23.
        -8.  -71.   61.    8.  411. -599.  208.  208.
       -52.  -43.   49.   44. -599.  411.  208.  208.
       -49.   -8.    8.   59.  208.  208.   99. -911.
        29.  -44.   52.  -23.  208.  208. -911.   99.];

After more slightly digging than expected, I found Rosser’s original paper on the subject (and an incredible bible of math I hadn’t heard of before). The first thing I noticed was that there were many other people involved than just Rosser, none of which were slouches: Lanczos has eponymous algorithms, Hestenes with him crushed some linear systems, and Karush killed it at nonlinear programming. Another name I saw which deserves mention here is in the footnote below:

Miss Fannie M. Gordon, numerical analyis

There’s isn’t much on the internet about Miss Gordon, but it appears she was working at INA along with Lanczos. In his paper on “his” algorithm (not yet named as such) to which the Rosser matrix paper is a direct follow-on, another footnote talks about her in much more grateful detail:

Indebted to Miss Gordon

While she didn’t go down in the record books like Lanczos and friends, it’s great to see that her work behind the scenes was appreciated and talked about, a part of mathematical history we don’t talk about now as much as we should. For another peak into this corner of the mathematical world, check out the list of all of the NBS/NIST staff members mentioned in A Century of Excellence in Measurements, Standards, and Technology: A Chronicle of Selected NBS/NIST Publications, 1901-2000 [Text, Google Books].

With all this information at my fingertips, I could get a much clearer picture of how to get your hands dirty and find an eigenvalue. It’s only in another appendix, however, that Rosser tells us how to make actually make a test matrix, the key ingredient that was used to benchmark algorithms across decades of computational and mathematical advancement. There, on the bottom of page 293, are the 64 entries of the matrix (color coded in the image above), just as they are in rosser.m:

The original Rosser matrix

I had to see how it actually worked, so in the paste below you’ll find a MATLAB Rosser recipe, the way sausage is actually made (you can skip the code for a visual explanation):

%   Created by David A. Gross. Inspired by [1].
%   Construction from [2,3].
%
%REFERENCES
%   [1] http://blogs.mathworks.com/cleve/2014/01/06/ ...
%   the-rosser-matrix/, accessed on 2014/01/07
%
%   [2] Rosser, J.B.; Lanczos, C.; Hestenes, M.R.; Karush, W.
%   Separation of close eigenvalues of a real symmetric matrix
%   (1951), J. Res. Natl. Bur. Stand., Vol. 47, No. 4, p. 291,
%   Appendix 1, https://archive.org/details/jresv47n4p291,
%   accessed on 2014/01/07
%
%   [3] T. Muir, History of Determinants III, 289 (Macmillan
%   and Co., Ltd., London, 1920), http://igm.univ-mlv.fr/ ...
%   ~al/Classiques/Muir/History_3/, accessed on 2014/01/07

% make our eigenvalues in 2x2 matrices
M1 = [102  1 ;  1 -102]; % lambda = ± sqrt(102^2 + 1)
M2 = [101  1 ;  1  101]; % lambda = 101 ± 1
M3 = [  1 10 ; 10  101]; % lambda = 51 ± sqrt(51^2-1)
M4 = [ 98 14 ; 14    2]; % lambda = 100, 0

B = zeros(8);

% explode M[1...4] into an 8x8 matrix
B([1,6],[1,6]) = M1;
B([2,8],[2,8]) = M2;
B([4,5],[4,5]) = M3;
B([3,7],[3,7]) = M4;

sylvester88_A = @(a,b,c,d) [ ...
    a  b  c  d ; ...
    b -a -d  c ; ...
    c  d -a -b ; ...
    d -c  b -a ];

sylvester44 = @(a,b,c,d) [ ...
    a  b  c  d ; ...
    b -a  d -c ; ...
    c -d -a  b ; ...
    d  c -b -a ];

% make Sylvester's "penorthogonant" of determinant 10^8
P = blkdiag(sylvester88_A(2,1,1,2),sylvester44(1,-1,-2,2));

% P'*P = 10I
R = P'*B*P;

It’s quite cool, actually. Four 2×2 symmetric matrices are constructed to have the desired eigenvalues, and those matrices are exploded into an 8×8 sparse matrix (sparse in that it’s all zeros where there aren’t any dots):

How the sausage matrix gets made

Lastly, a special matrix (magenta & yellow, above) is smashed on either side of our sparse matrix and BAM!–you’ve got yourself a full test matrix with the eigenvalues you wanted.

There’s a lot of this that’s wonderfully clear and clever, in hindsight: how Rosser forced and hand-calculated the eigenvalues he wanted, how he kept the matrix symmetric. But there are many things that were left up to Rosser to decide, almost artistically, about how the matrix should be made. The special smashing matrix, for example, actually scales up the eigenvalues of the original four matrices by a constant factor. A guy named Sylvester said that it was easy to make it scale up by powers of 2, but that if you were careful you could make a matrix that scales up by any number you want. Rosser had to cleverly find the integer entries of that special matrix that would give him a scale that meaningfully preserved the original eigenvalues he picked (for usability and clarity) and he chose to scale them up by 10.

Another artistic choice Rosser made was how to explode the original matrices into the sparse 8×8 matrix. What I mean is all of the following matrix explosions:

Matrix explosions

(and 2,511 other possibilities) have the same eigenvalues, but would make completely different full matrices after being smashed. Computational eigenvalue history would have looked very–well, slightly–different had Rosser picked any of these as the base for his test matrix. Maybe there’s something deeper to uncover here about his choices, but I’d like to think that Rosser loved matrices as much as I do, and that’s just one that he liked more than the rest.

If you don’t love matrices now, that’s ok. What originally started as a small coding exercise turned into a much deeper and richer look at the history of computational linear algebra (matrix algorithm stuff). I hope that some of you take the code and have fun making matrices that match up with your own values, and that others learned a little about how math was done almost 65 years ago.