source: trunk/documents/theses/dstn/review.tex @ 12282

Revision 12282, 88.0 KB checked in by dstn, 15 months ago (diff)

review: related astrometric calibration work

Line 
1
2\section{Preface}
3
4This thesis describes some of the research carried out by me and my
5colleagues in the \an group.  \an started as a collaboration between
6Sam Roweis, a Toronto computer scientist, and David W.~Hogg, a New
7York University astronomer.  An idea that began as a crazy scrawl on
8the back of a napkin---probably in a bar---eventually grew into a real
9project and began attracting collaborators and students.
10Fast-forwarding a few years, we have achieved a rare feat: a computer
11vision software system that \emph{works}, with little human
12intervention, and is being used by hundreds of real astronomers doing
13real research.  We have also branched out and explored a number of
14other promising areas where ideas in computer vision and machine
15learning can be applied to astronomical data to great benefit.
16
17
18This thesis is quite blatantly a collection of manuscripts.  I have
19done a bit more than staple them together to produce the chapters of
20this thesis, but the original tone and style of each manuscript still
21shines through.  I hope my readers will find the chapter-to-chapter
22variations a refreshing, rather than jarring, change of pace.  One
23advantage of collecting manuscripts into a thesis is that each chapter
24should largely stand alone.  Another advantage is that each manuscript
25was prepared with a list of authors, so it is simpler to identify the
26collaborators who have contributed to each chapter.  Each chapter has
27a note detailing the authors of the original manuscript and explaining
28my personal contribution to the work described.
29
30
31\section{Introduction}
32
33
34The general idea of the \an collaboration is to apply ideas in
35computer vision and machine learning to problems and data in
36astronomy.  Astronomical images are a good domain in which to explore
37ideas in computer vision, because the problems to be solved are often
38much more constrained than in general imaging, the telescopes and
39cameras are often well-understood and calibrated, large volumes of
40data and ``ground truth'' information exist, and with new telescopes
41constantly coming online and producing greater and greater volumes of
42imaging, there is a need for sophisticated computation.  I briefly
43outline astronomical imaging for computer vision researchers in
44\secref{sec:astrointro}.
45
46
47The specific idea addressed in this thesis is that pattern recognition
48approaches can be applied to astronomical images.  The goal is to
49produce a system that is able to recognize automatically the stars in
50any image of the night sky, using the information in the image pixels
51alone.  Recognizing the stars in an image is equivalent to finding the
52pointing, scale, and rotation of the image on the sky in a standard
53coordinate system.  The branch of astronomy concerned with measuring
54the positions and motions of celestial bodies is called
55\emph{astrometry}, and the task of placing a new image in a standard
56reference frame is known as ``solving the astrometry'' or
57``calibrating the astrometry'' of the image.  The \emph{blind
58astrometric calibration} task is to calibrate an image using only the
59image itself.  The broad goal of the \an project was to build a system
60that would allow us to create correct, standards-compliant astrometric
61\metadata for every useful astronomical image ever taken, past and
62future, in any state of archival disarray.  This is part of a larger
63effort to organize, annotate and make searchable all the world's
64astronomical information.
65
66
67After a brief introduction to astronomical imaging for computer
68scientists, the remainder of this chapter reviews the area of pattern
69recognition, and in particular the framework of \emph{geometric
70hashing} for object recognition in images.  One of the contributions
71of this thesis is to replace the simple hash table used in traditional
72geometric hashing with a \kdtree, so I review related work in hashing
73and other approaches for fast feature matching.  Finally, I present
74the blind astrometric calibration task as an instance of object
75recognition, and review some previous approaches to the problem.
76
77%% Our approach?
78
79
80The remaining chapters present our approach to the blind astrometric
81calibration.  \Chapref{chap:techreport} explains our approach and
82presents the results of large-scale tests of the system on real-world
83data.  Chapters \ref{chap:verify} and \ref{chap:kdtree} delve into
84details of the approach: \Chapref{chap:verify} presents the Bayesian
85decision theory problem that lies at the heart of our approach, while
86\chapref{chap:kdtree} explains the technical details of the \kdtree
87data structure implementation that is key to making our system
88performant enough to be a practical tool.  \Chapref{chap:conclusion}
89summarizes our results.
90
91
92\section{Astronomical imaging for computer scientists}
93\label{sec:astrointro}
94
95Astronomical images are very different from images produced by typical
96consumer-grade cameras.  This section is intended to give
97non-astronomers a sense of the typical parameters of contemporary
98astronomical imaging setups, and introduce some of the terminology and
99key ideas.
100
101
102The Sloan Digital Sky Survey (SDSS) \cite{sdsstechnical, sdsscamera}
103is one of the most important projects in contemporary astronomy.  The
104primary imaging survey was carried out from 2000 through 2008, so the
105hardware is no longer cutting-edge, but it is still very impressive.
106The telescope's primary mirror is $2.5$ meters in diameter.  Incoming
107light first strikes the primary mirror, then a one-meter secondary
108mirror, then passes through two corrective lenses in front of the
109camera.  The nearly distortion-free field of view is about $3$ degrees
110wide.  The camera is composed of $30$ charge-coupled devices (CCDs),
111each $2048\times2048$ pixels, arranged in six columns.  The five CCDs
112in each column have different bandpass filters, ranging from the
113near-infrared ($913~\nanometers$) through the optical to the
114near-ultraviolet ($354~\nanometers$).  The CCDs and associated
115electronics are cooled with liquid nitrogen to $-80^{\circ} \unit{C}$
116and held under vacuum.  Each CCD is about $5~\unit{cm}$ square, and
117the whole focal plane of the camera is about $65~\unit{cm}$ in
118diameter.  The camera assembly has a mass over $300~\unit{kg}$.
119
120
121The survey operates in ``drift-scanning'' mode: the telescope is
122pointed in a nearly fixed direction, and as the Earth rotates the sky
123drifts past the camera.  The camera electronics shift the electrons
124along the CCD columns at the same rate.  As an object drifts through
125the field of view, the electrons it produces in the CCD pixels march
126along with it, and are read out as it disappears from view.  In the
127SDSS camera, the object drifts past each of the five bandpass-filtered
128CCDs of the camera in turn, producing a five-color image.  Over the
129course of a night of observing, the camera sweeps out a great circle,
130creating six columns of five-band images that are $2048$ pixels wide
131and over a million pixels long: over $120$ gigabytes of pixels per
132night.  This clever drift-scanning scheme has many advantages, one
133being that the readout rate of the CCDs can be relatively slow,
134resulting in low readout noise.  Calibration of the camera is simpler
135because charge is integrated over columns of the CCD: it is only
136necessary to calibrate whole columns rather than individual pixels.
137Note that for this scheme to work, the CCDs must be very carefully
138aligned in the focal plane, and the telescope control system must be
139capable of keeping the system pointed stably at the level of a pixel
140or better.  The moving mass of the telescope is over
141$15,000~\unit{kg}$, so this is no mean feat!
142
143% 24 um pixels
144
145Astronomical imaging systems are designed so that the point-spread
146function (PSF)---the image on the CCD of a distant point source of
147light---is spread over several pixels.  This allows the position of a
148point source to be measured to sub-pixel accuracy by fitting a model
149of the PSF to the observed pixel values.  If the system focused a
150point source onto a single pixel, this would be impossible.  The goal
151is typically for the full-width at half-maximum (FWHM) of the PSF to
152be a few pixels, so that the PSF is well-sampled but the signal is not
153spread out over too many pixels.
154
155For ground-based telescopes, the moving atmosphere causes the image of
156a distant point source of light to dance around on the CCD; in typical
157exposures of many seconds this ``speckling'' is integrated into a
158point-spread function.  The size of this PSF is called the ``seeing''
159and a value of one $\unit{arcsecond}$ FWHM is considered quite good.
160The SDSS camera has pixels of size $0.396~\unit{arcseconds}$, and the
161point-spread function is dominated by the seeing: the optics of the
162system are good enough that the images are as clear as they can be
163given the atmosphere.
164
165
166The positions of astronomical objects are usually measured in an
167\emph{equatorial coordinate system}: essentially the projection of the
168Earth's latitudes and longitudes onto the celestial sphere at a
169specific instant.  The latitudinal direction is called ``declination''
170and abbreviated ``Dec'', and the longitudinal direction is called
171``right ascension'' and abbreviated ``RA''.  Declination is typically
172measured in degrees from $-90$ to $+90$, and right ascension is
173typically measured in degrees from $0$ to $360$, or hours from $0$ to
174$24$.  There are $60~\unit{arcminutes}$ per degree and
175$60~\unit{arcseconds}$ per $\unit{arcminute}$.  For reference, the
176Moon as seen from Earth is about half a degree in diameter.  The
177$0.396~\unit{arcsecond}$ pixels of the SDSS camera view an area the
178size of a penny at a distance of ten kilometers; each $2048\times2048$
179pixel CCD views an area of about one-millionth of the sky.
180
181
182The SDSS is very impressive, but the next generation of telescopes
183currently in development dwarf its specifications.  The Large Synoptic
184Survey Telescope (LSST) \cite{lsst}, for example, has an
185$8.4~\unit{meter}$ primary mirror, with a $3.5~\unit{degree}$ field of
186view almost completely covered by an array of nearly $200$ CCDs, each
187with $4096\times4096$ pixels: a total of about $3.2$ gigapixels.  The
188camera is the size of a small car.  It will take exposures of about
189$15$ seconds, yielding over $10$ terabytes of data per night.
190
191% -- basics of data reduction; catalogs
192% -- stars, galaxies, and other stuff
193
194
195
196
197\section{Pattern recognition}
198
199\emph{Pattern recognition} encompasses a huge class of problems and
200techniques in computer vision and machine learning.  Many pattern
201recognition problems are of the type ``identify this object'' or
202``find things like this'': the system is first told about a large set
203of objects, then is presented with a novel object and is asked to
204describe it or find similar objects from the set of known objects.
205
206Pattern recognition systems often take a \emph{feature matching}
207approach.  A \emph{feature} is a compact description of
208(problem-specific) relevant, invariant properties of an object, such
209that objects that are similar in the problem domain have similar
210features.  Defining how features are extracted from objects and
211defining the similarity between features are key factors in the
212success of such systems.  Often the representation of a feature (the
213\emph{feature descriptor}) is a point in a real coordinate space and a
214distance metric defines the similarity between features.
215
216Feature-based approaches typically involve two phases: an
217\emph{indexing} phase, in which the set of known objects is shown to
218the system, and a \emph{test} phase, in which novel objects are
219presented.  During indexing, features are extracted from the set of
220known objects.  At test time, features are extracted from the novel
221object and compared with the set of features from the known objects.
222In many domains, the set of known objects is very large, so it is
223critical that features be stored in a manner that allows the system to
224rapidly find known features that match a given feature.
225
226Often feature-based approaches suffer \emph{feature aliasing}
227problems: two objects that are dissimilar in the problem domain may
228yield similar features.  In this case, systems may use \emph{voting}
229or \emph{verification} schemes to ensure that aliased (false) feature
230matches are eliminated.  Voting schemes typically assume that false
231feature matches are independently randomly distributed, so the
232probability of matching several features to the same incorrect object
233is the product of the individual probabilities; if we demand a
234sufficient number of matches, the probability of a false match becomes
235small.  Verification or \emph{hypothesize-test} schemes treat a
236feature match as a hypothesis that the novel object is similar to a
237known object, and then some test to decide if the hypothesis is
238correct.  Such schemes can be seen as using feature matching as a
239search heuristic.
240
241\section{Visual pattern recognition}
242
243Pattern recognition in images is a particularly challenging and
244complex problem domain which has been studied extensively.  Objects to
245recognize include faces, fingerprints, specific physical objects, or
246even general classes of physical objects such as mugs, chairs, or
247elephants.
248
249
250Visual pattern recognition of general physical (3-dimensional) objects
251is difficult for many reasons: objects may be deformable, can appear
252different when viewed from different angles or under different
253lighting conditions, and can be occluded by other objects.  Features
254used in visual pattern recognition include edges (lines and curves),
255corners, or the color or texture of image patches.
256
257
258An important consideration when designing features for visual pattern
259recognition is the size (\ie, spatial extent in the image) of the
260feature.  \emph{Local features} gather information from a small region
261of the image, while \emph{global features} incorporate information
262from the whole image.  Local features can provide robustness to
263occlusion and deformation: if part of the object is hidden, features
264belonging to that part will be lost, but the features belonging to the
265remaining parts will still be found.  Since local features aggregate
266information from only a small portion of the image, they may be more
267noisy or less distinctive than global features.
268
269
270Since a single local feature may be insufficient to guarantee that an
271object has been correctly recognized, some systems build high-level
272features by combining multiple low-level features.  When the objects
273to be recognized are rigid or articulated (composed of multiple rigid
274parts connected by simple joints), a common approach is to build
275features that describe the relative geometry of low-level features.
276This use of \emph{geometric features} is often called \emph{geometric
277hashing}, since the result is a feature descriptor or \emph{hash code}
278which describes the geometric arrangement of the low-level features.
279
280
281
282\section{The geometric hashing framework}
283
284\begin{figure}
285\begin{center}
286% dot lamdan.dot -Tps2 -o lamdan.ps
287% ps2pdf -sPAPERSIZE=a4 lamdan.ps lamdan.pdf
288\includegraphics[height=0.8\textheight]{lamdan}
289\end{center}
290\caption{Outline of the Geometric Hashing scheme.  This diagram is a
291  reproduction of \fig 2 in Lamdan \etal \cite{lamdan1990}.
292  \label{lamdan}}
293\end{figure}
294
295
296Lamdan \etal \cite{lamdan1990} and Wolfson and Rigoutsos
297\cite{wolfson1997} give an excellent overview of the geometric hashing
298approach.  See Figure \ref{lamdan} for a block-diagram overview.
299Geometric hashing is based on the fact that when rigid objects undergo
300similarity transformations (translation, rotation, and scaling),
301angles between points on the object and relative distances between
302points do not change.  Therefore, if we describe the relative
303arrangement of points on an object in terms of these invariants, our
304description will be invariant under these transformations of the
305object.  Invariant geometric properties can be found for other types
306of transformations such as affine or perspective projection.
307
308% \cite{affineinvariant}
309
310
311Geometric hashing builds \emph{geometric features} out of lower-level
312features.  We will call the low-level features \emph{interest points}
313for clarity.
314
315Given the types of transformations the system must handle, there is
316some sufficient number of interest points required to define an
317invariant local coordinate system; these are called the \emph{basis
318  set}.  For example, for 2-D objects that can undergo similarity
319transformations, two interest points can be used to define a
320coordinate system.  For 3-D objects with similarity transformations,
321three (non-collinear) interest points are required.
322
323Geometric hashing follows an indexing approach.  During the indexing
324phase, the system preprocesses a database of objects which are to be
325recognized.  For each object, many interest points are extracted which
326are invariant under the transformations that are expected, and can be
327well localized.  The system then enumerates all basis sets by looking
328at all combinations of interest points; each set of basis features
329defines a relative coordinate system.  For each basis set, all
330remaining interest points and enumerated and their positions described
331in this coordinate system.  Each relative coordinate vector forms a
332\emph{geometric feature descriptor}.  In this way, the relative
333geometric arrangement of a set of interest points is converted into a
334point in a vector space.
335
336% --- consider adding a variant of the quad fig here.
337
338In the standard geometric hashing approach, the geometric feature
339descriptor is discretized and encoded as a hash table key. A grid is
340place over the feature descriptor space, and the hash table key of a
341geometric feature is the identifier of the grid cell containing it.
342Each entry in the hash table maps back to the basis set plus the
343feature whose position is encoded.
344
345% --- picture?
346
347Given a new image, a geometric hashing system extracts all interest
348points from the image and, as at indexing time, proceeds to enumerate
349all basis sets and computes the positions of the remaining interest
350points in each basis set.  This generates many geometric features
351whose descriptors are discretized and converted to hash table keys, as
352before.  Similar arrangements of interest points will lead to similar
353geometric feature descriptors, which--with luck---will be mapped to
354the same hash table key.  The hash table values associated with each
355key are retrieved, and each value that is found is considered a match.
356Each match votes for the correspondence between the interest points
357from the database and the interest points from the image that were
358used to define the geometric feature.
359
360
361After all features from the image have been processed, many votes will
362have accumulated.  False matches (due to aliasing or hash table
363collisions) should be uniformly distributed over the set of known
364objects, so should not generate very many votes for any particular
365object.  Objects that are actually present in the image should
366generate many votes.  Objects that receive an insufficient number of
367votes are rejected, and the remainder are subjected to a verification
368process.
369
370
371The verification process requires computing a transformation from the
372object to image coordinates, using the corresponding interest points
373from the known object and the image.  Using this transformation, all
374the interest points belonging to the known object are projected into
375image coordinates.  If the match is correct, we expect the image to
376contain features at these locations.  Some reasearchers add an extra
377verification stage by projecting the known object into the image and
378checking that the predicted edges of the object are found in the image
379(eg, \cite{lamdan1990, huttenlocher1990}).
380
381
382The geometric hashing approach is flexible and applicable to many
383problems, but there are some issues with the basic version described
384here.  The hash function described above simply places a grid over the
385relative coordinate system and returns the identifier of the grid cell
386in which the interest point lands.  This is prone to edge effects: if
387an interest point is near the edge of a grid cell, a small change in
388its position due to noise can move it across the edge into a different
389grid cell, which causes the hash code to change and a different set of
390feature matches to be found.  This problem can be overcome by
391detecting interest points that are near the edges of their cells and
392also searching the hash table using the hash codes of neighbouring
393cells.  This results in more false feature matches, and can become
394expensive as the dimensionality of the geometric feature space
395increases (there are more edges, and the proportion of the hypervolume
396of feature space that is near an edge increases).  Using a data
397structure other than the hash table may be beneficial.
398
399
400Another issue with uniform grid-based hashing is that it assigns equal
401importance (and storage space) to each grid cell in the feature space.
402Consider an interest point that is far from the basis set that defines
403its local coordinate system.  Small positional errors in the basis set
404can cause the coordinate system to be rotated or scaled, which results
405in large changes in the interest point's relative coordinates.  This
406means that edge effects become more pronounced in such regions of the
407feature space.  One response to this concern is to increases the size
408of the grid cells far from the basis set, but then if the interest
409points are uniformly distributed, these larger hash table bins will
410contain more features, so every match to these bins will be
411accompanied by more false positives.
412
413
414Finally, a hash table with equal-sized cells will only be uniformly
415utilized if interest points are uniformly distributed in feature
416space, but some domains may yield interest points that are far from
417uniformly distributed.  In other domains, it may be beneficial to
418place constraints on the geometric features that are used (for
419example, to avoid regions that are prone to large errors, or that are
420relatively undistinctive).
421
422
423There is nothing in the geometric hashing framework that requires a
424standard hash table to be used for feature matching.  Other methods of
425feature matching are described in the next section.
426
427
428\section{Related work in fast feature matching}
429
430
431For feature matching in a geometric hashing framework, the database of
432known objects is converted into a set of points in a feature
433descriptor space, which is a vector space.  To find matches to a new
434feature (extracted from an image in which we want to recognize
435objects), we need to find all points in the vector space that are
436within some tolerance.  Usually the Euclidean distance metric is
437applied, so the tolerance is a radius in the vector space.
438
439
440\subsection{Bloom filters}
441
442A Bloom filter \cite{bloom1970} is a data structure that can answer
443queries of the form ``is feature $q$ a member of the set of known
444features?''  A Bloom filter is a set of $n$ bits, initially zero.  A
445set of $k$ hash functions must be defined which take a feature as
446input and produce an integer in $[0, n)$ as output.  In the indexing
447phase, we examine each of the known features, and apply each of the
448hash functions, producing $k$ hash codes.  Each code is used to
449reference a bit and \emph{set} it (\ie, turn it on).  Each known
450feature results in $k$ bits being set; several features may request
451that a particular bit be set.
452
453
454Given a new feature to match, we run the $k$ hash functions on it and
455test whether each of the resulting bits are set.  If the feature is
456equal to a feature seen at indexing time (according to the hash
457functions), then all $k$ of the bits will have been set; thus the
458Bloom filter does not produce false negatives.  If the query feature
459is not equal to a known feature, then some of the bits may have been
460set due to \emph{collision} of the hash functions, but the feature
461will only be accepted if all $k$ hash functions result in collision;
462this results in a false positive.  The rate of false positives
463increases as the number of known features increases relative to the
464number of bits in the filter.  The Bloom filter is ideally suited to
465cases where the number of known features is small, and a verification
466step can be applied to eliminate any false positives that are
467produced.
468
469
470For feature matching, we want to know not only whether the query
471feature $q$ matches a known feature, we want to know \emph{which}
472known feature matches.  One way to answer such queries is to use a
473\emph{Bloomier filter} \cite{chazelle2004}.  The simplest Bloomier
474filter uses multiple Bloom filters to categorize a given feature into
475two different categories, and uses slightly more space than two Bloom
476filters.  A set of $n$ such Bloomier filters can be used to categorize
477a feature into $2^n$ different categories by assigning one Bloomier
478filter to each bit of the result.
479
480
481The standard Bloom filter only finds exact matches, but we want to
482find \emph{nearby} matches.  An extension of Bloom filters to allow
483proximity searches is given by Kirsch and Mitzenmacher
484\cite{kirsch2006b}.  This variant uses \emph{locality sensitive
485hashing} (LSH, discussed below) as its hashing function, so it
486inherits some limitations from LSH.  Most importantly, LSH only
487guarantees with some probability that nearby features will hash to the
488same bin.  As a result, the locality-sensitive Bloom filter can
489produce false negatives as well as false positives.
490
491
492In order to answer feature matching queries (find all known features
493near a given query feature), one could combine these Bloom filter
494extensions to build a locality-sensitive Bloomier filter.  However,
495the false negative rate of the locality-sensitive filter would be
496magnified because the Bloomier filter requires \mbox{$2
497\ceil{\log(n)}$} separate filters to represent $n$ known features, and
498all of these filters must produce the correct answer.  It seems that
499Bloom filters are not particularly well suited to this task.
500
501
502\subsection{Locality Sensitive Hashing}
503
504Locality Sensitive Hashing (LSH) \cite{datar2004, gionis1999,
505andoni2006}, is a technique for answering approximate nearest
506neighbour queries: it returns a feature whose distance is within a
507constant factor of the distance to the nearest neighbour.  The LSH
508scheme can be extended to handle exact nearest-neighbour queries
509efficiently if the set of known features satisfies some \emph{bounded
510growth} criteria.  It is intended for use in high-dimensional spaces
511(tens to thousands of dimensions), and $\ell_p$ distance metrics for
512$p \in (0, 2]$.
513
514
515The core idea of LSH is to use several hash functions with the
516property that the probability of collision---features being placed in
517the same bin---is much higher for nearby features than for distant
518ones.  In the indexing stage, we populate the hash table by running
519each hash function on each known feature.  To perform a query, we run
520the hash functions on the query feature and retrieve the features in
521the same hash table bin as the query.  One of these features is likely
522to be an approximate nearest neighbour of the query.
523
524
525In practice, in \cite{datar2004} each hash function is composed of
526several (typically $10$) simpler hash functions.  Each simple hash
527function performs a random projection in the feature space: the hash
528value is the discretized value of the dot product of the feature
529vector with the hash function's random vector.  Combining several of
530these simple hash functions is equivalent to projecting the feature
531vectors onto a grid in a $10$-dimensional, non-orthogonal, subspace.
532The hash code is simply the identifier of the grid cell in which the
533feature falls.
534
535
536This scheme suffers from edge effects: a query feature may land in a
537bin other than the bin containing its nearest neighbour due to small
538positional noise in any of the dimensions.  Attempting to reduce this
539effect by making the bin size (discretization level) larger fails
540because the number of hash functions must be increased to maintain the
541same total number of hash bins.
542
543
544In order to reduce the number of false negatives, we can choose many
545(typically $30$) different hash functions and place a reference to
546each feature in the hash table bin chosen by each function.  This
547increases the probability that at least one of the hash functions will
548select a $10$-dimensional subspace in which the query is close to its
549nearest neighbour.
550
551
552The standard LSH scheme does not seem particularly well suited for the
553purpose of feature matching in a geometric hashing system, since the
554dimensionality of the feature vectors is usually moderate ($2$ to $6$
555in the \an system). Since some feature aliasing is expected in
556geometric hashing systems, the nearest neighbour may not be the
557correct match: we would prefer to get all neighbours within a given
558search radius.
559
560
561\subsection{\Kdtrees}
562
563\begin{figure}
564  \begin{center}
565        % tree-figs.py : plot_bboxes
566    \begin{tabular}{c@{\hspace{3pt}}c@{\hspace{3pt}}c}
567      \includegraphics[width=0.31\textwidth]{kdtree-0} &
568      \includegraphics[width=0.31\textwidth]{kdtree-1} &
569      \includegraphics[width=0.31\textwidth]{kdtree-2} \\
570      \includegraphics[width=0.31\textwidth]{kdtree-3} &
571      \includegraphics[width=0.31\textwidth]{kdtree-4} &
572      \includegraphics[width=0.31\textwidth]{kdtree-5}
573    \end{tabular}
574  \end{center}
575  \caption[A \kdtree partitioning of space.]{A \kdtree partitioning of
576        space.  Each panel shows the nodes at one level of the tree.  The
577        points are 200 samples from a ring with uniformly-distributed
578        radius in $\left(0.6, 1\right)$ and uniformly-distributed angle in
579        $\left(0, \displaystyle{\frac{\pi}{2}}\right)$.  Nodes are split
580        along the dimension with the largest range, and the splitting
581        position is chosen so that the two child nodes will contain an
582        equal number of points.  Note how the tree adapts to the
583        distribution of the data: each node tightly bounds the data points
584        it owns.  This is a slightly different version of the \kdtree than
585        Bentley's original description.\label{kdtree}}
586\end{figure}
587
588The $k$-dimensional tree (\kdtree) was introduced by Bentley in 1975
589\cite{bentley1975}, and several extensions have been described
590\cite{friedman1977, sproull1991}.  The idea is to build a binary tree
591over the feature space, where each node ``owns'' a region of the space
592which is disjoint from the region owned by its sibling node.  Each
593non-leaf node has an axis-aligned splitting hyper-plane; its left
594child owns the subspace to the left of the splitting plane, and the
595right child the subspace to the right.  In this way, the \kdtree
596defines a hierarchical subdivision of the feature space into
597axis-aligned hyper-rectangles.  See \figref{kdtree}.
598
599
600For the purposes of indexing the features in a geometric hashing
601system, the set of features is known beforehand and is static, so we
602do not need to insert or delete features from the tree, and Bentley's
603\emph{optimized} \kdtree construction method can be used.  That is,
604the tree can be built to adapt to the particular set of features we
605have.
606
607
608Tree construction proceeds by first assigning all features to the root
609node, then recursively splitting nodes until each leaf node owns at
610most $L$ features, with $L$ perhaps $10$ or $20$.  Splitting is done
611by selecting a splitting plane and using it to partition the features.
612Sproull \cite{sproull1991} enumerates several strategies for choosing
613the splitting hyperplane.  Traditionally, \kdtrees use axis-aligned
614splitting hyperplanes, and the splitting dimension is chosen by simply
615iterating through the dimensions (as in Bentley's original paper), or
616by selecting the dimension with the largest \emph{range} or
617\emph{variance}.  One can also use an arbitrary splitting hyperplane,
618for example by finding the principal eigenvector of the covariance
619matrix of the feature vectors.  Typically the splitting hyperplane is
620positioned so that the two children own an equal number of data
621points.  This leads to a balanced (and therefore short) tree, but in
622some applications it can be beneficial to bias the splitting location
623\cite{macdonald1990}.
624
625
626Depending on the application, it can be beneficial to store at each
627node the tight axis-aligned bounding box (and other summary statistics
628\cite{deng1995}) of the points owned by that node (as shown in
629\figref{kdtree}).  This is particularly useful where there is
630significant correlation between the dimensions of the data, because
631splitting the data points along one dimension implies that the
632bounding box will shrink in other dimensions as well, and this can
633result in faster searches.  If the bounding boxes are stored, then it
634is no longer strictly necessary to store the splitting hyperplane,
635though it may be useful to speed up some computations.
636
637
638\comment{
639\begin{figure}
640  \begin{center}
641    \includegraphics[width=\textwidth]{rangesearch-bb-fig}
642  \end{center}
643  \caption[\Kdtree range search with bounding boxes.]{\Kdtree range
644search for a query point $q$ in \kdtree node $N$, for a \kdtree that
645stores the bounding box of each node.  We check whether the query
646hypersphere overlaps the child nodes $L$ and $R$.  In both cases $q$
647is contained in the left child node $L$ so we must recurse on $L$.  In
648the top panel, the distance \emph{in the splitting dimension} from $q$
649to the bounding box of node $R$ is less than the search radius $r_1$
650so we do not need to recurse on $R$.  In the lower panel, the distance
651in the search dimension to the query is larger than the search radius
652$r_2$ so we must recurse on $R$.}
653  \label{rangesearch-bb}
654\end{figure}
655
656
657\begin{figure}
658  \begin{center}
659    \includegraphics[width=\textwidth]{bohm2001fig7}
660  \end{center}
661  \caption[Lower and Upper Distance Bounds in a \kdtree.]{Lower and upper
662        bounds (min- and max-dist, respectively) on the distance from a query point
663        to a \kdtree bounding box.
664        This figure is copied from \cite{bohm2001} Figure 7.}
665  \label{upperlower}
666\end{figure}
667 
668
669\begin{figure}
670  \begin{center}
671    \includegraphics[width=\textwidth]{rangesearch-split-fig}
672  \end{center}
673  \caption[\kdtree range search with splitting planes.]{\kdtree range
674        search for a query point $q$ in \kdtree node $N$,
675        for a \kdtree that only stores the splitting plane (not the bounding boxes).
676        Top: since the query is to the left of $N$'s splitting plane (dashed line),
677        we must recurse on the left child $L$.  Since the distance in the splitting
678        dimension from the query to the splitting plane, $D_R$ is less than $r$,
679        we must recurse on $R$.
680
681        Bottom: For the left node ($L$): the query is above the splitting plane, so we must recurse
682        on child $T$.  The distance $D_B$ in the splitting dimension to the splitting plane
683        is less than $r$, so we must recurse on $B$ also.
684       
685        For the right node ($R$): the query is above the splitting plane, so we must recurse
686        on child $t$.  The distance $D_b$ to the splitting plane is greater than $r$ so we
687        need not recurse on node $b$.  (Note, we could also keep track of the \emph{total}
688        distance to the query point, $\sqrt{D_R^2 + D_b^2}$, but this requires more book-keeping.}
689  \label{rangesearch-sp}
690\end{figure}
691}
692
693For feature matching, we typically want to do \emph{range search}:
694find all features within radius $r$ of a given query feature $q$.
695This is easily achieved with a \kdtree using a recursive algorithm.
696Starting at the root, we must decide whether we need to recurse on
697each of the node's children.  (How we make this decision is explained
698below.)  Once we reach a leaf, we enumerate the features owned by the
699leaf and compute the distance to the query feature; any feature with
700distance less than $r$ is accepted.
701
702
703We decide whether we must recurse on a node's children based on their
704bounding boxes.  We compute the \emph{mindist}---the lower-bound of
705distance---between the query point and each child's bounding box.  The
706mindist is simply the distance to the corner or edge of the bounding
707box, and can be computed quickly.  Any node whose mindist to the query
708point is less than the query radius $r$ must be visited.  If the
709\kdtree does not explicitly store the tight bounding-boxes of the
710nodes, loose bounding-boxes can be constructed during the recursion
711based on the splitting planes of the ancestors.
712
713
714\comment{
715Nearest-neighbour search in \kdtrees is essentially the same as range
716search, except that $r$ is considered the \emph{pruning distance}; any
717nodes whose lower-bound distance is larger than the pruning distance
718need not be visited.  The pruning distance decreases monotonically as
719the search proceeds and is simply the (upper bound of) the distance to
720the nearest neighbour found so far.  Since reducing the pruning
721distance is critically important to achieving fast search, the order
722in which nodes are explored is important; one typically keeps a
723priority queue of nodes to visit, sorted by the (lower-bound) distance
724of the node to the query feature.  If the query feature falls in the
725region owned by one of the leaf nodes, then all nodes on the path from
726the root to that leaf will have lower-bound distance of zero, so this
727path will be explored first and the ``greedy'' nearest neighbour
728distance will become the first pruning distance.
729
730
731The \kdtree can also be used for approximate nearest-neighbour or
732range searches.  Beis and Lowe \cite{beis1997} describe the Best Bin
733First algorithm, which is essentially the same as the full
734nearest-neighbour search except that the search terminates after
735exploring a fixed maximum number of leaves or after a fixed period of
736time has elapsed.  The priority queue is sorted by the lower-bound
737distance of the node's bounding box to the query point, so the `most
738promising' nodes are explored first.
739}
740
741
742Chapter \ref{chap:kdtree} presents the \kdtree in more detail.
743
744
745\subsection{Other approaches}
746
747% Voronoi:
748% -de Berg 1997
749%   (Mark de Berg , Marc van Kreveld , Mark Overmars , Otfried Schwarzkopf, Computational geometry: algorithms and applications, Springer-Verlag New York, Inc., Secaucus, NJ, 1997)
750% -Edelsbrunner 1987
751%   (Herbert Edelsbrunner, Algorithms in combinatorial geometry, Springer-Verlag New York, Inc., New York, NY, 1987)
752% -Preparata and Shamon 1985
753%   (Franco P. Preparata , Michael I. Shamos, Computational geometry: an introduction, Springer-Verlag New York, Inc., New York, NY, 1985)
754
755In contrast to the \kdtree's rectangular decomposition of space, there
756is a family of space decompositions that use hyperspheres.  An example
757is the \emph{ball-tree} \cite{uhlmann1991b}, in which each non-leaf
758node is associated with one of the known features $f$ and a radius
759$r$; the left child owns all features within radius $r$ of feature
760$f$, and the right child owns the remaining features.  Similarly, in
761the \emph{anchors hierarchy} \cite{moore2000}, each non-leaf node is
762represented by a feature (called its \emph{anchor}), and a node owns
763all the features that are closer to its anchor than the anchor of its
764sibling.  These sphere-based trees are supposed to better capture the
765structure of high-dimensional data sets, though the results seem to
766depend quite strongly on the particular data distributions and search
767parameters.
768
769
770For two-dimensional feature spaces, algorithms based on planar point
771location (eg, \cite{kirkpatrick1983}) and the Voronoi tesselation of
772space yield $\mathcal{O}(n \log n)$ preprocessing time, a data
773structure of size $\mathcal{O}(n)$, and query time of
774$\mathcal{O}(\log{n})$.  Unfortunately, in higher dimensions the space
775required to store the Voronoi tesselation is
776$\mathcal{O}(n^{\floor{d/2}})$, which quickly becomes untenable
777\cite{arya1998}.
778
779% planar point location:
780% -Dobkin and Lipton in 1976 O((log n)^2) time, O(n^2) space
781% -Sarnak and Tarjan O(n) space, O(log n) time
782%    * Neil Sarnak, Robert E. Tarjan (1986). "Planar Point Location Using Persistent Search Trees". Communications of the ACM 29 (7): 669-679.
783% -Edelsbrunner, Guibas, and Stolfi, same
784%    * Herbert Edelsbrunner, Leonidas J. Guibas, Jorge Stolfi (1986). "Optimal point location in a monotone subdivision". SIAM Journal on Computing 15 (2): 317-340.
785% -Kirkpatrick, same
786%    * David G. Kirkpatrick (1983). "Optimal Search in Planar Subdivisions". SIAM J. Comput. 12: 28-35.
787% Voronoi tesselation:
788% -B. Delaunay, Sur la sphère vide, Izvestia Akademii Nauk SSSR, Otdelenie Matematicheskikh i Estestvennykh Nauk, 7:793-800, 1934
789
790The design of spatial indexing data structures and search algorithms
791is itself a large research area.  Dozens of different tree structures
792have been proposed: in two surveys B\"ohm \cite{bohm2001} and
793Hjaltason \cite{hjaltason2003} identify B-, $\textrm{B}^{+}$-, ball-,
794bisector-, BSP-, DABS-, fq-, gh-, \mbox{GNA-,} hB-, $\textrm{hB}^{\pi}$-,
795hybrid-, IQ-, kd-, kd-B-, $\textrm{LSD}^h$-, M-, mb-,
796$\textrm{mb}^{\ast}$-, mvp-, oct-, \mbox{post-office-,} pyramid-, quad-, R-,
797$\textrm{R}^{\ast}$-, $\textrm{R}^{+}$-, sa-, slim-, \mbox{sphere-,}
798SR-, SS-, TV-, vp-, $\textrm{vp}^{\ast}$-, and X-trees.  There is an
799equally mind-boggling variety of exact and approximate search
800algorithms.  Each data structure and algorithm may work well in some
801region of parameter space, but there is no clear winner for
802general-purpose use.
803
804
805\section{Astrometric calibration as a pattern recognition task}
806
807
808\comment{
809wget "http://casjobs.sdss.org/ImgCutoutDR7/getjpeg.aspx?ra=166.45&dec=-0.03&scale=1&opt=&width=2000&height=2000" -O ngc3521-orig.jpg
810jpegtopnm ngc3521-orig.jpg | pnmrotate -45 | pnmcut 600 900 1400 1000 | pnmscale -reduce 2 | pnmtojpeg > ngc3521.jpg
811#---> http://live.astrometry.net/status.php?job=alpha-200906-68444159
812jpegtopnm ngc3521.jpg | ppmtopgm | pnminvert | pnmtojpeg > ngc3521-bw.jpg
813wget "http://live.astrometry.net/status.php?job=alpha-200906-36181848&get=field.xy.fits" -O ngc3521.xy
814wget "http://live.astrometry.net/status.php?job=alpha-200906-36181848&get=index.xy.fits" -O ngc3521-index.xy
815jpegtopnm ngc3521-bw.jpg | plotxy -N 100 -i ngc3521.xy -I - -x 1 -y 1 -C black -b white > ngc3521-sources.png
816jpegtopnm ngc3521-bw.jpg | plotxy -N 100 -i ngc3521.xy -I - -x 1 -y 1 -C black -b white -P | plotxy -I - -i ngc3521-index.xy -x 1 -y 1 -C black -b white -s crosshair -P | plot-constellations -f 18 -w ngc3521.wcs -i - -N -o ngc3521-index.png
817%%% wget "http://live.astrometry.net/status.php?job=alpha-200906-36181848&get=wcs.fits" -O ngc3521.wcs
818%%% scp gmaps:/data2/test-merc/tycho.mkdt.fits .
819wget "http://explore.astrometry.net/tile/get/?layers=tycho,grid,userboundary&arcsinh&wcsfn=alpha/200906/36181848/wcs.fits&gain=-0.5&bb=0,-85,360,85&dashbox=0.1&w=500&h=500&lw=3" -O ngc3521-zoom0.png
820wget "http://explore.astrometry.net/tile/get/?layers=tycho,grid,userboundary&arcsinh&wcsfn=alpha/200906/36181848/wcs.fits&gain=-1&bb=175.533,-17.7621663832,211.533,17.6598478619&dashbox=0.01&w=500&h=500&lw=3" -O ngc3521-zoom1.png
821wget "http://explore.astrometry.net/tile/get/?layers=tycho,grid,userboundary&arcsinh&wcsfn=alpha/200906/36181848/wcs.fits&gain=0.5&bb=191.733,-1.85338140354,195.333,1.74602498613&w=500&h=500&lw=3" -O ngc3521-zoom2.png
822for x in 0 1 2; do
823 pngtopnm ngc3521-zoom${x}.png | ppmtopgm | pnminvert | pnmtopng > ngc3521-zoom${x}-bw.png;
824done
825}
826
827
828% ~/an-2/usnob-map/execs/tilerender -x 0.000000 -y -85.000000 -X 360.000000 -Y 85.000000 -w 1024 -h 1024 -l 'tycho' -l 'grid' -l 'boundary' -s -g -0.5 -W 'tor/200706/51145570/wcs.fits' -L 5 -B 0.1 -d > tile1.png
829% ~/an-2/usnob-map/execs/tilerender -x 151.724000 -y -4.784223 -X 187.724000 -Y 29.772221 -w 1024 -h 1024 -l 'tycho' -l 'grid' -l 'boundary' -s -g -0.25 -W 'tor/200706/51145570/wcs.fits' -L 5 -B 0.01 -d > tile2.png
830% ~/an-2/usnob-map/execs/tilerender -x 167.924000 -y 11.335527 -X 171.524000 -Y 14.841399 -w 1024 -h 1024 -l 'tycho' -l 'grid' -l 'boundary' -s -g 0.5 -W 'tor/200706/51145570/wcs.fits' -L 5 > tile3.png
831%
832% /home/gmaps/an-2/quads/plotxy2 -i /home/gmaps/ontheweb-data/tor/200706/51145570/index.xy.fits -S 1 -W 720 -H 503 -x 1 -y 1 -w 2 -r 6 -s s > ixy.pgm
833% /home/gmaps/an-2/quads/plotxy2 -i /home/gmaps/ontheweb-data/tor/200706/51145570/field.xy.fits -S 1 -W 720 -H 503 -r 5 -x 1 -y 1 -w 2 -s c > fxy.pgm
834% cp /home/gmaps/ontheweb-data/tor/200706/51145570/image.pnm undim.ppm
835% pgmtoppm green ixy.pgm > ixy.ppm
836% pnmcomp -alpha=ixy.pgm ixy.ppm undim.ppm > sum.ppm
837% pnmcomp -alpha=fxy.pgm fxy.ppm sum.ppm > sum2.ppm
838% pnmtopng sum2.ppm > sum.png
839% pnmcomp -alpha=fxy.pgm fxy.ppm undim.ppm | pnmtopng > sources.png
840% http://oven.cosmo.fas.nyu.edu/test/status.php?job=tor-200706-51145570
841
842
843\begin{figure}
844\begin{center}
845\framebox{\includegraphics[width=0.99\figunit]{ngc3521-bw}} \\
846\framebox{\includegraphics[width=0.99\figunit]{ngc3521-sources}} \\
847\framebox{\includegraphics[width=0.99\figunit]{ngc3521-index}}
848\end{center}
849\caption{\captionpart{Top:} Input image (credit: Sloan Digital Sky
850Survey).  \captionpart{Middle:} The brightest 100 sources extracted
851from the image.  \captionpart{Bottom:} Reference sources, transformed
852into the image coordinate system (crosshairs).  Many of the image and
853reference sources are aligned, but there are many image sources
854without reference sources.  Our system knows about the positions of
855many objects of interest on the sky, and has labelled the galaxy NGC
8563521.\label{fig:redgreen}}
857\end{figure}
858
859
860\begin{figure}
861\begin{center}
862\begin{tabular}{c@{\hspace{1pt}}c@{\hspace{1pt}}c}
863\framebox{\includegraphics[width=0.31\textwidth]{ngc3521-zoom0-bw}} &
864\framebox{\includegraphics[width=0.31\textwidth]{ngc3521-zoom1-bw}} &
865\framebox{\includegraphics[width=0.31\textwidth]{ngc3521-zoom2-bw}}
866\end{tabular}
867\end{center}
868\caption{The location of the input image on the sky.
869  \captionpart{Left:} The whole sky, in Mercator projection.  The
870  sinusoid-shaped feature is the Milky Way.  The dashed box shows the
871  zoomed-in region.  \captionpart{Middle:} Zoomed in by a factor of
872  $10$.  \captionpart{Right:} Zoomed in by a factor of $100$.  The box
873  shows the outline of the input image.\label{fig:onthesky}}
874\end{figure}
875
876
877For modern astronomers, astrometric calibration is often one of the
878first steps toward getting useful information out of an image of the
879sky.  Aligning a new image with an \emph{astrometric reference
880catalog} allows the astronomer to place the image within a standard
881coordinate frame.  This allows stars, galaxies, and other objects
882(\emph{sources}) in the new image to be identified with known sources,
883which in turn allows astronomers to calibrate other properties of the
884new image, and allows the positions of new sources to be described in
885a standard reference frame.
886
887
888The task of blind astrometric calibration---automatically finding the
889astrometric calibration of an image, using only the information in the
890image pixels---can be seen as a pattern recognition problem.  As
891Bertin \cite{bertin2005} notes, ``astrometric and photometric
892calibrations have remained the most tiresome step in the reduction of
893large imaging surveys,'' so this is not only an interesting problem to
894solve, but one with practical implications for astronomers.
895
896
897For the purposes of astrometric calibration, we can think of the sky
898as a large two-dimensional surface: the stars are very distant, so our
899viewpoint is effectively fixed.  We are moving, as are the stars, but
900these motions are small relative to the precision at which we
901typically work.  The sky contains many stars, galaxies, and other
902astronomical sources.  The stars and distant galaxies are effectively
903point sources, while closer galaxies can be resolved.  Astrometric
904reference catalogs list the positions, motions, and brightnesses of
905these sources and serve as the ``ground truth'' or database of known
906(reference) objects.  The USNO-B1 catalog \cite{usnob, nomad}, for
907example, lists over one billion objects.  As many as a few percent of
908these are false detections or other artifacts \cite{barroncleaning},
909and some objects that should be visible are missing.
910
911
912Extra sources can be due to planets, comets, satellites, or aircraft.
913Missing or poorly localized source can be due to imperfections in the
914imaging sensor, saturation, cosmic ray interference, or (rarely)
915occlusion.  Errors in the image processing that detects sources can
916lead to sources being gained or lost.  We call sources that appear in
917only the input image or reference catalog \emph{distractors} and
918\emph{dropouts}, respectively.  The existence of distractors and
919dropouts means that we can never assume that all the objects in the
920reference catalog will be contained in an image to be recognized, or
921vice versa.
922
923
924The images to be recognized are subregions of the sky.  Image sizes
925range from nearly half the celestial sphere down to $10^{-7}$ of the
926area and smaller.  The input images measure unknown bands of the
927electromagnetic spectrum, and various nonlinear functions may have
928been applied to the pixel values.  We cannot rely on absolute
929brightness or color to recognize individual stars or galaxies.  At
930best we can hope that there is some positive correlation in the
931relative brightness ordering of objects in the image and the
932corresponding objects in our catalog.
933
934
935Blind astrometric calibration is an ideal task for exploring geometric
936ideas in pattern recognition.  Most celestial objects are effectively
937point sources, and can be found and localized to sub-pixel accuracy
938using relatively simple image-processing procedures.  But since the
939individual features are characterized only by their positions and
940brightnesses, we must examine collections of features in order to
941build distinctive patterns.  In \chapref{chap:techreport} we present
942\an, which applies the geometric hashing framework to the task of
943blind astrometric calibration.  An example of our results in shown in
944\figs \ref{fig:redgreen} and \ref{fig:onthesky}.
945
946
947\comment{
948There are several useful applications for a blind astrometry solver.
949Most telescopes used by professional astronomers are
950computer-controlled and automatically record the telescope pointing in
951the image header, but these systems occasionally fail.  Feeding the
952images to a blind astrometry solver would allow an independent
953verification that the \metadata written by the telescope control
954software are correct.  Second, some observatories have large archives
955of photographic plates taken by astronomers over many decades, and a
956growing fraction of these have been digitized.  These images could be
957useful to researchers, since they provide a longer baseline for
958measuring changes over time.  However, information about where the
959telescope was pointing has often been lost, or is only available as a
960hand-written note on the edge of the photographic plate, or as an
961entry in a logbook.  A blind astrometry solver would be an invaluable
962tool for easily providing accurate \metadata about these images.
963Finally, these is a large and enthusiastic community of amateur
964astronomers who produce many images that would be useful to
965professional astronomers, but since amateurs are seldom interested in
966recording exact \metadata, these images are not readily available to
967professional astronomers.  A blind astrometry solver would unlock this
968trove of pixels.
969
970Other interesting applications include creating a customized
971star-finding chart overlayed on the input image, and controlling a
972telescope by using the output of a blind solver as feedback to a
973control system.
974
975\begin{figure}
976  \begin{center}
977        %\begin{tabular}{c@{\hspace{1pt}}c@{\hspace{1pt}}c}
978        \includegraphics[width=\figunit]{moon} \\
979        \includegraphics[width=\figunit]{saturated}
980        %\end{tabular}
981  \end{center}
982  \caption{Astronomical images with occlusion.  \captionpart{Top:} The
983moon, buildings, and mountains occlude the stars behind them (image
984copyright Johannes Schedler, \texttt{http://panther-observatory.com}).
985\captionpart{Bottom:} Saturation and diffraction spikes due to bright
986objects can obscure nearby objects.}
987\label{moon}
988\end{figure}
989
990Since most celestial objects are effectively point sources, the
991obvious image feature to use is an individual object.  Feature
992detection is relatively straightforward, and the positions of features
993can be measured with subpixel accuracy, but the feature descriptor
994essentially contains only brightness and position; individual features
995are not distinctive.
996
997Although occlusion is typically not a problem in astronomical images,
998it does occasionally occur (see Figure \ref{moon}).  In addition,
999bright objects in the image can cause diffraction spikes or saturation
1000over a large region of the image, and this can cause nearby objects to
1001be hidden.
1002}
1003
1004
1005
1006\section{Related work in astrometry}
1007
1008There seem to be two distinct groups of researchers who have worked on
1009astrometric calibration.  The first are professional astronomers,
1010whose images are typically of small angular extent, long exposure
1011time, and high quality.  They typically assume that there is a good
1012initial guess of the astrometry and the goal is to produce a very
1013accurate astrometric calibration, including image distortion.  The
1014second group of researchers are spacecraft engineers who want to use
1015the stars to estimate the attitude of a camera attached to a
1016spacecraft.  Here the images are of wide angular extent, have short
1017exposure time, and are very noisy.  Primary concerns include weight,
1018power consumption, and robustness (especially in avoiding false
1019positives), while a high degree of accuracy is neither required nor
1020possible given the hardware.
1021
1022\subsection{Non-blind astrometric calibration}
1023
1024Non-blind astrometry requires that an initial estimate of the image
1025astrometry be provided.  Groth \cite{groth1986} presents an algorithm
1026for matching two lists of coordinates (eg, image coordinates in pixels
1027and reference star coordinates on the celestial sphere), assuming that
1028the lists contain a significant proportion of objects in common.  With
1029the limited computing resources available at the time, he suggests
1030taking the brightest 20 or 30 objects in each list.
1031
1032
1033Once the two lists of objects have been compiled and the brightest
1034objects selected, all sets of three objects are enumerated and the
1035triangles they form are described by a scale-, rotation-, and
1036translation-invariant descriptor, composed of the length ratio of the
1037longest to shortest edges, plus the cosine between these edges and the
1038\emph{sense} or \emph{parity} of the triangle.  The tolerances
1039associated with these features are computed (by propagating their
1040positional errors through the feature descriptor process) and stored,
1041along with the logarithm of the triangle's perimeter.  Triangles with
1042large length ratio are rejected since they are relatively
1043undistinctive.
1044
1045
1046After triangle features are extracted from both point lists, feature
1047matching is performed by checking whether the distance between each
1048pair of features is less than their corresponding tolerances.  This
1049process is accelerated by sorting the features on one of the feature
1050dimensions.  If multiple matches are found for a particular feature,
1051only the closest is considered.  After all the features have been
1052compared, many correct matches should be found, along with some false
1053matches.  For each match, the difference of the log-perimeters of the
1054two triangles is computed.  This gives the relative scales of the two
1055triangles and hence their coordinate frames, assuming the match is
1056correct.  False matches are rejected by iterative outlier detection:
1057the mean difference of log-perimeters is computed and matches far from
1058the mean are rejected.
1059
1060
1061This approach is essentially an application of the geometric hashing
1062method, though instead of using hashing to accelerate feature
1063matching, the features are simply kept in lists that are sorted on one
1064dimension.  Much of the subsequent work in this area follows
1065essentially the same path (eg, \cite{valdes1995}, \cite{dong2006}).
1066
1067
1068P\'al and Bakos \cite{pal2006} adapt the triangle-matching approach to
1069images containing many more objects (of order $10^4$).  Since it would
1070become prohibitively expensive to enumerate all triangles in such an
1071image, they reduce the number of triangles created by using only the
1072triangles created by a Delauney triangulation.  This vastly reduces
1073the number of triangles created, but makes the algorithm more
1074sensitive to dropout and distractor stars: a single extra star causes
1075a completely disjoint set of triangles to be created in its
1076neighbourhood.  To compensate for this shortcoming, they define an
1077\emph{extended Delauney triangulation}: for each point, they select
1078all points at distance $\ell$ in the Delauney triangulation, and
1079triangulate this set.  This process is repeated for each point.  The
1080first extended triangulation ($\ell = 2$) skips over the nearest set
1081of stars and builds larger triangles by using the next-further set of
1082stars.  The intent is that if some of the nearby stars are distractors
1083that they will be ignored when the extended triangulations are used.
1084
1085\begin{figure}
1086\begin{center}
1087\includegraphics[width=\figunit]{pal-fig2}
1088\end{center}
1089\caption{The triangle parameterization used by P\'al and Bakos.  This
1090figure is a reproduction of \fig 2 in \cite{pal2006}.\label{pal}}
1091\end{figure}
1092
1093P\'al and Bakos introduce a new triangle parameterization which is
1094continuous and sensitive to chirality (parity); see \figref{pal}.
1095This two-dimensional parameter space is used as the geometric feature
1096space.  When matching triangles between the two images, they demand a
1097\emph{symmetric point match}: each triangle must be the other
1098triangle's nearest neighbour in feature space.  Matching two images is
1099done by creating the lists of triangles and attempting to find
1100symmetric point matches.  This process is accelerated by sorting each
1101list by one of the coordinates.  Each triangle match is considered to
1102be a vote for the correspondence of the three pairs of points
1103composing the triangles.  These votes are accumulated in a sparse
1104matrix where element $(i, j)$ contains the number of votes for a
1105correspondence between object $i$ in the first image and object $j$ in
1106the second image.  After all matches are considered, the top 40\% of
1107the nonzero matrix elements are assumed to contain the true
1108correspondences.  A transformation based on these correspondence is
1109computed and the unitarity of the transformation matrix is used to
1110judge whether the match is true or false.
1111
1112
1113A different approach is taken by Kaiser \etal \cite{kaiser1999}.  They
1114assume they are given two lists of source positions that differ by a
1115rigid transformation involving scaling, rotation, and translation.  In
1116each image, they iterate over each pair of points and compute the
1117vector difference of their positions.  For each list, the log-length
1118and angle of the pairwise difference vectors are accumulated in a
1119two-dimensional histogram.  Observe that if the whole list of points
1120is scaled up by a constant factor, then the log-distance between each
1121pair of points increases by a constant amount.  Similarly, if the
1122whole list is rotated then the angles shift by a constant amount.
1123Once the two histograms have been computed, their cross-correlation is
1124computed.  If the two lists contain a significant number of
1125corresponding points, the cross-correlation signal will be strongest
1126at a shift corresponding to the difference in log-scale and rotation
1127between the lists.  This process is similar to a Hough transform
1128\cite{duda1972,ballard1981}, except that instead of finding the peak
1129of a single parameter-space histogram, we are searching for a peak in
1130the similarity of two histograms as we shift them with respect to each
1131other in parameter space.
1132
1133
1134Once the scaling and rotation between the lists has been found, the
1135translation can be found by scaling and rotating one of the lists into
1136the frame of the other list, then histogramming the vector difference
1137between points across the two lists.  This is a standard (generalized)
1138Hough transform.
1139
1140
1141\subsection{Blind astrometric calibration}
1142
1143
1144The majority of previous work on blind astrometric calibration is
1145motivated by the problem of spacecraft attitude estimation.  Sometimes
1146called the ``lost in space'' or ``stellar gyroscope'' problem, the
1147task is to estimate the pose of a spacecraft by using an image of the
1148sky captured by an onboard camera.  Although similar in general
1149spirit, the requirements and limitations of this application are quite
1150different than astronomical applications.  Mass, power consumption,
1151and robustness of the system are primary concerns, and as a result the
1152optical designs are very different from science-grade astronomical
1153instruments, and the available processor and memory resources are very
1154restricted.  Typical fields of view are tens of degrees across, and
1155the exposure time is kept short to allow the system to function while
1156the spacecraft is rotating.  As a result, image quality is typically
1157quite poor: often only a handfull of the brightest stars are visible.
1158Since the field of view is large, a reference catalog of a few
1159thousand stars is sufficient to ensure that any view of the sky
1160contains many reference stars.  Since the system will only process
1161images from a single camera, the whole system can be customized and
1162calibrated to that camera.  For example, the nonlinear distortions of
1163the optical system can be measured, and the scale and bandpass of the
1164imaging system are known, so the reference catalog can be tailored to
1165match.
1166
1167
1168For example, Liebe \etal \cite{liebe2004} describe a system design
1169with a $56~\deg$ field of view and exposure time of $50~\unit{ms}$.
1170The resulting images contain tens of stars if the spacecraft is not
1171rotating, but on average only three stars will be detectable when the
1172rotation rate is $50~\deg/\unit{sec}$.  The paper does not describe a
1173particular algorithm for star identification, with the implication
1174that it is not a particularly difficult problem since absolute
1175brightness information will be available, and the total number of
1176stars that are visible to the camera is only a few hundred.
1177
1178
1179In earlier work \cite{liebe1993}, Liebe describes a star
1180identification system.  The reference catalog is composed of the 1539
1181brightest stars, with brightness calibrated to the camera used in the
1182system.  The field of view is $30~\degrees$, and the system uses a
1183feature-matching approach, using the brightest star in the field and
1184its two nearest neighbours to define a geometric feature.  The feature
1185descriptor is composed of the distance from the brightest star to its
1186nearest and second-nearest neighbours, along with the angle between
1187these neighbours.  Note that this feature is not scale-invariant,
1188because the system is only meant to solve images taken by one camera,
1189so the scale is known.  An index is constructed by enumerating all
1190such features that could possibly be detected, given the detection
1191limits of the camera.  These features are coarsely quantized and
1192stored in a table.  To account for noise in the feature descriptors,
1193all neighbouring cells in the quantized feature space are also stored
1194in the table.  This generates $185,000$ features.  At test time, the
1195features in the image are enumerated and the table of features is
1196scanned; an exact feature match in the quantized space is assumed to
1197be correct.
1198
1199
1200This approach is a fairly straightforward geometric hashing technique,
1201except that it does not use hashing as such, and there is no voting or
1202verification scheme because feature aliasing is assumed not to happen.
1203
1204\begin{figure}
1205\begin{center}
1206\includegraphics[width=\figunit]{grid}
1207\end{center}
1208\caption{A grid-based feature: two sources are used to define a local
1209  coordinate system which is discretized into a grid of cells.  Each
1210  cell becomes a bit in the feature descriptor; if a cell is occupied
1211  its bit is set.  This figure is copied from MacKay
1212  \cite{mackay2005}.}
1213\label{gridbased}
1214\end{figure}
1215
1216In contrast to systems that build features from the precise locations
1217of a small number of stars, Padgett and Kreutz-Delgado
1218\cite{padgett1997} present an approach that incorporates information
1219from a large portion of the image.  Their grid-based feature is
1220defined by first choosing one star as the reference star to define the
1221center of the grid, then selecting its nearest neighbour (outside an
1222exclusion radius) to define the orientation of the grid.  Once the
1223center and orientation are determined, a grid is defined, and each
1224remaining stars in the image is assigned to a grid cell.  The feature
1225is a bit vector, one bit per grid cell, where the bit is set only if
1226the cell contains a star.  See \figref{gridbased}.
1227
1228
1229The index is constructed by scanning the sky and selecting, for each
1230field of view, the $n$ brightest stars.  For each of these stars a
1231feature is computed and stored.  At test time, they examine the
1232brightest $c n$ stars (for some safety factor $c$), computing the
1233feature for each one and searching the index for a match.  A pair of
1234features is defined to match if their dot product is above a
1235threshold.  This is equivalent to taking the bitwise \textsc{AND} of
1236the bit vectors and counting the number of bits that are set.  This
1237search can be implemented efficiently by using lookup tables: for each
1238bit, they maintain a list of the features for which that bit is set.
1239Note that this is equivalent to using a grid-based geometric hashing
1240approach: each grid cell is equivalent to a discretized relative
1241coordinate vector, which becomes a hash key, and the ``lookup table''
1242is a hash table.
1243
1244
1245After all the features in the test image have been extracted and
1246matched to the index, the algorithm proceeds to a voting (or perhaps
1247``consensus'') phase where it rejects matches that propose a field of
1248view that does not overlap that of the majority.
1249
1250
1251The system is designed to operate on images of diameter $8~\degrees$,
1252and performs well on simulated data.  They use a grid size of $40
1253\times 40$, and find that on average $25$ grid cells are filled.
1254Their index contains $13,000$ patterns, which means that false
1255positives are quite rare.  However, misidentification of the nearest
1256neighbour, or edge effects (assigning a star to the wrong grid cell
1257due to positional noise) mean that failures to find a match are not
1258uncommon, and are more likely in regions of the sky with high stellar
1259density.
1260
1261In later work, Clouse and Padgett \cite{clouse2000} extend this
1262approach by using Bayesian decision theory instead of a simple
1263threshold on the dot product to define how well features match.  This
1264extension, along with smaller noise levels in the (simulated) imaging
1265system, allows them to extend the approach down to fields of view
1266$2~\degrees$ in diameter.
1267
1268\comment{
1269  Given two features, they compute the dot product between the bit
1270  vectors, then proceed to estimate the probabilities that the match is
1271  a true positive and false positive.  These probability distributions
1272  are quite complex, so several simplifying approximations are made, and
1273  the remaining parameters are estimated by running simulations.
1274  }
1275
1276Harvey \cite{harvey2004} presents two different approaches, one
1277grid-based and the other shape-based.  The grid-based approach is
1278similar to the Padgett--Kreutz-Delgado and Clouse--Padgett approaches
1279\cite{padgett1997, clouse2000}.  A coarser grid is used, so more stars
1280are likely to appear in each bin.  To compensate, a grid cell is only
1281considered ``occupied'' if it contains more than some threshold number
1282of stars.  The other major change is that he expects a test image to
1283be an ``overexposure'' or ``underexposure'' relative to the reference
1284catalog.  This implies that the test image should contain either a
1285subset or a superset of the stars in the index, and therefore the
1286image feature vector must be either greater than or less than an index
1287feature at each bit.  This allows simple bit operations to be used to
1288find feature matches, and feature matching is performed by a linear
1289scan through the index.
1290
1291Harvey's shape-based approach uses constrained $n$-star constellations
1292in order to aggregate information from a large number of stars without
1293allowing the number of potential features to grow combinatorially.
1294Specifically, Harvey uses a pair of stars to define a narrow wedge in
1295the image, then describes the relative positions and angles of a fixed
1296number of nearby stars within that wedge.  Unfortunately, this makes
1297the feature highly sensitive to distractor and dropout stars, since
1298the feature depends on the stars in the feature being enumerated in a
1299particular order.  As a result, all potential features (allowing any
1300combination of stars to drop out) must be checked; the number of
1301features grows exponentially as the density of stars increases.
1302
1303
1304Harvey makes the useful observation that a cascade of indices can be
1305built, where each index is designed to solve images with a particular
1306range of scales.
1307
1308
1309MacKay and Roweis \cite{mackay2005} point out that a grid-based
1310feature such as that used by Harvey leads naturally to a hashing-based
1311strategy.  Each grid cell is associated with a bit that is turned on
1312if the cell is occupied.  This value is placed in a hash table with a
1313mapping back to its position on the sky.  Since false positives can
1314occur as a result of feature aliasing or hash collision, a voting
1315scheme is employed: several feature matches must accumulate before the
1316match is accepted.  Since false negatives can occur as a result of
1317dropouts and distractors (\emph{any} missing or extra star causes the
1318hash code to change completely), many features must be extracted for
1319each region of the sky.
1320
1321
1322\subsection{Fine-tuning astrometry}
1323
1324In order to ``co-add'' or ``stack'' astronomical images (combine
1325pixels from different images of higher signal-to-noise), or to do
1326proper-motion studies (measure the movement of stars over time), it is
1327necessary to fine-tune the astrometric calibration of their images.
1328This is similar to the \emph{bundle adjustment} problem in computer
1329vision \cite{triggs2000}, in that it involves the simultaneous
1330optimization of the various camera and telescope parameters and the
1331estimated positions of objects in the world.  Fine-tuning astrometric
1332calibrations is easier because it is essentially two-dimensional, but
1333more difficult because the camera parameters include polynomial
1334distortion terms to model the image distortion introduced by telescope
1335optics.
1336
1337
1338The software package \scamp by Bertin \cite{bertin2005} is a popular
1339tool used by astronomers to fine-tune simultaneously the astrometric
1340calibrations of a large collection of images.  For each image, it is
1341assumed that the center of the image is known to about 25\% of the
1342size of the image, and the scale is known to within a factor of two.
1343The histogram-alignment method of \cite{kaiser1999} is used to find
1344the translation, scaling, and rotation between the image and a
1345reference catalog, which allows the correspondences between image and
1346reference catalog stars to be determined.
1347
1348The core of the \scamp system is a chi-squared minimization of the
1349total weighted distance between the projected positions of all star
1350correspondences among the set of images and the reference catalog.
1351The parameters to be adjusted are the center, scale, rotation, and
1352polynomial distortion coefficients of each image.  A key feature is
1353the ability to share image distortion parameters among subsets of the
1354images, since images taken with the same telescope are expected to
1355share some distortion terms due to the telescope optics.  This reduces
1356the degrees of freedom of the fitting process, resulting in more
1357robust fits.  \scamp can fine-tune thousands of images to sub-pixel
1358accuracy.
1359
1360
1361\section{Our approach}
1362\label{ourapproach}
1363
1364\subsection{Preprocessing}
1365
1366Astronomers have developed computational methods for detecting and
1367localizing stars, galaxies, and other objects in astronomical images.
1368In astronomy this process is known as \emph{source extraction}.
1369Telescopes are designed to produce a predictable point-spread function
1370which covers several pixels, and this allows stars, which are
1371effectively point sources, to be localized with subpixel accuracy.
1372Sources that have image profiles larger than the point spread function
1373must be extended objects, and this allows stars to be differentiated
1374from galaxies and other objects.  Cosmic rays (single, energetic
1375particles) deposit energy in single pixels; since no distant object
1376can produce such an image profile, this source of noise can easily be
1377filtered from the image.
1378
1379
1380The result of source extraction is a list of object positions (with
1381subpixel resolution), along with the brightness of each object.  If
1382the telescope has been properly calibrated, this brightness is
1383proportional to the electromagnetic \emph{flux} from the object, but
1384in general we simply treat the brightness as a means of ordering the
1385objects.  It is also possible to classify each source as a star or
1386galaxy, and even classify the galaxy type, but we currently do not use
1387this information.
1388
1389
1390For the purposes of this paper, we will assume that source extraction
1391has been performed as a preprocessing step, so we are given a list of
1392source positions sorted by brightness.
1393
1394
1395
1396\subsection{Quad features}
1397
1398\begin{figure}
1399  \begin{center}
1400    \includegraphics[width=0.5\textwidth]{quad-fig}
1401  \end{center}
1402  \caption{The local coordinate frame of a quad.  The most distant pair of sources, $A$ and $B$,
1403        define the origin and $(1,1)$ of the coordinate frame, respectively.
1404        The remaining two sources $C$ and $D$ are required to lie inside the circle with diameter $AB$.
1405        To resolve ambiguities, the points must be labelled such that $C_x \le \smallhalf$ and
1406        $C_x + D_x \le 1$.  The geometric feature descriptor is $(C_x, C_y, D_x, D_y)$.}
1407  \label{quad}
1408\end{figure}
1409
1410We have taken a geometric feature-matching approach to this problem.
1411Since individual sources can be well localized but have no distinctive
1412properties, we build distinctive features by describing the relative
1413geometry of small sets of objects.
1414
1415
1416To solve the blind astrometry problem, we need to index a large number
1417of features: we want to solve images that cover $10^{-6}$ or less of
1418the area of the sky, and we want to have tens of features per image in
1419order to increase our robustness to distractors, dropouts, and
1420occlusion.
1421
1422
1423We are matching two-dimensional objects, and we assume that
1424astronomical images can be well described by similarity
1425transformations: scaling, rotation and translation.  Therefore, we
1426require two sources to define a local coordinate system.
1427
1428
1429The standard geometric hashing recipe would have us use triangle
1430features.  However, the positional noise in astronomical image is
1431sufficiently large that triangles are not distinctive: with the number
1432of features we require, \emph{any} triangle will match an excessively
1433large number of triangles in our index due to feature aliasing.  In
1434order to building a more distinctive feature, we consider sets of
1435\emph{four} sources-- \emph{quads}.  Two of the sources are the basis
1436set that defines the local coordinate system, and our geometric
1437feature descriptor is composed of the positions of the remaining two
1438sources in this coordinate system.
1439
1440Using a quad feature can be seen as a way of incorporating a voting
1441scheme directly into the feature-matching phase: a quad is essentially
1442the conjunction of two triangles.  We are effectively squaring the
1443distinctiveness of our geometric features by demanding that \emph{two}
1444triangles be matched simultaneously.
1445
1446One disadvantage of using quads rather than triangles is that we
1447become more sensitive to distractors and dropouts: if the proportion
1448of distractors and dropouts is $d$, the proportion of overlapping
1449sources is $(1-d)$, and we expect about $(1-d)^3$ triangles but
1450$(1-d)^4$ quads to overlap.
1451
1452
1453In standard geometric feature-matching, one enumerates all basis sets,
1454then lists the positions of all the remaining interest points within
1455that basis.  Since we have a two-dimensional basis and we list the
1456positions of pairs of sources, this would yield $\mathcal{O}(n^4)$
1457features.  Astronomical images typically contain hundreds to thousands
1458of sources, and our reference catalog contains $10^9$ sources, so this
1459would be prohibitively expensive.  We therefore impose two constraints
1460on the properties of the quads we create.  See Figure \ref{quad}.
1461
1462Given a set of four sources, we first find the two sources most
1463distant from each other and label them $A$ and $B$.  If the distance
1464from $A$ to $B$ is outside a specified range, we reject the quad.
1465Next, we demand that the two remaining sources lie inside the circle
1466that has $AB$ as a diameter.  We place $A$ at the origin of the local
1467coordinate system, and place $B$ at position $(x, y) = (1, 1)$.  We
1468then compute the $(x,y)$ coordinates of the two remaining sources.
1469The source with the smaller $x$ coordinate is labelled $C$, and the
1470last source is labelled $D$, so we have $C_x \le D_x$.  The geometric
1471feature descriptor is $(C_x, C_y, D_x, D_y)$.
1472
1473Observe that this feature has one redundancy: we labelled $A$ and $B$
1474arbitrarily, and if we swap their labels then the coordinate system is
1475rotated $180$ degrees about $(\half, \half)$.  The feature descriptor
1476becomes $(1 - C_x, 1 - C_y, 1 - D_x, 1 - D_y)$.  We can deal with this
1477redundancy by demanding that $A$ and $B$ be labelled such that $C_x +
1478D_x \le 1$.
1479
1480
1481Recall the idea mentioned earlier in this paper of solving images of unknown
1482scale by building a cascade of indices, each covering a range of scales.
1483We use this approach with a scale factor of $2$ or $\sqrt{2}$.
1484This allows us to assume that the image scale is known to
1485within a small factor.  When deciding what range of quad sizes ($AB$ distances) we wish to
1486allow, we are faced with a tradeoff: if we use quads with large
1487diameter, then the probability that all four stars will be within the bounds of
1488the image decreases.  Meanwhile, the number of quads of diameter $d$ that
1489can be created is $\mathcal{O}(d^5)$, assuming a uniform distribution
1490of sources and ignoring edge effects.\footnote{%
1491  For a given star $A$, the number of
1492  stars that can become star $B$ goes as $d$, while the number of stars
1493  $C$ and $D$ goes as $d^2$, so the number of $BCD$ combinations goes as $d^5$.}
1494On the other hand, if we build small quads, then the relative positional error
1495will be larger, which means that when we search for matching quads in the index
1496we must use a larger search radius, which means that many more false matches
1497will be found.  We have not yet found an analytic solution to optimize this
1498tradeoff; we currently set the desired quad scale to about
1499a third of the size of the images we wish to solve.
1500
1501
1502Our quad features have good noise properties: a small change of
1503position of any one of sources $ABCD$ leads to a small change in the
1504feature descriptor.  The noise is nearly isotropic: movement of a
1505source in any direction yields an almost equal change in the feature
1506descriptor.  The feature has some edge effects: $A$ and $B$ are
1507defined to be the most distant pair, but if $C$ and $D$ are nearly
1508diametrically opposite each other then a small amount of positional
1509noise can push them further apart so that they become the most distant
1510pair.  As a result, $C$ and $D$ will be labelled $A$ and $B$, and vice
1511versa, and a completely different geometric feature will be produced.
1512This is a minor effect since it only occurs when both $C$ and $D$ have
1513extreme values; we ignore it.  Another edge effect is that $C$ and $D$
1514are required to lie inside the circle with diameter $AB$; at test time
1515we simply expand the allowed diameter of the quad enough to compensate
1516for a given level of positional noise.  Finally, if the distance $AB$
1517is near the limit of the allowed range of scales, then the quad can
1518become invalid due to noise.  This effect is equally easy to correct
1519by expanding the range slightly at test time.
1520
1521
1522Notice that our constraints on the quads we accept has the effect of
1523restricting the regions of feature space in which valid quads live.
1524Since $C$ and $D$ are required to fall inside a circle with $AB$ on
1525the diameter, we have:
1526\begin{eqnarray*}
1527\left(C_x - \smallhalf\right)^2 + \left(C_y - \smallhalf\right)^2 & \le & \smallhalf \\
1528\left(D_x - \smallhalf\right)^2 + \left(D_y - \smallhalf\right)^2 & \le & \smallhalf
1529\end{eqnarray*}
1530We also have the constraints $C_x + D_x \le 1$ and $C_x \le D_x$ to
1531eliminate the redundancy of the labelling of the source pairs $AB$ and
1532$CD$.  As a result, our feature space is rather strangely shaped, so a
1533grid-based hashing scheme would result in poor utilization of the hash
1534table storage.  Also note that since our noise is nearly isotropic and
1535our quads are restricted to a known range of scales, we can use the
1536Euclidean distance metric with a fixed search radius $r$ to do quad
1537feature matching.
1538
1539
1540
1541
1542\subsection{Indexing}
1543
1544
1545We want our index to satisfy the following constraints.  It should
1546have uniform source and quad density over the whole sky, because we
1547want to be equally likely to solve images over the whole sky.  We want
1548to build quads from bright stars, because we expect these to be more
1549likely to be visible in test images, yet because of distractors,
1550dropouts, and occlusions, we don't want to depend too heavily on any
1551one star; we want to trade off using bright stars and using a variety
1552of stars.
1553
1554
1555We achieve this by first selecting a spatially uniform subset of the
1556brightest sources in the reference catalog.  We place a grid over the
1557sky whose cells are sized so that a typical image is a few cells wide,
1558then we take the $n$ brightest sources in each cell.
1559
1560
1561Next, we place another similarly-sized grid over the sky and in each
1562cell we try to build a quad whose center is within the cell.  We try
1563building quads in order of source brightness, but we also keep track
1564of how many times each source has been used.  In our first pass
1565through the cells we only allow each star to be used once, and in each
1566subsequent pass we increase the number of times a source can be used.
1567
1568
1569Once we have built all the quads, we build a \kdtree in feature space.
1570This allows us to rapidly find all the quads in our index whose
1571geometric feature descriptor is close to a query feature.
1572
1573We also build a \kdtree over the spatially uniform subset of sources;
1574this is used for our verification scheme.
1575
1576
1577\subsection{Solving}
1578
1579At test time, we begin enumerating all valid quads in the image,
1580starting with those that can be built from the brightest sources.  For
1581each quad we perform a range search in the \kdtree in order to find
1582matching quads.  Typically each image quad results in a few matches.
1583We run a verification process on each match.
1584
1585The verification process first computes a least-squares transformation
1586between image coordinates and celestial coordinates.  We then search
1587in the \kdtree of sources for all sources that should be visible in
1588the image.  We apply the transformation to bring them into image
1589coordinates, and for each one find the nearest source in the image.
1590We use a probabilistic model of the positional noise to compute the
1591odds ratio between the match being a true positive and a false
1592positive.  If this ratio exceeds a threshold we declare the image
1593solved.  By using an odds ratio threshold, we allow the user to trade
1594off the probability of false negatives and false positives.
1595
1596
1597
1598\comment{
1599  We build an \emph{index} which maps these features
1600  back to their locations on the sky.\footnote{%
1601        In the database community (and some early computer science literature)
1602        this would be called an \emph{inverted} index, but we feel this
1603        is contrary to common usage: the index of a book maps from concepts
1604        (features) to pages (locations); the inverse mapping is called a
1605        \emph{table of contents}.}
1606 
1607  At test time, we extract features from the image, then search our
1608  index for matching features.  Each matched feature is a hypothesized
1609  astrometric solution.  We use a separate probabilistic verification
1610  stage to test these hypotheses.
1611
1612
1613  Since our input images have unknown scale, rotation, and
1614  translation, we want our features to be invariant under these
1615  transformations.  One option is to consider triangles of stars and
1616  use the interior angles (or relative edge lengths) as a
1617  two-dimensional feature descriptor, as in \cite{pal2006}.  Since
1618  both the image and reference catalog are noisy, we must allow each
1619  feature in the image to match a region in feature space.  When using
1620  geometric features, we are faced with a tradeoff: if we consider
1621  sets of stars that span a large portion of the image, then the
1622  relative positional error is smaller (and hence each feature matches
1623  a smaller region of feature space - features are more distinctive),
1624  but there are \emph{many} features to test.  If we choose smaller
1625  features, there are fewer features to test, but the relative error
1626  is larger so each feature matches a larger region in feature space;
1627  features can be confused more easily.
1628
1629
1630  Although triangles are appealing features from a theoretical
1631  perspective, in practice they are problematic due to the magnitude
1632  of positional noise in typical images we wish to solve.  Any
1633  triangle we find in an image will generate a large number of false
1634  matches to triangles in our index because triangles aren't
1635  distinctive enough.  We therefore use sets of four stars --
1636  \emph{quads} -- as our features.  The two most widely separated
1637  stars (A and B) define the corners of a coordinate system, and we
1638  list the positions of the remaining two stars (C and D) in that
1639  coordinate system.  We demand that stars C and D lie inside the
1640  circle centered at the midpoint of stars A and B and diameter AB.
1641  We impose an arbitrary order to decide which star is labelled C and
1642  which is labelled D, based on their positions in the local
1643  coordinate system.  This yields a four-dimensional feature
1644  descriptor.  If we exchange the labels of A and B, then the local
1645  coordinate system is flipped and a different feature descriptor is
1646  produced.  We store one of these descriptors in our index, and at
1647  test time we search for matches to both descriptors.
1648
1649 
1650  This choice of feature and feature descriptor (\emph{code}) has
1651  several desirable properties.  If stars are distributed uniformly in
1652  space, then codes are distributed uniformly in \emph{code space}.
1653  The error propagation properties are good: small positional errors
1654  in any of the four stars yield small errors in the code value, and
1655  isotropic errors in the positions of the stars produce nearly
1656  isotropic errors in the code value.  This allows us to find matching
1657  codes based on Euclidean distance in code space.
1658
1659  Our index is composed of a large number of code values and mappings
1660  back to locations on the sky.  When given a new image to solve, we
1661  extract many quads, compute their codes, and for each one search for
1662  matching codes in the index.  Due to the isotropic error properties
1663  of codes, searching for matching codes is simply \emph{range search}
1664  in four-dimensional code space.  We use \kdtrees to accelerate this
1665  process.
1666
1667
1668  Each time we match a feature in an image to a feature in our index,
1669  we generate a hypothesis about the location (pointing, scale, and
1670  rotation) of the image on the sky.  We are typically operating in
1671  the regime where \emph{any} feature will produce several matches, so
1672  we use a probabilistic model to determine whether a given match is a
1673  false positive or a true solution.  We do this by computing a
1674  transformation between image coordinates and the sky based on the
1675  matching quad.  We then search for stars in our reference catalog
1676  that should be visible in the image if the hypothesis is correct.
1677  By combining our knowledge of the positional errors in the reference
1678  catalog and the image with an estimate of the proportion of stars
1679  that should be visible in both, we can assign a probability to the
1680  hypothesis.  If it is above some threshold, we accept the match.
1681
1682
1683  \emph{
1684        \begin{itemize}
1685        \item Brightness ordering to speed search only.
1686        \item More about the constraints and goals of our system to differentiate it
1687          from the spacecraft attitude estimation problem.
1688        \item A quad is like the conjunction of two triangles: we're doing implicit agreement of two triangle features!
1689        \item Applications of our system?
1690        \end{itemize}
1691  }
1692
1693 
1694  To summarize, the key elements of our approach are:
1695  \begin{itemize}
1696  \item \textbf{geometric features} composed of sets of point features;
1697  \item \textbf{continuous feature descriptors} that describe the relative positions of the point features;
1698  \item \textbf{fast range-search} to match features descriptors;
1699  \item \textbf{probabilistic validation} to eliminate false positives.
1700  \end{itemize}
1701
1702
1703
1704  %rummage through the toolbox of computer vision and object
1705  %recognition techniques that are related to our approach or would seem
1706  %to be applicable to our problem.
1707}
1708
1709
1710
1711\section{Conclusion}
1712
1713This paper presents the geometric feature matching approach we are
1714using to solve the blind astrometry problem, and presents prior work
1715relating to the techniques we are using and the problem we are trying
1716to solve, as well as some different techniques that might seem to be
1717applicable.
1718
1719Geometric hashing (or more generally, geometric feature matching) is a
1720widely-applicable technique for problems where it is desirable to
1721build distinctive geometric features out of simple interest-point
1722features.  The approach is ideally suited to the blind astrometry
1723problem, since images of celestial objects can be localized to
1724sub-pixel accuracy but have few other distinguishing features.
1725Indeed, much of the previous work in blind and non-blind astrometry
1726can be placed within a geometric feature matching framework.
1727
1728
1729Although one might suppose that hashing is an intrinsic part of
1730geometric hashing, we have found that a tree structure is more
1731effective for the large-scale feature matching we want to perform; we
1732feel that deciding to use approximate feature matching should be a
1733design decision, not an intrinsic part of the method: in cases where
1734each image contains only a few features that could potentially be
1735found in the index, we do not want to risk missing them due to ``bad
1736luck'' in a hash function.
1737
1738
Note: See TracBrowser for help on using the repository browser.