source: trunk/documents/papers/noise-information/noise.tex @ 11885

Revision 11885, 15.8 KB checked in by apw235, 15 months ago (diff)

Noise Bits: figure revisions

Line 
1\documentclass[12pt,preprint]{aastex}
2\newcounter{address}
3\setcounter{address}{1}
4\begin{document}
5
6\title{What bandwidth do I need for my image?}
7\author{
8  Adrian~M.~Price-Whelan\altaffilmark{\ref{CCPP}},
9  David~W.~Hogg\altaffilmark{\ref{CCPP},\ref{email}}
10}
11
12\altaffiltext{\theaddress}{\stepcounter{address}\label{CCPP} Center
13for Cosmology and Particle Physics, Department of Physics, New York
14University, 4 Washington Place, New York, NY 10003}
15\altaffiltext{\theaddress}{\stepcounter{address}\label{email} To whom
16correspondence should be addressed: \texttt{david.hogg@nyu.edu}}
17
18\begin{abstract}
19Computer representations of real numbers are necessarily discrete,
20with some finite resolution.  This resolution cannot be made
21arbitrarily fine in real-time systems (such as camera read-outs) or
22where transmission is expensive (such as with space telescopes).  We
23demonstrate experimentally that essentially all of the scientific
24information in an astronomical image can be preserved or transmitted
25in a representation with discreteness only slightly finer than the
26root-variance of the additive per-pixel noise.  Adopting a resolution
27this coarse can preserve dynamic range for the measurement of bright
28sources and reduce valuable bandwidth without sacrificing any
29information for down-stream data analysis, including of very faint
30sources far below the individual-image detection limit.
31\end{abstract}
32
33\section{Introduction}
34Computers operate on bits and collections of bits; the numbers stored
35by a computer are necessarily discrete; finite in both range and
36resolution.  Computer-mediated measurements or quantitative
37observations of the world are therefore only approximately
38real-valued.  This means that choices must be made, in the design of a
39computer instrument or a computational representation of data, about
40the range and resolution of represented numbers.
41
42In astronomy this limitation is keenly felt at the present day in
43optical imaging systems, where the analog-to-digital convertersion of
44CCD or equivalent detector read-out happens in real time and is
45severely limited in bandwidth.  This is even more constrained in space
46missions, where it is not just the bandwidth of real-time electronics
47but the bandwidth of telemetry of data from space to ground that is
48limited.
49
50Fortunately, the information content of any astronomical image is
51limited \emph{naturally} by the fact that the image contains
52\emph{noise}.  That is, tiny differences between pixel
53values---differences much smaller than the amplitude of any additive
54noise added into the signal---do not carry very much astronomical
55information.  For this reason, the discreteness of computer
56representations of pixel values do not have to limit the scientific
57information content in a computer-recorded image.  All that is
58required is that the noise in the image be \emph{resolved} by the
59representation.  What this means, quantitatively, for the design of
60imaging systems is the subject of this \textsl{Article}; we are asking
61this question: ``What bandwidth is required to deliver the scientific
62information content of a computer-recorded image?''
63
64We answer this question, in some sense, \emph{experimentally}.  We
65perform experiments with artificial data, varying the bandwidth of the
66representation---the size of the smallest representable difference $\Delta$ in
67pixel values---and measuring properties of scientific interest in the
68image, such as the fluxes and centroids of compact sources, the mean
69and variance in extended regions, and the properties of sources
70fainter than the detection limit.  The higher the bandwidth, the
71better these measurements become, in precision and in accuracy.  We
72find, not surprisingly, that the smallest representable difference
73should be on the order of the root-variance of the noise level in the
74image.  More specifically, we find that about \emph{two bits should
75  span the FWHM of the noise distribution} if the computer
76representation is to deliver the information content of the image.
77This rule-of-thumb is obvious in retrospect.
78
79Of course, tiny differences in pixel values, even those much smaller
80than the noise amplitude, \emph{do} contain \emph{extremely valuable}
81information, as is clear when many short exposures (for example) of
82one patch of the sky are co-added or analyzed simultaneously.
83``Blank'' or noise-dominated parts of the individual images become
84signal-dominated in the co-added image.  In what follows, we
85explicitly include this ``below-the-noise'' information as part of the
86information content of the image.  Perhaps surprisingly, \emph{all} of
87the information can be preserved about sources far fainter than the
88discreteness of the computer representation, provided that the
89discreteness is finer than the amplitude of the noise.
90
91Our results have some relationship to the study of \emph{stochastic
92  resonance}, where it has been shown that signals of low dynamic
93range can be better detected in the presence of noise than in the
94absence of noise \cite[Dykman, Luchinsky,]{dykman-1993}.  These studies show
95that if a signal is below the minimum representable difference, it is
96visible in the data only when the digitization of the signal is noisy.
97A crude summary of this literature is that the optimal noise amplitude
98is comparable to the mininmum representable difference.  We turn the
99stochastic resonance problem on its head; the counterintuitive result
100that weak signals become detectable only when the digitization is
101noisy becomes (in our context) the relatively obvious result that so
102long as the minimum representable difference is comparable to or
103smaller than the noise, signals are transmitted at the maximum
104fidelity possible in the data set.
105
106\section{Experimental Programming Techniques}
107The first experiment we performed was to observe the effect of measuring the variance and mean of pure Gaussian noise in a 100 by 100 pixel image after 'snapping to integer' at various intensity resolutions. This 'snap to integer' procedure, or 'SNIP,' is something that we use extensively in the experiments to follow and therefore it is important to understand the exact process. To begin we wrote a simple Python class that creates image objects (of type NBImage) that was built specifically for this project. When instantiated, these NBImage objects are blank data sets of user specified dimensions. We chose square 100 by 100 pixel images for our experiments because there was no other obvious size to choose. With some simple functions, one can add Gaussian noise to the image of specified variance, add a star at a random or pseudorandom position with specified Flux, and/or multiply by each element in an array of specified factors (in units of 'bits') and then truncate the image from floating point to integer data. The SNIP method takes in an array of resolution values, multiplies the image data by each of these values, stores all of this data into a 3-dimensional array, and then truncates the data to integer values. From this point we can measure various things about the SNIP-ped image data - such as adding a star before the SNIP and comparing the measured variance to an image with no star, or we can centroid the star and ask how well we did in finding the centroid compared to the real location.
108
109\section{Plain Gaussian Noise}
110In what follows, we will briefly return to the discussion about determining how well we measure the variance in a 100 by 100 pixel image of pure Gaussian noise. For this experiment we simply created an image, added Gaussian noise, and applied the SNIP procedure - then plotted up the measured variance against the bit resolution. As expected, the measured variance increases in accuracy as the bit resolution increases (fig.~\ref{fig:variance}).
111
112\section{Adding a Randomly Placed Star to Gaussian Noise}
113Following the measurement of the variance after multiplying our data by various factors and snapping it to integer values, we wanted to show the same results but instead by adding a star to the image and asking how well we can centroid its location. To do this we used the same routine as described above, but instead with a randomly placed 'star' somewhere within a 25 pixel radius of the center of the image. We chose to use a Gaussian spread for the star, and so the intensity of the star is given by eq.~(\ref{intensity}), where \((x_{0}, y_{0})\) is the randomly selected center of the star, and \((x, y)\) are the image coordinates. \(\sigma\) is set to 1.0 for convenience, and \(A\) is a measure of the flux of the star. We save the true location, and proceed to multiply the image by various factors, snap the data to integer, and then centroid the star.
114\begin{equation}\label{intensity}
115I(x,y) = A e^{-\frac{((x-x_0)^2+(y-y_0)^2)}{(2 \sigma^2)}}
116\end{equation}
117\indent Our technique for centroiding the star does not locate the star within the image. The function takes the real position of the star as input variables, and simply looks at a 3 by 3 section of the image data with the center of this array as the center of the star. It then looks for the highest intensity pixel value, and re-centers the 3 by 3 array around this pixel. We perform a simple least squares fit on this data, using eq.~(\ref{surface}) as our surface model, where our experimentally found star center is the maximum of this surface.
118\begin{equation}\label{surface}
119S(x,y) = a + b x + c y + d x^2 + e x y + f y^2
120\end{equation}
121 The offset is then simply given by eq.~(\ref{offset}), where \((x_{1}, y_{1})\) is the calculated position of the star.
122\begin{equation}\label{offset}
123\mathrm{Offset} = \sqrt{(x_{1} - x_{0})^{2} + (y_{1} - y_{0})^{2}}
124\end{equation}
125\indent What we can expect from a measurement like this is that with a star of high flux compared to the noise level, we will measure the offset very accurately even at the lowest bit truncation value \((2^{-3})\). For lower values of the flux, we should get a curve that converges to some value near ~\((2^{1})\) that depends on the flux, but for smaller bit truncation values we would expect the offset to get larger. These somewhat obvious intuitions are confirmed by (fig.~\ref{fig:bitsoffset256}) and (fig.~\ref{fig:bitsoffset8}). For both plots, the process is tried on 1024 images and the resultant data is all shown on the same plot. As expected, for a large value of the flux (256) the median values (indicated by a black circle) form a virtually straight line at a very low offset, with a lot of data points below the median, and some above. The effect of modulating our noise such that it can vary from 0 to 100 is that it gets closer and farther from the flux value, producing an occasional outlier for high and low noise levels.
126
127\section{Low Value Bit Truncation and Coadding Images}
128As mentioned previously, tiny variations in pixel values, even those smaller than the noise amplitude, do contain valuable information - which is demonstrated by simply coadding a number of noise-dominated images of the same region of the sky. The test we perform is to take 1024 images, add a faint source (below the noise) to the image, apply the SNIP method for the same range of bit values, and then coadd the images and measure the star offsets. Perhaps the most interesting result here is we are able to show that even for stars of flux = \(\frac{1}{8}\), significantly lower than the root variance of the noise, after truncating and coadding many images containing these faint sources we are still able to measure the centroid accurately.
129
130\section{Discussion}
131The methods we propose and examine in this paper  would formally be classified as 'lossy image compression techniques,' but with that aside it is shown that no \emph{science} is lost in the compression of this data. Our experiment on just measuring the variance of plain Gaussian noise as we truncate the pixel values at different bit depths gave us confidence that our initial intuition was correct, and also provided an idea about what we should expect from the more complicated experiments that followed. As shown on fig.~\ref{fig:variance}, even with a minimum representable difference $\Delta = 1.0$, the measured variance is off by less than 10\% from the theoretical value.
132
133ADRIAN: Describe results, one at a time, in a big paragraph.
134
135ADRIAN: Re-state the rule of thumb in the introduction.
136
137HOGG: Comment on applications where bandwidth is limited and we might
138have done better.  For example, what is the HST bit depth?
139
140\begin{figure}
141\includegraphics[width=\textwidth]{twelve-panel.png}
142\caption{Starting from top left and moving to bottom right we show 16x16 images of increasing bit depth. The original images are identical but snapped to integer as described in the text. The images are labeled by the ratio of noise root-variance $\sigma$ to the minimum representable difference $\Delta$. At bit depths $\frac{1}{\Delta} > 2^{0}$, the images become virtually indistinguishable from the high bandwidth images.\label{fig:twelvepanel}}
143\end{figure}
144
145\begin{figure}
146\includegraphics[width=\textwidth]{1024ImsNoStarINT_Variance.png}
147\caption{Measurement of image noise variance as a function of bit depth $\frac{1}{\Delta}$ for images with a randomly chosen mean level and gaussian noise with true variance $\sigma^{2} = 1.0$. Each data point has been dithered horizontally to make the distribution visible. Black circles show medians for each value of the multiplicative factor. The variance is well measured as long as the noise root-variance $\sigma$ is twice the minimum representable difference $\Delta$. \label{fig:variance}}
148\end{figure}
149
150\begin{figure}
151\includegraphics[width=\textwidth]{BitsvsOffset_1024ims_flux8.png}
152\caption{Plot of measured star offset (astrometric error in pixels) as a function of bit depth. The black circles show the median values. We set an upper bound on the offset of \(2^{0}\) reasoning that anything outside of 2 pixels is essentially infinity. The points are generated by generating 1024 images with noise variance $\sigma^{2} = 1.0$ and a gaussian star randomly placed with flux = 8.0., fwhm = 2.35 px. We centroid the star and measure the offset between this position and the known location of the star.\label{fig:bitsoffset8}}
153\end{figure}
154
155\begin{figure}
156\includegraphics[width=\textwidth]{BitsvsOffset_1024ims_flux256.png}
157\caption{Same as fig.~\ref{fig:bitsoffset8} except flux of star = 256.0.\label{fig:bitsoffset256}}
158\end{figure}
159
160\begin{figure}
161\includegraphics[width=\textwidth]{four-panel.png}
162\caption{Four 16x16 pixel images that demonstrate coadding procedure. The top left image shows a single image with noise variance $\sigma^{2}$ = 1.0 and a gaussian star with flux = 8.0, fwhm = 2.35 px. The top right image is the same as the top left, but with the pixel values snapped to integer at bit depth \(2^{-1}\) or minimum representable difference $\Delta = 2.0$. The bottom left image represents coadding 1024 images of flux = 8.0, fwhm = 2.35 px without snapping to integer, whereas the bottom right image is after coadding 1024 images of flux = 8.0, fwhm = 2.35 px \emph{after} snapping each individual image data to integer values. The similarities of the images indicates that information has been preserved. \label{fig:fourpanel}}
163\end{figure}
164
165\begin{figure}
166\includegraphics[width=\textwidth]{prove-coadd.png}
167\caption{Four 16x16 images showing that coadding a number of images with faint sources will produce a visible source, even for a source flux much lower than the root variance $\sigma$ of the noise and lower than the minimum representable difference $\Delta$. Same as fig.~\ref{fig:fourpanel} except flux = 0.25. \label{fig:provecoadd}}
168\end{figure}
169
170\begin{figure}
171\includegraphics[width=\textwidth]{BitsvsOffset_1024ims_flux025_Coadded.png}
172\caption{Same as fig.~\ref{fig:bitsoffset8} except for flux = 0.25, and coadding 1024 exposures after snap-to-integer to make the source detectable. \label{fig:bitsoffsetcoadd1}}
173\end{figure}
174
175\begin{figure}
176\includegraphics[width=\textwidth]{BitsvsOffset_1024ims_flux8_Coadded.png}
177\caption{Same as fig.~\ref{fig:bitsoffsetcoadd1} except for flux = 8.0. \label{fig:bitsoffsetcoadd1}}
178\end{figure}
179
180\bibliographystyle{amsplain}
181\bibliography{refs}
182\end{document}
Note: See TracBrowser for help on using the repository browser.