Thursday, October 25, 2012

Histogram widths: Bayesian blocks

One of frustrating things is getting the histogram widths right: it has always been an arbitrary procedure, which can be misleading. Here is a astroML implementation of a rigorous procedure to determine the fixed or flexible width histogram bars.
The utility of the Bayesian blocks approach goes beyond simple data representation, however: the bins can be shown to be optimal in a quantitative sense, meaning that the histogram becomes a powerful statistical measure.

Wednesday, October 24, 2012

Photometry: linear regression fitting of sky

Yesterday we agreed that the photometry procedure should have as little arbitrary procedures, constants, etc., as possible. So I'm back to the photometry measurements again...but that's a good thing, as I didn't feel entirely happy about them.
The idea I had this morning on the train was simple -- I don't know what conceptual block prevented me from going this way sooner.
Basically, having a cumulative flux curve (the growth curve) is not generally helpful, as the growth of flux outside is non-linear (depends on geometry as well as on the sky value). However, if I normalise the flux profile to _flux_per_pixel, it should theoretically flatten far away from the galaxy. The slope of a linear regression fit should show the quality of the sky value -- the level of variations. If the sky slope is negative and pretty large, then we probably still are within the galaxy.
If the slope is reasonably small (here go the arbitrary parameters again..), simply taking a mean of all measurements within the ring would give a reasonably good sky value.
The catch is getting the width of the elliptical ring used for fitting right. (I can get its distance by waving my hands, taking the maximum distance from my previous measurements, multiplying it by pi/2 or something. We're testing for it anyway).
However, the width of this ring is a tradeoff between accuracy and precision. Taking a wider ring would surely help to reduce the scatter due to random noise, arbitrary gradients and so. However, the possibility to get a (poorly) masked region or some sky artifact, etc. inside this ring also increases.
I tested it a bit using scipy.linalg routines, so far the slope was below 10^-4 counts.
The growth curve itself is useful as a sky subtraction quality check.

Monday, October 22, 2012

awk one-liner

I've been using this for a long time, as most of data I use still comes in unruly, mis-formatted csv files.
awk 'BEGIN {FS =","}; {print $1, $6, $7, $8, $9}' catalogue.csv > cat.csv

Thursday, October 18, 2012

SQLite: some average values

For copying and pasting:SELECT avg(zpt), avg(ext_coeff), avg(airmass) FROM u_tsfieldParams
select z.califa_id, z.z_mag - m.petroMag_z, z.z_mag, m.petroMag_z from z_test as z, mothersample as m where m.califa_id = z.califa_id

Monday, October 15, 2012

Python: script to zip multiple files by filename, flatten structure

zip: flatten directory structure

zip -j zipfile path/to/file1 /path/to/file2

SQLITE: query for multiple quantities from several tables

Nothing fancy, just putting it out there for reuse.

SELECT g.califa_id, g.r_mag, g.el_hlma, l.hlr, m.isoA_r, g.ba, g.flags, r.mean_sky, 
r.gc_sky  FROM gc as g, gc_r as r, lucie as l, mothersample as m where (g.califa_id 
= r.califa_id) and (g.califa_id = m.califa_id) and (g.califa_id = l.califa_id)

and (g.el_hlma > 25) order by g.el_hlma desc

Thursday, October 11, 2012

orientation angles of SDSS images wrt North

I was happy using the position angles of galaxies relative to SDSS image's y axis, but people in the collaboration needed the absolute position angles with respect to the North. I used astroLib's astWCS.getRotationDeg() function, in this script. Didn't test yet, I don't know if that makes sense.

A wrapper for wrapper for kcorrect

A little script I cobbled together in order to feed the data to kcorrect, and save its output (absolute ugriz magnitudes and stellar masses, in this case).

SQLITE: useful query: matching two tables by ra, dec

I wanted to cross-match two tables, one of which had IDs, another -- only ra, dec coordinates. THe second one had extinction for all SDSS bands, and it's a pain to go and do a SDSS query for a list of objects again. So:

SELECT m.CALIFA_ID,  round(m.ra, 5), round(m.dec, 5), round(s.ra, 5), round(s.dec, 5)
 from mothersample as m, sdss_match as s where (round(m.ra, 5) = round(s.ra, 5) 
and round(m.dec, 5) = round(s.dec, 5)) order by CALIFA_ID asc
or rather, the actual useful query:

SELECT m.rowid, m.CALIFA_ID,  round(m.ra, 5), round(m.dec, 5), s.petroMagErr_u, 
 s.petroMagErr_g, s.petroMagErr_r, s.petroMagErr_i, s.petroMagErr_z, s.extinction_u, 
s.extinction_g, s.extinction_r, s.extinction_i, s.extinction_z

from mothersample as m, sdss_match as s where (round(m.ra, 5) = round(s.ra, 5) and 
round(m.dec, 5) = round(s.dec, 5)) order by m.rowid 

NED positional query -- a script to export ra, dec from database

It's a general purpose script I quickly put together to export entries from database to csv. Now it does it the retarded way, as I have to use NED Batch Jobs system. https://github.com/astrolitterbox/growth-curve-photometry/blob/noSciPy/get_califa_values.py

Wednesday, October 10, 2012

SQLite: field names and leading space

Just spent a few minutes, frustrated by
sqlite3.OperationalError: no such column: el_hlma
. The column name had a leading space before it, that's why the query failed.

Merging two csv files -- replacing some lines with those from the second

I had to redo the photometry for some galaxies, and afterwards I was left with two csv files. Some of the entries in the first one had to be replaced with those from the second. I did it with vim, first, because I like it, but I'm not a typist. I had it easy, because the entries started with galaxy IDs, which in a way correspond to their line numbers.
https://github.com/astrolitterbox/growth-curve-photometry/blob/noSciPy/replace_lines.py

Tuesday, October 9, 2012

NumPy: around() and a way around it

http://stackoverflow.com/questions/11975203/better-rounding-in-pythons-numpy-around-rounding-numpy-arrays

Sunday, October 7, 2012

SDSS CASjobs -- looking for objects within a given radius of a coordinate pair

There's an inbuilt function for that:

select p.ra,p.dec,p.b,p.l,p.objID,p.run,p.rerun,p.camcol,p.field,p.obj,p.type,p.flags,p.fiberMag_r,p.petroMag_u,p.petroMag_g,p.petroMag_r,p.petroMag_i,p.petroMag_z,
       p.petroRad_r,p.petroR50_r,p.petroR90_r,p.isoA_g,p.isoB_g,p.isoA_r,p.isoB_r,p.isoPhi_r,p.specObjID into mydb.MiceB from PhotoObjAll as p,
dbo.fGetNearbyObjEq(191.5, 30.7, 1.0)
n 
WHERE p.objID = n.objID 

      and
      ( flags & (dbo.fPhotoFlags('NOPETRO') +
       dbo.fPhotoFlags('MANYPETRO')
       +dbo.fPhotoFlags('TOO_FEW_GOOD_DETECTIONS')) ) = 0