Astronomers Need You

Today, we know what stars are, we know how they generate light, and we understand why individual stars are different from one another, largely because of the work of amateur astronomers and underpaid labourers who worked to create an extensive catalogue of stellar spectra in the late nineteenth and early twentieth centuries. Their story provides an important perspective on the ways that interested citizen scientists can contribute to our understanding of the universe through astronomical research.

Picture of Harvard Computers with William PickeringThe first photographs of stellar spectra were collected by amateur astronomers Henry and Mary Draper in 1872, who continued photographing stellar spectra for the purpose of studying their chemical properties until Henry’s death in 1882. A few years later, Mary Draper donated both the funds needed to create an endowment in honour of her husband and her astronomical equipment to Harvard College Observatory, which supported continuation of the work she and Henry had begun to photograph and classify stellar spectra. From 1885 until his death in 1919, production of the Henry Draper Catalogue was overseen by William Pickering, a friend of the Drapers and Director of Harvard College Observatory.

Data production for the catalogue was so prolific that it was difficult to keep up with analysis. In part due to financial constraints and the need to hire a sufficient number of skilled workers to stay on top of the analysis, and through Mary Draper’s influence, Pickering hired several female “Computers” to do the work of measuring and calculating properties of the stars that were observed, and to develop a classification scheme to organise the catalogue.

Harvard Computers at work in the labIn 1924, the Henry Draper Catalogue of 225,300 stellar spectra was published. In the course of their analysis, the Harvard Computers made many important discoveries, such as the existence of white dwarfs, which we now understand to be the cores of dead low-mass stars and the ultimate fate of our Sun, and the period-luminosity relation of Cepheid variable stars, which was instrumental in Edwin Hubble’s determination that the “spiral nebulae” are all distant galaxies like our own, in a universe that extends far beyond our Milky Way galaxy—as well as Hubble’s later discovery that the universe is in fact expanding, and the galaxies are being pulled ever further apart from each other.

The stellar classification system developed by the Harvard Computers remains the classification system used by astronomers today. In fact, though they didn’t know the reason why at the time, this classification sequence empirically ordered stellar spectra according to their surface temperature. This was instrumental in the work of Cecilia Payne-Gaposchkin, published in her 1925 PhD thesis, which showed that the different concentrations of spectral lines present in stellar spectra depend primarily on their surface temperature, which in turn enabled her to demonstrate that all stars are comprised of roughly three-quarters hydrogen, one-quarter helium, and up to at most a few percent other, heavier elements. Again in turn, Cecilia Payne-Gaposchkin’s work led to the discovery that main sequence stars generate light by fusing hydrogen atoms into helium atoms in their cores.

Today, astronomy is in a state that’s actually quite similar to where it was in the late nineteenth century. While we think we have a grasp on broad classes of objects like open and globular star clusters, eclipsing binary star systems and pulsating variable stars, our catalogues of these objects are largely a mess and even the broad classes given to individual targets are apt to change completely from one analysis to the next. Furthermore, specific phenomena like the O’Connell effect in binary systems, and the Blazhko effect and multi-modal pulsation mechanisms in RR Lyrae variables are poorly understood—and likely will not be properly understood without more accurate catalogues to draw from for astrophysical analysis. And while our isochrone models are fairly accurate in the upper part of the main sequence, the red dwarfs at the dim end, and the more evolved stars with helium burning cores are not as well captured. However, before we’ll be able to claim victory with these models of stellar evolution, we first need more accurate catalogues of star clusters, as we are still in the nascent state of eliminating asterisms that are not clusters at all, but merely appear from our perspective on Earth as groupings of stars much as the constellations do. Our insufficient understanding of all the relevant physical effects is becoming more apparent all the time, as we run data mining operations and automated analyses to sort and classify targets that frequently fail subsequent human inspection. And this problem is only getting worse, with several new survey missions that will see deeper and produce time series data with several orders of magnitude greater abundance than the data sets we are already struggling to analyse. Applying our current data mining techniques and classification algorithms to these much larger data sets is going to produce an even bigger mess than what we currently have.

And these are just the things we know we don’t understand very well. The reason why these analyses are so difficult to carry out algorithmically is likely that the procedures involve multiple steps. Even when a real star cluster has been identified, before fitting an isochrone model the field stars—which typically make up the majority of stars in the field—must be removed so that the analysis can be performed on the cluster stars only. The shape of a periodic variable star’s phased light curve can tell a lot about the physical properties of the system—but only once the star’s period has been accurately determined, as an inaccurate determination can cause an eclipsing binary to appear as a pulsating variable star or vice versa. Often, measurement uncertainty leads to inaccuracy in algorithmic determinations of, say, cluster membership or variable star period, so that the data sets we fit our physical models to are in fact already incorrect.

Typically, trained humans do a much better job of these multistep analyses than our current best automated algorithms—and this is why astronomers need the assistance of the interested public—teachers, students and school groups, retirees, hobbyists and amateurs of all stripes—anyone wanting to learn how to analyse unique, uncurated publicly available data sets and contribute to improving our catalogues of astronomical data. Along with contributing to the improvement of our astronomical catalogues that will lead to subsequent discoveries like those made by Cecilia Payne-Gaposchkin and Edwin Hubble, following the work of the Drapers and the Harvard Computers, surely new phenomena like white dwarfs and Cepheid variable period-luminosity relations are still waiting to be discovered as well.

Please follow this link to get started!