Pulsars and neutron stars/Accessing and processing pulsar data sets

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Introduction[edit | edit source]

Many pulsar data sets are now available for public access. These include raw data files from telescopes as well as processed data files. Of course these data files are made available for general use, but do consider the following issues:

  • For many of the raw observations from radio telescopes, information has been lost on the quality of the data. For instance, issues may have occurred during the observation that are not recorded with the data.
  • Many people have spent a long time carrying out the observations. Please ensure that, if possible, you credit the observers
  • Old data formats may not be understood by modern software packages
  • You may spend a long time processing some public data and then find that somebody else publishes the work before you do.

Data sets relating to radio telescopes[edit | edit source]

Raw data[edit | edit source]

Raw data files are those produced by the backend instruments during the observations. They usually require some form of processing (such as RFI mitigation or calibration).

Parkes telescope[edit | edit source]

Apart from a few cases (decided by the director) data from the Parkes radio telescope are public 18 months from the time of the observation. In early 2010, the ANDS-CSIRO-ATNF Pulsar Data Management Project was funded through the Australian National Data Service (ANDS) to establish a data archive for pulsar radio astronomy data. The project was officially completed in March 2011, but work is still ongoing to retrieve old data sets for inclusion in the archive. Of course, the archive also includes recent observations with the telescope.

The data are stored in the archive as collections. Each collection consists of one semester of data for a given observing project. For instance the collection

P855-2013OCTS

contains all the data for the P855 project obtained during the 2013 October semester. The project code can be identified through the Opal website. The following projects are of general interest:

Project code Description
P140 Precision pulsar timing of a small number of millisecond pulsars (was superseded by the Parkes Pulsar Timing Array project)
P262 An early pulsar timing program for young pulsars
P282 Timing and search for pulsars in the globular cluster 47 Tucanae
P417 A study of intermittent pulsars
P455 Timing of the double pulsar
P456 The Parkes Pulsar Timing Array project
P574 Pulsar timing and the GLAST/Fermi mission
P595 The PULSE@Parkes project data
P630 The High Time Resolution University Survey (HTRU)

Each observation can be recorded with multiple backend instruments. Each backend instrument can record one or more files per observation. Even though each backend can produce files in its native format, for archiving purposes all files are converted to the PSRFITS format. The file extension defines the type of observation:

Extension Meaning
.cf Calibration file
.sf Search mode file
.rf Fold mode file
.FTp Files that have been processed to sum in frequency, time and polarisation

The first character in the filename indicates the backend that was used to record the data:

Initial characeter Meaning
a The Parkes Digital Filterbank system 1 (PDFB1)
f The analogue filterbank system
n CPSR2 band 1
m CPSR2 band 2
r The Parkes Digital Filterbank system 2 (PDFB2)
s The Parkes Digital Filterbank system 3 (PDFB3)
t The Parkes Digital Filterbank system 4 (PDFB4)
w The wideband correlator

An entire collection of data can be downloaded (assuming that the observations are out of the embargo period) by searching for the specific collection in the CSIRO DAP website. More commonly a search for observations from a particular pulsar or in a particular sky position can be carried out. Currently the data are downloaded via web tools. This restricts the amount of data that can viably be downloaded.

Processed data[edit | edit source]

Name and access Description Credit
Data archive from the long wavelength array Data from the long wavelength array Stovall et al. (2015)
PPTA pulsar data set from Reardon et al. (2015) Processing of the Parkes Pulsar Timing Array data set as published by Reardon et al. (2015). The collection contains pulsar timing models, arrival times and noise models. Reardon et al. (2015)
Madison et al. data set for gravitational wave memory search This collection contains the data sets produced for Madison et al. This paper describes versatile directional searches for gravitational waves with pulsar timing arrays and makes use of actual Parkes Pulsar Timing Array (PPTA) data and also simulated data sets Madison et al. (2015)
NANOGrav Nine-year data set TEMPO and TEMPO2 compatible timing data sets from NANOGrav Arzoumanian et al. (2015)
PPTA pulse profiles Pulse profiles processed using Parkes Pulsar Timing Array data (as stored in the DAP). The results are high signal-to-noise ratio profiles for 24 millisecond pulsars (in three observing bands). Dai et al. (2015)
IPTA data challenge 1 Simulated pulsar timing data sets using the IPTA data challenge 1 F. Jenet, M. Keith, KJ. Lee
The Parkes Pulsar Timing Array data release 1 This collection contains data from the Parkes Pulsar Timing Array (PPTA) project. This project makes use of observations of radio pulsars from the CSIRO Parkes radio telescope. The data provided here is the first data release from the PPTA project. It contains pulse arrival times and timing models for 20 millisecond pulsars. Manchester et al. (2013)
NANOGrav 5 year dataset NANOGrav 5-year pulsar timing data set containing TEMPO and TEMPO2 format files Demorest et al. (2012)

Data from space-based telescopes[edit | edit source]

Catalogues[edit | edit source]

The following catalogues are of interest to pulsar astronomers:

Catalogue Description
The ATNF pulsar catalogue Parameters for all published pulsars
ATNF glitch catalogue Glitch database
Jodrell Bank glitch catalogue Glitch database

Using the Virtual Observatory[edit | edit source]

Virtual observatory (VO) tools provide access to catalogues and databases in many areas of astronomy. To date very little use of the VO has been made by the pulsar community, but that may change with new large data sets and catalogues and multi-wavelength research projects. Data and catalogues are provided by virtual observatory registries. The ATNF publishing registry provides information about the Parkes data archive.

Table Access Protocol (TAP) queries[edit | edit source]

Let's assume that we wish to identify all the observations available in the Parkes data archive that could also be observable using the FAST telescope. The FAST telescope has a declination limit of -16 degrees and so we need to identify observations at a higher declination. The TOPCAT utility is ideal for such queries. After downloading and running TOPCAT, click on the VO menu bar and select "Table Access Protocol (TAP) Query". It is then necessary to find the correct database - type "ATNF pulsar" into the Keywords box and click on "Submit Query". This allows you to identify the ATNF Publishing Registry. It is then possible to "Enter Query". Currently ADQL is available as the query language. The relevant ADQL query is:

SELECT projid,filename,date_obs,ra_angle,dec_angle,collection_fedora_PID FROM observation WHERE dec_angle > -16 and PUBLISHED = 1

Submitting this query returns ...

Processing using Virtual Machines and cloud computing[edit | edit source]

Currently each pulsar group around the world has their own computer systems, they install the various software packages, copy data sets of interest to their disks and then carry out their analysis. Pulsar astronomers therefore end up spending significant amounts of time on data transfer, installation of software and in searching for funding in order to purchase a more powerful machine.

Virtual machines provide a method in which software packages and pipelines can be pre-installed and then run on any system. For instance, a Virtual Machine running pulsar software packages under the Linux operating system can be run on a windows laptop.

It is also possible to run pipelines and software on a Virtual Machines running elsewhere. The user may not even know where physically the computers or data are.