Downloading the Data
We’ve rolled out a new feature to the site. You now have the ability to download the lightcurve data directly from Planet Hunters. Once you’ve classified a star and submitted the transits, the download data button will appear and is available for every star on its source page (ie http://www.planethunters.org/sources/SPH10067557) as well as from the user My Star page (http://www.planethunters.org/profile) where you can download the data for all the stars you’ve classified (we’ve now paginated the My Stars page so all your favorites and all the stars you’ve classified should now be listed).
The file is in CSV (Comma-Separated Values) format which can be opened directly or imported into Excel, Numbers or the Open Office equivalent where you can then plot and manipulate the data. We provide additional info about the star properties including infrared color, specific gravity, right ascension and declination, and Kepler IDs. We also identify if the star is a Simulation (simulated transit lightcurve), a Kepler Planet Candidate (ie Kepler Favorite -a star that the Kepler team believes has a transiting planet but has not confirmed with follow-up observations) or Source (real Kepler lightcurve). For the simulated lightcurves, the CSV file will provide the planet radius in Earth radii and orbital period in days for the injected transit signal (assuming the given radius of the star).
The CSV file also contains three columns of data labeled time (days), brightness, error in brightness. The brightness values are the brightness of the star measured by Kepler per observation corrected for instrumental effects and systematic errors by the Kepler Team’s data processing pipeline. The error in brightness is simply +/- error in the reported brightness measurement. We’ve normalized the brightness values by dividing what we get from the Kepler public release data by a constant value just for convenience, so it’s easier to measure relative change in the brightness of the star. This just shifts the absolute value of the y-axis up or down for our plotted lightcurves but doesn’t change the actual depths of any transits. For more specifics about the data, see the Corrected Light Curves section of http://keplergo.arc.nasa.gov/DataAnalysisProducts.shtml.
Some times there’s a missing data point in the lightcurves the Kepler Team has released. These missing data points indicate a”no data” condition where the observation has been compromised by spacecraft operations or other anomalies that effect the quality of the measurements (examples might be the spacecraft entering safe mode or possibly a glitch with the electronics that readout for the flux measurements for that star). To indicate those data points we’ve set the brightness value to zero in the CSV file.
Is it possible to provide the data already before the process of flagging (or not flagging) a star? Sometimes I would like to do some simple statistics (like t-test) just to find out if a dip could be significant or if its just noise.
HI Trenton1979, The reason we don’t give the download data before you classify is that we want everyone to look at the lightcurves the same way. We use the classifications of multiple users for each lightcurve so it would change your behavior if you could T-test and it would be hard to compare to other users who don’t do that. We would have no way of knowing since all we would know is that you downloaded the file. So to keep everyone looking at the data the same, we included the download data feature to appear after transit classification, but after that you can do you’re own analysis and share the results with us on the science team and other users on Talk
Thx for the reply. I understand and accept this issue. However, I think its unrealistic to believe that everyone looks in the same way on the data. People take it more or less serious. Others could have a scientific background, which might help to interpret the data better. And people are more or less visually gifted.
Hi Trenton, Yes we understand that and for the most part it evens itself out – but we try and control the variables we can control ~Meg
Is there some way we can download the ‘winners” so they look just like they did when we were classifying them?
Hi JG – we currently don’t have it so they you download images of how we plot the lightcurves so they look exactly like they do in the classifying but for any of our candidates there is a view star link and that takes you to the star source page which has the lightcurve plots in the same scheme as we showing during classifying. For any of those stars you can also download the data and plot it yourself
I’ve downloaded many of the curves I’ve analized looking for transits, with the idea of load the data in my own database and perform some analysis just for fun.
What I’ve noticed some issues that I would like to solve:
* First of all, the stars are sometimes named with an “S” and sometimes with an “A” at the begining of its name. This name does not fit the .csv file name. What is the name convention?
* Some data series contain about 1600 measures taken according to the “Kepler long cadence”, but some others contain about 4800 mesures taken in a shorter frequency but not the “Kepler short cadence”. Why are these curves sampled in that way?
* And finaly, I’ve realized that some of the data files doesn’t match the light curves on the web, maybe there’s some mistake when provinding to the users. If needed I can post a sample.
Thank you very much for your attention
Hosting – Science is organized knowledge. Wisdom is organized life. Immanuel Kant
This page basically explains “error in brightness” is “error in brightness”. It’s reduntant – quote: ” The error in brightness is simply +/- error in the reported brightness measurement.”
“Error in brightness” is “error in brightness” (essentially). What does that mean? If one wants to try to figure out a few things on their own, how is one to use the “Error in brightness” column? Subtract it from the “Brightness” value? Or does the given brightness value already have the “error in brightness” value subtracted from it, so that the original, raw observation is the combination of the two?
Are there any plans to allow download of the data without being logged in?
Is it possible to download the whole database?
I see the whole process has a little “shot in the dark” characteristic.
I mean, I would love to contribute, but I am left with data you chose to discriminate in some way.
Wouldn’t it be more productive to have as many people as possible have the same tools?
I understand some would prefer to have things the way they are.
I have no problem with that.
But what about the people who would like to have it all?
A good application of this would be to analyze all the stars found so far the the players and to use this knowledge to try to improve on the algorithm.
I understand you guys are good at algorithms.
But, you know what they say, the more, the merrier.
We don’t have the capability to have people download all the csv files – All the Kepler data is publicly available for download from NASA’s The Multimission Archive at STScI – http://archive.stsci.edu/kepler/ – as FITS binary tables