Assessing the Kepler Inventory of Short Period Planets

You might remember that I’ve been working on a systematic search of the Q1 light curves to examine the frequencies of large planets (> 2 R⊕ -Earth radii) on orbits less than 15 days. I’m happy to announce that my paper titled “Planet Hunters: Assessing the Kepler Inventory of Short Period Planets” has just been accepted to Astrophysical Journal. The paper is available on-line here if you’d like to read it (warning: it’s quite long coming in at 22 pages of single spaced text, 13 figures, and 8 tables!), but I’ll give the highlights below.

We wanted to see for Q1 light curves, how well we could find planets and what might be left remaining there to be found compared to the known Kepler sample of planets. I think this important because Planet Hunters can serve as a separate estimate of the planet abundance and Kepler detection efficiency. I decided first to concentrate the search of planets with periods less than 15 days so that I was certain there would be at least two transits visible in the Q1 light curve. I thought it might be harder for us to identify transits if there was only one dip, so I thought it would be a good idea to start where there were at least transits.

To figure out which of the light curves had transits, I developed an algorithm to combine the multiple classifications for each light curve (for Q1 on average 10 people classified each ~33 day Kepler light curve) by developing a weighting scheme based on the majority vote.  What the weights are doing is really just helping me pay  a bit more attention to those that are a bit more sensitive at finding transits when combining the results from everyone who classified that light curve. The weighting scheme makes us more sensitive to transits than if I just took the majority vote for each light curve and helps to decrease the false positives. Below is the distribution of user weights for Q1 classifiers.

Distribution of Q1 Planet Hunters user weights binned in 0.5 bins

Using the user weights, I am able to give each light curve a ‘transit’ score (the sum of the user weights who marked a transit box divided by the sum of the user weights for everyone who classified the light curve). To narrow the list from 150,000 light curves, I picked those light curves that had ‘transit’ scores greater than 0.5 as my initial list of candidates. I applied several additional cuts to widdle down the list (you can read all about those details in the paper). That left about 3000 light curves and approximately 4000 simulations to go through. So to identify those that had at least two transits in them, we turned to a second round of review where light curves were presented in a separate interface and volunteers were asked whether they could see at least two transits (ignoring the depths being the same or not) in the light curve and asked to answer either asked to answer ‘yes’, ‘no’ or ‘maybe’ to the question. Those light curves where the majority of classifiers said ‘yes’ were moved on to review by the science team.  A big thank you to everyone helped out with the Round 2 review; your efforts are acknowledged here. As always we acknowledge all those who contribute to Planet Hunters science on our authors page.

At the end of the search after removing all the known planet candidates and transit false positives known before February 2012, there were 7 light curves that have transit-like events but not on the original Kepler candidates list published back in 2011 that used  only Quarters 1 and 2. I show example transits from each of these 7 light curves in the Figure below. One of these light curves turns out to be one of the candidates from our first paper and another one was part of our co-discoveries with the Kepler team.  Even those these 7 light curves weren’t found in the first Kepler candidate releases, they now have been found in the latest iteration of the Kepler candidate list released earlier this year, where they’ve used an updated and improved versions of their detection and data validation pipelines. So what that shows is that the Kepler detection and validation processes has indeed gotten better, but there’s more that we can say.

Zoom-in of selected transits for each set of transit identified visible in short period candi- date light curves remaining after Round 2 review and visual inspection. Visually the science team could identify two separate sets of repeating transits in the mutli-planet KIC 8240797, 9729691, and 11551692 based on the user drawn boxes We note that the snapshot of KIC 8240797 contains two independent transit events.

Now that we know what new things we found, and that there wasn’t anything more than the 7 candidates that are now KOIs on the latest Kepler candidate list, we can look at what that says for the completeness of the short period planet inventory. Using the simulations that you’ve helped classify, I was able to look at how good Planet Hunters is at detecting planets of different sizes on orbits less than 15 days.  I randomly selected about 7000 light curves that at the time weren’t known to have transiting planets or were not eclipsing binaries and inject synthetic transits into them for varying planet radii (ranging from 2- 15 R⊕) and periods less than 15 days.  The simulations are really important because one completed I could see what which of the simulations made it to the end of my candiate pipeline and which ones didn’t.  Having the results from those classifications really made the heart of the paper, because we could show independent of the Kepler planet candidates and detection and validation processes, what we were sensitive to.

Efficiency recovery rate for simulated planet transits with orbital periods between 0.5 and 15 days and radii between 2 and 15 R⊕.

What was striking to me, was our detection efficiency is basically independent of orbital period and that whether there were 2 or 15 transits in the light curve, they were just as easily identified. I think this bodes well for us being just as sensitive to single transit events (I’m starting to work on testing that now). Although performance drops rapidly for smaller radii, ≥ 4 R⊕ Planet Hunters is ≥ 85% efficient at identifying transit signals for planets with periods less than 15 days for the Kepler sample of target stars. For 2-3 R⊕ planets, the recovery rate for < 15 day orbits drops to 40%. I compared to the Kepler planet candidates and found similar results (which is a good check).

Our high recovery rate of both ≥4 R⊕ simulations and Kepler planet candidates and the lack of additional candidates not recovered by the improved Kepler detection and data validation routines and procedures suggests the Kepler inventory of ≥4 R⊕ short period planets is nearly complete!



10 responses to “Assessing the Kepler Inventory of Short Period Planets”

  1. cosmicphil says :

    Hi from France ; very, very interesting paper ; just one question about the final distribution of user weights which is plotted in Figure 5 : Could we have known of the array containing the individual scores that allowed the development of this curve. Maybe it would indeed be interesting for the volunteers of Planet Hunters to know their individual score …
    So happy and proud to have participated at this study.

    • Meg says :

      Hi Phil,

      Thanks! We’ve decided against releasing them because it actually can make people perform differently. The weighting scheme isn’t necessarily measuring how well you identify transits since we compare to the majority vote- say for example you classify 5 light curves and they had hard to find transits in them and everyone else said no transits but you marked them. We’d give you a slightly worse weighting but you were actually finding transits and then you might change your behavior knowing your score when it’s not actually reflecting how well you find transits. We don’t want to do anything that might effect your classifying behavior. Also people improve over time, so your user weight may not reflect where you are today. So for that reason I’m going to keep the scores private (also I don’t even know my own user weight for Q1 so I’m in the same boat as you all). You can see more of a detailed reasoning why were not going to give them out on the galaxy zoo blog

      I should say I’m planning on in the future of coming up with ways of improving our sensitivity to hard to find transits, which could for example include using the simulations to highlight people who are good at finding them and weighting people on how they compare to those that have do find the hard to see simulations. But I’ve got a few other things to get done with PH data first, before I come back to this.


      • cosmicphil says :

        So we’re both researchers planets using data that are proposed in this study (and our sagacity), and subjects of study in our ability to achieve results to be compared with those obtained by more developed computer programs.

        It’s nice to make himself useful to help develop techniques that are already the most sophisticated, to search for other worlds. We are so small and alone for the moment in the universe…

        I also participate in SETILive, and artificial signals were hidden in the data, as are false transits planethunters. Hopefully this will also allow researchers to help improve procedures for automatic detection of extraterrestrial signals in the ambient background noise.

        To return to the detection and validation of planet candidate Kepler team or those found by planethunters, it seems complicated to use the confirmation of candidates of small size by the radial velocity method. I read recently that detection by the transit method, but instead using the near infrared, would be more effective especially for small planets. What is your opinion on this possibility ?


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: