Assessing the Kepler Inventory of Short Period Planets
You might remember that I’ve been working on a systematic search of the Q1 light curves to examine the frequencies of large planets (> 2 R⊕ -Earth radii) on orbits less than 15 days. I’m happy to announce that my paper titled “Planet Hunters: Assessing the Kepler Inventory of Short Period Planets” has just been accepted to Astrophysical Journal. The paper is available on-line here if you’d like to read it (warning: it’s quite long coming in at 22 pages of single spaced text, 13 figures, and 8 tables!), but I’ll give the highlights below.
We wanted to see for Q1 light curves, how well we could find planets and what might be left remaining there to be found compared to the known Kepler sample of planets. I think this important because Planet Hunters can serve as a separate estimate of the planet abundance and Kepler detection efficiency. I decided first to concentrate the search of planets with periods less than 15 days so that I was certain there would be at least two transits visible in the Q1 light curve. I thought it might be harder for us to identify transits if there was only one dip, so I thought it would be a good idea to start where there were at least transits.
To figure out which of the light curves had transits, I developed an algorithm to combine the multiple classifications for each light curve (for Q1 on average 10 people classified each ~33 day Kepler light curve) by developing a weighting scheme based on the majority vote. What the weights are doing is really just helping me pay a bit more attention to those that are a bit more sensitive at finding transits when combining the results from everyone who classified that light curve. The weighting scheme makes us more sensitive to transits than if I just took the majority vote for each light curve and helps to decrease the false positives. Below is the distribution of user weights for Q1 classifiers.
Using the user weights, I am able to give each light curve a ‘transit’ score (the sum of the user weights who marked a transit box divided by the sum of the user weights for everyone who classified the light curve). To narrow the list from 150,000 light curves, I picked those light curves that had ‘transit’ scores greater than 0.5 as my initial list of candidates. I applied several additional cuts to widdle down the list (you can read all about those details in the paper). That left about 3000 light curves and approximately 4000 simulations to go through. So to identify those that had at least two transits in them, we turned to a second round of review where light curves were presented in a separate interface and volunteers were asked whether they could see at least two transits (ignoring the depths being the same or not) in the light curve and asked to answer either asked to answer ‘yes’, ‘no’ or ‘maybe’ to the question. Those light curves where the majority of classifiers said ‘yes’ were moved on to review by the science team. A big thank you to everyone helped out with the Round 2 review; your efforts are acknowledged here. As always we acknowledge all those who contribute to Planet Hunters science on our authors page.
At the end of the search after removing all the known planet candidates and transit false positives known before February 2012, there were 7 light curves that have transit-like events but not on the original Kepler candidates list published back in 2011 that used only Quarters 1 and 2. I show example transits from each of these 7 light curves in the Figure below. One of these light curves turns out to be one of the candidates from our first paper and another one was part of our co-discoveries with the Kepler team. Even those these 7 light curves weren’t found in the first Kepler candidate releases, they now have been found in the latest iteration of the Kepler candidate list released earlier this year, where they’ve used an updated and improved versions of their detection and data validation pipelines. So what that shows is that the Kepler detection and validation processes has indeed gotten better, but there’s more that we can say.

Zoom-in of selected transits for each set of transit identified visible in short period candi- date light curves remaining after Round 2 review and visual inspection. Visually the science team could identify two separate sets of repeating transits in the mutli-planet KIC 8240797, 9729691, and 11551692 based on the user drawn boxes We note that the snapshot of KIC 8240797 contains two independent transit events.
Now that we know what new things we found, and that there wasn’t anything more than the 7 candidates that are now KOIs on the latest Kepler candidate list, we can look at what that says for the completeness of the short period planet inventory. Using the simulations that you’ve helped classify, I was able to look at how good Planet Hunters is at detecting planets of different sizes on orbits less than 15 days. I randomly selected about 7000 light curves that at the time weren’t known to have transiting planets or were not eclipsing binaries and inject synthetic transits into them for varying planet radii (ranging from 2- 15 R⊕) and periods less than 15 days. The simulations are really important because one completed I could see what which of the simulations made it to the end of my candiate pipeline and which ones didn’t. Having the results from those classifications really made the heart of the paper, because we could show independent of the Kepler planet candidates and detection and validation processes, what we were sensitive to.

Efficiency recovery rate for simulated planet transits with orbital periods between 0.5 and 15 days and radii between 2 and 15 R⊕.
What was striking to me, was our detection efficiency is basically independent of orbital period and that whether there were 2 or 15 transits in the light curve, they were just as easily identified. I think this bodes well for us being just as sensitive to single transit events (I’m starting to work on testing that now). Although performance drops rapidly for smaller radii, ≥ 4 R⊕ Planet Hunters is ≥ 85% efficient at identifying transit signals for planets with periods less than 15 days for the Kepler sample of target stars. For 2-3 R⊕ planets, the recovery rate for < 15 day orbits drops to 40%. I compared to the Kepler planet candidates and found similar results (which is a good check).
Our high recovery rate of both ≥4 R⊕ simulations and Kepler planet candidates and the lack of additional candidates not recovered by the improved Kepler detection and data validation routines and procedures suggests the Kepler inventory of ≥4 R⊕ short period planets is nearly complete!
Volunteers Needed to Finish Q1 Round 2
I’m close to submitting my paper looking for short period planets in the Quarter 1 Kepler light curves and comparing to the known Kepler sample of planets. For the past several months I’ve been developing an algorithm to combine the results from multiple users who have classified the Quarter 1 light curves and summarizing the results in a paper. The goal was to look for planets with period less than 15 days and radii greater than 2 earth radii (so looking for big planets) where there would at least be two transits in the light curves. To see what the project could and couldn’t detect, I made simulated light curves where I injected a planet transit signal into a real Kepler light curve for different planet radii and orbits. If you’ve seen a simulation, you’ll know because a message will pop up after you’ve classified the light curve telling you so and showing where the transit is in red. These synthetic light curves help us understand what Planet Hunters can and can’t find, which is important to know. We do have simulations for Q4 light curves that you might have classified.
The last stage of my pipeline requires human eyeballs. I’m interested in my current paper on how well we find light curves that have at least two transits in them. So we implemented a second round of review, to help us narrow down the list of potential planet candidates from my pipeline and reject some false positives. This new interface shows a light curve in black points with blue boxes where transit boxes were drawn by the users who originally classified the light curve. We ask the reviewer based on what they see in the light curve and what’s been marked previously, if there are at least two transits visible, though they don’t need to be the same depth.
We did an initial stage of round 2 last September right before I presented preliminary results from this work at the American Astronomical Society’s Division of Planetary Sciences Meeting in Nantes. But we were not finished classifying all the Q1 light curves and simulations at that time. Now we are done officially done with Q1, and I’ve done the last run of the light curves through my code. So I have a small set of light curves and simulations requiring round 2 review that didn’t get screened in September. I need some volunteers to help me do the round 2 review for these light curves. If you are interested, you can go to http://review.planethunters.org/ where you can join in. There’s a tutorial on the front page, do read it through it, and it will guide you on what you should be doing, as well as show you some examples of false positives.
Once this is stage is completed, we’ll know if there are any additional new candidates or Kepler planet candidates that we’ve been able to identify from the Q1 classification. Then I can call the analysis for this paper complete and put the final numbers in and submit to a scientific journal. I hope to submit in the next week (crossing fingers) if we can get the round 2 review completed this week.
Thanks in advance,
~Meg
Finishing the Q1 data in time for Nantes
Thanks to your hard work, in October, we will be presenting the first results from Planet Hunters at the European Planetary Science Congress and American Astronomical Society’s Division of Planetary Sciences joint EPSC-DPS meeting in Nantes, France. We will be giving a talk at the meeting, showing the results from your classifications of the first Quarter of Kepler data including the abundances of short period planets with periods less than 15 days. I’m making progress on my selection algorithms to come up with a list of planet candidates from all of your classifications.
But we still need your help! The ~6,000 light curves held back from the original Quarter 1 data release and subsequently released this February have been available on the site since May, and we need your help finishing them in time for Nantes. We’ve been showing these Q1 light curves interspersed in the Q2 light curves (at a rate of about 1 in 10), so you can still look at the latest and newest Q2 data, and have about 3000 still in need of classifications. We need more clicks. We need your helping classifying them so we can perform a final search of all the Quarter 1 data for planet candidates and present them in October. Thanks for all your help, and I plan to share more as the work and progress continues.
Many Thanks,
~Meg
The Road to Nantes
Conferences are a big part of the scientific process – researchers from your sub-field and wider field get together to share the latest interesting results with talks and poster sessions. I love going to conferences, mainly because of the idea sharing. I always leave reinvigorated from the week of science conversations, new results, seeing collaborators you haven’t seen in awhile, and catching up with the friends you’ve made along the way.
The main conference I go to as a planetary astronomer is the American Astronomical Society’s (AAS) Division of Planetary Sciences (DPS) annual meeting. The conference is usually the first or second week in October each year. This year the conference is being hosted jointly with the European Planetary Science Congress (EPSC), so EPSC-DPS will be held in Europe in Nantes, France in October.
At the end of May, Chris and I wrote and submitted a EPSC-DPS Planet Hunters abstract detailing the short period planet analysis we’ve been working on with your classifications. I’ve been working towards being able to measure the frequency of short period planets (periods less than 15 days) for different sizes and types of planets based on the Planet Hunters Q1 classifications.
I’ve been working on taking the classifications and building a pipeline combining the results from your classifications from each light curve and the classifications from the synthetic light curves to score the light curves from 0 to 1, where 1 is the highest likelihood the light curve has transits in it. I went to Chicago back in May to visit with Chris for two days at the Adler Planetarium with a preliminary version of the algorithm and code. Chris and I looked at the early results and schemed away on the white boards in his office about ways to improve the algorithm (after discussions with Michael and time to introduce me to shuffle board). I went back to Yale and have been working on implementing the game plan we came up with.
I have a preliminary pipeline that I think works, but I’m working on improving it and coming up with the final criteria to say, “yes this light curve has a transit in it”. I’ve gone through by eye and looked at ~2000 light curves selected by my code as having planet transits based on your classifications. I think I know what is my major source of false positives, and I am working on a way to reduce them in my final list of light curves that have transits. Once I have that done, I’ll have a list of planet candidates and begin the process of comparing them to the Kepler candidates, false positives, and eclipsing binaries, and then I’ll be able to use the results from the synthetics to estimate our detection efficiency for different planet sizes and orbits.
We’ve been waiting to hear back from the organizing and session committee to find out if our Planet Hunters abstract was accepted and whether we were granted a talk or a poster.We asked to present a talk at EPSC-DPS. 1698 abstracts were submitted. 1236 abstracts requested talks, and there is simply not enough time to give everyone a talk. Some abstracts will instead be presented as posters during the afternoon poster sessions. (I’ll also be presenting a poster on my KBO survey work at the conference).
We heard two weeks ago that our abstract was accepted, and even better news we were slotted to be the last talk in the CoRoT and Kepler results session. We’re very excited! We’ll be giving a 7 minute talk (titled First Results from Planet Hunters: Exploring the Inventory of Short Period Planets from Kepler) with about 3 minutes for questions – so not very much time, but long enough to share the highlights from Planet Hunters and the new results from our short period planet analysis. You can find our abstract online here. Chris and I will definitely blog and tweet about the conference.
We have a challenge for all of you – At the AAS spring meeting in May, the Planetometer™ had just reached 3 million classifications. We’ll flash the Planetometer™ during our talk, let’s have it say 4 million when we get to Nantes!
Back to work, lots to do before Nantes thanks to all your classifications.
~Meg
PS. Congratulations are in order for Chris. He is being awarded the 2011 Royal Society Kohn award for being zookeeper extraordinaire and for everything he’s done with the Zooniverse and beyond or as the Royal Society aptly put it “for his excellent engagement with society in matters of science and its societal dimension.” Congrats Chris!
Science and Progress: Short Period Planets in Q1
Chris Lintott (Zookeeper Chris) and I wanted to give an update on what the team is working on and some of the changes made to the PH site to help us answer the question we are tackling right now. We used very simple cuts and visual inspection to come up with a preliminary list of planet candidates that John has discussed in an earlier post. We’ve been brainstorming on how to combine the results from all the multiple user classifications (about 10 users looking at each lightcurve) to tease out every transit in the database of over 2.0 million classifications. We are working hard on more sophisticated algorithms and techniques to take all your Q1 classifications and transit boxes and extract transits and planet candidates.
After starting to look at your classifications and results from the simulated transits, Chris and I think an interesting question to look at is what are the abundances of planets on short period orbits (less than 15 days ) in the Q1 data. The Kepler team is doing something similar and it will be very interesting to compare the two results. As an initial step we are only looking at planets bigger than 2 Earth radii so only gas and ice giants because the transits are more pronounced than the smaller rocky planets. Less than 2 Earth radii will be much harder to detect, so we first we want to develop the analysis tools and then we’ll come back to the less than 2 Earth radii planets later.
With just the transit discoveries alone we can’t answer this question. This is because we don’t know how complete the sample is. If we found 120 Neptune-sized planets for example, we can’t say anything about their abundance compared to Jupiter-sized planets, since we don’t know how many we might have missed in the data set. This is where the synthetic transits we insert into the interface play an important role. If users flag 100% of the Jupiter-sized simulations with orbital periods shorter than 15 days, but only 50% of the Neptune-sized synthetic transits, then we know that the number of transiting Neptunes in the real light curves is a factor of two larger than what we found. With this completeness estimate we can debias our sample and begin to understand the spectrum of solar systems providing crucial context for own solar system.
We find that we need higher numbers and finer resolution in period and radii for the synthetic lightcurves to do this analysis. Starting today, mixed in with the Q2 data, we will be showing newly generated synthetic Q1 lightcurves specifically made for this task. As always with the simulated transits ,we will identify the simulated transit points in red after you’ve classified the star and will mark the lightcurve as simulated data in Talk . With the results from these synthetics we can better tweak our analysis tools for extracting transits from your classifications as well as get sufficient numbers to calculate the short period planet detection efficiency for Planet Hunters. The new synthetics won’t be the only non-Q2 lightcurves you see. We also have about 5800 additional lightcurves from Q1 that were released by the Kepler team on Feb 1st,. Now that the Q2 data upload is complete, these have now been introduced into the database and we’ll be showing these mixed in the classify interface as well as a small subset of the Q1 data previously looked at to examine how classifications have changed over time since December.
Chris and I have are aiming to have the bulk of the analysis complete before October, so we can present the results at the joint meeting of the European Planetary Science Congress (EPSC) and the American Astronomical Society Division for Planetary Sciences (DPS) meeting being held in Nantes, France, in October. We will keep you posted on our progress and results as time goes on. Abstracts are due in May, and so we need to start work now to be able to have results for the Nantes meeting. With your help, we think this will lead to a very interesting paper.
Cheers,