The NIPS Experiment
  Neil D. Lawrence
  
  RADIANT Meeting, University of Zurich, Switzerland
NeurIPS in Numbers
- To review papers we had:
- 1474 active reviewers (1133 in 2013)
 
- 92 area chairs (67 in 2013)
 
- 2 program chairs
 
 
NeurIPS in Numbers
- In 2014 NeurIPS had:
- 1678 submissions
 
- 414 accepted papers
 
- 20 oral presentations
 
- 62 spotlight presentations
 
- 331 poster presentations
 
- 19 papers rejected without review
 
 
The NeurIPS Experiment
- How consistent was the process of peer review?
 
- What would happen if you independently reran it?
 
The NeurIPS Experiment
- We selected ~10% of NeurIPS papers to be reviewed twice, independently.
 
- 170 papers were reviewed by two separate committees.
- Each committee was 1/2 the size of the full committee.
 
- Reviewers allocated at random
 
- Area Chairs allocated to ensure distribution of expertise
 
 
Timeline for NeurIPS
- Submission deadline 6th June
- three weeks for paper bidding and allocation
 
- three weeks for review
 
- two weeks for discussion and adding/augmenting reviews/reviewers
 
- one week for author rebuttal
 
- two weeks for discussion
 
- one week for teleconferences and final decisons
 
- one week cooling off
 
 
- Decisions sent 9th September
 
Speculation
NeurIPS Experiment Results
block
4 papers rejected or withdrawn without review.
Reaction After Experiment
A Random Committee @ 25%
| 
 | 
Committee 1
 | 
| 
 | 
Accept
 | 
Reject
 | 
| 
Committee 2
 | 
Accept
 | 
10.4 (1 in 16)
 | 
31.1 (3 in 16)
 | 
| 
Reject
 | 
31.1 (3 in 16)
 | 
93.4 (9 in 16)
 | 
NeurIPS Experiment Results
| 
 | 
Committee 1
 | 
| 
 | 
Accept
 | 
Reject
 | 
| 
Committee 2
 | 
Accept
 | 
22
 | 
22
 | 
| 
Reject
 | 
21
 | 
101
 | 
A Random Committee @ 25%
| 
 | 
Committee 1
 | 
| 
 | 
Accept
 | 
Reject
 | 
| 
Committee 2
 | 
Accept
 | 
10
 | 
31
 | 
| 
Reject
 | 
31
 | 
93
 | 
Conclusion
- For parallel-universe NIPS we expect between 38% and 64% of the presented papers to be the same.
 
- For random-parallel-universe NIPS we only expect 25% of the papers to be the same.
 
Discussion
- Error types:
- type I error as accepting a paper which should be rejected.
 
- type II error rejecting a paper should be accepted.
 
 
- Controlling for error:
- many reviewer discussions can be summarised as subjective opinions about whether controlling for type I or type II is more important.
 
- with low accept rates, type I errors are much more common.
 
 
- Normally in such discussions we believe there is a clear underlying boundary.
 
- For conferences there is no clear separation points, there is a spectrum of paper quality.
 
- Should be explored alongside paper scores.