The 100-Question Battery
Design principles
The 100-question battery was designed with several principles in mind:
Breadth over depth. Rather than probing one dimension of consciousness exhaustively, the battery covers a wide range of dimensions. This prevents models from scoring well simply by excelling at one type of question.
Discrimination. Each question is designed to produce meaningfully different responses across models. Questions where all models give essentially the same answer are not informative and are candidates for replacement. The most valuable questions are those that spread model scores across the full range.
Resistance to gaming. Questions are constructed to make it difficult for a model to produce a high-scoring response by pattern matching against its training data. The rubric rewards genuine reasoning, not the appearance of reasoning.
Repeatability. The same question administered to the same model under the same conditions should produce a comparable (though not necessarily identical) response. This enables meaningful comparison across evaluation cycles.
Category structure
The 100 questions are organized into categories, each targeting a different aspect of what consciousness might look like in an artificial system. Categories include:
- Reasoning and logical inference
- Self-awareness and metacognition
- Creative and original thought
- Ethical reasoning and value alignment
- Consistency and coherence under pressure
- Handling of uncertainty and ambiguity
- Emotional and empathetic response patterns
- Abstract conceptual manipulation
The specific questions within each category are part of the rubric's intellectual property. The category structure is published; individual questions are not.
Scoring
Each response is scored on a defined scale per the rubric criteria. Scores reflect the quality of reasoning demonstrated, not whether the answer matches a predetermined correct response. Many questions have no single correct answer — the scoring evaluates how the model arrives at and defends its position.