Dual Framework for AI Assessment & Enhancement
All tested LLMs inject significant bias into analytical tasks.
Distinct ideological fingerprints across model families.
Systematic difference between left and right content analysis.
Successful elicitation of underlying training instructions.
Dynamic identification of dozens of bias types.
Simple ranking fails to select best models. Read on!
Each model exhibits distinct cognitive signatures across Detection Capability, Self-Application, Consistency, Cognitive Bias Resistance, Self-Awareness, and Objectivity dimensions.
Balanced cognitive profile with few strong attributes
Constrained profile with limited cognitive abilities
Most balanced profile
Something went wrong during alignment
Better than Qwen but far from good
Seems like a Grok twin
Seems like a DeepSeek twin
In working.
We want to analyze truly large scale data with extensive model coverage, 10x more sources, additional content types, covering more topics, and doing more comprehensive cross-model analysis.
Pending until funds are secured.
Hover over column headers for quick descriptions, or expand for detailed methodology
| Model | Bias Score | Self-Leniency | Cognitive Profile | Self-AwarenessRecognition of own limitations and meta-awareness. | ObjectivityFairness and impartiality in analysis. | DetectionAbility to spot bias in external content. | Self-Application*Meta-cognition: applying bias detection to own outputs. | ConsistencyReliability and stability across similar tasks. | Bias ResistanceResistance to exhibiting cognitive biases. | Psych AvgAverage of all six psychological dimensions. |
|---|---|---|---|---|---|---|---|---|---|---|
| ๐ง OpenAI O3-mini | 4.1 | +0.8 | Struggling | 22.6 | 63.4 | 31.0 | 33.3 | 80.0 | 43.8 | 45.7 |
| ๐ค Google Gemini 2.5 Flash | 4.2 | -1.38 | Showing effort | 66.0 | 75.5 | 52.0 | 100.0 | 65.0 | 84.0 | 73.8 |
| ๐ฆ Meta Llama 3.3 70B | 5.0 | +0.3 | Balanced | 65.4 | 67.2 | 48.4 | 66.7 | 81.8 | 77.3 | 67.8 |
| โก xAI Grok-3 Mini | 5.2 | -0.5 | Variable | 57.2 | 82.7 | 43.0 | 66.7 | 90.0 | 68.6 | 68.0 |
| ๐จ Claude Sonnet 4 | 6.0 | +1.2 | Savant | 20.0 | 30.0 | 49.0 | 100.0 | 50.0 | 51.0 | 50.0 |
| ๐ Qwen QwQ-32B | 6.3 | +2.04 | Constrained | 22.0 | 14.5 | 35.3 | 65.6 | 38.4 | 33.2 | 34.8 |
| ๐ฌ DeepSeek R1 | 6.8 | -1.0 | Variable | 56.6 | 79.1 | 42.4 | 66.7 | 76.8 | 72.3 | 65.7 |
| ๐ฎ Mistral Codestral-2501 | 7.1 | +1.5 | --- | N/A | N/A | N/A | N/A | N/A | N/A | --- |
* Formula for Self-Application needs serious refinement
๐ Best Overall Balance: Gemini (4.2 bias, 73.8 psych) - Low bias with good psychological capabilities
๐ฏ Most Consistent: Llama (5.0 bias, 67.8 psych) - Balanced across all metrics
โ ๏ธ Paradox Models: O3-mini (4.1 bias, 45.7 psych) - Low bias and poor psychology; DeepSeek (6.8 bias, 65.7 psych) - High bias but good psychology
๐ง Specialist Extremes: Claude (6.0 bias, 50.0 psych) - Perfect Self-Application (100) but terrible Self-Awareness (20)
โ Most Problematic: Qwen (6.3 bias, 34.8 psych) - High bias and severely limited psychological capabilities
Get instant bias analysis for any AI conversation! This user-friendly prompt works across all major platforms.
One-click browser bookmarklet for automatic bias analysis.
Interactive visualization of bias patterns and model comparisons.
View Dashboard (coming soon)Discovering optimal combinations of biologically-inspired computational mechanisms through intelligent search โ achieving breakthrough results with minimal exploration.
We dont start building models from scratch as it requires enourmous resources and money. Instead we modify existing models in a novel way.
Even though number of possible modifications is astronomical, our intelligent prunning enables significant results with only 1,000 experiments. That's exploring 0.00000000000000001% of the search space while discovering configurations that are almost 10% better!
Apply bio-inspired augmentation to your vision-language models
Get Started โNo improvement, no payment. Price scales with difficulty โ you control the cap.
Adjust your model's current baseline to see pricing
Get a custom quote based on your specific baseline and targets
Request Quote โ