The latest models were pitted against coding, medical, finance, and legal traps, then I cross-checked the results with multiple AIs.