A new study finds that, once again, large language models can outperform humans. But not in certain situations.