The one with Cognitive Biases

This week I had a chance to re-watch the “Bohemian Rhapsody” movie for the Nth time. Not to say I really love The Queen’s music, but I find some new details every time I watch this movie again. This time, the Seven Seas of Rhye recording scene resonated with all the buzz around experimenting.
“We need to get experimental”
“Try …”
*magic happens*
While the New York Times filed another lawsuit against OpenAI models, my eyes captured an interesting paper about “Benchmarking Cognitive Biases in LLMs as Evaluators”. Actually, I do not track new upcoming paper abstracts in the AI field - but it seems that the guys at “Hugging Face” do as Andrew Jardine shared this on LinkedIn proving to me one more time that they are worth following. And, getting back to the contents of the paper - it is interesting to read how they test LLMs for cognitive biases, analyze their results, and conclusions and see that the models are far from ideal under such tests. However, I am wondering how would human specialists compare to LLMs under similar tests. Especially, if they would not be aware of being tested.
Cognitive biases (in testing, critical thinking, decision-making, and thinking in general) are an interesting topic. Knowing them and avoiding (or minimizing) them are two very different things. However, being aware of them and using that knowledge in self-reflection can lead to interesting observations. And here is a nice infographic to start exploring some of the cognitive biases. Some say all of the cognitive biases are just different flavors of a confirmation bias - do you agree?
While reflecting about my year 2023 I noticed I have not spent as much time coding as I wanted. Therefore, as I had some spare time between Christmas and New Year’s Eve, I checked out my custom visual testing helper I created some time ago as it has been a while and it needed some care.
Updating packages, fixing some tests, and then migrating from System.Drawing to SkiaSharp… I admit my coding skills have got a little bit rusty over this year since I stepped away from active coding in my career, but still, I somehow managed to provide proper care to my little (visual testing) helper. I know, I still need to fix some more tests (fixed only the Selenium part, need to fix the Playwright portion as well) and I would like to make the repo more welcoming by adding a proper readme.md file. But that is for the (near?) future.
Feel free to checkout and try it on your own :)
Last point of the last post of the year. 2023 was the year of AI as even my mum was asking me about ChatGPT. Therefore, let the last point be some AI model to try out… And as I recently played with Stable Video Diffusion image-to-video transformer model, I encourage you to try it out as well. Somehow those outputs remind me the newspapers from Harry Potter movies.