I have create a simple cross-exam module for applicants looking for a job.
At the same time, I also tested the same using bare LLM like Grok and it came back with more or less the same output and process.
I leave this to you to decide whether it is good or bad. What I mean is that as model become smarter, it is able to understand what we want to do even thought our prompting is rubbish. In my case I provided a very simple prompt for a very complicated backend process of “cross-exam” and somehow even I cant understand how Grok was able to think it is a barrister and do this without any coaching….