Limitations of GenAI tools
In the previous step, you considered the importance of prompt engineering. In this step, you will have the opportunity to consider real and assumed limitations of generative AI text and image creators.
Generative AI limitations
There has been a lot shared already about what generative AI can or will be able to do. But where are the limits? In education, this is important to understand when designing our curricula or the assessments of students studying them. In higher education, much of the talk has been about how students (or indeed anyone writing or publishing anything) might be tempted to take shortcuts or even knowingly breach institutional academic integrity policy or guidelines by using generative tools inappropriately. It has been argued that if we can find what these tools can’t do then maybe we can use that knowledge to design tasks and assessments that are AI-proof. It is unlikely such a goal is achievable even in contexts where attempts are made to ban or lock out certain technologies, however. And, on top of that is the rapidity of improvement. Nevertheless, it is important to think about what we know about how these tools work and acknowledge limitations as they arise whilst accepting that a limitation today may well be resolved tomorrow.
We’ll start with a look at AI text generators then consider image generators to exemplify some of the issues, tendencies and debates.
Limitations lifted?
You may have heard things like “ChatGPT can’t access information after 2021” or “It can’t read images” or “It’s not web connected so can’t search” or “It hallucinates and makes up supposed ‘facts’” or “It makes up sources and even quotes!”.
Whilst the essence of each of these remains true at the time of writing in terms of the product ChatGPT (version 3.5, the free-to-access version), all these limitations have been resolved to an extent in other generative AI tools. Microsoft Bing Chat has an impressive image analyser, for example, which can not only describe the content of the image (a brilliant boon to alternative text writing by the way!) but can also resolve questions therein (such as a maths problem). Google Bard and Bing Chat are web connected and can provide real sources of information and accurate references. In other words, attempts to ‘AI proof’ student assessments based on limitations is probably a non-starter. Knowing that different tools have different capabilities and that the technologies are improving all the time is, however, essential knowledge.
Limitations linger
Despite rapid advances, I often hear from students and colleagues about frustrations when trying to generate something. Writing a prompt in a way that gets the output you want takes work and thought. The more you have preconceptions in mind of a complete output, the more likely you will be to find yourself disappointed. As tempting as it may be, the only way to guarantee text that has your voice or is expressed in a way that you want it is to produce it yourself!
Another common criticism is that text outputs are ‘bland’, ‘boring’ or ‘samey’, lacking an ingredient that is quite hard to define, given the unarguably natural and grammatically accurate language they produce. Whilst it is possible to manipulate the randomness of some text-generating tools, the tendency for default setting use means that there is little in the way of surprise or unexpectedness in either form or content, which may account for the feeling of blandness. In fact, this is how many AI text detectors work: they look for patterns of predictability as a key marker of AI authorship. In some tests, it was found that text written by non-native speakers of English was flagged incorrectly as AI-authored more frequently than their native-speaker counterparts. I asked a language teacher colleague why this might be, and her convincing explanation is that those developing or learning a language into adulthood are often taught from quite a limited frame of reference, even being recommended to use such things as connecting phrases from a menu of possibilities, for example. Large language models are by definition predictable, much like I am, as a non-native speaker, when I speak French.
One of the best things to do, in my view, is to ask your large language model of choice to write you an essay on something. You soon see that the initial output looks very much like a standardised, predictable and bland five-paragraph essay. If you are a schoolteacher this may worry you but may also raise questions about the ubiquity of such essays and the utility of such outputs, whether human or AI-authored.
The more personal the required output, the more reflective, the more discursive or the more specific to a given case, the less convincing the initial outputs will be. In these instances, it takes considerable working and re-working to generate a compelling response and, as some are already arguing, it actually takes engagement with the core ideas to reach a point where the text generated is fit for purpose.
Whilst it is useful to be cognisant of the limitations of text-generating AI, what’s important is that we are invested in what we write and that we are clear in our own minds why we are writing (or asking others to) and find ways to articulate that. Writing itself is very often a critical phase in learning. Rough drafts, edits, errors, blind alleys and discarded paragraphs are part of that. Perhaps we should be celebrating and focusing more on those stages than the final polished products?
Limitations in focus
Interestingly, criticisms of generative visual media outputs rarely focus on blandness and boring outputs! Some quite remarkable outputs have been created, even winning prizes, which has led to considerable debates about copyright and provenance. Limitations specifically in image generation, however, tend to cluster in three areas.
1. Bias
Without deliberate instruction in the prompt, outputs often reflect common societal prejudices. This image, for example, was the outcome of the prompt: “A biochemistry professor at a UK university”. The assumption that a biochemistry professor would be a white man is apparent and such biases are common.
Four variations of a prompt for a biochemistry professor [1]
2. Gruesome distortions
These are due to complex geometries and a tendency for training data not to be full of adequate reference points. So, while AI image generators are improving all the time and can generate images in a multitude of styles, they often struggle with things like hands and feet or keyboards and other tools, as shown below.
AI generated image based on prompt: “photorealistic computer keyboard hands poised to type.” [2]’
3. Incorporating text and numbers into images
It may seem odd—given that generative AI can separately generate astounding images and reams of text—that incorporating text into images is often unconvincing at best; this is due, it seems, to similar reasons for the problems seen with hands and tools.
The image below was created from a very specific prompt, but the text is some way off what was requested:
AI generated image based on the prompt: “a futuristic university classroom with students and a sign on the wall that reads ‘study at GenAI Uni’” [3]
So while there are persistent limitations, and we can acknowledge these in our approaches to assessment design, for example, we cannot depend on them not being overcome. Limitations, such as they are, might just as well challenge us to confront the ways we have done things in education for a long time and perhaps consider the weight and value we put on products (like nicely bound essays) rather than the processes necessary to cohere a solid argument and evidence in-depth understanding of something.
The biggest and most concerning limitations for me are likely to persist too: the way language models work means that they are likely always to be capable of generating nonsense and the way that they are trained will be always likely to reflect, and therefore potentially reinforce, biases and stereotypes so evident in the first image above.
*I wrote this entire text on my own but was struggling with the sub-headings. Each of these in this article was suggested by ChatGPT after some copy/pasting of paragraphs and some to-ing and fro-ing with prompts. In this case, the assistant bot did exactly what I wanted and needed!
Now that you have completed this step, you have seen that the limitations landscape is one that changes frequently. In the next step, you will hear some student perspectives on generative AI.
References
- Compton M. Four variations of a prompt for a biochemistry professor. 2023 Aug via Midjourney.
- Compton M. AI-generated image based on prompt: ‘photorealistic computer keyboard hands poised to type’. 2023 Aug via Midjourney.
- Compton M. AI-generated image based on the prompt: ‘a futuristic university classroom with students and a sign on the wall that reads ‘study at GenAI Uni’ 2023 Aug via Midjourney.
Join the conversation
What’s the biggest current (or ongoing) limitation of these tools in your view?
Reach your personal and professional goals
Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.
Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.
Register to receive updates
-
Create an account to receive our newsletter, course recommendations and promotions.
Register for free