Hi, that is Denis,
While it’s cool that we can now generate images from text (you can do it for free on neural.love), what about generating images with tools made only for text alone?
You probably heard about a revolutionary AI launched by OpenAI called “ChatGPT”: This tool uses machine-learning algorithms to generate human-like text using the GPT-3.5 model. Designed to be a conversational tool, ChatGPT can generate realistic dialogue like with a real human.
After a week of using it, I get shivers every now and then. In other words, we’re so deep in the future that we’ve got to embrace it and adapt. You can read more shocking examples of this tech in work here.
But for today’s article, I had a simple question:
Could ChatGPT draw?
We can confidently say that no, it cannot draw. While ChatGPT is a marvel of natural language processing and can generate incredibly human-like text, it is not equipped with the ability to create images.
However, ChatGPT is also really good at working on code development. You can ask it to create a single app web page, a web storefront, or something more complicated, like pretending to be a virtual Linux system.
Working web-store code example by ChatGPT
In web development, we have a special format for vector images called SVG: In this format, images are created using a combination of text and mathematical equations. This means that, while ChatGPT may not be able to create a traditional image like the Mona Lisa, it can potentially create an SVG version of it using text.
To test this theory, I decided to give ChatGPT a try. I asked it to generate an SVG image of the Mona Lisa using this prompt:
Using the SVG format, output a schematic reproduction of the Mona Lisa painting. Put the output in the code block. Don't forget xmlns="http://www.w3.org/2000/svg" after svg tag.
To my surprise (and a few additional presses of the “Try again” button ), it was able to come up with a proper and working SVG file code!
So, here it is, drawings done by the Large Language Model only:
First, it generated something abstract, and without colors, so I decided to specify what I wanted to see.
A new prompt was:
Using the SVG format, output a schematic reproduction of the Mona Lisa painting. [I want to see a frame, a head, two eyes, a nose, mysterious smile and a body]. Put the output in the code block. Use colors. Don't forget xmlns="http://www.w3.org/2000/svg" after svg tag.
And that simple trick worked; it started to draw something like a human face and use colors:
This result is impressive, but it’s not really close to the Mona Lisa, right? However, ChatGPT definitely caught that mysterious smile from the original painting.
I decided to add Write 15 lines of code to the prompt. It might help to make the scene more complex.
By those details, it seems to work, though it’s just a scribbled mess. With the same prompt, I generated a few more drawings:
Looks like the “Write X lines of code” is not helping much, so I have removed it and continued:
It totally captures the smile and has a “fancy art style” in its drawing, but it is not a Mona Lisa.
Especially this historical demon from a latent space, which looks completely the opposite of what I tried to generate.
Despite some more prompt-tuning attempts, it hasn’t fully obeyed to draw SVG Mona Lisa, and what I have gotten was this:
Which is already satisfying and stylish, in my opinion, even if it’s not entirely what I have asked for.
So, for the next experiment, I have decide to try it’s drawing limits and to specify another painter in the prompt.
I wrote this:
Using the SVG format, output a schematic reproduction of Malevich. [I want to see a frame, objects, Malevich-like details]. Put the output in the code block. Use vibrant colors. Don't forget xmlns="http://www.w3.org/2000/svg" after svg tag. Use 20 lines of code.
And I got this:
That’s really close to the Makevich style (thanks to the notes too)!
So, in general, ChatGPT could draw, but how to get the more complex results from it – is an open question. It’s already generating some fancy stylish SVG art, which could be usable in some web-development cases.
However, it starts to hallucinate when you try to get something complex from it, which is also a kind of cool text2image-like process out of the model trained on the text-only data. Impressive, right?
For now, I will call this style SVG-hallucinations and will use it somewhere in my personal projects:
Plot twist: ChatGPT wrote 90% of this post.
BTW You can be a Mona Lisa Avatar with neural.love too; just try this tool.