I by no means pictured AI writing human-like textual content, no I’m not speaking concerning the textual content era however relatively picture era with human handwriting. The Flux fashions made it straightforward to deduce, generate and edit photos. Right this moment, on this article, we’ll be one such mannequin used for producing photos with hand-written textual content. No that’s not all, we’ll additionally construct a narrative telling utility in the direction of the tip of the article.
What are FLUX Fashions?
Flux fashions are generative fashions which are usually related to producing high-quality photos, movies, or different content material. These fashions are constructed utilizing superior neural networks like Steady Diffusion or Variational Autoencoders (VAEs). We’ll be specializing in a Flux mannequin, specifically the fofr/flux-handwriting mannequin all through the article.
Flux Handwriting Mannequin
“fofr/flux-hndwriting” is a flux Lora fine-tuned to provide handwritten textual content, let’s have a look at numerous methods to make use of it to generate some photos with handwritten textual content.
Hugging Face
You’ll be able to shortly to the mannequin web page on the Hugging Face and use the ‘diffusers library’ or the ‘inference api’ to generate the pictures.
Notice: Do not forget that you need to use HWRIT handwriting to set off the picture era.
I prompted it to generate: ““HWRIT shaky messy handwriting stating “The solar will rise,” illegible, darkish inexperienced ink on outdated water-damaged paper with seen mildew marks.““
The generated picture has the textual content in the identical type I had talked about within the immediate.
Let’s strive the Inference API
Get your HuggingFace entry token from right here: Hugging Face Tokens
from huggingface_hub import InferenceClient
consumer = InferenceClient("fofr/flux-handwriting", token="hf_token")
# output
picture = consumer.text_to_image('HWRIT scrawling messy handwriting saying "I am Iron Man", illegible, written with a HB pencil on a grainy paper')
Output
This seems fairly good and the mannequin didn’t mess up with any characters too.
Notice: It takes some time for the picture era.
Replicate
It’s also possible to select to run this mannequin on Replicate however it’ll price you roughly $1 for 90 runs or roughly $0.11/run, this may fluctuate.
Story Telling Software
Let’s create an LLM utility that first writes a narrative after which breaks into 7 items after which generates 7 hand-written photos to assist the storytelling. We’ll then mix these photos to finish the applying.
We’ll be utilizing Gemini fashions to generate the story and make prompts to generate photos from flux-handwriting. First, let’s get on our palms on the Gemini API-key:
Merely click on on Create API Key to get a brand new key to make use of Gemini fashions.
Set up
!pip set up -q -U google-generativeai
To make use of the Gemini fashions.
Implementation
Configure and select the mannequin, I’ll be utilizing the ‘Gemini-1.5-flash’ mannequin.
import google.generativeai as genai
genai.configure(api_key=”API-Key”)
mannequin = genai.GenerativeModel("gemini-1.5-flash")
Producing the story:
response = mannequin.generate_content("Write a brief and clear story in about 80 phrases, a couple of day within the lifetime of a person named Cyan turning right into a superhero.")
story = response.textual content
print(story)
Cyan woke to a throbbing headache, a wierd image burning into his palm.
That day, mundane duties – grocery procuring, canine strolling – felt amplified,
his senses sharper. A rushing automotive careened in the direction of a toddler; instinctively,
Cyan reacted. He moved sooner than he thought doable, a blur of movement,
saving the kid. The image glowed. He was now not simply Cyan. He was
one thing extra.
Now break up the story into 7 elements:
sentences = story.break up(". ")
prompts = [f"{sentences[i]}." if i < len(sentences) else "" for i in vary(7)]
Construction the 7 elements into prompts to request for a response:
handwriting_prompts = [
f"HWRIT handwriting style for the text: '{prompt}' in a neat cursive writing in orange Ink and red paper background"
for prompt in prompts if prompt.strip()
]
Perform to generate the handwritten photos:
(Get your hugging face token and ensure to verify all of the inference containers whereas making a token)
from huggingface_hub import InferenceClient
import time
consumer = InferenceClient("fofr/flux-handwriting", token="hf_token")
def handwriting_text(immediate):
picture = consumer.text_to_image(immediate)
return picture
Producing photos with handwritten textual content:
handwritten_images = []
for immediate in handwriting_prompts:
picture = handwriting_text(immediate)
handwritten_images.append(picture)
time.sleep(120) # 2-minute delay
Notice: The API request may throw an error on the strains of “Max requests
complete reached on picture era inference (3). Wait as much as one minute
earlier than having the ability to course of extra Diffusion requests.”, Therefore we’re including a
120 second sleep after every request within the for loop.
Producing the video utilizing OpenCV:
import cv2
import os
def create_video_from_images(image_list, output_video_path, fps=1):
# Load the primary picture to get dimensions
body = cv2.imread(image_list[0])
top, width, _ = body.form
# Initialize the video author
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
video = cv2.VideoWriter(output_video_path, fourcc, fps, (width, top))
# Write every picture to the video
for image_path in image_list:
body = cv2.imread(image_path)
video.write(body)
# Launch the video author
video.launch()
# Save the pictures to disk and create a video
image_file_paths = []
for idx, picture in enumerate(handwritten_images):
file_path = f"handwritten_image_{idx}.png"
picture.save(file_path)
image_file_paths.append(file_path)
# Mix photos right into a video
create_video_from_images(image_file_paths, "handwritten_story.mp4", fps=0.25)
print("Video created: handwritten_story.mp4")
We saved the pictures after which mixed them right into a video with a body fee of 0.25 (1 body per 4 seconds for readability).
Output
Hyperlink to the video: handwritten_story.mp4
Notice: The mannequin struggles whereas producing photos with greater than 4-5 phrases per picture so we have to prohibit the textual content.
One train you may strive is to make use of an LLM working to make the prompts as a substitute of splitting the story and utilizing a typical template, it will make sure the textual content restrict and the type and background of the textual content might be tuned based on the textual content by the LLM.
Conclusion
In conclusion, utilizing Flux fashions comparable to “fofr/flux-handwriting” introduces new alternatives for crafting customized handwritten-style visuals. Whether or not creating standalone prompts or creating full storytelling options, these instruments spotlight AI’s capability to merge creative creativity with sensible functions. The storytelling function exemplifies how effortlessly AI-generated visuals can mix into multimedia tasks, driving ahead ingenious and charming prospects.
Additionally if you’re in search of a Generative AI course on-line then, discover: GenAI Pinnacle Program
Continuously Requested Questions
Ans. The “flux-handwriting” mannequin is a LoRA (Low-Rank Adaptation) fine-tuned model of the FLUX.1-dev mannequin, designed to generate photos of handwriting in numerous types primarily based on textual content prompts.
Ans. First, load the bottom FLUX.1-dev mannequin utilizing the Diffusers library. Then, apply the “flux-handwriting” LoRA weights to the pipeline. Lastly, generate photos by offering prompts.
Ans. To activate the handwriting era function, embody the set off phrase HWRIT handwriting in your immediate.
Ans. You need to use Replicate to deduce utilizing the fofr/flux-handwriting mannequin: Replicate.