February 15th, OpenAI introduced Sora AI, a new text to video tool. The internet went nuts at the full HD creations that artificial intelligence generated. The videos uploaded on their page (https://openai.com/sora) are the most realistic ones we have seen in the AI world so far. In this article, we’ll tell you everything we know about Sora, who has access to it, and any other vital information you might find useful.
What did OpenAI announce?
As described by OpenAI, Sora is “an AI model that can create realistic and imaginative scenes from text instructions”, with the goal of “teaching AI to understand and simulate the physical world in motion, with the goal of training models that help people solve problems that require real-world interaction”.
OpenAI announced that Sora can generate videos up to a minute long and stay consistent with the prompt and the rest of the generated content within the same video (which means the lady you just generated will not magically change jackets mid-way through the video).
Sora is available to red teamers, which means a specific team of people that focus on checking risks, security, bias and potential ethical issues with Sora. It’s currently not available to the public. It is not clear if this team is made of OpenAI employees or independent contractors, but OpenAI is committed to get feedback from people outside of the company as well.
OpenAI explained that Sora can not only understand the prompt written by the human, but also how things work and look in real life. Which means that if you ask for a car, you will not see a wonky-looking vehicle with square-shaped wheels (unless you ask for it, of course).
Despite the huge amount of good news and new features, OpenAI openly admitted that Sora is not so great yet at simulating the physics of a complex scene, and it may not fully understand effects and consequences.
And finally, OpenAI announced that important safety steps will be taken before Sora is available to the public. As we can all imagine, AI can be used for entertainment, work purposes, and also unethical things, like generating suggestive videos, impersonating people, creating fake news, and more.
The measures include collaborating with experts in areas such as misinformation, hateful content, and bias to test the model. Additionally, tools are being developed to detect misleading content, such as a detection classifier to identify videos generated by Sora. OpenAI might be secretly working on a digital watermark to easily recognize AI generated content, just like they planned to do for text to image tools.
What’s the technology behind it?
If we had to explain it without being too nerdy, Sora creates low frame and low quality visual patches that are scalable. After that, Sora extracts each image or frame to work on it. Those initial images are the equivalent of tokens when using ChatGPT 3.5 or 4, they’re the base work to start crafting a response.
If needed, Sora can improve the quality of the images by generating any missing pixels or removing grain, digital noises, or anything that isn’t supposed to be there. Since Sora generates and corrects the initial images, it’s also able to crop and adapt the format of the video to what the user wants.
Sora can also be prompted with other inputs, such as pre-existing images or videos and can edit them completely (putting a filter on it, animating it differently, extending videos, rewinding them, creating transitions, etc…) So technically, Sora is a text to video, image to video and video to video tool as well.
Sora is also able to simulate virtual worlds or universes based on existing video games, movies, or pictures.
Potential applications
Sora AI offers content creators a powerful tool to visualize their ideas in ways never before possible. Writers, bloggers, and journalists can now breathe life into their stories with captivating video content, captivating their audience's attention and creating deeper engagement.
In the marketing area, Sora AI opens up exciting possibilities for brands to connect with their audience on a whole new level. Marketers can use Sora to showcase products and services in dynamic video formats, creating immersive experiences that resonate with consumers.
Educators and e-learning platforms can use Sora AI to enrich learning experiences for students. By transforming text into interactive video simulations, educators can bring complex concepts to life and engage students in meaningful ways.
People with learning difficulties, neuroatypical people, and anyone else who might need personalized video content to get through everyday life could benefit greatly from Sora AI. Many individuals with disabilities, including dyslexia and visual impairments, benefit from visual learning support. Sora AI can generate visual content from text instructions, enabling access to information in formats that are more easily comprehensible. Sora AI can also serve as a valuable tool for individuals with autism by providing visual aids and interactive storytelling experiences that support communication and social skills development.
For filmmakers, animators, and game developers, Sora AI offers a wealth of creative possibilities. With its ability to generate lifelike scenes and characters based on textual prompts, Sora provides a cost-effective solution for bringing creative visions to life. No more spending hours of work to build a universe that you’ll end up throwing away because it didn’t turn out how you wanted, now you can visualize it in minutes.
Yes, but…
As technology like Sora AI gets better, it's super important to address the ethical considerations associated with their development and deployment.
The biggest ethical concern surrounding AI-generated content is the potential for misinformation and manipulation. With the ability to create realistic and convincing videos from text instructions, there is a risk that malicious actors could exploit Sora AI to spread false information, manipulate public opinion, or deceive audiences. OpenAI must implement robust measures to detect and mitigate the spread of misleading content generated by Sora.
The use of AI-generated content raises privacy and consent issues, particularly when it comes to using real-life images or videos as input data. Imagine if someone saved one of your Facebook pictures to upload into Sora and turn it into a video where you say really embarrassing stuff?
Just like ChatGPT and other LLMs, OpenAI needs to make sure that Sora AI is safe to use. For now, they are still testing it within the company, and it is possible that OpenAI will decide that Sora isn’t safe enough yet, and needs extra work before being released to the public.
In any case, this is a major announcement for the AI world and for content creators, and of course, we’re also excited to see where it goes. Of course, we will keep you updated on the latest developments about Sora and text to video tools in general.
Are you interested in learning how to create content with AI? We got you covered with our brand new ChatGPT for Content Creation course. Use the code "FACEBOOKMARCH2024" for a super discount!
Comments