OmniGen: An Open Source AI Model That Lets You Edit Images Conversationally
The post OmniGen: An Open Source AI Model That Lets You Edit Images Conversationally appeared on BitcoinEthereumNews.com.
This is Decrypt’s co-founder, Josh Quittner, having a casual meeting with his friend, Vitalik Buterin. No, not really. They’ve never met, much less been in the same place at the same time. This image is a fake, which isn’t surprising. What is surprising is that it took us less than a minute to build, using two photos and a simple prompt: “The man from image 1 and the man from image 2 posing for the cameras in a bbq party.” Pretty nifty. The model is Omnigen, and it’s a lot more than just an image generator. Instead, it focuses on image editing and context understanding, letting users tweak their generations by simply chatting to the model, rather than loading standalone third-party tools. It is capable of “reasoning” and understanding commands thanks to its embedded LLM. Researchers at the Beijing Academy of Artificial Intelligence have finally released the weights—the executable AI models that users can run on their computer—of this new type of AI model that may be an all-in-one source for image creation. Unlike its predecessors, which operated like single-purpose task executors (having artists load separate image generators, controlnets, IPadapters, inpainting models, and so on) OmniGen functions as a comprehensive creative suite. It handles everything from basic image editing to complex visual reasoning tasks within a single, streamlined framework. OmniGen relies on two core components: a Variational Autoencoder—the good old VAE that all AI artists are so familiar with—that deconstructs images into their fundamental building blocks, and a transformer model that processes varied inputs with remarkable flexibility. This stripped-down approach eliminates the need for supplementary modules that often bog down other image generation systems. Trained on a dataset of one billion images, dubbed X2I (anything-to-image), OmniGen handles tasks ranging from text-to-image generation and sophisticated photo editing to more nuanced operations…
Filed under: News - @ November 4, 2024 9:28 pm