Hands-On Generative AI with Transformers and Diffusion Models

Omar Sanseviero · Pedro Cuenca · Apolinário Passos · Jonathan Whitaker

Nov 2024 · "O'Reilly Media, Inc."

Ebook

418

Pages

Eligible

Ratings and reviews aren’t verified Learn More

About this ebook

Learn to use generative AI techniques to create novel text, images, audio, and even music with this practical, hands-on book. Readers will understand how state-of-the-art generative models work, how to fine-tune and adapt them to their needs, and how to combine existing building blocks to create new models and creative applications in different domains.

This go-to book introduces theoretical concepts followed by guided practical applications, with extensive code samples and easy-to-understand illustrations. You'll learn how to use open source libraries to utilize transformers and diffusion models, conduct code exploration, and study several existing projects to help guide your work.

Build and customize models that can generate text and images
Explore trade-offs between using a pretrained model and fine-tuning your own model
Create and utilize models that can generate, edit, and modify images in any style
Customize transformers and diffusion models for multiple creative purposes
Train models that can reflect your own unique style

About the author

Omar Sanseviero was the Chief Llama Officer and Head of Platform and Community at Hugging Face, leading the developer advocacy engineering, on-device, and moonshot teams. Omar has extensive engineering experience working at Google in Google Assistant and TensorFlow Graphics. Omar’s work at Hugging Face was at the intersection of open source, product, research, and technical communities.

Pedro Cuenca is a Machine Learning Engineer at Hugging Face working on diffusion software, models, and applications. He has 20+ years of software development experience in fields like Internet applications (in Spain, he helped create the first interactive educational portal, the first book store, and the first free ISP) and, more recently, iOS. As a co-founder and CTO of LateNiteSoft, he worked on the technology behind Camera+, a successful iPhone photography app. He created deep-learning models for tasks such as photography enhancement and super-resolution. He was also involved in the development and operations behind dalle-mini. He brings a practical vision of integrating AI research into real-world services and the challenges and optimizations involved.

Apolinário Passos is a Machine Learning Art Engineer at Hugging Face working across different teams on multiple machine learning for art and creativity use-cases. Apolinario has 10+ years of professional and artistic experience, alternating between holding art exhibitions, coding, and product management, having been a Head of Product in World Data Lab. Apolinario aims to ensure that the ML ecosystem supports and makes sense for artistic use cases.

Rate this ebook

Tell us what you think.

Reading information

Smartphones and tablets

Install the Google Play Books app for Android and iPad/iPhone. It syncs automatically with your account and allows you to read online or offline wherever you are.

Laptops and computers

You can listen to audiobooks purchased on Google Play using your computer's web browser.

eReaders and other devices

To read on e-ink devices like Kobo eReaders, you'll need to download a file and transfer it to your device. Follow the detailed Help Center instructions to transfer the files to supported eReaders.

Report illegal content