31. Build systems that use multiple kinds of input
Use models that combine text, images, audio, video, documents, and structured data. You will design workflows for captioning, visual question answering, document extraction, speech, and multimodal search.