Add support for the `image-text-to-text` task in the HF LLM pipeline. This will enable image models such as the following call. ```python model = LLM("...") result = model([{ "role": "user", "content": [ {"type": "text", "text": "What is in this image?"}, {"type": "image", "image": "books.jpg"} ]} ]) ```