Conversational text and image generation