An open-source multimodal language model capable of understanding and generating text and images. Use it for computer vision, image generation and visual reasoning tasks