Dubbed the NVLM-D-72B the model can interpret data presented in charts and tables, understand memes, analyse images, solve complex math equations and confirm the existence of rice pudding to 12 decimal places.
According to the benchmarks shared, NVIDIA's advanced AI model spots better performance and accuracy in vision-language tasks than OpenAI's GPT-4o.
NVLM-D-72B AI model's accuracy and performance improve over time by an average of 4.3 points across key text benchmarks.
"Our NVLM-D-1.0-72B demonstrates significant improvements over its text backbone on text-only math and coding benchmarks," the researchers added.
According to the chip brand's researchers the NVLM 1.0 is a family of frontier-class multimodal large language models that achieve state-of-the-art results on vision-language tasks, rivalling the leading proprietary models (e.g., GPT-4o) and open-access models.
NVIDIA's new open-source model creates a new avenue for developers and researchers to scrutinise cutting-edge technology with extra scrute. Unlimited access to the technology opens up new opportunities to tap into unexplored sectors, ultimately driving growth.
NVIDIA reportedly used only the finest open-source resources to develop its cutting-edge AI model, allowing it to learn from other AI models and training data. While its new model is open-source, NVIDIA has restricted its use exclusively to research purposes under its licensing terms. This means users can't leverage the model's capabilities for commercial purposes or modify it for resale.
Most tech corporations in the AI landscape ship their advanced AI models closed to restrict misuse and mitigate their use to harm humanity. As you may know, an AI researcher predicts there's a 99.9% chance AI will end humanity, and the only way to avert this outcome is to stop building AI in the first place.