10 minute read

AI Models and Creators

AI Models and Creators

  1. Nova - Amazon
  2. Gemini, Gemma - Google
  3. Granite - Oracle
  4. GPT - OpenAI
  5. Phi - Microsoft Azure
  6. Einstein - Salesforce
  7. Joule - SAP
  8. Grok - X (formerly Twitter)
  9. Llama - Meta
  10. Qwen - Alibaba
  11. Claude - Anthropic
  12. Bard - Google
  13. PaLM - Google
  14. Mistral - Mistral AI
  15. Falcon - Technology Innovation Institute (TII), UAE
  16. Gato - DeepMind
  17. Jasper - Jasper AI
  18. Bloom - BigScience (collaborative project)
  19. Ernie - Baidu
  20. Alpaca - Stanford University (fine-tuned LLaMA model)
  21. Stable Diffusion - Stability AI
  22. HuggingChat - Hugging Face
  23. Cohere of Command
  24. Alpha fold of deepmind

Models Developed by Microsoft

Microsoft has developed or collaborated on several AI models and frameworks, especially as part of its Azure AI ecosystem and its partnership with OpenAI. Below is a list of models and AI systems associated with Microsoft:

Models and Frameworks Developed by Microsoft

  1. Phi - Microsoft’s generative AI model designed for Azure OpenAI services.
  2. Turing Models - A series of language models developed by Microsoft:
    • Turing-NLG: A Natural Language Generation model.
    • Turing-Bletchley: A multimodal model for text and image understanding.
    • Turing-Universal Language Representation: Pretrained models for text classification and language tasks.
  3. Orion - Microsoft’s internal AI model family designed for reasoning tasks.
  4. Z-Code - Models for multilingual and multimodal language processing.
  5. Florence - A vision model focused on image recognition and understanding.
  6. Cosmos - Multimodal large language models for processing text, images, and videos.
  7. Guidance AI - A set of AI models specifically optimized for safety and ethical AI responses.
  8. Project InnerEye - AI models for medical imaging analysis.
  9. Project Bonsai - A reinforcement learning-based model for autonomous system training.
  10. DeepSpeed - A deep learning optimization library developed by Microsoft that enhances model training efficiency, supporting many of their models.
  11. Microsoft Custom Neural Voice - AI models for speech synthesis using a few-shot learning approach.

Collaboration with OpenAI

While Microsoft didn’t directly develop OpenAI’s models (like GPT-4 or DALL·E), the company integrated them deeply into its ecosystem:

  • Azure OpenAI Service provides access to models like GPT, Codex, and DALL·E.
  • These models are branded within Microsoft products as part of tools like Copilot for Office, GitHub, and Azure services.

Models Developed by Meta

  1. Llama (Large Language Model Meta AI)
    • A family of large language models optimized for efficiency and open research.
    • Llama 2 - An improved version released with commercial and research use permissions.
    • Llama 3 - Base model focusing on improving context length, multilingual support, and general-purpose NLP tasks.
    • Llama 3.1 - Introduced improvements in instruction-following, safety, and accuracy in multilingual tasks.
    • Llama 3.2 - Introduced multimodal capabilities with models capable of handling both text and vision tasks. It integrates image processing alongside traditional language tasks, enabling applications in visual grounding and document analysis. esigned to run on edge devices, with lightweight models (1B and 3B parameters) suitable for local environments. Larger models (up to 90B parameters) are tailored for cloud-based applications.
    • Llama 3.3 - Llama 3.3 is a text-only 70B instruction-tuned model that provides enhanced performance. Llama 3.3 70B approaches the performance of Llama 3.1 405B.
  2. OPT (Open Pretrained Transformer) - An open-source large language model family designed for transparency and accessibility.
  3. ImageBind - Multimodal AI model linking six modalities (text, image, audio, video, 3D, and temperature).
  4. Segment Anything Model (SAM) - A model for general-purpose object segmentation in images.
  5. DINO (Self-Distilled with No Labels) - A self-supervised learning model for visual tasks.
  6. Mask2Former - A unified model for image and video segmentation tasks.
  7. No Language Left Behind (NLLB) - A model for high-quality translation across 200 languages.
  8. Ego4D - A dataset and model for understanding human interactions from an egocentric perspective.
  9. DeepFace - Early facial recognition model developed for identity verification.
  10. ConvNet (Convolutional Networks) - Developed for image recognition and processing.
  11. TextStyleBrush - A model for text style transfer in handwritten or digital text.
  12. CodexSwitch - Model designed to seamlessly switch between coding languages.
  13. MultiRay - A scalable system for embedding production at Meta.
  14. Meta AI Open Graph Representations (FAIR models) - Models for graph-based learning and social network analysis.
  15. Animated Drawings - AI models allowing users to animate their sketches.
  16. Sphere - A model for knowledge-intensive natural language tasks.
  17. RoBERTa - A robustly optimized BERT variant for NLP tasks.
  18. LOOP - Model designed for open-ended reinforcement learning tasks.
  19. TimeSformer - A transformer-based model for video understanding.
  20. Laser - Language-agnostic embeddings for translation and cross-lingual tasks.
  21. Pharos - Developed for AR/VR environments to understand spatial positioning and context.
  22. TorchRec - A framework for recommendation system models.
  23. PyTorch - Although technically a framework, PyTorch plays a foundational role in Meta’s AI models.

Models from Google

Models Developed by Google

  1. Gemini - Google’s flagship family of large language models (LLMs).
  2. Gemma - A variant or project related to Gemini.
  3. Bard - Chatbot based on Google’s PaLM model.
  4. PaLM (Pathways Language Model) - A large-scale LLM for various language tasks.
    • PaLM 2 - Successor to PaLM with improvements in reasoning and multilingual capabilities.
  5. MUM (Multitask Unified Model) - Designed for complex search queries and multimodal understanding.
  6. BERT (Bidirectional Encoder Representations from Transformers) - A breakthrough NLP model for understanding context in sentences.
  7. LaMDA (Language Model for Dialogue Applications) - Optimized for conversational AI.
  8. Imagen - Text-to-image generative model.
  9. MusicLM - Model for generating high-fidelity music from text descriptions.
  10. Flamingo - A multimodal model for image and text understanding.
  11. MedPaLM - A medical AI model built on PaLM for healthcare-related tasks.
  12. Pathways - A general-purpose AI framework for training a single model to handle multiple tasks.
  13. TAPAS - Table-based question-answering model.
  14. BigGAN - Generative Adversarial Network for high-quality image synthesis.
  15. PERceiver - General-purpose architecture for processing diverse data types (e.g., text, images).
  16. Sparse Transformers - A framework for processing long sequences efficiently.
  17. Universal Sentence Encoder - Model for creating embeddings of sentences for semantic similarity.
  18. TensorFlow - A framework for training and deploying AI models.
  19. TensorFlow.js - A JavaScript library for using TensorFlow models in web applications.
  20. TensorFlow Serving - A service for serving TensorFlow models.
  21. TensorFlow Lite - A lightweight library for running TensorFlow models on mobile devices.

Models Developed by DeepMind (Google Subsidiary)

  1. AlphaGo - The first AI to defeat human champions in the game of Go.
  2. AlphaZero - Generalized version of AlphaGo for games like chess and shogi.
  3. AlphaFold - Revolutionary model for predicting protein folding structures.
  4. Gato - A general-purpose model for multiple tasks across modalities.
  5. Perceiver IO - Versatile model architecture for handling text, images, and more.
  6. Flamingo - Multimodal model for combining vision and language.
  7. Chinchilla - Optimized LLM designed to reduce computational costs while improving performance.
  8. Sparrow - AI chatbot with a focus on safety and factual accuracy.
  9. Dreamer - Model for reinforcement learning and planning in simulated environments.
  10. WaveNet - Neural network for generating high-quality speech synthesis.
  11. MuZero - Model combining learning and planning, capable of mastering games without knowing rules.
  12. Catalyst - Research-focused framework for exploring new AI architectures.
  13. DMRL (DeepMind Reinforcement Learning) - Applied for advanced autonomous decision-making tasks.

Models Developed by OpenAI

  1. GPT3 Series (Generative Pre-trained Transformers) - Natural language understanding and generation. Can perform tasks like summarization, Supports few-shot, one-shot, and zero-shot learning. Model Sizes: Available in different versions, such as davinci, curie, babbage, and ada, varying in capability and cost.

  2. GPT-4 Advanced language understanding and generation, capable of complex reasoning and producing coherent, contextually relevant text.

  3. GPT-4o, GPT4o-mini : A multimodal model that processes text, images, and audio inputs, providing versatile outputs across different media.
  4. Codex - Specializes in programming tasks. Can generate, debug, and explain code across many programming languages.

  5. DALL·E - Text-to-image generation. Support for inpainting (modifying parts of an image).

  6. Whisper - Automatic speech recognition (ASR) model. Supports multilingual transcription and translation.

  7. CLIP (Contrastive Language–Image Pretraining) - Enables tasks like image classification, zero-shot image recognition, and content-based image search.

  8. Point-E - Generates 3D models or point clouds from textual prompts.

  9. Shap-E - Text-to-3D model similar to Point-E but with enhanced texture and detailed modeling.

  10. GPT-Fine-Tuned Models - Fine-tuned GPT models tailored for specific domains like customer service, e-commerce, and education.

  11. o1-preview, o1-mini : Designed to enhance reasoning abilities through a “chain of thought” approach, excelling in complex problem-solving tasks in mathematics, coding, and scientific domains. o1-Preview: The latest snapshot of the o1 model, trained on data up to October 2023, with a context window of 128,000 tokens. o1-Mini: A more accessible version of o1, offering a balance between performance and cost, suitable for a wider range of applications.

  12. Canvas, ChatGPT are the interface or UI.

Experimental/Research Models

  1. OpenAI Microscope
    • Visualization tool for understanding the inner workings of neural networks.
  2. Safety Gym
    • Focuses on reinforcement learning with safety constraints for real-world AI systems.
  3. Rubik’s Cube Solver
    • Model trained using reinforcement learning to solve the Rubik’s Cube with a robotic hand.

What are the Moderation Models?

Moderation models are trained to detect and classify content that might violate policies or guidelines, such as harmful, hateful, illegal, or explicit material. They focus on identifying violations across several categories.

Key Moderation Categories

  1. Hate Speech: Detects language intended to demean or incite hostility toward groups or individuals based on race, gender, religion, etc.

  2. Harassment and Threats: Identifies abusive language, including threats of violence or intimidation.

  3. Self-Harm and Suicide: Flags content promoting or discussing self-harm or suicidal ideation inappropriately.

  4. Violence: Detects descriptions or encouragement of violent acts.

  5. Sexual Content: Flags explicit or inappropriate sexual content.

  6. Child Sexual Abuse Material (CSAM): Identifies and prevents interaction with illegal content related to child exploitation.

  7. Terrorism and Extremism: Flags content promoting or glorifying terrorist activities or extremist ideologies.


Moderation Models Evaluation Parameters:

  • Capabilities: Categorizes content into broad areas like violence, hate speech, and explicit material.
  • Tuning Capabilities: Developers can fine-tune models for specific use cases, such as Applications requiring basic safety filtering for user-generated content (e.g., chatbots, forums).
  • Fine-Tuned Moderation Models: Customization for domain-specific or organizational needs. Allows businesses to align moderation with their specific content policies.
  • Scalability: Designed to handle large-scale applications with real-time needs.
  • Language Support: Can analyze content in multiple languages depending on the implementation.
  • Flexibility: Can integrate with APIs for dynamic content moderation.
  • Threshold Adjustment: Developers can set sensitivity levels to match desired strictness.

Applications:

  • Chat Moderation: Prevent abusive or harmful language in conversational agents.
  • Forum Moderation: Automatically flag inappropriate posts or comments.
  • Social Media Platforms: Ensure adherence to community guidelines.
  • Corporate Compliance: Monitor sensitive communications for potential policy violations.

What are different Copilot and AI Powered IDE availalbe?

Microsoft Copilot: An AI-powered productivity tool integrated into Microsoft 365 applications like Word, Excel, PowerPoint, and Outlook. It assists in drafting content, analyzing data, creating presentations, and managing emails, enhancing user efficiency and creativity.

GitHub Copilot: An AI pair programmer developed by GitHub, integrated into various code editors. It suggests code snippets, completes lines or blocks of code, and assists in generating entire functions, supporting multiple programming languages and frameworks.

Amazon CodeWhisperer: An AI coding companion by Amazon Web Services (AWS) that provides real-time code suggestions, helps in writing code faster, and ensures best practices, supporting various programming languages and AWS services.

Salesforce Eintein GPT: A generative AI for CRM by Salesforce, delivering AI-powered content across sales, service, marketing, and IT interactions, enhancing customer relationship management.

Google Duet AI: An AI collaborator integrated into Google Workspace, assisting users in drafting emails, generating documents, creating presentations, and analyzing data within Google’s suite of applications.

Tabnine: Tabnine delivers AI-driven code completions, supporting numerous languages and IDEs, including VS Code, IntelliJ, and PyCharm. It emphasizes privacy with options for local model deployment. Tabnine offers both free and paid plans, catering to individual developers and teams. DataCamp

Replit’s Ghostwriter: Integrated into Replit’s online IDE, Ghostwriter assists with code generation and debugging across various languages. It’s particularly beneficial for collaborative coding and is accessible through Replit’s subscription services. Algocademy

Cursor AI: Cursor AI focuses on enhancing developer productivity through intelligent code suggestions and refactoring capabilities. It integrates with popular code editors and supports multiple programming languages. Algocademy

V0 by Replit: V0 is an advanced AI model designed for code generation and problem-solving, capable of understanding project context and generating meaningful code contributions. It works across multiple files in a project, providing comprehensive assistance.

Windsurf is an AI first integrated development environment (IDE) designed to redefine the coding workflow with seamless AI integration. Unlike traditional copilots that function as extensions to existing IDEs, Windsurf combines the power of advanced AI with a standalone, purpose-built development environment. It offers features like context-aware code generation, intelligent refactoring, and debugging tools, making it ideal for managing large and complex projects. With a focus on developer productivity, Windsurf enables faster coding, reduces context-switching, and provides deep insights into the codebase, catering to modern programming needs in a highly efficient way.

Updated:

Leave a comment