Vlm

The Vlm Node represents an integration with a Visual Language Model (VLM) provider. This type of node is typically used for advanced AI tasks that involve both visual and language understanding, such as analyzing images, generating text from images, creating images from text, or performing multimodal reasoning.

What the Vlm Node Is Best Used For

  • Image Analysis: Extracting information, objects, or text from images.

  • Image Generation: Creating new images based on text prompts or modifying existing images.

  • Document Understanding: Reading and summarizing visual documents (e.g., PDFs, scanned pages).

  • Multimodal Tasks: Combining text and image inputs for richer AI outputs (e.g., answering questions about a chart or diagram).

  • File Management: Uploading, downloading, and managing files for use in AI workflows.

Establishing a Connection

The Vlm Node requires establishing a new connection using an API key before use.

Available Actions (Exhaustive List)

Here are all the actions available for the Vlm provider:

  1. Get Files – Get a list of files.

  2. Upload File – Upload a file.

  3. Get File (by ID) – Get a file by its ID.

  4. Get File (by Hash) – Get a file by its hash.

  5. Generate Presigned URL – Generate a presigned URL for file upload.

  6. Verify File Upload – Verify a file upload.

  7. Check Health – Check the health of the OpenAI VLM service.

  8. chat_completions_v1_openai_chat_completions_post – Generate chat completions (multimodal).

  9. models_v1_openai_models_get – List available models.

  10. model_info_v1_openai_models__model__get – Get information about a specific model.

  11. info_v1_hub_info_get – Get hub information.

  12. list_domains_v1_hub_domains_get – List available domains.

  13. get_domain_schema_v1_hub_schema_post – Get the schema for a domain.

  14. image_generate_v1_image_generate_post – Generate an image from a prompt.

  15. schema_generate_image_v1_image_schema_post – Get the schema for image generation.

  16. document_generate_v1_document_generate_post – Generate a document from a prompt.

  17. document_execute_v1_document_execute_post – Execute a document generation task.

  18. schema_generate_document_v1_document_schema_post – Get the schema for document generation.

  19. video_generate_v1_video_generate_post – Generate a video from a prompt.

  20. audio_generate_v1_audio_generate_post – Generate audio from a prompt.

  21. agent_execute_v1_agent_execute_post – Execute an agent task (multimodal agent).

  22. get_predictions_v1_predictions_get – List predictions.

  23. get_prediction_v1_predictions__id__get – Get a specific prediction.

  24. get_prediction_domain_v1_predictions__id__domain_get – Get the domain of a prediction.

  25. health_v1_health_get – General health check.

  26. get_models_v1_models_get – List all models.

  27. get_domains_v1_domains_get – List all domains.

  28. get_schema_v1_schema_post – Get a schema for a task.

Note: Each action is designed for a specific type of multimodal or file-related task. Some are for file management, others for generating or analyzing content, and some for managing or querying models and domains.

Last updated

Was this helpful?