What are Multimodal and Task Interfaces?
Multimodal and task interfaces allow you to generate images, create music, produce videos, and perform other complex tasks through Model Router. These tasks work differently from regular chat completions - they’re asynchronous, meaning you submit a task and then check back later for results.This is an advanced feature. Make sure you understand basic API usage first. See Direct API Requests for the basics.
How Task Interfaces Work
Task interfaces follow a three-step pattern:- Submit task: Send a request to create a task (e.g., generate an image)
- Poll query: Check the task status using the task ID
- Get results: Retrieve the final output (image, audio, video, etc.)
Tasks are asynchronous - they don’t return results immediately. You need to poll for status and retrieve results when the task is complete.
Supported Task Types
Midjourney: Image Generation
Generate high-quality images from text prompts. Capabilities:- Imagine: Generate images from text
- Describe: Describe existing images
- Blend: Combine multiple images
- Change: Modify existing images
- Shorten: Optimize prompts
- Notify: Get notifications when tasks complete
Suno: Music and Voice Generation
Generate music or voice-overs from text prompts. Example: Submit Music TaskRecraftAI: Image Processing
Process and enhance images with various tools. Capabilities:- Vectorization: Convert bitmaps to vector graphics
- Background Removal: Remove image backgrounds
- Clarity Enhancement: Enlarge images without quality loss
- Style Management: Apply styles to images
Kling: Video Generation
Generate videos from text or images. Capabilities:- Text-to-Video: Generate videos from text prompts
- Image-to-Video: Generate videos from images with text prompts
Important Notes
- Channel Requirements: Task interfaces require corresponding channels to be enabled. If you get errors, check with your administrator.
- Asynchronous Nature: Tasks don’t return results immediately. Always implement polling logic to check status.
- Quota Consumption: All task interfaces consume quota just like regular API calls. See View Usage and Billing for details.
- Large Data Volumes: Multimodal tasks can have large data volumes. Make sure you have sufficient quota and consider rate limiting.
- Task Visibility: Task submission and query must use the same API key to ensure task visibility.
Best Practices
- Implement Polling: Don’t assume tasks complete immediately. Poll regularly until status is “completed” or “failed”
- Handle Errors: Tasks can fail. Always check the status and handle error cases gracefully
- Monitor Quota: Large tasks consume more quota. Monitor your usage regularly
- Rate Limiting: Don’t submit too many tasks at once. Stagger submissions to avoid rate limits
- Check Logs: View task details and billing in the console logs