Multimodal and Task Interfaces

What are Multimodal and Task Interfaces?

Multimodal and task interfaces allow you to generate images, create music, produce videos, and perform other complex tasks through Model Router. These tasks work differently from regular chat completions - they’re asynchronous, meaning you submit a task and then check back later for results.

This is an advanced feature. Make sure you understand basic API usage first. See Direct API Requests for the basics.

How Task Interfaces Work

Task interfaces follow a three-step pattern:

Submit task: Send a request to create a task (e.g., generate an image)
Poll query: Check the task status using the task ID
Get results: Retrieve the final output (image, audio, video, etc.)

Tasks are asynchronous - they don’t return results immediately. You need to poll for status and retrieve results when the task is complete.

Supported Task Types

Midjourney: Image Generation

Generate high-quality images from text prompts. Capabilities:

Imagine: Generate images from text
Describe: Describe existing images
Blend: Combine multiple images
Change: Modify existing images
Shorten: Optimize prompts
Notify: Get notifications when tasks complete

Example: Submit Imagine Task

await fetch("https://app.memorylake.ai/midjourney/imagine", {
  method: "POST",
  headers: {
    "Authorization": "Bearer sk-demo123",
    "Content-Type": "application/json"
  },
  body: JSON.stringify({
    prompt: "a cyberpunk city at night, neon lights, 4k",
    notify: true
  })
});

Query Task Status After submitting, you’ll get a task ID. Use it to check status:

curl https://app.memorylake.ai/tasks/tsk_12345 \
  -H "Authorization: Bearer sk-demo123"

Suno: Music and Voice Generation

Generate music or voice-overs from text prompts. Example: Submit Music Task

await fetch("https://app.memorylake.ai/suno/jobs", {
  method: "POST",
  headers: {
    "Authorization": "Bearer sk-demo123",
    "Content-Type": "application/json"
  },
  body: JSON.stringify({
    prompt: "soft lo-fi background music for coding",
    mode: "music"
  })
});

After successful submission, query progress and audio results (playback links, download links, etc.) by task ID.

RecraftAI: Image Processing

Process and enhance images with various tools. Capabilities:

Vectorization: Convert bitmaps to vector graphics
Background Removal: Remove image backgrounds
Clarity Enhancement: Enlarge images without quality loss
Style Management: Apply styles to images

Use the same pattern: submit task → poll query → get processed image link.

Kling: Video Generation

Generate videos from text or images. Capabilities:

Text-to-Video: Generate videos from text prompts
Image-to-Video: Generate videos from images with text prompts

Submit a task and query by task ID. After completion, retrieve video links from results.

Important Notes

Channel Requirements: Task interfaces require corresponding channels to be enabled. If you get errors, check with your administrator.
Asynchronous Nature: Tasks don’t return results immediately. Always implement polling logic to check status.
Quota Consumption: All task interfaces consume quota just like regular API calls. See View Usage and Billing for details.
Large Data Volumes: Multimodal tasks can have large data volumes. Make sure you have sufficient quota and consider rate limiting.
Task Visibility: Task submission and query must use the same API key to ensure task visibility.

Best Practices

Implement Polling: Don’t assume tasks complete immediately. Poll regularly until status is “completed” or “failed”
Handle Errors: Tasks can fail. Always check the status and handle error cases gracefully
Monitor Quota: Large tasks consume more quota. Monitor your usage regularly
Rate Limiting: Don’t submit too many tasks at once. Stagger submissions to avoid rate limits
Check Logs: View task details and billing in the console logs

Getting started

Model Router

MemoryLake

Team collaboration

Multimodal and Task Interfaces

What are Multimodal and Task Interfaces?

How Task Interfaces Work

Supported Task Types

Midjourney: Image Generation

Suno: Music and Voice Generation

RecraftAI: Image Processing

Kling: Video Generation

Important Notes

Best Practices

Getting started

Model Router

MemoryLake

Team collaboration

​What are Multimodal and Task Interfaces?

​How Task Interfaces Work

​Supported Task Types

​Midjourney: Image Generation

​Suno: Music and Voice Generation

​RecraftAI: Image Processing

​Kling: Video Generation

​Important Notes

​Best Practices

​Related Documentation

What are Multimodal and Task Interfaces?

How Task Interfaces Work

Supported Task Types

Midjourney: Image Generation

Suno: Music and Voice Generation

RecraftAI: Image Processing

Kling: Video Generation

Important Notes

Best Practices

Related Documentation