AiShortsGenerator is an API designed to provide content generation for videos, audio, captions, and images. The backend is integrated with AI tools and services for generating content.
This API uses:
- Gemini API for generating video content.
- Google Text-to-Speech API for synthesizing text-to-speech audio.
- AssemblyAI for generating captions for audio/video files.
- Cloudflare Workers AI for generating images from text prompts.
- Cloudinary for storing and serving media files.
The API is built on ASP.NET Core and PostgreSQL for data persistence.
- Content Generation: Create video content based on user input using Google's Gemini API.
- Audio Synthesis: Convert text input to speech using Google Cloud Text-to-Speech API.
- Caption Generation: Generate captions for audio or video files using AssemblyAI.
- Image Generation: Generate images from text prompts using Cloudflare's Workers AI API.
- Video Storage: Save and retrieve video data, including associated content and captions.
- Backend Framework: ASP.NET Core 7
- Database: PostgreSQL
- AI Services:
- Google Gemini API
- Google Cloud Text-to-Speech
- AssemblyAI
- Cloudflare API
- Cloud Storage: Cloudinary
- ORM: Entity Framework Core
Ensure you have the following set up:
- .NET 9 SDK.
- PostgreSQL for the database. I used Neon Serverless Postgres
- Google Cloud Text-to-Speech API Key
- Gemini API key
- Cloudinary account and API key.
- Cloudflare Workers AI account and API key. I used flux-1-schnell Model
- AssemblyAI API key.
-
Clone the repository:
git clone https://github.com/RianNegreiros/AiShortsVideosGenerator.git cd AiShortsGenerator/backend
-
Install dependencies:
dotnet restore
-
Configure your settings: Create a
appsettings.Development.json
file and configure the following settings:{ "Logging": { "LogLevel": { "Default": "Information", "Microsoft.AspNetCore": "Warning" } }, "BaseUrl": "http://localhost:5211", "GoogleApi": { "GeminiKey": "your-gemini-api-key", "TextToSpeechKey": "your-text-to-speech-api-key" }, "AssemblyAi": { "ApiKey": "your-assemblyai-api-key" }, "CloudinaryUrl": "your-cloudinary-url", "Cloudflare": { "ApiKey": "your-cloudflare-api-key", "AccountId": "your-cloudflare-account-id" }, "ConnectionStrings": { "DefaultConnection": "your-postgresql-connection-string" } }
-
Create the database:
dotnet ef database update
-
Run the application:
dotnet run
The backend will be accessible at http://localhost:5211.
-
POST /generate-content
-
Description: Generate video content from user input using Google's Gemini API.
-
Request Body:
{ "input": "Your text input for content generation." }
-
Response: List of generated video content items.
-
Example:
[ { "id": 1, "content": "Generated content", "otherProperty": "value" } ]
-
-
POST /generate-audio
-
Description: Convert text input to audio (MP3) using Google Cloud Text-to-Speech API.
-
Request Body:
{ "input": "Text to convert to speech." }
-
Response: URL of the uploaded audio file.
-
Example:
{ "url": "https://your-cloudinary-url.com/audio.mp3" }
-
-
POST /generate-captions
-
Description: Generate captions for an audio or video file using AssemblyAI.
-
Request Body:
{ "fileUrl": "URL of the audio/video file." }
-
Response: Transcription and captions in JSON format.
-
-
POST /generate-image
-
Description: Generate an image from a text prompt using Cloudflare's AI API.
-
Request Body:
{ "Prompt": "Text prompt for image generation." }
-
Response: URL of the uploaded image.
-
Example:
{ "url": "https://your-cloudinary-url.com/image.jpg" }
-
-
POST /save-video
-
Description: Save a video record to the database.
-
Request Body:
{ "VideoContent": "Generated video content", "Captions": "Generated captions" }
-
Response: Success message with video ID.
-
-
PUT /videos/{id}
-
Description: Update a video by its ID.
-
Request Body:
{ "OutputFile": "https://your-aws-s3-bucket-url/out.mp4", "RenderId": "8hfxlw" }
-
Response: Success message with video updated.
-
-
GET /videos
- Description: Retrieve a list of all videos stored in the database.
- Response: List of videos.
-
DELETE /videos/{id}
- Description: Delete the video in the database.
- Response: Not content message.
The database uses PostgreSQL with a Videos table. Each video contains:
- VideoContent: Video content in JSON format.
- AudioFileUrl: Audio file URL in string format.
- Captions: Captions in JSON format.
- Images: Images in an array of strings format.
- OutputFile: Rendered video URL in string format.
- RenderId: Rendered video ID in string format.
Google.Cloud.TextToSpeech.V1
: Google Cloud Text-to-Speech client.AssemblyAI:
Client library for AssemblyAI services.CloudinaryDotNet
: Cloudinary SDK for uploading media.Npgsql.EntityFrameworkCore.PostgreSQL
: PostgreSQL support for Entity Framework Core.