Clips AI Documentation

Clips AI is an open-source Python library that automatically converts longform video into clips. With just a few lines of code, you can segment a video into multiple clips and resize its aspect ratio from 16:9 to 9:16.

GitHub Demo

Quickstart

Clips AI is designed for audio-centric, narrative-based videos such as podcasts, interviews, speeches, and sermons. Our clipping algorithm analyzes a video's transcript to identify and create clips. Our resizing algorithm dynamically reframes videos to focuse on the current speaker, converting the video into various aspect ratios.

Installation

Install Python dependencies.
We highly suggest using a virtual environment (such as venv) to avoid dependency conflicts
pip install clipsai
pip install whisperx@git+https://github.com/m-bain/whisperx.git
Install libmagic
Install ffmpeg

Creating clips

Since clips are found using the video's transcript, the video must first be transcribed. Transcribing is done with WhisperX, an open-source wrapper on Whisper with additional functionality for detecting start and stop times for each word. For trimming the original video into a chosen clip, refer to the clipping reference.

from clipsai import ClipFinder, Transcriber

transcriber = Transcriber()
transcription = transcriber.transcribe(audio_file_path="/abs/path/to/video.mp4")

clipfinder = ClipFinder()
clips = clipfinder.find_clips(transcription=transcription)

print("StartTime: ", clips[0].start_time)
print("EndTime: ", clips[0].end_time)

Read clipping reference

Resizing a video

A hugging face access token is required to resize a video since Pyannote is utilized for speaker diarization. You won't be charged for using Pyannote and instructions are on the Pyannote HuggingFace page. For resizing the original video to the desired aspect ratio, refer to the resizing reference.

from clipsai import resize

crops = resize(
    video_file_path="/abs/path/to/video.mp4",
    pyannote_auth_token="pyannote_token",
    aspect_ratio=(9, 16)
)

print("Crops: ", crops.segments)

Read resizing reference