Clip
The clipping feature leverages the TextTiling algorithm to segment long-form audio content into coherent clips using the transcript. This approach, first conceptualized by Marti A. Hearst in the 1990s, detects shifts in the topics of a piece of content by analyzing word usage and distribution patterns. Thanks to recent advances in NLP, Texttiling with BERT Embeddings provides significant improvements over Texttiling's original formulation and can be readily applied to SoA transcriptions using Whisper. The algorithm segments the text at the granularity of sentences with the entire process focusing on detecting topic shifts rather than topics themselves. This is particularly effective in identifying distinct sections within a narrative and, consequently, clips of varying lengths optimized for short and extended audio content segments.
Usage
The following returns the start and end time of the clips.
from clipsai import ClipFinder, Transcriber
transcriber = Transcriber()
transcription = transcriber.transcribe(audio_file_path="/abs/path/to/video.mp4")
clipfinder = ClipFinder()
clips = clipfinder.find_clips(transcription=transcription)
print("StartTime: ", clips[0].start_time)
print("EndTime: ", clips[0].end_time)
To trim the video using the returned clips
, run the following code.
media_editor = clipsai.MediaEditor()
# use this if the file contains audio stream only
media_file = clipsai.AudioFile("/abs/path/to/audio_only_file.mp4")
# use this if the file contains both audio and video stream
media_file = clipsai.AudioVideoFile("/abs/path/to/video.mp4")
clip = clips[0] # select the clip you'd like to trim
clip_media_file = media_editor.trim(
media_file=media_file,
start_time=clip.start_time,
end_time=clip.end_time,
trimmed_media_file_path="/abs/path/to/clip.mp4", # doesn't exist yet
)
ClipFinder Class
A class for finding engaging clips based on the input transcript.
Methods
- Name
find_clips
- Type
- -> list[Clip]
- Description
Finds clips in an audio file's transcription using the TextTiling Algorithm.
Required Parameters
- Name
- transcriptionTranscription
- Description
The transcription of an audio or video file to find clips from.
Clip Class
Represents a clip of a video or audio file.
Properties
- Name
start_time
- Type
- string
- Description
The start time of the clip in seconds.
- Name
end_time
- Type
- string
- Description
The end time of the clip in seconds.
- Name
start_char
- Type
- string
- Description
The start character in the transcription of the clip.
- Name
end_char
- Type
- string
- Description
The end character in the transcription of the clip.
Methods
- Name
copy
- Type
- -> Clip
- Description
Returns a copy of the Clip instance.
- Name
to_dict
- Type
- -> dict
- Description
Returns a dictionary representation of the clip.