chat-about-video
TypeScript icon, indicating that this package has built-in type declarations

2.7.0 • Public • Published

chat-about-video

Chat about a video clip using the powerful OpenAI GPT-4 Vision or GPT-4o.

Version Downloads/week License

chat-about-video is an open-source NPM package designed to accelerate the development of conversation applications about video content. Harnessing the capabilities of OpenAI GPT-4 Vision or GPT-4o services from Microsoft Azure or OpenAI, this package opens up a range of usage scenarios with minimal effort.

Key features:

  • ChatGPT models hosted in both Azure and OpenAI are supported.
  • Frame images are extracted from the input video, and uploaded for ChatGPT to consume.
  • It can automatically retry on receiving throttling (HTTP status code 429) responses from the API.
  • Options supported by the underlying API are exposed for customisation.

Usage scenarios

There are two approaches for feeding video content into GPT-4 Vision. chat-about-video supports both of them.

Frame image extraction:

  • Integrate GPT-4 Vision or GPT-4o from Microsoft Azure or OpenAI effortlessly.
  • Utilize ffmpeg integration provided by this package for frame image extraction or opt for a DIY approach.
  • Store frame images with ease, supporting Azure Blob Storage and AWS S3.
  • GPT-4 Vision hosted in Azure allows analysis of up to 10 frame images.
  • GPT-4 Vision or GPT-4o hosted in OpenAI allows analysis of more than 10 frame images.

Video indexing with Microsoft Azure:

  • Exclusively supported by GPT-4 Vision from Microsoft Azure.
  • Ingest videos seamlessly into Microsoft Azure's Video Retrieval Index.
  • Automatic extraction of up to 20 frame images using Video Retrieval Indexer.
  • Default integration of speech transcription for enhanced comprehension.
  • Flexible storage options with support for Azure Blob Storage and AWS S3.

Usage

Installation

Add chat-about-video as a dependency to your Node.js application using the following command:

npm i chat-about-video

Dependencies

If you intend to utilize ffmpeg for extracting video frame images, ensure it is installed on your system. You can install it using either a system package manager or a helper NPM package:

sudo apt install ffmpeg
# or
npm i @ffmpeg-installer/ffmpeg

If you plan to use Azure Blob Storage, include the following dependency:

npm i @azure/storage-blob

For using AWS S3, install the following dependencies:

npm i @handy-common-utils/aws-utils @aws-sdk/s3-request-presigner @aws-sdk/client-s3

Usage in code

To integrate chat-about-video into your Node.js application, follow these simple steps:

  1. Instantiate the ChatAboutVideo class by creating an instance. The constructor allows you to pass in configuration options.
  • Most configuration options come with sensible default values, but you can specify your own for further customization.
  • The second constructor argument is a logger. If not specified, a default logging will be created for logging to the console. If logging is not needed, you can pass in undefined.
  1. Use the startConversation(videoFilePath) function to initiate a conversation about a video clip. This function returns a Conversation object. The video file or its frame images are sent to Azure Blob Storage or AWS S3 during this step.
  2. Interact with GPT by using the say(question, { maxTokens: 2000 }) function within the conversation. You can pass in a question, and will receive an answer.
  • Message history is automatically kept during the conversation, providing context for a more coherent dialogue.
  • The second parameter of the say(...) function allows you to specify your own for further customization.
  1. Wrap up the conversation using the end() function. This ensures proper clean-up and resource management.

Examples

Below is an example chat application, which

  • uses GPT deployment (in this example, it is named 'gpt4vision') hosted in Microsoft Azure;
  • uses ffmpeg to extract video frame images;
  • stores video frame images in Azure Blob Storage;
    • container name: 'vision-experiment-input'
    • object path prefix: 'video-frames/'
  • reads credentials from environment variables
  • reads input video file path from environment variable 'DEMO_VIDEO'
import readline from 'node:readline';
import { ChatAboutVideo } from 'chat-about-video';

const rl = readline.createInterface({ input: process.stdin, output: process.stdout });
const prompt = (question: string) => new Promise<string>((resolve) => rl.question(question, resolve));

async function demo() {
  const chat = new ChatAboutVideo({
    openAiEndpoint: process.env.AZURE_OPENAI_API_ENDPOINT!, // This line is not needed if you are using GTP provided by OpenAI rather than by Microsoft Azure.
    openAiApiKey: process.env.OPENAI_API_KEY!, // This is the API key.
    azureStorageConnectionString: process.env.AZURE_STORAGE_CONNECTION_STRING!, // This line is not needed if you'd like to use AWS S3.
    openAiDeploymentName: 'gpt4vision', // For GPT provided by OpenAI, this is the model name. For GPT provided by Microsoft Azure, this is the deployment name.
    storageContainerName: 'vision-experiment-input', // Blob container name in Azure or S3 bucket name in AWS
    storagePathPrefix: 'video-frames/',
  });

  const conversation = await chat.startConversation(process.env.DEMO_VIDEO!);
  
  while(true) {
    const question = await prompt('\nUser: ');
    if (!question) {
      continue;
    }
    if (['exit', 'quit'].includes(question.toLowerCase().trim())) {
      break;
    }
    const answer = await conversation.say(question, { maxTokens: 2000 });
    console.log('\nAI:' + answer);
  }
}

demo().catch((error) => console.error(error));

Below is an example showing how to create an instance of ChatAboutVideo that

  • uses GPT provided by OpenAI;
  • uses ffmpeg to extract video frame images;
  • stores video frame images in AWS S3;
    • bucket name: 'my-s3-bucket'
    • object path prefix: 'video-frames/'
  • reads API key from environment variable 'OPENAI_API_KEY'
  const chat = new ChatAboutVideo({
    openAiApiKey: process.env.OPENAI_API_KEY!,
    openAiDeploymentName: 'gpt-4-vision-preview', // or 'gpt-4o'
    storageContainerName: 'my-s3-bucket',
    storagePathPrefix: 'video-frames/',
    extractVideoFrames: {
      limit: 30,    // override default value 10
      interval: 2,  // override default value 5
    },
  });

Below is an example showing how to create an instance of ChatAboutVideo that

  • uses GPT deployment (in this example, it is named 'gpt4vision') hosted in Microsoft Azure;
  • uses Microsoft Video Retrieval Index to extract frames and analyse the video
    • A randomly named index is created automatically.
    • The index is also deleted automatically when the conversation ends.
  • stores video file in Azure Blob Storage;
    • container name: 'vision-experiment-input'
    • object path prefix: 'videos/'
  • reads credentials from environment variables
  const chat = new ChatAboutVideo({
    openAiEndpoint: process.env.AZURE_OPENAI_API_ENDPOINT!,
    openAiApiKey: process.env.AZURE_OPENAI_API_KEY!,
    azureStorageConnectionString: process.env.AZURE_STORAGE_CONNECTION_STRING!,
    openAiDeploymentName: 'gpt4vision',
    storageContainerName: 'vision-experiment-input',
    storagePathPrefix: 'videos/',
    videoRetrievalIndex: {
      endpoint: process.env.AZURE_CV_API_ENDPOINT!,
      apiKey: process.env.AZURE_CV_API_KEY!,
      createIndexIfNotExists: true,
      deleteIndexWhenConversationEnds: true,
    },
  });

API

chat-about-video

Modules

Classes

Class: VideoRetrievalApiClient

azure/video-retrieval-api-client.VideoRetrievalApiClient

Constructors

constructor

new VideoRetrievalApiClient(endpointBaseUrl, apiKey, apiVersion?)

Parameters
Name Type Default value
endpointBaseUrl string undefined
apiKey string undefined
apiVersion string '2023-05-01-preview'

Methods

createIndex

createIndex(indexName, indexOptions?): Promise<void>

Parameters
Name Type
indexName string
indexOptions CreateIndexOptions
Returns

Promise<void>


createIndexIfNotExist

createIndexIfNotExist(indexName, indexOptions?): Promise<void>

Parameters
Name Type
indexName string
indexOptions? CreateIndexOptions
Returns

Promise<void>


createIngestion

createIngestion(indexName, ingestionName, ingestion): Promise<void>

Parameters
Name Type
indexName string
ingestionName string
ingestion IngestionRequest
Returns

Promise<void>


deleteDocument

deleteDocument(indexName, documentUrl): Promise<void>

Parameters
Name Type
indexName string
documentUrl string
Returns

Promise<void>


deleteIndex

deleteIndex(indexName): Promise<void>

Parameters
Name Type
indexName string
Returns

Promise<void>


getIndex

getIndex(indexName): Promise<undefined | IndexSummary>

Parameters
Name Type
indexName string
Returns

Promise<undefined | IndexSummary>


getIngestion

getIngestion(indexName, ingestionName): Promise<IngestionSummary>

Parameters
Name Type
indexName string
ingestionName string
Returns

Promise<IngestionSummary>


ingest

ingest(indexName, ingestionName, ingestion, backoff?): Promise<void>

Parameters
Name Type
indexName string
ingestionName string
ingestion IngestionRequest
backoff number[]
Returns

Promise<void>


listDocuments

listDocuments(indexName): Promise<DocumentSummary[]>

Parameters
Name Type
indexName string
Returns

Promise<DocumentSummary[]>


listIndexes

listIndexes(): Promise<IndexSummary[]>

Returns

Promise<IndexSummary[]>

Class: ChatAboutVideo

chat.ChatAboutVideo

Constructors

constructor

new ChatAboutVideo(options, log?)

Parameters
Name Type
options ChatAboutVideoConstructorOptions
log undefined | LineLogger<(message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void>

Properties

Property Description
Protected client: OpenAIClient
Protected log: undefined | LineLogger<(message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void>
Protected options: ChatAboutVideoOptions

Methods

prepareVideoFrames

Protected prepareVideoFrames(conversationId, videoFile, extractVideoFramesOptions?): Promise<PreparationResult>

Parameters
Name Type
conversationId string
videoFile string
extractVideoFramesOptions? Partial<{ extractor: VideoFramesExtractor ; height: undefined | number ; interval: number ; limit: number ; width: undefined | number }>
Returns

Promise<PreparationResult>


prepareVideoRetrievalIndex

Protected prepareVideoRetrievalIndex(conversationId, videoFile, videoRetrievalIndexOptions?): Promise<PreparationResult>

Parameters
Name Type
conversationId string
videoFile string
videoRetrievalIndexOptions? Partial<{ apiKey: string ; createIndexIfNotExists?: boolean ; deleteDocumentWhenConversationEnds?: boolean ; deleteIndexWhenConversationEnds?: boolean ; endpoint: string ; indexName?: string }>
Returns

Promise<PreparationResult>


startConversation

startConversation(videoFile, options?): Promise<Conversation>

Start a conversation about a video.

Parameters
Name Type Description
videoFile string Path to a video file in local file system.
options? Object overriding options for this conversation
options.chatCompletions? Partial<ChatOptions> -
options.extractVideoFrames? Partial<{ extractor: VideoFramesExtractor ; height: undefined | number ; interval: number ; limit: number ; width: undefined | number }> -
options.videoRetrievalIndex? Partial<{ apiKey: string ; createIndexIfNotExists?: boolean ; deleteDocumentWhenConversationEnds?: boolean ; deleteIndexWhenConversationEnds?: boolean ; endpoint: string ; indexName?: string }> -
Returns

Promise<Conversation>

The conversation.

Class: Conversation

chat.Conversation

Constructors

constructor

new Conversation(client, deploymentName, conversationId, messages, options?, cleanup?, log?)

Parameters
Name Type
client OpenAIClient
deploymentName string
conversationId string
messages ChatRequestMessage[]
options? GetChatCompletionsOptions
cleanup? () => Promise<void>
log undefined | LineLogger<(message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void>

Properties

Property Description
Protected Optional cleanup: () => Promise<void>
Protected client: OpenAIClient
Protected conversationId: string
Protected deploymentName: string
Protected log: undefined | LineLogger<(message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void>
Protected messages: ChatRequestMessage[]
Protected Optional options: GetChatCompletionsOptions

Methods

end

end(): Promise<void>

Returns

Promise<void>


say

say(message, options?): Promise<undefined | string>

Say something in the conversation, and get the response from AI

Parameters
Name Type Description
message string The message to say in the conversation.
options? ChatOptions Options for fine control.
Returns

Promise<undefined | string>

The response/completion

Interfaces

Interface: CreateIndexOptions

azure/video-retrieval-api-client.CreateIndexOptions

Properties

Property Description
Optional features: IndexFeature[]
Optional metadataSchema: IndexMetadataSchema
Optional userData: object

Interface: DocumentSummary

azure/video-retrieval-api-client.DocumentSummary

Properties

Property Description
createdDateTime: string
documentId: string
Optional documentUrl: string
lastModifiedDateTime: string
Optional metadata: object
Optional userData: object

Interface: IndexFeature

azure/video-retrieval-api-client.IndexFeature

Properties

Property Description
Optional domain: "surveillance" | "generic"
Optional modelVersion: string
name: "vision" | "speech"

Interface: IndexMetadataSchema

azure/video-retrieval-api-client.IndexMetadataSchema

Properties

Property Description
fields: IndexMetadataSchemaField[]
Optional language: string

Interface: IndexMetadataSchemaField

azure/video-retrieval-api-client.IndexMetadataSchemaField

Properties

Property Description
filterable: boolean
name: string
searchable: boolean
type: "string" | "datetime"

Interface: IndexSummary

azure/video-retrieval-api-client.IndexSummary

Properties

Property Description
createdDateTime: string
eTag: string
Optional features: IndexFeature[]
lastModifiedDateTime: string
name: string
Optional userData: object

Interface: IngestionRequest

azure/video-retrieval-api-client.IngestionRequest

Properties

Property Description
Optional filterDefectedFrames: boolean
Optional generateInsightIntervals: boolean
Optional includeSpeechTranscript: boolean
Optional moderation: boolean
videos: VideoIngestion[]

Interface: IngestionStatusDetail

azure/video-retrieval-api-client.IngestionStatusDetail

Properties

Property Description
documentId: string
documentUrl: string
lastUpdatedTime: string
succeeded: boolean

Interface: IngestionSummary

azure/video-retrieval-api-client.IngestionSummary

Properties

Property Description
Optional batchName: string
createdDateTime: string
Optional fileStatusDetails: IngestionStatusDetail[]
lastModifiedDateTime: string
name: string
state: "NotStarted" | "Running" | "Completed" | "Failed" | "PartiallySucceeded"

Interface: VideoIngestion

azure/video-retrieval-api-client.VideoIngestion

Properties

Property Description
Optional documentId: string
documentUrl: string
Optional metadata: object
mode: "update" | "remove" | "add"
Optional userData: object

Interface: ChatAboutVideoOptions

chat.ChatAboutVideoOptions

Option settings for ChatAboutVideo

Properties

Property Description
Optional extractVideoFrames: Object Type declaration

fileBatchUploader: FileBatchUploader Function for uploading files
Optional initialPrompts: ChatRequestMessage[] Initial prompts to be added to the chat history before frame images.
openAiDeploymentName: string Name/ID of the deployment
Optional startPrompts: ChatRequestMessage[] Prompts to be added to the chat history right after frame images.
storageContainerName: string Storage container for storing frame images of the video.
storagePathPrefix: string Path prefix to be prepended for storing frame images of the video.
tmpDir: string Temporary directory for storing temporary files.
If not specified, them temporary directory of the OS will be used.
Optional videoRetrievalIndex: Object Type declaration

Module: aws

Functions

createAwsS3FileBatchUploader

createAwsS3FileBatchUploader(s3Client, expirationSeconds, parallelism?): FileBatchUploader

Parameters
Name Type Default value
s3Client S3Client undefined
expirationSeconds number undefined
parallelism number 3
Returns

FileBatchUploader

Module: azure

References

CreateIndexOptions

Re-exports CreateIndexOptions


DocumentSummary

Re-exports DocumentSummary


IndexFeature

Re-exports IndexFeature


IndexMetadataSchema

Re-exports IndexMetadataSchema


IndexMetadataSchemaField

Re-exports IndexMetadataSchemaField


IndexSummary

Re-exports IndexSummary


IngestionRequest

Re-exports IngestionRequest


IngestionStatusDetail

Re-exports IngestionStatusDetail


IngestionSummary

Re-exports IngestionSummary


PaginatedWithNextLink

Re-exports PaginatedWithNextLink


VideoIngestion

Re-exports VideoIngestion


VideoRetrievalApiClient

Re-exports VideoRetrievalApiClient

Functions

createAzureBlobStorageFileBatchUploader

createAzureBlobStorageFileBatchUploader(blobServiceClient, expirationSeconds, parallelism?): FileBatchUploader

Parameters
Name Type Default value
blobServiceClient BlobServiceClient undefined
expirationSeconds number undefined
parallelism number 3
Returns

FileBatchUploader

Module: azure/video-retrieval-api-client

Classes

Interfaces

Type Aliases

PaginatedWithNextLink

Ƭ PaginatedWithNextLink<T>: Object

Type parameters
Name
T
Type declaration
Name Type
nextLink? string
value T[]

Module: chat

Classes

Interfaces

Type Aliases

ChatAboutVideoConstructorOptions

Ƭ ChatAboutVideoConstructorOptions: Partial<Omit<ChatAboutVideoOptions, "videoRetrievalIndex" | "extractVideoFrames">> & Required<Pick<ChatAboutVideoOptions, "openAiDeploymentName" | "storageContainerName">> & { extractVideoFrames?: Partial<Exclude<ChatAboutVideoOptions["extractVideoFrames"], undefined>> ; videoRetrievalIndex?: Partial<ChatAboutVideoOptions["videoRetrievalIndex"]> & Pick<Exclude<ChatAboutVideoOptions["videoRetrievalIndex"], undefined>, "endpoint" | "apiKey"> } & { azureStorageConnectionString?: string ; downloadUrlExpirationSeconds?: number ; openAiApiKey: string ; openAiEndpoint?: string }


ChatOptions

Ƭ ChatOptions: GetChatCompletionsOptions & { throttleBackoff?: number[] }


ExtractVideoFramesOptions

Ƭ ExtractVideoFramesOptions: Exclude<ChatAboutVideoOptions["extractVideoFrames"], undefined>


VideoRetrievalIndexOptions

Ƭ VideoRetrievalIndexOptions: Exclude<ChatAboutVideoOptions["videoRetrievalIndex"], undefined>

Module: client-hack

Functions

fixClient

fixClient(openAIClient): void

Parameters
Name Type
openAIClient any
Returns

void

Module: index

References

ChatAboutVideo

Re-exports ChatAboutVideo


ChatAboutVideoConstructorOptions

Re-exports ChatAboutVideoConstructorOptions


ChatAboutVideoOptions

Re-exports ChatAboutVideoOptions


ChatOptions

Re-exports ChatOptions


Conversation

Re-exports Conversation


ExtractVideoFramesOptions

Re-exports ExtractVideoFramesOptions


FileBatchUploader

Re-exports FileBatchUploader


VideoFramesExtractor

Re-exports VideoFramesExtractor


VideoRetrievalIndexOptions

Re-exports VideoRetrievalIndexOptions


extractVideoFramesWithFfmpeg

Re-exports extractVideoFramesWithFfmpeg


lazyCreatedFileBatchUploader

Re-exports lazyCreatedFileBatchUploader


lazyCreatedVideoFramesExtractor

Re-exports lazyCreatedVideoFramesExtractor

Module: storage

References

FileBatchUploader

Re-exports FileBatchUploader

Functions

lazyCreatedFileBatchUploader

lazyCreatedFileBatchUploader(creator): FileBatchUploader

Parameters
Name Type
creator Promise<FileBatchUploader>
Returns

FileBatchUploader

Module: storage/types

Type Aliases

FileBatchUploader

Ƭ FileBatchUploader: (dir: string, fileNames: string[], containerName: string, blobPathPrefix: string) => Promise<string[]>

Type declaration

▸ (dir, fileNames, containerName, blobPathPrefix): Promise<string[]>

####### Parameters

Name Type
dir string
fileNames string[]
containerName string
blobPathPrefix string

####### Returns

Promise<string[]>

Module: video

References

VideoFramesExtractor

Re-exports VideoFramesExtractor


extractVideoFramesWithFfmpeg

Re-exports extractVideoFramesWithFfmpeg

Functions

lazyCreatedVideoFramesExtractor

lazyCreatedVideoFramesExtractor(creator): VideoFramesExtractor

Parameters
Name Type
creator Promise<VideoFramesExtractor>
Returns

VideoFramesExtractor

Module: video/ffmpeg

Functions

extractVideoFramesWithFfmpeg

extractVideoFramesWithFfmpeg(inputFile, outputDir, intervalSec, format?, width?, height?, startSec?, endSec?): Promise<string[]>

Parameters
Name Type
inputFile string
outputDir string
intervalSec number
format? string
width? number
height? number
startSec? number
endSec? number
Returns

Promise<string[]>

Module: video/types

Type Aliases

VideoFramesExtractor

Ƭ VideoFramesExtractor: (inputFile: string, outputDir: string, intervalSec: number, format?: string, width?: number, height?: number, startSec?: number, endSec?: number) => Promise<string[]>

Type declaration

▸ (inputFile, outputDir, intervalSec, format?, width?, height?, startSec?, endSec?): Promise<string[]>

####### Parameters

Name Type
inputFile string
outputDir string
intervalSec number
format? string
width? number
height? number
startSec? number
endSec? number

####### Returns

Promise<string[]>

Package Sidebar

Install

npm i chat-about-video

Weekly Downloads

230

Version

2.7.0

License

Apache-2.0

Unpacked Size

94.6 kB

Total Files

36

Last publish

Collaborators

  • james-hu