Last updated about 2 months ago
This guide showcases how to deploy a video transcription service. We will use the Google Cloud Video Intelligence API to transform videos to text (speed-to-text) and the Koyeb Serverless Engine to handle your video files and orchestrate the processing.
Once you have completed this tutorial, you will be able to upload your videos via the Koyeb S3-compatible API which will trigger a function generating a video speech transcription file each time a new video is uploaded.
You can then use the speech transcription file to:
And many other use-cases.
In this guide, we use a Koyeb Managed Store to store our videos and the generated video speech transcription file. You can also connect your own Cloud Storage Provider to integrate with your existing infrastructure and data with minimal effort. Learn how to connect your Cloud Storage Provider here.
To build a video transcription service using Koyeb and the GCP Video Intelligence API there are four steps:
The first step is to create a Koyeb Store to store our videos and the transcription files generated. Koyeb Stores provides an S3-compatible API allowing you to manage your data programmatically using any S3-compatible SDKs and tools.
To create a new Koyeb Managed Store with the CLI, in your terminal, type:
1koyeb create store -f new-store.yaml 2
Where the content of new-store.yaml is:
1name: my-store-01 2type: koyeb 3
From there, you have a Koyeb Store up and running. You can interact with your store using any S3 compatible SDKs and tools.
The next step is to configure S3cmd to interact with the Store, so we can upload videos and retrieve video audio transcription files from there.
In the Koyeb Control Panel, click API in the left side menu and click New in the S3 credentials section. A modal appears to create a new S3 credential. Enter the name and a description (optional) to identify and remember what this credential is used for.
Click the Submit button. Save the access_key and secret_key generated in a secure place. Once the modal closed, you will not able to see them again.
Create an S3cmd in your home repository and replace the value REPLACE_ME with the credentials you previously generated.
1[default] 2access_key = REPLACE_ME 3secret_key = REPLACE_ME 4bucket_location = US 5check_ssl_certificate = True 6check_ssl_hostname = True 7default_mime_type = binary/octet-stream 8delay_updates = False 9delete_after = False 10delete_after_fetch = False 11delete_removed = False 12dry_run = False 13enable_multipart = True 14encoding = UTF-8 15encrypt = False 16follow_symlinks = False 17force = False 18get_continue = False 19guess_mime_type = True 20host_base = s3.eu-west-1.prod.koyeb.com 21host_bucket = %(bucket)s.s3.eu-west-1.prod.koyeb.com 22human_readable_sizes = False 23invalidate_default_index_on_cf = False 24invalidate_default_index_root_on_cf = True 25invalidate_on_cf = False 26limit = -1 27limitrate = 0 28list_md5 = False 29long_listing = False 30max_delete = -1 31multipart_chunk_size_mb = 15 32multipart_max_chunks = 10000 33preserve_attrs = True 34progress_meter = True 35put_continue = False 36recursive = False 37recv_chunk = 65536 38reduced_redundancy = False 39requester_pays = False 40restore_days = 1 41restore_priority = Standard 42send_chunk = 65536 43server_side_encryption = False 44signature_v2 = False 45signurl_use_https = False 46skip_existing = False 47socket_timeout = 300 48stats = False 49stop_on_error = False 50throttle_max = 100 51urlencoding_mode = normal 52use_https = True 53use_mime_magic = True 54verbosity = WARNING 55
To check the configuration is working fine, in the terminal type:
1s3cmd -c ~/.s3cfg-gcp ls 22020-10-28 09:15 s3://my-store-01 3
You should see the Store you previously created.
Create a Koyeb Secret to securely store your GCP Service Account configuration. Koyeb Secrets allow you to access API credentials, tokens, etc. securely in your configuration and functions without having to expose them.
Create a secret.yaml
file and replace the value with your GCP Service Account configuration.
1name: gcp-sa-vi 2value: | 3 {...} 4
1koyeb create secrets -f secret.yaml 2
Our Store is configured and ready-to-use. The next step is to deploy our processing function to perform the video speech transcription. We will use the Koyeb Catalog App to perform the processing as it allows you to perform this operation without writing a single line of code.
In the terminal, start by creating a new Stack. Stacks are processing environments containing code and containers.
1koyeb create stack -n video-transcription 2
With our Stack created, we can configure and deploy the video speech transcription app. Create a file containing our function configuration video-transcription.yaml
:
1functions: 2 - name: gcp-video-intelligence 3 use: gcp-video-intelligence@1.0.1 4 with: 5 STORE: my-store-01 #The store to watch to trigger the function and save the GCP Video intelligence result. This parameter is required. 6 GCP_KEY: my-gcp-secret #The name of the secret in which the GCP service account will be stored. This parameter is required. 7 VIDEO_INTELLIGENCE_FEATURE: SPEECH_TRANSCRIPTION 8
Deploy the function by running:
1koyeb create revision video-transcription -f video-transcription.yaml` 2
This deploys the function into our Stack. Now, each time a video is uploaded to the Store my-store-01
, the function will be triggered and a video speech transcription file will be generated.
With our processing stack ready, we can now check everything is running fine and that for each video uploaded, a video speech transcription file is generated.
To upload a video using S3cmd, in the terminal type:
1s3cmd put /path/to/video.mp4 s3://my-store-01 2
Now, if you type koyeb logs stack-events video-transcription
you see an event appears that triggers your functions. This event is then used in your function to retrieve the video file and perform the speech transcription. You can follow the function execution running: koyeb logs functions video-transcription gcp-video-intelligence
.
Once the execution done, you can retrieve the speech-transcription file running:
1s3cmd get s3://my-store-01/gcp-video-intelligence-SPEECH_TRANSCRIPTION-[...].json 2
This file contains the result of the processing function with the detected text in the video:
1"results": [ 2 { 3 "alternatives": [ 4 { 5 "transcript": "Hey, I'm John...", 6 "confidence": 0.7477226853370667, 7 "words": [ 8 { 9 "startTime": { 10 "nanos": 500000000 11 }, 12 "endTime": { 13 "nanos": 700000000 14 }, 15 "word": "Hey," 16 }, 17 { 18 "startTime": { 19 "nanos": 700000000 20 }, 21 "endTime": { 22 "nanos": 900000000 23 }, 24 "word": "I'm" 25 }, 26 ... 27
In this guide, we discovered how to deploy a video transcription service using Google Video Intelligence API and the Koyeb Serverless Engine. We used S3cmd to upload and retrieve video but you can also use any S3 compatible SDKs and tools.
The catalog integrations code used in this guide is available on GitHub.
If you have any questions about this tutorial, feel free to reach out to us on the Koyeb Slack Community.
Get in touch or create an account and deploy your serverless stack in minutes.