---
comments: true
---
# Multilingual Speech Recognition Production Line User Guide
## 1. Introduction to Multilingual Speech Recognition Production Line
Speech recognition is an advanced tool that can automatically convert spoken languages into corresponding text or commands. This technology plays an important role in various fields such as intelligent customer service, voice assistants, and meeting records. Multilingual speech recognition supports automatic language detection and recognition of multiple languages.
| Method |
Description |
Parameter |
Parameter Type |
Parameter Description |
Default Value |
print() |
Print the result to the terminal |
format_json |
bool |
Whether to format the output content using JSON indentation |
True |
indent |
int |
Specify the indentation level to beautify the output JSON data, making it more readable. Effective only when format_json is True |
4 |
ensure_ascii |
bool |
Control whether to escape non-ASCII characters to Unicode. When set to True, all non-ASCII characters will be escaped; False will retain the original characters. Effective only when format_json is True |
False |
save_to_json() |
Save the result as a JSON file |
save_path |
str |
Path to save the file. When it is a directory, the saved file name is consistent with the input file type naming |
None |
indent |
int |
Specify the indentation level to beautify the output JSON data, making it more readable. Effective only when format_json is True |
4 |
ensure_ascii |
bool |
Control whether to escape non-ASCII characters to Unicode. When set to True, all non-ASCII characters will be escaped; False will retain the original characters. Effective only when format_json is True |
False |
- Calling the `print()` method will print the result to the terminal, with the printed content explained as follows:
- `input_path`: The path where the input audio is stored
- `result`: Recognition result
- `text`: The text result of speech recognition
- `segments`: The result text with timestamps
* `id`: ID
* `seek`: Audio segment pointer
* `start`: Segment start time
* `end`: Segment end time
* `text`: Text recognized in the segment
* `tokens`: Token IDs of the segment text
* `temperature`: Speed variation ratio
* `avg_logprob`: Average log probability
* `compression_ratio`: Compression ratio
* `no_speech_prob`: Non-speech probability
- `language`: Recognized language
- Calling the `save_to_json()` method will save the above content to the specified `save_path`. If specified as a directory, the saved path will be `save_path/{your_audio_basename}.json`; if specified as a file, it will be saved directly to that file. Since JSON files do not support saving numpy arrays, the `numpy.array` types will be converted to lists.
* Additionally, it also supports obtaining visualized images and prediction results through attributes, as follows:
Multilingual Service Call Examples
Python
import base64
import requests
API_URL = "http://localhost:8080/video-classification" # Service URL
video_path = "./demo.mp4"
output_video_path = "./out.mp4"
# Encode local video to Base64
with open(video_path, "rb") as file:
video_bytes = file.read()
video_data = base64.b64encode(video_bytes).decode("ascii")
payload = {"video": video_data} # Base64 encoded file content or video URL
# Call API
response = requests.post(API_URL, json=payload)
# Process API response
assert response.status_code == 200
result = response.json()["result"]
with open(output_video_path, "wb") as file:
file.write(base64.b64decode(result["video"]))
print(f"Output video saved at {output_video_path}")
print("\nCategories:")
print(result["categories"])
C++
#include <iostream>
#include "cpp-httplib/httplib.h" // https://github.com/Huiyicc/cpp-httplib
#include "nlohmann/json.hpp" // https://github.com/nlohmann/json
#include "base64.hpp" // https://github.com/tobiaslocker/base64
int main() {
httplib::Client client("localhost:8080");
const std::string videoPath = "./demo.mp4";
const std::string outputImagePath = "./out.mp4";
httplib::Headers headers = {
{"Content-Type", "application/json"}
};
// Encode local video to Base64
std::ifstream file(videoPath, std::ios::binary | std::ios::ate);
std::streamsize size = file.tellg();
file.seekg(0, std::ios::beg);
std::vector<char> buffer(size);
if (!file.read(buffer.data(), size)) {
std::cerr << "Error reading file." << std::endl;
return 1;
}
std::string bufferStr(reinterpret_cast<const char*>(buffer.data()), buffer.size());
std::string encodedImage = base64::to_base64(bufferStr);
nlohmann::json jsonObj;
jsonObj["video"] = encodedImage;
std::string body = jsonObj.dump();
// Call API
auto response = client.Post("/video-classification", headers, body, "application/json");
// Process API response
if (response && response->status == 200) {
nlohmann::json jsonResponse = nlohmann::json::parse(response->body);
auto result = jsonResponse["result"];
encodedImage = result["video"];
std::string decodedString = base64::from_base64(encodedImage);
std::vector<unsigned char> decodedImage(decodedString.begin(), decodedString.end());
std::ofstream outputImage(outPutImagePath, std::ios::binary | std::ios::out);
if (outputImage.is_open()) {
outputImage.write(reinterpret_cast<char*>(decodedImage.data()), decodedImage.size());
outputImage.close();
std::cout << "Output video saved at " << outPutImagePath << std::endl;
} else {
std::cerr << "Unable to open file for writing: " << outPutImagePath << std::endl;
}
auto categories = result["categories"];
std::cout << "\nCategories:" << std::endl;
for (const auto& category : categories) {
std::cout << category << std::endl;
}
} else {
std::cout << "Failed to send HTTP request." << std::endl;
return 1;
}
return 0;
}
Java
import okhttp3.*;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.node.ObjectNode;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.Base64;
public class Main {
public static void main(String[] args) throws IOException {
String API_URL = "http://localhost:8080/video-classification"; // Service URL
String videoPath = "./demo.mp4"; // Local video
String outputImagePath = "./out.mp4"; // Output video
// Encode local video to Base64
File file = new File(videoPath);
byte[] fileContent = java.nio.file.Files.readAllBytes(file.toPath());
String videoData = Base64.getEncoder().encodeToString(fileContent);
ObjectMapper objectMapper = new ObjectMapper();
ObjectNode params = objectMapper.createObjectNode();
params.put("video", videoData); // Base64 encoded file content or video URL
// Create OkHttpClient instance
OkHttpClient client = new OkHttpClient();
MediaType JSON = MediaType.Companion.get("application/json; charset=utf-8");
RequestBody body = RequestBody.Companion.create(params.toString(), JSON);
Request request = new Request.Builder()
.url(API_URL)
.post(body)
.build();
// Call API and process API response
try (Response response = client.newCall(request).execute()) {
if (response.isSuccessful()) {
String responseBody = response.body().string();
JsonNode resultNode = objectMapper.readTree(responseBody);
JsonNode result = resultNode.get("result");
String base64Image = result.get("video").asText();
JsonNode categories = result.get("categories");
byte[] videoBytes = Base64.getDecoder().decode(base64Image);
try (FileOutputStream fos = new FileOutputStream(outputImagePath)) {
fos.write(videoBytes);
}
System.out.println("Output video saved at " + outputImagePath);
System.out.println("\nCategories: " + categories.toString());
} else {
System.err.println("Request failed with code: " + response.code());
}
}
}
}
Go
package main
import (
"bytes"
"encoding/base64"
"encoding/json"
"fmt"
"io/ioutil"
"net/http"
)
func main() {
API_URL := "http://localhost:8080/video-classification"
videoPath := "./demo.mp4"
outputImagePath := "./out.mp4"
// Base64 encode the local video
videoBytes, err := ioutil.ReadFile(videoPath)
if err != nil {
fmt.Println("Error reading video file:", err)
return
}
videoData := base64.StdEncoding.EncodeToString(videoBytes)
payload := map[string]string{"video": videoData} // Base64 encoded file content or video URL
payloadBytes, err := json.Marshal(payload)
if err != nil {
fmt.Println("Error marshaling payload:", err)
return
}
// Call the API
client := &http.Client{}
req, err := http.NewRequest("POST", API_URL, bytes.NewBuffer(payloadBytes))
if err != nil {
fmt.Println("Error creating request:", err)
return
}
res, err := client.Do(req)
if err != nil {
fmt.Println("Error sending request:", err)
return
}
defer res.Body.Close()
// Handle the API response
body, err := ioutil.ReadAll(res.Body)
if err != nil {
fmt.Println("Error reading response body:", err)
return
}
type Response struct {
Result struct {
Image string `json:"video"`
Categories []map[string]interface{} `json:"categories"`
} `json:"result"`
}
var respData Response
err = json.Unmarshal([]byte(string(body)), &respData)
if err != nil {
fmt.Println("Error unmarshaling response body:", err)
return
}
outputImageData, err := base64.StdEncoding.DecodeString(respData.Result.Image)
if err != nil {
fmt.Println("Error decoding base64 video data:", err)
return
}
err = ioutil.WriteFile(outputImagePath, outputImageData, 0644)
if err != nil {
fmt.Println("Error writing video to file:", err)
return
}
fmt.Printf("Image saved at %s.mp4\n", outputImagePath)
fmt.Println("\nCategories:")
for _, category := range respData.Result.Categories {
fmt.Println(category)
}
}
C#
using System;
using System.IO;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Text;
using System.Threading.Tasks;
using Newtonsoft.Json.Linq;
class Program
{
static readonly string API_URL = "http://localhost:8080/video-classification";
static readonly string videoPath = "./demo.mp4";
static readonly string outputImagePath = "./out.mp4";
static async Task Main(string[] args)
{
var httpClient = new HttpClient();
// Base64 encode the local video
byte[] videoBytes = File.ReadAllBytes(videoPath);
string video_data = Convert.ToBase64String(videoBytes);
var payload = new JObject{ { "video", video_data } }; // Base64 encoded file content or video URL
var content = new StringContent(payload.ToString(), Encoding.UTF8, "application/json");
// Call the API
HttpResponseMessage response = await httpClient.PostAsync(API_URL, content);
response.EnsureSuccessStatusCode();
// Handle the API response
string responseBody = await response.Content.ReadAsStringAsync();
JObject jsonResponse = JObject.Parse(responseBody);
string base64Image = jsonResponse["result"]["video"].ToString();
byte[] outputImageBytes = Convert.FromBase64String(base64Image);
File.WriteAllBytes(outputImagePath, outputImageBytes);
Console.WriteLine($"Output video saved at {outputImagePath}");
Console.WriteLine("\nCategories:");
Console.WriteLine(jsonResponse["result"]["categories"].ToString());
}
}
Node.js
const axios = require('axios');
const fs = require('fs');
const API_URL = 'http://localhost:8080/video-classification'
const videoPath = './demo.mp4'
const outputImagePath = "./out.mp4";
let config = {
method: 'POST',
maxBodyLength: Infinity,
url: API_URL,
data: JSON.stringify({
'video': encodeImageToBase64(videoPath) // Base64 encoded file content or video URL
})
};
// Base64 encode the local video
function encodeImageToBase64(filePath) {
const bitmap = fs.readFileSync(filePath);
return Buffer.from(bitmap).toString('base64');
}
// Call the API
axios.request(config)
.then((response) => {
// Process the API response
const result = response.data["result"];
const videoBuffer = Buffer.from(result["video"], 'base64');
fs.writeFile(outputImagePath, videoBuffer, (err) => {
if (err) throw err;
console.log(`Output video saved at ${outputImagePath}`);
});
console.log("\nCategories:");
console.log(result["categories"]);
})
.catch((error) => {
console.log(error);
});
PHP
<?php
$API_URL = "http://localhost:8080/video-classification"; // Service URL
$video_path = "./demo.mp4";
$output_video_path = "./out.mp4";
// Base64 encode the local video
$video_data = base64_encode(file_get_contents($video_path));
$payload = array("video" => $video_data); // Base64 encoded file content or video URL
// Call the API
$ch = curl_init($API_URL);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($payload));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($ch);
curl_close($ch);
// Process the API response
$result = json_decode($response, true)["result"];
file_put_contents($output_video_path, base64_decode($result["video"]));
echo "Output video saved at " . $output_video_path . "\n";
echo "\nCategories:\n";
print_r($result["categories"]);
?>