How to Connect Real-Time TTS Using API

Overview

This API provides a RESTful interface for converting text to speech in real-time. It's designed to work with IoT devices like ESP32, Raspberry Pi, Arduino, and any device capable of making HTTP requests.

The API is modular and supports multiple TTS engines:

System TTS - Uses your operating system's built-in text-to-speech (Windows, macOS, Linux)
Google TTS - Uses Google's Text-to-Speech service (requires internet)
AWS Polly - Uses Amazon Polly (requires AWS credentials)

💡 Quick Start: The API runs on port 3000 by default and responds with audio files (MP3/WAV) that can be played directly or saved to storage on your device. The server is configured to accept connections from external devices over the internet (host: 0.0.0.0), making it ready for ESP32 and other IoT devices.

⚠️ Important Note: System TTS (using 'say' package) has limited cross-platform support for audio file generation. For best results, especially on Windows and Linux, use Google TTS (npm install gtts) which provides consistent MP3 output across all platforms. AWS Polly is also recommended for production use.

✅ Production API URL:

The TTS API is available at:

https://www.forbixindia.com/software/tts/api/speak

This URL is ready to use from ESP32 and any external devices with internet access.

📡 API Server Deployment:

The API server needs to be deployed as a Node.js application. Configuration:

Server runs on port 3000 (or configured port)
Host binding: 0.0.0.0 (accepts external connections)
CORS enabled for cross-origin requests
Rate limiting: 60 requests per minute per IP
HTTPS recommended for production (SSL/TLS)

🚀 Server Setup

Prerequisites

Node.js 14.0.0 or higher
npm (Node Package Manager)
Internet connection (for initial setup)

Installation Steps

Install dependencies:

Terminal
npm install
Install TTS engine (choose one):

System TTS (Recommended for local use)
npm install say

Google TTS (Requires internet)
npm install gtts
Configure the engine (optional):
Edit api/config.js to change the TTS engine:

api/config.js
tts: { engine: 'system', // Options: 'system', 'google', 'polly' // ... other settings }
Start the server:

Terminal
npm start

✅ Success! The server should now be running at http://localhost:3000. Test it by visiting http://localhost:3000/api/health in your browser.

📡 API Endpoints

GET /api/health

Check if the API server is running and get basic status information.

Response Example

{
    "status": "ok",
    "timestamp": "2024-01-15T10:30:00.000Z",
    "engine": "system",
    "available": true,
    "version": "1.0.0"
}

GET /api/voices

Get a list of available voices for the current TTS engine.

Response Example

{
    "success": true,
    "engine": "system",
    "voices": [
        {
            "name": "Alex",
            "lang": "en-US",
            "gender": "Male"
        }
    ],
    "count": 4
}

GET /api/speak?text=Hello%20World

Synthesize speech using query parameters. Returns audio file (MP3/WAV).

Parameters:

text (required) - Text to convert to speech
lang (optional) - Language code (e.g., 'en-US', 'es-ES')
voice (optional) - Voice name
rate (optional) - Speech rate 0.5-2.0 (default: 1.0)
pitch (optional) - Speech pitch 0.5-2.0 (default: 1.0)
volume (optional) - Volume 0.0-1.0 (default: 1.0)

Example Request (Production)

GET https://www.forbixindia.com/software/tts/api/speak?text=Hello%20World&lang=en-US&rate=1.2

Local Development: http://localhost:3000/api/speak?text=Hello%20World

Response: Audio file (Content-Type: audio/mpeg or audio/wav)

POST /api/speak

Synthesize speech using JSON body. Recommended for longer texts. Returns audio file.

Request Body (JSON)

{
    "text": "Hello, this is a test message.",
    "lang": "en-US",
    "voice": "Alex",
    "rate": 1.2,
    "pitch": 1.0,
    "volume": 1.0
}

Response: Audio file (Content-Type: audio/mpeg or audio/wav)

🔌 ESP32 Example with MAX98357A I2S Module

Here's a simple, ready-to-use example for ESP32 with MAX98357A I2S audio amplifier module.

✅ Production Ready: This code is configured to work with the TTS API at https://www.forbixindia.com/software/tts/api/speak. Just update your WiFi credentials and upload!

Hardware Connections (MAX98357A)

VIN → ESP32 5V (or 3.3V)
GND → ESP32 GND
LRCLK (Word Select) → ESP32 GPIO 25
BCLK (Bit Clock) → ESP32 GPIO 26
DIN (Data) → ESP32 GPIO 22
Speaker → Connect to MAX98357A speaker outputs

Required Arduino Library

Install the Audio library by Schreibfaul:

Open Arduino IDE
Go to Tools → Manage Libraries
Search for Audio by schreibfaul1
Click Install

Complete ESP32 Code

ESP32_MAX98357A_TTS.ino

/*
 * ESP32 TTS with MAX98357A I2S Module
 * Simple example to generate text, send to TTS API, and play audio
 * 
 * Hardware: ESP32 + MAX98357A I2S Audio Amplifier
 * API: https://www.forbixindia.com/software/tts/api/speak
 */

#include "WiFi.h"
#include "HTTPClient.h"
#include "Audio.h"

// ===== CONFIGURATION - UPDATE THESE =====
const char* WIFI_SSID = "YOUR_WIFI_SSID";        // Your WiFi network name
const char* WIFI_PASSWORD = "YOUR_WIFI_PASSWORD"; // Your WiFi password

// TTS API endpoint - Production URL
// Note: For HTTPS, ensure your ESP32 has updated SSL certificates
// If you get SSL errors, use HTTP for testing: "http://www.forbixindia.com/software/tts/api/speak"
const char* TTS_API_URL = "https://www.forbixindia.com/software/tts/api/speak";

// Alternative: Use HTTP if HTTPS gives SSL certificate errors (for testing)
// const char* TTS_API_URL = "http://www.forbixindia.com/software/tts/api/speak";

// For local testing, uncomment below:
// const char* TTS_API_URL = "http://YOUR_LOCAL_IP:3000/api/speak";

// ===== HARDWARE CONFIGURATION =====
// MAX98357A I2S pins (standard connections)
#define I2S_BCLK 26   // Bit Clock
#define I2S_LRC 25    // Word Select (Left/Right Clock)
#define I2S_DOUT 22   // Data Output

// ===== AUDIO SETUP =====
Audio audio;

void setup() {
    // Initialize Serial for debugging
    Serial.begin(115200);
    delay(1000);
    Serial.println("\n\n=== ESP32 TTS with MAX98357A ===");
    
    // Connect to WiFi
    connectToWiFi();
    
    // Initialize I2S audio output for MAX98357A
    audio.setPinout(I2S_BCLK, I2S_LRC, I2S_DOUT);
    audio.setVolume(15); // Volume: 0-21 (15 = ~70%)
    
    Serial.println("Setup complete! Ready to play TTS audio.");
    Serial.println();
    
    // Example: Generate and play TTS
    playTextToSpeech("Hello! This is a test message from ESP32.");
}

void loop() {
    // Must call audio.loop() regularly to keep audio playing
    audio.loop();
    
    // Example: Play TTS every 10 seconds (remove this in production)
    // Uncomment below for testing:
    /*
    static unsigned long lastPlayTime = 0;
    if (millis() - lastPlayTime > 10000) {
        lastPlayTime = millis();
        playTextToSpeech("Current time: " + String(millis() / 1000) + " seconds.");
    }
    */
}

// Function to connect to WiFi
void connectToWiFi() {
    Serial.print("Connecting to WiFi: ");
    Serial.println(WIFI_SSID);
    
    WiFi.mode(WIFI_STA);
    WiFi.begin(WIFI_SSID, WIFI_PASSWORD);
    
    int attempts = 0;
    while (WiFi.status() != WL_CONNECTED && attempts < 20) {
        delay(500);
        Serial.print(".");
        attempts++;
    }
    
    if (WiFi.status() == WL_CONNECTED) {
        Serial.println("\nWiFi connected!");
        Serial.print("IP address: ");
        Serial.println(WiFi.localIP());
        Serial.print("Signal strength (RSSI): ");
        Serial.print(WiFi.RSSI());
        Serial.println(" dBm");
    } else {
        Serial.println("\nWiFi connection failed!");
        Serial.println("Please check your credentials and try again.");
        // In production, you might want to retry or go into deep sleep
    }
}

// Main function: Send text to TTS API and play the audio
void playTextToSpeech(String text) {
    if (WiFi.status() != WL_CONNECTED) {
        Serial.println("Error: WiFi not connected!");
        return;
    }
    
    if (text.length() == 0) {
        Serial.println("Error: Text is empty!");
        return;
    }
    
    Serial.println("--- Generating TTS Audio ---");
    Serial.print("Text: ");
    Serial.println(text);
    
    // Build the API URL with parameters
    String url = String(TTS_API_URL) + "?text=" + urlEncode(text) + "&lang=en-US&rate=1.0";
    
    Serial.print("Requesting: ");
    Serial.println(url);
    
    HTTPClient http;
    http.begin(url);
    
    // Set timeout
    http.setTimeout(10000); // 10 seconds
    
    // Make GET request
    int httpCode = http.GET();
    
    if (httpCode == 200) {
        Serial.print("✓ Audio received! Size: ");
        Serial.print(http.getSize());
        Serial.println(" bytes");
        Serial.println("Playing audio...");
        
        // Stream audio directly from URL to I2S output
        // The Audio library handles the streaming automatically
        audio.connecttohost(url.c_str());
        
    } else if (httpCode > 0) {
        Serial.print("✗ HTTP Error: ");
        Serial.println(httpCode);
        Serial.print("Response: ");
        Serial.println(http.getString());
    } else {
        Serial.print("✗ Connection failed: ");
        Serial.println(http.errorToString(httpCode));
    }
    
    http.end();
    Serial.println("--- Request completed ---\n");
}

// URL encoding function for text parameter
String urlEncode(String str) {
    String encodedString = "";
    char c;
    char code0;
    char code1;
    
    for (int i = 0; i < str.length(); i++) {
        c = str.charAt(i);
        if (c == ' ') {
            encodedString += '+';
        } else if (isalnum(c) || c == '-' || c == '_' || c == '.' || c == '~') {
            // Safe characters - no encoding needed
            encodedString += c;
        } else {
            // Encode special characters
            code1 = (c & 0xf) + '0';
            if ((c & 0xf) > 9) {
                code1 = (c & 0xf) - 10 + 'A';
            }
            c = (c >> 4) & 0xf;
            code0 = c + '0';
            if (c > 9) {
                code0 = c - 10 + 'A';
            }
            encodedString += '%';
            encodedString += code0;
            encodedString += code1;
        }
    }
    
    return encodedString;
}

// Optional: Audio event callbacks for monitoring
void audio_info(const char *info) {
    Serial.print("Audio info: ");
    Serial.println(info);
}

void audio_eof_mp3(const char *info) {
    Serial.print("Audio EOF: ");
    Serial.println(info);
}

📝 Quick Setup Instructions:

Install Library: Install "Audio" library by schreibfaul1 from Arduino Library Manager
Update WiFi: Replace YOUR_WIFI_SSID and YOUR_WIFI_PASSWORD with your credentials
Connect Hardware: Wire MAX98357A as shown in the Hardware Connections section above
Upload Code: Select ESP32 board in Arduino IDE and upload
Test: Open Serial Monitor (115200 baud) to see debug messages

⚠️ Important Notes:

API URL: Pre-configured for production: https://www.forbixindia.com/software/tts/api/speak
HTTPS vs HTTP: If you get SSL certificate errors, change the URL to use http:// instead of https://
SSL Certificates: For HTTPS, update your ESP32 Arduino core to get latest SSL certificates, or use HTTP
Local Testing: Use HTTP with your local IP: http://192.168.1.100:3000/api/speak
MAX98357A Power: Works with 3.3V, but 5V provides better audio quality and volume
Audio Library: Make sure to install the correct "Audio" library by schreibfaul1

Customization Options

Example: Playing Different Texts

// In your loop() or button handler:

// Simple message
playTextToSpeech("Welcome to the TTS system.");

// With variable content
String sensorReading = String(25.5);
playTextToSpeech("Temperature is " + sensorReading + " degrees Celsius.");

// Multiple sentences
playTextToSpeech("System started. All sensors are online. Ready for operation.");

// Different language (if supported by API)
String url = String(TTS_API_URL) + "?text=" + urlEncode("Hola mundo") + "&lang=es-ES&rate=1.0";
audio.connecttohost(url.c_str());

🌐 Other Devices Examples

Python Example (Raspberry Pi)

tts_client.py

import requests
import pygame
from io import BytesIO

# Production API URL
API_URL = "https://www.forbixindia.com/software/tts/api/speak"

# For local testing, use:
# API_URL = "http://localhost:3000/api/speak"

def text_to_speech(text, lang='en-US', rate=1.0):
    params = {
        'text': text,
        'lang': lang,
        'rate': rate
    }
    
    try:
        response = requests.get(API_URL, params=params, timeout=30)
        
        if response.status_code == 200:
            # Play audio using pygame
            pygame.mixer.init()
            audio_file = BytesIO(response.content)
            pygame.mixer.music.load(audio_file)
            pygame.mixer.music.play()
            
            while pygame.mixer.music.get_busy():
                pygame.time.wait(100)
            print("Audio playback completed!")
        else:
            print(f"Error: HTTP {response.status_code}")
            print(f"Response: {response.text}")
    except requests.exceptions.RequestException as e:
        print(f"Request failed: {e}")

# Usage
if __name__ == "__main__":
    text_to_speech("Hello from Raspberry Pi!")

cURL Example

Terminal (Production)

# Save audio to file (Production)
curl "https://www.forbixindia.com/software/tts/api/speak?text=Hello%20World" \
     -o output.mp3

# Play with system audio player
curl "https://www.forbixindia.com/software/tts/api/speak?text=Hello%20World" | \
     play -t mp3 -

# For local testing:
# curl "http://localhost:3000/api/speak?text=Hello%20World" -o output.mp3

JavaScript/Node.js Example

tts_client.js

const axios = require('axios');
const fs = require('fs');
const { exec } = require('child_process');

// Production API URL
const API_URL = 'https://www.forbixindia.com/software/tts/api/speak';

// For local testing, use:
// const API_URL = 'http://localhost:3000/api/speak';

async function textToSpeech(text, options = {}) {
    try {
        const params = new URLSearchParams({
            text: text,
            lang: options.lang || 'en-US',
            rate: options.rate || 1.0,
            ...options
        });
        
        console.log(`Requesting TTS for: "${text}"`);
        
        const response = await axios({
            method: 'GET',
            url: `${API_URL}?${params.toString()}`,
            responseType: 'arraybuffer',
            timeout: 30000 // 30 second timeout
        });
        
        // Save to file
        fs.writeFileSync('output.mp3', response.data);
        console.log(`✓ Audio saved to output.mp3 (${response.data.length} bytes)`);
        
        // Play on macOS
        exec('afplay output.mp3', (error) => {
            if (error) {
                console.error('Playback error:', error.message);
            } else {
                console.log('Audio playback started');
            }
        });
        
    } catch (error) {
        if (error.response) {
            console.error(`HTTP Error: ${error.response.status}`);
            console.error(`Response: ${error.response.data.toString()}`);
        } else {
            console.error('Error:', error.message);
        }
    }
}

// Usage
textToSpeech('Hello from Node.js!');

⚙️ Configuration

The API can be configured by editing api/config.js. Here are the main settings:

Server Configuration

api/config.js

server: {
    port: process.env.PORT || 3000,
    host: process.env.HOST || '0.0.0.0',  // '0.0.0.0' allows external access
    cors: {
        enabled: true,
        origin: '*'  // In production, specify allowed origins
    }
}

TTS Engine Configuration

api/config.js

tts: {
    engine: 'system',  // 'system', 'google', or 'polly'
    defaultVoice: {
        lang: 'en-US',
        voice: null,  // null = system default
        rate: 1.0,
        pitch: 1.0,
        volume: 1.0
    }
}

Rate Limiting

api/config.js

api: {
    rateLimit: {
        enabled: true,
        windowMs: 60000,  // 1 minute
        max: 60  // 60 requests per minute
    },
    limits: {
        maxTextLength: 5000,  // Maximum characters
        maxConcurrentRequests: 10
    }
}

⚠️ Security Note: For production use, set origin to specific allowed domains instead of * to prevent unauthorized access.

🔧 Troubleshooting

Common Issues

❌ "TTS engine not available"

Solution: Install the required TTS package:

npm install say # for system TTS
npm install gtts # for Google TTS

❌ "Cannot connect from ESP32"

Solution: Check the following:

WiFi Connection: Ensure ESP32 is connected to WiFi and can access the internet
API URL: Verify the URL is correct: https://www.forbixindia.com/software/tts/api/speak
HTTPS/SSL: For HTTPS, ESP32 needs valid SSL certificates (update Arduino ESP32 core or use HTTP for testing)
Serial Monitor: Check Serial Monitor (115200 baud) for error messages
Local Testing: If testing locally, use HTTP and your local IP: http://192.168.1.100:3000/api/speak
Firewall: Ensure firewall allows connections on port 3000 (for local) or port 443/80 (for production)

❌ "Audio file is corrupted"

Solution:

Check TTS engine is properly installed
Verify text encoding (UTF-8)
Check audio format compatibility with your device

❌ "Rate limit exceeded"

Solution: Adjust rate limits in api/config.js or disable rate limiting for development.

Debug Mode

Enable debug logging by setting the log level:

api/config.js

logging: {
    level: 'debug'  // Options: 'error', 'warn', 'info', 'debug'
}

🔊 Real-Time TTS API

📑 Table of Contents

Overview

🚀 Server Setup

Prerequisites

Installation Steps

📡 API Endpoints

🔌 ESP32 Example with MAX98357A I2S Module

Hardware Connections (MAX98357A)

Required Arduino Library

Complete ESP32 Code

Customization Options

🌐 Other Devices Examples

Python Example (Raspberry Pi)

cURL Example

JavaScript/Node.js Example

⚙️ Configuration

Server Configuration

TTS Engine Configuration

Rate Limiting

🔧 Troubleshooting

Common Issues

❌ "TTS engine not available"

❌ "Cannot connect from ESP32"

❌ "Audio file is corrupted"

❌ "Rate limit exceeded"

Debug Mode

📚 Additional Resources