🔊 Real-Time TTS API

Complete Guide to Connect ESP32 and External Devices

REST API Real-Time ESP32 Compatible

📑 Table of Contents

Overview

This API provides a RESTful interface for converting text to speech in real-time. It's designed to work with IoT devices like ESP32, Raspberry Pi, Arduino, and any device capable of making HTTP requests.

The API is modular and supports multiple TTS engines:

💡 Quick Start: The API runs on port 3000 by default and responds with audio files (MP3/WAV) that can be played directly or saved to storage on your device. The server is configured to accept connections from external devices over the internet (host: 0.0.0.0), making it ready for ESP32 and other IoT devices.
⚠️ Important Note: System TTS (using 'say' package) has limited cross-platform support for audio file generation. For best results, especially on Windows and Linux, use Google TTS (npm install gtts) which provides consistent MP3 output across all platforms. AWS Polly is also recommended for production use.
✅ Production API URL:

The TTS API is available at:

https://www.forbixindia.com/software/tts/api/speak

This URL is ready to use from ESP32 and any external devices with internet access.

📡 API Server Deployment:

The API server needs to be deployed as a Node.js application. Configuration:

  • Server runs on port 3000 (or configured port)
  • Host binding: 0.0.0.0 (accepts external connections)
  • CORS enabled for cross-origin requests
  • Rate limiting: 60 requests per minute per IP
  • HTTPS recommended for production (SSL/TLS)

🚀 Server Setup

Prerequisites

Installation Steps

  1. Install dependencies:
    Terminal
    npm install
  2. Install TTS engine (choose one):
    System TTS (Recommended for local use)
    npm install say
    Google TTS (Requires internet)
    npm install gtts
  3. Configure the engine (optional):

    Edit api/config.js to change the TTS engine:

    api/config.js
    tts: { engine: 'system', // Options: 'system', 'google', 'polly' // ... other settings }
  4. Start the server:
    Terminal
    npm start
✅ Success! The server should now be running at http://localhost:3000. Test it by visiting http://localhost:3000/api/health in your browser.

📡 API Endpoints

GET /api/health

Check if the API server is running and get basic status information.

Response Example
{ "status": "ok", "timestamp": "2024-01-15T10:30:00.000Z", "engine": "system", "available": true, "version": "1.0.0" }
GET /api/voices

Get a list of available voices for the current TTS engine.

Response Example
{ "success": true, "engine": "system", "voices": [ { "name": "Alex", "lang": "en-US", "gender": "Male" } ], "count": 4 }
GET /api/speak?text=Hello%20World

Synthesize speech using query parameters. Returns audio file (MP3/WAV).

Parameters:

  • text (required) - Text to convert to speech
  • lang (optional) - Language code (e.g., 'en-US', 'es-ES')
  • voice (optional) - Voice name
  • rate (optional) - Speech rate 0.5-2.0 (default: 1.0)
  • pitch (optional) - Speech pitch 0.5-2.0 (default: 1.0)
  • volume (optional) - Volume 0.0-1.0 (default: 1.0)
Example Request (Production)
GET https://www.forbixindia.com/software/tts/api/speak?text=Hello%20World&lang=en-US&rate=1.2

Local Development: http://localhost:3000/api/speak?text=Hello%20World

Response: Audio file (Content-Type: audio/mpeg or audio/wav)

POST /api/speak

Synthesize speech using JSON body. Recommended for longer texts. Returns audio file.

Request Body (JSON)
{ "text": "Hello, this is a test message.", "lang": "en-US", "voice": "Alex", "rate": 1.2, "pitch": 1.0, "volume": 1.0 }

Response: Audio file (Content-Type: audio/mpeg or audio/wav)

🔌 ESP32 Example with MAX98357A I2S Module

Here's a simple, ready-to-use example for ESP32 with MAX98357A I2S audio amplifier module.

✅ Production Ready: This code is configured to work with the TTS API at https://www.forbixindia.com/software/tts/api/speak. Just update your WiFi credentials and upload!

Hardware Connections (MAX98357A)

Required Arduino Library

Install the Audio library by Schreibfaul:

  1. Open Arduino IDE
  2. Go to Tools → Manage Libraries
  3. Search for Audio by schreibfaul1
  4. Click Install

Complete ESP32 Code

ESP32_MAX98357A_TTS.ino
/* * ESP32 TTS with MAX98357A I2S Module * Simple example to generate text, send to TTS API, and play audio * * Hardware: ESP32 + MAX98357A I2S Audio Amplifier * API: https://www.forbixindia.com/software/tts/api/speak */ #include "WiFi.h" #include "HTTPClient.h" #include "Audio.h" // ===== CONFIGURATION - UPDATE THESE ===== const char* WIFI_SSID = "YOUR_WIFI_SSID"; // Your WiFi network name const char* WIFI_PASSWORD = "YOUR_WIFI_PASSWORD"; // Your WiFi password // TTS API endpoint - Production URL // Note: For HTTPS, ensure your ESP32 has updated SSL certificates // If you get SSL errors, use HTTP for testing: "http://www.forbixindia.com/software/tts/api/speak" const char* TTS_API_URL = "https://www.forbixindia.com/software/tts/api/speak"; // Alternative: Use HTTP if HTTPS gives SSL certificate errors (for testing) // const char* TTS_API_URL = "http://www.forbixindia.com/software/tts/api/speak"; // For local testing, uncomment below: // const char* TTS_API_URL = "http://YOUR_LOCAL_IP:3000/api/speak"; // ===== HARDWARE CONFIGURATION ===== // MAX98357A I2S pins (standard connections) #define I2S_BCLK 26 // Bit Clock #define I2S_LRC 25 // Word Select (Left/Right Clock) #define I2S_DOUT 22 // Data Output // ===== AUDIO SETUP ===== Audio audio; void setup() { // Initialize Serial for debugging Serial.begin(115200); delay(1000); Serial.println("\n\n=== ESP32 TTS with MAX98357A ==="); // Connect to WiFi connectToWiFi(); // Initialize I2S audio output for MAX98357A audio.setPinout(I2S_BCLK, I2S_LRC, I2S_DOUT); audio.setVolume(15); // Volume: 0-21 (15 = ~70%) Serial.println("Setup complete! Ready to play TTS audio."); Serial.println(); // Example: Generate and play TTS playTextToSpeech("Hello! This is a test message from ESP32."); } void loop() { // Must call audio.loop() regularly to keep audio playing audio.loop(); // Example: Play TTS every 10 seconds (remove this in production) // Uncomment below for testing: /* static unsigned long lastPlayTime = 0; if (millis() - lastPlayTime > 10000) { lastPlayTime = millis(); playTextToSpeech("Current time: " + String(millis() / 1000) + " seconds."); } */ } // Function to connect to WiFi void connectToWiFi() { Serial.print("Connecting to WiFi: "); Serial.println(WIFI_SSID); WiFi.mode(WIFI_STA); WiFi.begin(WIFI_SSID, WIFI_PASSWORD); int attempts = 0; while (WiFi.status() != WL_CONNECTED && attempts < 20) { delay(500); Serial.print("."); attempts++; } if (WiFi.status() == WL_CONNECTED) { Serial.println("\nWiFi connected!"); Serial.print("IP address: "); Serial.println(WiFi.localIP()); Serial.print("Signal strength (RSSI): "); Serial.print(WiFi.RSSI()); Serial.println(" dBm"); } else { Serial.println("\nWiFi connection failed!"); Serial.println("Please check your credentials and try again."); // In production, you might want to retry or go into deep sleep } } // Main function: Send text to TTS API and play the audio void playTextToSpeech(String text) { if (WiFi.status() != WL_CONNECTED) { Serial.println("Error: WiFi not connected!"); return; } if (text.length() == 0) { Serial.println("Error: Text is empty!"); return; } Serial.println("--- Generating TTS Audio ---"); Serial.print("Text: "); Serial.println(text); // Build the API URL with parameters String url = String(TTS_API_URL) + "?text=" + urlEncode(text) + "&lang=en-US&rate=1.0"; Serial.print("Requesting: "); Serial.println(url); HTTPClient http; http.begin(url); // Set timeout http.setTimeout(10000); // 10 seconds // Make GET request int httpCode = http.GET(); if (httpCode == 200) { Serial.print("✓ Audio received! Size: "); Serial.print(http.getSize()); Serial.println(" bytes"); Serial.println("Playing audio..."); // Stream audio directly from URL to I2S output // The Audio library handles the streaming automatically audio.connecttohost(url.c_str()); } else if (httpCode > 0) { Serial.print("✗ HTTP Error: "); Serial.println(httpCode); Serial.print("Response: "); Serial.println(http.getString()); } else { Serial.print("✗ Connection failed: "); Serial.println(http.errorToString(httpCode)); } http.end(); Serial.println("--- Request completed ---\n"); } // URL encoding function for text parameter String urlEncode(String str) { String encodedString = ""; char c; char code0; char code1; for (int i = 0; i < str.length(); i++) { c = str.charAt(i); if (c == ' ') { encodedString += '+'; } else if (isalnum(c) || c == '-' || c == '_' || c == '.' || c == '~') { // Safe characters - no encoding needed encodedString += c; } else { // Encode special characters code1 = (c & 0xf) + '0'; if ((c & 0xf) > 9) { code1 = (c & 0xf) - 10 + 'A'; } c = (c >> 4) & 0xf; code0 = c + '0'; if (c > 9) { code0 = c - 10 + 'A'; } encodedString += '%'; encodedString += code0; encodedString += code1; } } return encodedString; } // Optional: Audio event callbacks for monitoring void audio_info(const char *info) { Serial.print("Audio info: "); Serial.println(info); } void audio_eof_mp3(const char *info) { Serial.print("Audio EOF: "); Serial.println(info); }
📝 Quick Setup Instructions:
  1. Install Library: Install "Audio" library by schreibfaul1 from Arduino Library Manager
  2. Update WiFi: Replace YOUR_WIFI_SSID and YOUR_WIFI_PASSWORD with your credentials
  3. Connect Hardware: Wire MAX98357A as shown in the Hardware Connections section above
  4. Upload Code: Select ESP32 board in Arduino IDE and upload
  5. Test: Open Serial Monitor (115200 baud) to see debug messages
⚠️ Important Notes:
  • API URL: Pre-configured for production: https://www.forbixindia.com/software/tts/api/speak
  • HTTPS vs HTTP: If you get SSL certificate errors, change the URL to use http:// instead of https://
  • SSL Certificates: For HTTPS, update your ESP32 Arduino core to get latest SSL certificates, or use HTTP
  • Local Testing: Use HTTP with your local IP: http://192.168.1.100:3000/api/speak
  • MAX98357A Power: Works with 3.3V, but 5V provides better audio quality and volume
  • Audio Library: Make sure to install the correct "Audio" library by schreibfaul1

Customization Options

Example: Playing Different Texts
// In your loop() or button handler: // Simple message playTextToSpeech("Welcome to the TTS system."); // With variable content String sensorReading = String(25.5); playTextToSpeech("Temperature is " + sensorReading + " degrees Celsius."); // Multiple sentences playTextToSpeech("System started. All sensors are online. Ready for operation."); // Different language (if supported by API) String url = String(TTS_API_URL) + "?text=" + urlEncode("Hola mundo") + "&lang=es-ES&rate=1.0"; audio.connecttohost(url.c_str());

🌐 Other Devices Examples

Python Example (Raspberry Pi)

tts_client.py
import requests import pygame from io import BytesIO # Production API URL API_URL = "https://www.forbixindia.com/software/tts/api/speak" # For local testing, use: # API_URL = "http://localhost:3000/api/speak" def text_to_speech(text, lang='en-US', rate=1.0): params = { 'text': text, 'lang': lang, 'rate': rate } try: response = requests.get(API_URL, params=params, timeout=30) if response.status_code == 200: # Play audio using pygame pygame.mixer.init() audio_file = BytesIO(response.content) pygame.mixer.music.load(audio_file) pygame.mixer.music.play() while pygame.mixer.music.get_busy(): pygame.time.wait(100) print("Audio playback completed!") else: print(f"Error: HTTP {response.status_code}") print(f"Response: {response.text}") except requests.exceptions.RequestException as e: print(f"Request failed: {e}") # Usage if __name__ == "__main__": text_to_speech("Hello from Raspberry Pi!")

cURL Example

Terminal (Production)
# Save audio to file (Production) curl "https://www.forbixindia.com/software/tts/api/speak?text=Hello%20World" \ -o output.mp3 # Play with system audio player curl "https://www.forbixindia.com/software/tts/api/speak?text=Hello%20World" | \ play -t mp3 - # For local testing: # curl "http://localhost:3000/api/speak?text=Hello%20World" -o output.mp3

JavaScript/Node.js Example

tts_client.js
const axios = require('axios'); const fs = require('fs'); const { exec } = require('child_process'); // Production API URL const API_URL = 'https://www.forbixindia.com/software/tts/api/speak'; // For local testing, use: // const API_URL = 'http://localhost:3000/api/speak'; async function textToSpeech(text, options = {}) { try { const params = new URLSearchParams({ text: text, lang: options.lang || 'en-US', rate: options.rate || 1.0, ...options }); console.log(`Requesting TTS for: "${text}"`); const response = await axios({ method: 'GET', url: `${API_URL}?${params.toString()}`, responseType: 'arraybuffer', timeout: 30000 // 30 second timeout }); // Save to file fs.writeFileSync('output.mp3', response.data); console.log(`✓ Audio saved to output.mp3 (${response.data.length} bytes)`); // Play on macOS exec('afplay output.mp3', (error) => { if (error) { console.error('Playback error:', error.message); } else { console.log('Audio playback started'); } }); } catch (error) { if (error.response) { console.error(`HTTP Error: ${error.response.status}`); console.error(`Response: ${error.response.data.toString()}`); } else { console.error('Error:', error.message); } } } // Usage textToSpeech('Hello from Node.js!');

⚙️ Configuration

The API can be configured by editing api/config.js. Here are the main settings:

Server Configuration

api/config.js
server: { port: process.env.PORT || 3000, host: process.env.HOST || '0.0.0.0', // '0.0.0.0' allows external access cors: { enabled: true, origin: '*' // In production, specify allowed origins } }

TTS Engine Configuration

api/config.js
tts: { engine: 'system', // 'system', 'google', or 'polly' defaultVoice: { lang: 'en-US', voice: null, // null = system default rate: 1.0, pitch: 1.0, volume: 1.0 } }

Rate Limiting

api/config.js
api: { rateLimit: { enabled: true, windowMs: 60000, // 1 minute max: 60 // 60 requests per minute }, limits: { maxTextLength: 5000, // Maximum characters maxConcurrentRequests: 10 } }
⚠️ Security Note: For production use, set origin to specific allowed domains instead of * to prevent unauthorized access.

🔧 Troubleshooting

Common Issues

❌ "TTS engine not available"

Solution: Install the required TTS package:

npm install say # for system TTS
npm install gtts # for Google TTS

❌ "Cannot connect from ESP32"

Solution: Check the following:

  • WiFi Connection: Ensure ESP32 is connected to WiFi and can access the internet
  • API URL: Verify the URL is correct: https://www.forbixindia.com/software/tts/api/speak
  • HTTPS/SSL: For HTTPS, ESP32 needs valid SSL certificates (update Arduino ESP32 core or use HTTP for testing)
  • Serial Monitor: Check Serial Monitor (115200 baud) for error messages
  • Local Testing: If testing locally, use HTTP and your local IP: http://192.168.1.100:3000/api/speak
  • Firewall: Ensure firewall allows connections on port 3000 (for local) or port 443/80 (for production)

❌ "Audio file is corrupted"

Solution:

  • Check TTS engine is properly installed
  • Verify text encoding (UTF-8)
  • Check audio format compatibility with your device

❌ "Rate limit exceeded"

Solution: Adjust rate limits in api/config.js or disable rate limiting for development.

Debug Mode

Enable debug logging by setting the log level:

api/config.js
logging: { level: 'debug' // Options: 'error', 'warn', 'info', 'debug' }

📚 Additional Resources