Complete Guide to Connect ESP32 and External Devices
This API provides a RESTful interface for converting text to speech in real-time. It's designed to work with IoT devices like ESP32, Raspberry Pi, Arduino, and any device capable of making HTTP requests.
The API is modular and supports multiple TTS engines:
npm install gtts) which provides
consistent MP3 output across all platforms. AWS Polly is also recommended for production use.
The TTS API is available at:
https://www.forbixindia.com/software/tts/api/speak
This URL is ready to use from ESP32 and any external devices with internet access.
The API server needs to be deployed as a Node.js application. Configuration:
0.0.0.0 (accepts external connections)npm install
npm install say
npm install gtts
Edit api/config.js to change the TTS engine:
tts: {
engine: 'system', // Options: 'system', 'google', 'polly'
// ... other settings
}
npm start
http://localhost:3000.
Test it by visiting http://localhost:3000/api/health in your browser.
Check if the API server is running and get basic status information.
{
"status": "ok",
"timestamp": "2024-01-15T10:30:00.000Z",
"engine": "system",
"available": true,
"version": "1.0.0"
}
Get a list of available voices for the current TTS engine.
{
"success": true,
"engine": "system",
"voices": [
{
"name": "Alex",
"lang": "en-US",
"gender": "Male"
}
],
"count": 4
}
Synthesize speech using query parameters. Returns audio file (MP3/WAV).
Parameters:
text (required) - Text to convert to speechlang (optional) - Language code (e.g., 'en-US', 'es-ES')voice (optional) - Voice namerate (optional) - Speech rate 0.5-2.0 (default: 1.0)pitch (optional) - Speech pitch 0.5-2.0 (default: 1.0)volume (optional) - Volume 0.0-1.0 (default: 1.0)GET https://www.forbixindia.com/software/tts/api/speak?text=Hello%20World&lang=en-US&rate=1.2
Local Development: http://localhost:3000/api/speak?text=Hello%20World
Response: Audio file (Content-Type: audio/mpeg or audio/wav)
Synthesize speech using JSON body. Recommended for longer texts. Returns audio file.
{
"text": "Hello, this is a test message.",
"lang": "en-US",
"voice": "Alex",
"rate": 1.2,
"pitch": 1.0,
"volume": 1.0
}
Response: Audio file (Content-Type: audio/mpeg or audio/wav)
Here's a simple, ready-to-use example for ESP32 with MAX98357A I2S audio amplifier module.
https://www.forbixindia.com/software/tts/api/speak. Just update your WiFi credentials and upload!
Install the Audio library by Schreibfaul:
Audio by schreibfaul1/*
* ESP32 TTS with MAX98357A I2S Module
* Simple example to generate text, send to TTS API, and play audio
*
* Hardware: ESP32 + MAX98357A I2S Audio Amplifier
* API: https://www.forbixindia.com/software/tts/api/speak
*/
#include "WiFi.h"
#include "HTTPClient.h"
#include "Audio.h"
// ===== CONFIGURATION - UPDATE THESE =====
const char* WIFI_SSID = "YOUR_WIFI_SSID"; // Your WiFi network name
const char* WIFI_PASSWORD = "YOUR_WIFI_PASSWORD"; // Your WiFi password
// TTS API endpoint - Production URL
// Note: For HTTPS, ensure your ESP32 has updated SSL certificates
// If you get SSL errors, use HTTP for testing: "http://www.forbixindia.com/software/tts/api/speak"
const char* TTS_API_URL = "https://www.forbixindia.com/software/tts/api/speak";
// Alternative: Use HTTP if HTTPS gives SSL certificate errors (for testing)
// const char* TTS_API_URL = "http://www.forbixindia.com/software/tts/api/speak";
// For local testing, uncomment below:
// const char* TTS_API_URL = "http://YOUR_LOCAL_IP:3000/api/speak";
// ===== HARDWARE CONFIGURATION =====
// MAX98357A I2S pins (standard connections)
#define I2S_BCLK 26 // Bit Clock
#define I2S_LRC 25 // Word Select (Left/Right Clock)
#define I2S_DOUT 22 // Data Output
// ===== AUDIO SETUP =====
Audio audio;
void setup() {
// Initialize Serial for debugging
Serial.begin(115200);
delay(1000);
Serial.println("\n\n=== ESP32 TTS with MAX98357A ===");
// Connect to WiFi
connectToWiFi();
// Initialize I2S audio output for MAX98357A
audio.setPinout(I2S_BCLK, I2S_LRC, I2S_DOUT);
audio.setVolume(15); // Volume: 0-21 (15 = ~70%)
Serial.println("Setup complete! Ready to play TTS audio.");
Serial.println();
// Example: Generate and play TTS
playTextToSpeech("Hello! This is a test message from ESP32.");
}
void loop() {
// Must call audio.loop() regularly to keep audio playing
audio.loop();
// Example: Play TTS every 10 seconds (remove this in production)
// Uncomment below for testing:
/*
static unsigned long lastPlayTime = 0;
if (millis() - lastPlayTime > 10000) {
lastPlayTime = millis();
playTextToSpeech("Current time: " + String(millis() / 1000) + " seconds.");
}
*/
}
// Function to connect to WiFi
void connectToWiFi() {
Serial.print("Connecting to WiFi: ");
Serial.println(WIFI_SSID);
WiFi.mode(WIFI_STA);
WiFi.begin(WIFI_SSID, WIFI_PASSWORD);
int attempts = 0;
while (WiFi.status() != WL_CONNECTED && attempts < 20) {
delay(500);
Serial.print(".");
attempts++;
}
if (WiFi.status() == WL_CONNECTED) {
Serial.println("\nWiFi connected!");
Serial.print("IP address: ");
Serial.println(WiFi.localIP());
Serial.print("Signal strength (RSSI): ");
Serial.print(WiFi.RSSI());
Serial.println(" dBm");
} else {
Serial.println("\nWiFi connection failed!");
Serial.println("Please check your credentials and try again.");
// In production, you might want to retry or go into deep sleep
}
}
// Main function: Send text to TTS API and play the audio
void playTextToSpeech(String text) {
if (WiFi.status() != WL_CONNECTED) {
Serial.println("Error: WiFi not connected!");
return;
}
if (text.length() == 0) {
Serial.println("Error: Text is empty!");
return;
}
Serial.println("--- Generating TTS Audio ---");
Serial.print("Text: ");
Serial.println(text);
// Build the API URL with parameters
String url = String(TTS_API_URL) + "?text=" + urlEncode(text) + "&lang=en-US&rate=1.0";
Serial.print("Requesting: ");
Serial.println(url);
HTTPClient http;
http.begin(url);
// Set timeout
http.setTimeout(10000); // 10 seconds
// Make GET request
int httpCode = http.GET();
if (httpCode == 200) {
Serial.print("✓ Audio received! Size: ");
Serial.print(http.getSize());
Serial.println(" bytes");
Serial.println("Playing audio...");
// Stream audio directly from URL to I2S output
// The Audio library handles the streaming automatically
audio.connecttohost(url.c_str());
} else if (httpCode > 0) {
Serial.print("✗ HTTP Error: ");
Serial.println(httpCode);
Serial.print("Response: ");
Serial.println(http.getString());
} else {
Serial.print("✗ Connection failed: ");
Serial.println(http.errorToString(httpCode));
}
http.end();
Serial.println("--- Request completed ---\n");
}
// URL encoding function for text parameter
String urlEncode(String str) {
String encodedString = "";
char c;
char code0;
char code1;
for (int i = 0; i < str.length(); i++) {
c = str.charAt(i);
if (c == ' ') {
encodedString += '+';
} else if (isalnum(c) || c == '-' || c == '_' || c == '.' || c == '~') {
// Safe characters - no encoding needed
encodedString += c;
} else {
// Encode special characters
code1 = (c & 0xf) + '0';
if ((c & 0xf) > 9) {
code1 = (c & 0xf) - 10 + 'A';
}
c = (c >> 4) & 0xf;
code0 = c + '0';
if (c > 9) {
code0 = c - 10 + 'A';
}
encodedString += '%';
encodedString += code0;
encodedString += code1;
}
}
return encodedString;
}
// Optional: Audio event callbacks for monitoring
void audio_info(const char *info) {
Serial.print("Audio info: ");
Serial.println(info);
}
void audio_eof_mp3(const char *info) {
Serial.print("Audio EOF: ");
Serial.println(info);
}
YOUR_WIFI_SSID and YOUR_WIFI_PASSWORD with your credentialshttps://www.forbixindia.com/software/tts/api/speakhttp:// instead of https://http://192.168.1.100:3000/api/speak// In your loop() or button handler:
// Simple message
playTextToSpeech("Welcome to the TTS system.");
// With variable content
String sensorReading = String(25.5);
playTextToSpeech("Temperature is " + sensorReading + " degrees Celsius.");
// Multiple sentences
playTextToSpeech("System started. All sensors are online. Ready for operation.");
// Different language (if supported by API)
String url = String(TTS_API_URL) + "?text=" + urlEncode("Hola mundo") + "&lang=es-ES&rate=1.0";
audio.connecttohost(url.c_str());
import requests
import pygame
from io import BytesIO
# Production API URL
API_URL = "https://www.forbixindia.com/software/tts/api/speak"
# For local testing, use:
# API_URL = "http://localhost:3000/api/speak"
def text_to_speech(text, lang='en-US', rate=1.0):
params = {
'text': text,
'lang': lang,
'rate': rate
}
try:
response = requests.get(API_URL, params=params, timeout=30)
if response.status_code == 200:
# Play audio using pygame
pygame.mixer.init()
audio_file = BytesIO(response.content)
pygame.mixer.music.load(audio_file)
pygame.mixer.music.play()
while pygame.mixer.music.get_busy():
pygame.time.wait(100)
print("Audio playback completed!")
else:
print(f"Error: HTTP {response.status_code}")
print(f"Response: {response.text}")
except requests.exceptions.RequestException as e:
print(f"Request failed: {e}")
# Usage
if __name__ == "__main__":
text_to_speech("Hello from Raspberry Pi!")
# Save audio to file (Production)
curl "https://www.forbixindia.com/software/tts/api/speak?text=Hello%20World" \
-o output.mp3
# Play with system audio player
curl "https://www.forbixindia.com/software/tts/api/speak?text=Hello%20World" | \
play -t mp3 -
# For local testing:
# curl "http://localhost:3000/api/speak?text=Hello%20World" -o output.mp3
const axios = require('axios');
const fs = require('fs');
const { exec } = require('child_process');
// Production API URL
const API_URL = 'https://www.forbixindia.com/software/tts/api/speak';
// For local testing, use:
// const API_URL = 'http://localhost:3000/api/speak';
async function textToSpeech(text, options = {}) {
try {
const params = new URLSearchParams({
text: text,
lang: options.lang || 'en-US',
rate: options.rate || 1.0,
...options
});
console.log(`Requesting TTS for: "${text}"`);
const response = await axios({
method: 'GET',
url: `${API_URL}?${params.toString()}`,
responseType: 'arraybuffer',
timeout: 30000 // 30 second timeout
});
// Save to file
fs.writeFileSync('output.mp3', response.data);
console.log(`✓ Audio saved to output.mp3 (${response.data.length} bytes)`);
// Play on macOS
exec('afplay output.mp3', (error) => {
if (error) {
console.error('Playback error:', error.message);
} else {
console.log('Audio playback started');
}
});
} catch (error) {
if (error.response) {
console.error(`HTTP Error: ${error.response.status}`);
console.error(`Response: ${error.response.data.toString()}`);
} else {
console.error('Error:', error.message);
}
}
}
// Usage
textToSpeech('Hello from Node.js!');
The API can be configured by editing api/config.js. Here are the main settings:
server: {
port: process.env.PORT || 3000,
host: process.env.HOST || '0.0.0.0', // '0.0.0.0' allows external access
cors: {
enabled: true,
origin: '*' // In production, specify allowed origins
}
}
tts: {
engine: 'system', // 'system', 'google', or 'polly'
defaultVoice: {
lang: 'en-US',
voice: null, // null = system default
rate: 1.0,
pitch: 1.0,
volume: 1.0
}
}
api: {
rateLimit: {
enabled: true,
windowMs: 60000, // 1 minute
max: 60 // 60 requests per minute
},
limits: {
maxTextLength: 5000, // Maximum characters
maxConcurrentRequests: 10
}
}
origin to specific allowed domains
instead of * to prevent unauthorized access.
Solution: Install the required TTS package:
npm install say # for system TTSnpm install gtts # for Google TTS
Solution: Check the following:
https://www.forbixindia.com/software/tts/api/speakhttp://192.168.1.100:3000/api/speakSolution:
Solution: Adjust rate limits in api/config.js or disable rate limiting for development.
Enable debug logging by setting the log level:
logging: {
level: 'debug' // Options: 'error', 'warn', 'info', 'debug'
}