NakamuraYuichi/tts-mcp
Built by Metorial, the integration platform for agentic AI.
NakamuraYuichi/tts-mcp
Server Summary
Convert text to audio
Customize voice options
Support for multiple audio formats (MP3, WAV, OPUS, AAC)
Adjust speech speed
Provide command-line interface for direct conversions
A Model Context Protocol (MCP) server and command-line tool for high-quality text-to-speech generation using the OpenAI TTS API.
# Clone the repository
git clone https://github.com/nakamurau1/tts-mcp.git
cd tts-mcp
# Install dependencies
npm install
# Optional: Install globally
npm install -g .
# Start the MCP server directly
npx tts-mcp tts-mcp-server --voice nova --model tts-1-hd
# Use the CLI tool directly
npx tts-mcp -t "Hello, world" -o hello.mp3
The MCP server allows you to integrate text-to-speech functionality with Model Context Protocol (MCP) compatible clients like Claude Desktop.
# Start with default settings
npm run server
# Start with custom settings
npm run server -- --voice nova --model tts-1-hd
# Or directly with API key
node bin/tts-mcp-server.js --voice echo --api-key your-openai-api-key
Options:
-V, --version Display version information
-m, --model TTS model to use (default: "gpt-4o-mini-tts")
-v, --voice Voice character (default: "alloy")
-f, --format Audio format (default: "mp3")
--api-key OpenAI API key (can also be set via environment variable)
-h, --help Display help information
The MCP server can be used with Claude Desktop and other MCP-compatible clients. For Claude Desktop integration:
~/Library/Application Support/Claude/claude_desktop_config.json
){
"mcpServers": {
"tts-mcp": {
"command": "node",
"args": ["full/path/to/bin/tts-mcp-server.js", "--voice", "nova", "--api-key", "your-openai-api-key"],
"env": {
"OPENAI_API_KEY": "your-openai-api-key"
}
}
}
}
Alternatively, you can use npx for easier setup:
{
"mcpServers": {
"tts-mcp": {
"command": "npx",
"args": ["-p", "tts-mcp", "tts-mcp-server", "--voice", "nova", "--model", "gpt-4o-mini-tts"],
"env": {
"OPENAI_API_KEY": "your-openai-api-key"
}
}
}
}
You can provide the API key in two ways:
args
array using the --api-key
parameterenv
object as shown aboveSecurity Note: Make sure to secure your configuration file when including API keys.
You can also use tts-mcp as a standalone command-line tool:
# Convert text directly
tts-mcp -t "Hello, world" -o hello.mp3
# Convert from a text file
tts-mcp -f speech.txt -o speech.mp3
# Specify custom voice
tts-mcp -t "Welcome to the future" -o welcome.mp3 -v nova
Options:
-V, --version Display version information
-t, --text Text to convert
-f, --file Path to input text file
-o, --output Path to output audio file (required)
-m, --model Model to use (default: "gpt-4o-mini-tts")
-v, --voice Voice character (default: "alloy")
-s, --speed Speech speed (0.25-4.0) (default: 1)
--format Output format (default: "mp3")
-i, --instructions Additional instructions for speech generation
--api-key OpenAI API key (can also be set via environment variable)
-h, --help Display help information
The following voice characters are supported:
The following output formats are supported:
You can also configure the tool using system environment variables:
OPENAI_API_KEY=your-api-key-here
MIT