Google Translate Vs Google Cloud
This article explains the difference between the google_translate and google_cloud platforms in Home Assistant and their proper integration.
Google Translate: Configuration
The Google Translate Text-To-Speech integration uses the unofficial Google Translate TTS engine to read input text in natural sounding voices.
- language string (optional, default: en)
The default speech language to use. For full list click HERE
- cache boolean (optional, default: true)
Allow TTS to cache voice file to local storage.
- cache_dir string (optional, default: tts)
Folder name or path to a folder for caching files.
- time_memory integer (optional, default: 300)
Time to hold the voice data inside memory for fast play on a media player. Minimum is 60 s and the maximum 57600 s (16 hours).
- base_url string (optional, default: value of internal URL)
A base URL to use instead of the one set in the Home Assistant configuration. It is used as-is by the tts component. In particular, you need to include the protocol scheme http:// or https:// and the correct port number. They will not be automatically added for you.
- service_name string (optional)
Define the service name.
Default: The service name default set to _say. For example, for google_translate tts, its service name default is google_translate_say.
IMPORTANT: If you are using SSL certificate to access your Home Assistant server, you must provide to URL to enable the google translate service. Google Cast devices reject self-signed certificates and simply providing the internal IP when using SSL will make the google cast devices refuse the connection, thus you need to use your host name (eg. my-hostname.duckdns.org).
If you are not using SSL simply provide the Internal IP because the cast device will not have to resolve the host name. Set your host name under base_url.
Example of full configuration
#Add this to your configuration.yaml tts: - platform: google_translate language: "en" service_name: google_say cache: true cache_dir: /tmp/tts time_memory: 300 base_url: https://yourdomain.duckdns.orgCode language: YAML (yaml)
Google Cloud: Configuration
The google_cloud platform allows you to use the Google Cloud Platform APIs and integrate them into Home Assistant. Before we can configure the google_cloud tts service, we need to obtain an API key from google’s resource manager.
IMPORTANT: Google requires a billing account to be setup in order to use their APIs. This does not mean you will be charged for using the API, as long as you do not exceed the word quota.
Google Cloud: Setting up a billing account
- Visit Google’s Billing Account site
- Click Add a Billing Account
- Add your phone number and verify it via SMS
- In Account Type, select Individual
- Setup your Payment Method (country dependent) and click Start my Free Trial
Google Cloud: Obtaining an API key
- Visit Google’s Cloud Resource Manager site
- Create New Project, specify Name and click Create
- Visit Google’s APIs Library
- Search for “text-to-speech” and click on the Cloud Text-To-Speech API
4.1 Or follow THIS direct link to enable TTS
- Select your project from the dropdown at the top and click enable
5.1 If you get a Billing required you need to setup a billing account first
- Setup authentication
6.1 Visit Google’s Cloud Resource Manager site
6.2 Click the hamburger menu of the left and choose IAM & Admin
6.3 Click Service Accounts on the left menu, than click Create new service account on the top
6.4 Name your account, ID field is populated automatically, Click Create and continue
6.5 Click Continue again without selecting a role (not needed for TTS), click Done
6.6 Open the account by clicking its name under the email tab
6.7 Go the KEYS tab and press ADD KEY, Create New Key
6.8 Select JSON which will download the JSON file containing your key to your PC
- Upload the file to the config folder of your Home Assistant server
Google Cloud: Configuration Variables
- key_file string (optional)
The API key file to use with Google Cloud Platform. If not specified os.environ[‘GOOGLE_APPLICATION_CREDENTIALS’] path will be used.
- language string (optional, default: en-US)
Default language of the voice, e.g., en-US. Supported languages, genders and voices listed here. Also there are extra not documented but supported languages (see dropdown here).
- gender string (optional, default: neutral)
Default gender of the voice, e.g., male. Supported languages, genders and voices listed here.
- voice string (optional)
Default voice name, e.g., en-US-Wavenet-F. Supported languages, genders and voices listed here. Important! This parameter will override language and gender parameters if set.
- encoding string (optional, default: mp3)
Default audio encoder. Supported encodings are ogg_opus, mp3 and linear16.
- speed float (optional, default: 1.0)
Default rate/speed of the voice, in the range [0.25, 4.0]. 1.0 is the normal native speed supported by the specific voice. 2.0 is twice as fast, and 0.5 is half as fast. If unset(0.0), defaults to the native 1.0 speed.
- pitch float (optional, default: 0.0)
Default pitch of the voice, in the range [-20.0, 20.0]. 20 means increase of 20 semitones from the original pitch. -20 means decrease of 20 semitones from the original pitch.
- gain float (optional, default: 0.0)
Default volume gain (in dB) of the voice, in the range [-96.0, 16.0]. If unset, or set to a value of 0.0 (dB), will play at normal native signal amplitude. A value of -6.0 (dB) will play at approximately half the amplitude of the normal native signal amplitude. A value of +6.0 (dB) will play at approximately twice the amplitude of the normal native signal amplitude. Strongly recommend not to exceed +10 (dB) as there’s usually no effective increase in loudness for any value greater than that.
- profiles list (optional, default: )
An identifier which selects ‘audio effects’ profiles that are applied on (post synthesized) text to speech. Effects are applied on top of each other in the order they are given. Supported profile ids listed here.
- text_type string (optional, default: text)
Default text type. Supported text types are text and ssml. Read more on what is that and how to use SSML here.
Example of full configuration
#Add this to your configuration.yaml tts: - platform: google_cloud key_file: my-apikey-cloud.json service_name: google_cloud language: en-US gender: male voice: en-GB-Wavenet-D speed: 1.1 pitch: -2 gain: 0.0 text_type: textCode language: YAML (yaml)
Pricing: Standard vs Wavenet voices
Text-to-Speech is priced based on the number of characters sent to the service to be synthesized into audio each month. The total number of characters in the input string are counted for billing purposes, including spaces.
The full list of available voices can be found HERE
- Free Tier
- 0-4 Million Characters
- 600.000-800.00 Words
- Price after exceeding limit
- $0.000004 per character ($4 USD per 1 Million Characters
- Free Tier
- 0-1 Million Characters
- 150.000-200.00 Words
- Price after exceeding limit
- $0.000016 per character ($16 USD per 1 Million Characters
The google_translate and google_cloud integrations in home assistant, serve the same purpose: Converting Text-to-Speech via their respective service. We will list the pros & cons of each so you can decide which one is for you.
- Easier to setup
- No billing account required
- No google cloud platform required
- More languages supported – 100+
- No character limit
- Robotic sounding voice
- Inability to change voice in Home Assistant
- Static speed, pitch & gain
- Need to include base_url in configuration to bypass google rejecting self-signed certificates
- More natural sounding voices
- Higher quality speech synthesis
- SSML support for customization of audio response
- Audio profiles can be applied on post-synthesized TTS
- More difficult to setup
- Google cloud platform required
- Billing Account Required
- Character limit
- Less Languages available
Both integrations work seamlessly in Home Assistant if setup correctly. We recommend trying them both and see which one is for you. If you are a power user, and have a lot of audible notifications, announcements etc. yelling through your smart speaker all the time, and if cost is your concern, than google translate may be for you. Easier to setup and no character limit. Although, we must bring to your attention that the free tier of google cloud tts lets you use a large number of characters free of charge (4 million for standard voices and 1 million for WaveNet voices). This would be around 600.000 and 150.000 words respectively! Which is a LOT!
You can use both integrations simultaneously, and call on them via their services.