Implement time markers for TTS.

The service can inform the framework at which frame a part of the input
is spoken, and that information is then relayed to the client.

This can be used to highlight the currently spoken word/sentence or to
resume synthesis requests at the start of the last word/sentence.

Test: manual
Change-Id: Ie20a6764a8788cc3539cb058425e55eb6fde07db
11 files changed