Reducing AI Inference Latency with Speculative Decoding