Drop in an audio file or call recording. Detect synthetic speech from ElevenLabs, Resemble, PlayHT, OpenAI, and other voice-cloning engines.
Synthetic speech is statistically different from human speech. Our model triangulates across six independent signals. A file has to fail several before we flag it.
Real speakers vary pitch involuntarily, micro-fluctuations every 80–150ms. TTS systems smooth them out. We measure the variance.
Humans breathe. Cloned voices often skip the inter-clause inhale or insert a synthetic one of suspiciously consistent duration.
Vocoder upsampling leaves a high-frequency quantization signature that most TTS engines can't fully suppress, visible above 8kHz.
Diffusion-based voice clones produce unnaturally smooth phoneme transitions. We compare against a corpus of 200k human speech samples.
Phone calls have background noise. A "phone call" with a studio-clean noise floor is one of the strongest deepfake tells.
Synthetic speech carries vocoder and upsampling artifacts across the frequency band that microphone-captured human speech does not. We score the whole recording rather than guessing which tool produced it.
We detect output from current TTS and voice-cloning systems, and update our models as new engines appear.
Documented voice-clone fraud in the last 24 months. Every one had a clear acoustic signal we'd have flagged inside a second.
A woman in Hyderabad lost Rs 1.4 lakh to a scammer using AI to mimic her nephew's voice. A clone like this typically shows a flat pitch contour and missing breath gaps, the kind of thing Signals 01 and 02 catch.
A finance worker was tricked into sending $25.6M after a deepfake video call with cloned voices of the CFO and colleagues. A studio-clean noise floor on a "Zoom call" is exactly the kind of tell Signal 05 flags.
A mother received a fake-kidnapping ransom call using a clone of her daughter's voice. A cloned plea tends to stay tonally flat across 90 seconds, where genuine distress would vary.
An AI voice detector analyzes an audio recording and estimates whether the speech is a real human voice or AI-generated, such as a voice clone or synthetic speech. DeepfakeDetector.ai returns a clear verdict, Authentic, Likely Synthetic, or Inconclusive, paired with a TrustScore from 0 to 100.
It detects AI-generated and cloned speech with high accuracy and pairs every result with a confidence score, because detection is probabilistic rather than absolute. Accuracy can vary with audio quality, compression, and background noise, so treat a verdict as strong evidence to weigh alongside the source and context.
Yes. It detects synthetic speech from ElevenLabs, Resemble, PlayHT, OpenAI, and other major voice-cloning tools. The result is a whole-file verdict with a confidence score; it does not name which specific tool produced the audio.
Upload MP3, WAV, OGG, or M4A files. The free plan analyzes clips up to 2 minutes per detection, and paid plans handle up to 10 minutes per detection.
Yes. A free account includes 50 detections a month across voice, image, and video, with no card required. Paid plans add higher quotas, longer clips, API access, and exports.
Not in real time. The detector analyzes uploaded recordings rather than monitoring live calls. If a call feels suspicious, record it where lawful or save the voicemail, then upload the audio for a verdict.
Listen for flat or mismatched emotion, missing breaths, unnaturally even pacing, and a sterile, room-free background. No single sign is proof, so stack a few cues and then confirm with the detector.
Files are deleted from primary storage within 60 seconds of analysis unless you opt into retention. A SOC 2 audit is in progress as part of our security program.
The free plan includes 50 detections a month. Starter at $49/mo handles 1,000. Enterprise scales to 20,000.