# Check what a file really is, beyond the extension

Drop any file (PDF, executable, archive, image, video, audio, source code) and we look at the actual bytes (and use AI for text files) to check whether they match the extension. MATCH, MISMATCH, AMBIGUOUS, or UNKNOWN. In your browser, no upload.

## Quick facts

- **Where it runs:** Both the byte-level check and the AI pass run in your browser, on your device. Nothing is uploaded to our servers.
- **Formats supported:** Byte-level check covers two dozen well-known formats (PDF, PNG, JPEG, ZIP, RAR, MP4, Linux and macOS executables, Windows EXE / DLL, DOCX, and more). The AI pass recognises 200+ file types, including text-based ones (Python, Go, JSON, YAML, and more).
- **Verdict vocabulary:** MATCH (extension and bytes agree), MISMATCH (extension and bytes disagree), AMBIGUOUS (more than one format fits, or the AI is unsure), UNKNOWN (no recognised format, often a text-based file on mobile).
- **When the AI pass runs:** The AI pass runs only when the byte-level check returns AMBIGUOUS or UNKNOWN, the device has at least 4 GB of memory, and the file is under 20 MB. Confident byte-level matches skip it so the 7 MB AI download is never wasted.
- **Free limit:** Ten verifications per IP per UTC day, with spam protection. The paid batch flow ships verifying up to 50 files in one job at 1 credit per file.

## FAQ

### Which file types can this identify?

The instant byte-level check covers the well-known formats: PDF, PNG, JPEG, GIF, WebP, AVIF, HEIC, TIFF, MP3, MP4, MOV, ZIP, RAR, 7z, GZIP, TAR, ISO, DOCX, XLSX, PPTX, Linux and macOS executables, Windows EXE and DLL, APK, JAR, and several dozen more. The AI pass extends that to more than 200 file types, including text-based files that no byte-level check can spot, like CSV, JSON, Markdown, YAML, TOML, and most popular programming languages (Python, Go, Rust, JavaScript, TypeScript, Ruby, Java, C, C++, shell scripts).

### Is my file uploaded?

No. The byte-level check and the AI pass both run entirely in your browser. The first few kilobytes of the file are read into memory for the byte-level check; on desktop computers with at least 4 GB of RAM the file is also fed through the AI pass locally. Nothing crosses the network.

### How is this different from a virus scanner?

It identifies the file format, not the file contents. A MATCH verdict tells you the bytes really do look like a PDF; it does NOT tell you the PDF is harmless. We never claim malware detection. For an actual virus or malware verdict, use an antivirus product. The file type checker is a complementary tool that catches the simpler trick of a renamed extension.

### What does the AMBIGUOUS verdict mean?

Some files are valid as more than one format at once (for example, a PDF that is also a valid ZIP). We list every candidate so you know exactly which formats fit. We also return AMBIGUOUS when the AI cannot decide between its top two guesses, which usually means the file is short or unusual.

### What is the file size limit?

50 MB for the byte-level check and 20 MB for the AI pass (the smaller limit is a mobile safety clamp so phones with low memory do not crash). Larger files get the byte-level verdict only; the paid batch flow handles bulk checking with no per-file limit below the standard 50-files-per-job ceiling.

### Why does the AI pass only run on desktop?

The AI model is roughly 7 MB to download and uses around 100 MB of memory while it runs. On phones with less than 4 GB of memory we skip the AI pass to avoid the browser tab being killed, and the page shows a small notice explaining why only the byte-level check ran.

## File-type glossary

### Magic number

A short, fixed byte sequence at the start of a file that identifies its format independent of the extension. Examples: `25 50 44 46` for PDF, `89 50 4E 47` for PNG, `50 4B 03 04` for ZIP, `4D 5A` for a Windows PE, `7F 45 4C 46` for a Linux ELF binary.

### AI content classifier

A trained model that recognises file formats from the byte stream alone, without relying on a fixed magic-byte signature. The classifier is the second pass that closes the gap left by magic bytes on textual formats (source code, configuration, markup); it covers more than 200 content types and runs locally in the browser.

### Polyglot file

A file that is valid as two or more container formats at the same time (a PDF that is also a ZIP, a JPEG that is also an HTML document). Polyglots return AMBIGUOUS in the verdict so the operator can see exactly which formats fit.

### PE (Portable Executable) (PE)

The native executable format on Windows. Identified by the leading `4D 5A` magic (`MZ`, named for Mark Zbikowski, the original DOS executable header designer). Files renamed from `.exe` to `.pdf` keep the `4D 5A` header and surface as MISMATCH.

### ELF (ELF)

Executable and Linkable Format. The native executable / shared-library format on Linux, BSD, and several other Unix systems. Magic `7F 45 4C 46` (`\x7FELF`).

### Mach-O (Mach-O)

The native executable format on macOS and iOS. Magic varies (`FE ED FA CE`, `FE ED FA CF`, `CA FE BA BE` for fat universal binaries). Useful for catching macOS executables relabelled as documents.

## Related reading

- [What is a magic number?](/what-is-a-magic-number) - Plain-English explainer of magic-byte signatures, how they identify files, and the canonical examples (PDF, PNG, ZIP, PE, ELF, Mach-O).
- [File extension vs file type](/file-extension-vs-file-type) - The extension is a label; the file type is what the bytes actually say. Here is exactly how the two diverge and what to do when they disagree.
- [Is my file safe? Magic-byte vs malware scanners](/is-my-file-safe) - Honest scope-setting for what this tool can and cannot tell you. A magic-byte check identifies the format; only an antivirus tells you about malicious payloads.
- [Is this really a PDF?](/is-this-really-a-pdf) - Per-format spoke focused on PDFs specifically, with the exact magic-byte signature and the most common PDF-as-something-else attack patterns.

---

Canonical URL: https://bousemutton.com/file-type-checker
Last updated: 2026-04-28
Please cite as: BouseMutton (2026). File type checker, identify any file beyond its extension [Web application]. https://bousemutton.com/file-type-checker
