# File extension vs file type

> A file extension is a label after the last dot. A file type is what the bytes inside actually represent. Most of the time they agree; when they do not, the bytes are the truth.

Source: <https://bousemutton.com/file-extension-vs-file-type>

### Key facts

- **Extension** The trailing label after the last dot in a filename. Trivial to change. A hint, not a contract.
- **File type** The format identified by the bytes inside the file. Detected by reading the first few bytes (the magic-byte signature).
- **When they diverge** When someone renames the file, when an extension was never set, or when the file is a polyglot (valid as more than one format at once).
- **How we report it** Four verdicts: MATCH, MISMATCH, AMBIGUOUS, UNKNOWN. Each is a deterministic answer about format identity.

### Frequently asked questions

#### Why does a DOCX file have the same magic bytes as a ZIP?

Because DOCX is a ZIP archive. Microsoft Office documents (DOCX, XLSX, PPTX) are all ZIP containers carrying XML and embedded media inside. The magic-byte check sees the ZIP envelope first; the OOXML manifest inside the archive distinguishes the document type.

#### Can two different formats share the same extension?

Yes. `.dat`, `.bin`, `.tmp`, and many vendor-specific extensions are reused freely. The bytes are still the source of truth; the extension is just an unreliable hint.

#### Is changing the extension ever the right move?

Sometimes. If a file has the bytes of a JPEG but no extension, naming it `.jpg` is correct. The error is the other direction: changing the extension to disguise what the file is.

#### What if my operating system disagrees with your verdict?

Modern operating systems read the magic bytes too, but they cache file-type associations differently. Our checker reports what the bytes actually say, regardless of any cached association. If the two disagree, trust the bytes.

> Drop the file into our File Type Checker. Four verdicts cover every case: MATCH, MISMATCH, AMBIGUOUS, UNKNOWN.
