BOM (Byte Order Mark)

A byte sequence at the start of a file that identifies the encoding. EF BB BF for UTF-8, FF FE or FE FF for UTF-16.

A BOM (Byte Order Mark) is a special byte sequence placed at the beginning of a text file to indicate the encoding type and byte order. It is the encoded form of Unicode character U+FEFF.

In UTF-16, the BOM is essential for determining byte order (big-endian vs little-endian). In UTF-8, the BOM (EF BB BF) serves only as an encoding identifier since UTF-8 has no byte order concept. File encoding guides cover BOM details thoroughly.

The UTF-8 BOM can cause issues in some programs that treat it as unwanted bytes. Shell scripts and PHP files should use UTF-8 without BOM.

Windows Notepad used to add a BOM by default when saving as UTF-8, but recent versions default to no-BOM. Text editor productivity books discuss encoding settings as an important topic.