XML

Ngôn ngữ đánh dấu mở rộng, ngôn ngữ đánh dấu mô tả cấu trúc dữ liệu bằng thẻ.

XML (Extensible Markup Language) là ngôn ngữ đánh dấu đa năng mô tả cấu trúc dữ liệu bằng thẻ. Standardized by W3C in 1998, it derives from SGML like HTML but features the ability to freely define custom tags. Its capacity to explicitly express data meaning and structure has led to widespread adoption as a data exchange format between different systems.

XML có hệ sinh thái phong phú các công nghệ liên quan. XSD (XML Schema Definition) for schema validation, XSLT (XSL Transformations) for transformation, XPath for element selection, and namespaces for vocabulary conflict avoidance provide comprehensive tools for handling large, complex data structures. In practice, XML is used in sitemap.xml (search engine sitemaps), RSS/Atom feeds (news syndication), SOAP (web service protocol), SVG (vector graphics), Office documents (OOXML), and many other domains. XML introduction books provide foundational coverage.

Quy tắc cú pháp XML nghiêm ngặt hơn HTML. Tất cả thẻ phải được đóng (<br />), attribute values must be quoted, case sensitivity is enforced, and only one root element is allowed. Special characters (<, >, &, ", ') must be escaped with entity references (&lt;, &gt;, etc.). This strictness simplifies parser implementation and enhances data reliability.

Mặc dù JSON đang thay thế XML làm định dạng phản hồi REST API, XML remains preferred in certain scenarios: when rich document structure expression is needed (mixed content, coexisting attributes and text), strict schema validation is required, namespace-based vocabulary management is necessary, or compatibility with existing systems is demanded. XML is also established as an industry standard in fields like financial FIX protocol and healthcare HL7.

So sánh XML và JSON, XML tends to be verbose due to tags and attributes but supports schema validation and comments. JSON is simpler and lighter but lacks comments and requires JSON Schema as a separate mechanism for validation. Data format comparison books are helpful references.

Từ góc độ đếm ký tự, các phần tử cú pháp phong phú của XML (tags, attributes) result in higher character counts than JSON or YAML for the same data. For example, <name>Taro</name> is 19 characters while JSON's "name": "Taro" is 14 characters. This overhead should be considered when optimizing API response sizes or reducing data transfer volumes.