
Simple and small iconv implementation
=====================================

SPDX-FileType: DOCUMENTATION
SPDX-FileCopyrightText: NONE
SPDX-License-Identifier: CC0-1.0


General
-------
This library provides a simple and small iconv implementation for conversion
from common 8-bit codepages (including US-ASCII) to Unicode (including its
subsets ISO-8859-1 and US-ASCII).

The development goals of this implementation are:
- C90 conformance
- Simplicity
- No memory (re)allocation
- Support for reproducible builds

Metadata for pkg-config is provided.


API
---
This library occupies the namespaces with "ssic0_" and "SSIC0_" prefix.
The namespaces with "ssic0_i_" and "SSIC0_I_" prefix are reserved for
internal use, never use it outside of the library (the shared library
may not even export such symbols).

The API with stripped namespace prefixes is intended to be compatible with
Solaris 11:
<https://docs.oracle.com/cd/E88353_01/html/E37843/iconvstr-3c.html>

No global context is used, therefore no initialization or shutdown
functions are present.


Error handling
--------------
Some fatal errors can be internally handled via assert(). Such errors,
like internal data corruption, indicate bugs in the library.
For maximum performance internal checks can be disabled with:

   CPPFLAGS=-DNDEBUG

The build system of the library enables this option by default.


Versioning scheme
-----------------
The release version contains 3 numbers "x.y.z":

- Major (x)
  The major number is incremented for every API/ABI change that is not
  backward compatible (with exception for version 0).

- Minor (y)
  The minor number is incremented for API/ABI extensions that are
  backward compatible.

- Patch (z)
  The patch number is incremented for changes that don't change the
  API/ABI.

In other words:
Releases with the same major and minor numbers are drop-in replacements.
Up- and downgrades between such versions are possible without touching
programs that use the library.
Releases with the same major, but different minor numbers are backward,
but not forward compatible. Upgrades are possible, downgrades can break
programs that use the library.
Releases with different major numbers require changes in all programs that
use the library.

Versions with different major numbers can be installed in parallel
(including header files).
Binaries can be linked against multiple instances of the library with
different major versions (e.g. if dependencies are using different
versions).


Thread safety
-------------
All API functions are thread-safe if "assert()" is thread-safe.
Otherwise "NDEBUG" must be defined to compile a thread-safe library.


Unicode normalization
---------------------
No normalization is guaranteed for Unicode output data.
An external normalization step is required after conversion.


Supported encodings
-------------------
As defined by IANA, encoding names are matched case-insensitive:
<https://www.iana.org/assignments/character-sets/character-sets.xhtml>
Optionally the registered alias names are accepted too (only for source
encodings).
The library enables this option by default.

Optionally, for better error tolerance, some nonstandard variants of the
names are accepted too (only for source encodings).
The library enables this option by default.


Supported target encodings
--------------------------
- Unicode (with UTF-8 transformation format)
  Name: "UTF-8"

- ISO 8859-1 with ISO 6429 control characters (a subset of Unicode)
  Name: "ISO-8859-1"

- ANSI X3.4 (a subset of Unicode)
  Name: "US-ASCII"


Supported source encodings
--------------------------
- ANSI X3.4
  Name   : "US-ASCII"
  Aliases: "ANSI_X3.4-1968", "ANSI_X3.4-1986", "ISO-IR-6", "ISO_646.IRV:1991",
           "ISO646-US", "IBM367", "CP367", "CSASCII", "US"
  Nonstandard: "US_ASCII"

- ISO 8859-1 with ISO 6429 control characters
  Name       : "ISO-8859-1"
  Aliases    : "ISO_8859-1:1987", "ISO_8859-1", "ISO-IR-100", "IBM819", "CP819",
               "CSISOLATIN1", "LATIN1", "L1"
  Nonstandard: "ISO8859-1"

- ISO 8859-2 with ISO 6429 control characters
  Name       : "ISO-8859-2"
  Aliases    : "ISO_8859-2:1987", "ISO_8859-2", "ISO-IR-101", "CSISOLATIN2",
               "LATIN2", "L2"
  Nonstandard: "ISO8859-2"

- ISO 8859-3 with ISO 6429 control characters
  Name       : "ISO-8859-3"
  Aliases    : "ISO_8859-3:1988", "ISO_8859-3", "ISO-IR-109", "CSISOLATIN3",
               "LATIN3", "L3"
  Nonstandard: "ISO8859-3"

- ISO 8859-4 with ISO 6429 control characters
  Name       : "ISO-8859-4"
  Aliases    : "ISO_8859-4:1988", "ISO_8859-4", "ISO-IR-110", "CSISOLATIN4",
               "LATIN4", "L4"
  Nonstandard: "ISO8859-4"

- ISO 8859-5 with ISO 6429 control characters
  Name       : "ISO-8859-5"
  Aliases    : "ISO_8859-5:1988", "ISO_8859-5", "ISO-IR-144",
               "CSISOLATINCYRILLIC", "CYRILLIC"
  Nonstandard: "ISO8859-5"

- ISO 8859-6 with ISO 6429 control characters
  Name       : "ISO-8859-6"
  Aliases    : "ISO_8859-6:1987", "ISO_8859-6", "ISO-IR-127", "ECMA-114",
               "ASMO-708", "CSISOLATINARABIC", "ARABIC"
  Nonstandard: "ISO8859-6"

- ISO 8859-7 with ISO 6429 control characters
  Name       : "ISO-8859-7"
  Aliases    : "ISO_8859-7:1987", "ISO_8859-7", "ISO-IR-126", "ECMA-118",
               "ELOT_928", "CSISOLATINGREEK", "GREEK8", "GREEK"
  Nonstandard: "ISO8859-7"

- ISO 8859-8 with ISO 6429 control characters
  Name       : "ISO-8859-8"
  Aliases    : "ISO_8859-8:1988", "ISO_8859-8", "CSISOLATIN8", "LATIN8", "L8",
               "ISO-IR-138", "HEBREW", "CSISOLATINHEBREW"
  Nonstandard: "ISO8859-8"

- ISO 8859-9 with ISO 6429 control characters
  Name       : "ISO-8859-9"
  Aliases    : "ISO_8859-9:1989", "ISO_8859-9", "ISO-IR-148", "CSISOLATIN5",
               "LATIN5", "L5"
  Nonstandard: "ISO8859-9"

- ISO 8859-10 with ISO 6429 control characters
  Name       : "ISO-8859-10"
  Aliases    : "ISO_8859-10:1992", "ISO-IR-157", "CSISOLATIN6", "LATIN6", "L6"
  Nonstandard: "ISO8859-10"

- ISO 8859-11 with ISO 6429 control characters
  Name       : "TIS-620"
  Aliases    : "ISO-8859-11", "CSTIS620"
  Nonstandard: "ISO_8859-11", "ISO8859-11"

- ISO 8859-13 with ISO 6429 control characters
  Name       : "ISO-8859-13"
  Aliases    : "CSISO885913"
  Nonstandard: "ISO8859-13"

- ISO 8859-14 with ISO 6429 control characters
  Name       : "ISO-8859-14"
  Aliases    : "ISO_8859-14:1998", "ISO_8859-14", "ISO-IR-199", "CSISO885914",
               "ISO-CELTIC", "LATIN8", "L8"
  Nonstandard: "ISO8859-14"
- ISO 8859-15 with ISO 6429 control characters
  Name       : "ISO-8859-15"
  Aliases    : "ISO_8859-15", "CSISO885915", "LATIN9"
  Nonstandard: "ISO8859-15"
-
 ISO 8859-16 with ISO 6429 control characters
  Name       : "ISO-8859-16"
  Aliases    : "ISO_8859-16:2001", "ISO_8859-16", "ISO-IR-226", "CSISO885916",
               "LATIN10", "L10"
  Nonstandard: "ISO8859-16"

- KOI8-R
  Name   : "KOI8-R"
  Aliases: "CSKOI8R"

- KOI8-U
  Name   : "KOI8-U"
  Aliases: "CSKOI8U"

- Windows codepage 1250
  Name       : "Windows-1250"
  Aliases    : "CSWINDOWS1250"
  Nonstandard: "CP1250"

- Windows codepage 1251
  Name       : "Windows-1251"
  Aliases    : "CSWINDOWS1251"
  Nonstandard: "CP1251"

- Windows codepage 1252
  Name       : "Windows-1252"
  Aliases    : "CSWINDOWS1252"
  Nonstandard: "CP1252"

- Windows codepage 1253
  Name       : "Windows-1253"
  Aliases    : "CSWINDOWS1253"
  Nonstandard: "CP1253"

- Windows codepage 1254
  Name       : "Windows-1254"
  Aliases    : "CSWINDOWS1254"
  Nonstandard: "CP1254"

- Windows codepage 1255
  Name       : "Windows-1255"
  Aliases    : "CSWINDOWS1255"
  Nonstandard: "CP1255"

- Windows codepage 1256
  Name       : "Windows-1256"
  Aliases    : "CSWINDOWS1256"
  Nonstandard: "CP1256"

- Windows codepage 1257
  Name       : "Windows-1257"
  Aliases    : "CSWINDOWS1257"
  Nonstandard: "CP1257"

- Windows codepage 1258
  Name       : "Windows-1258"
  Aliases    : "CSWINDOWS1258"
  Nonstandard: "CP1258"

- Macintosh Roman codepage
  Name       : "Macintosh"
  Aliases    : "CSMACINTOSH", "MAC"
  Nonstandard: "MACROMAN", "X-MAC-ROMAN"

- IBM codepage 437 (with lower half interpreted as US-ASCII)
  Name       : "IBM437"
  Aliases    : "CSPC8CODEPAGE437", "CP437", "437"
  Nonstandard: "CP-437"

- IBM codepage 775 (with lower half interpreted as US-ASCII)
  Name   : "IBM775"
  Aliases: "CSPC775BALTIC", "CP775"
  Nonstandard: "CP-775"

- IBM codepage 850 (with lower half interpreted as US-ASCII)
  Name   : "IBM850"
  Aliases: "CSPC850MULTILINGUAL", "CP850", "850"
  Nonstandard: "CP-850"

- IBM codepage 852 (with lower half interpreted as US-ASCII)
  Name   : "IBM852"
  Aliases: "CSPCP852", "CP852", "852"
  Nonstandard: "CP-852"

- IBM codepage 858 (with lower half interpreted as US-ASCII)
  Name       : "IBM00858"
  Aliases    : "PC-MULTILINGUAL-850+EURO", "CSIBM00858", "CCSID00858",
               "CP00858"
  Nonstandard: "CP858", "CP-858", "IBM858"
