detect-charset

0.1.0 • Public • Published

detect-charset

NPM version Build Status Dependency Status Coverage Status

Detect character set of file

Install

$ npm install --save detect-charset

Usage

var detectCharset = require('detect-charset');
var fileBuffer = fs.readFileSync('myfile.txt');
detectCharset(fileBuffer); // "latin1"

Supported Features

latin1

Any file without a byte order mark or unicode characters will return "latin1".

Valid UTF-8 files without any unicode charaters will return "latin1".

utf-8

Any file that contains unicode characters but does not have a byte order mark will return UTF-8.

In the future, more advanced character set guessing techniques may be employed, but at the moment all files without a byte order mark are assumed to be UTF-8.

utf-8-bom

Any file with a UTF-8 byte order mark ('\xEF\xBB\xBF') will return "utf-8-bom".

utf-16be

Any file with a UTF-16 big-endian byte order mark ('\xFE\xFF') will return "utf-16be".

utf-16be

Any file with a UTF-16 little-endian byte order mark ('\xFF\xFE') will return "utf-16le".

utf-32be

Any file with a UTF-32 big-endian byte order mark ('\x00\x00\xFE\xFF') will return "utf-32be".

utf-32be

Any file with a UTF-32 little-endian byte order mark ('\x00\x00\xFF\xFE') will return "utf-32le".

Contributing

In lieu of a formal styleguide, take care to maintain the existing coding style. Add unit tests for any new or changed functionality. Lint and test your code using gulp.

License

Copyright (c) 2015 Trey Hunner. Licensed under the MIT license.

Readme

Keywords

none

Package Sidebar

Install

npm i detect-charset

Weekly Downloads

2

Version

0.1.0

License

MIT

Last publish

Collaborators

  • trey