blob with conversion to 8bit cp1251 or cp1252

2.8k Views Asked by At

I need a solution with encoding utf to 8-bit cp1251 or cp1252 using blob

I managed to change the https://github.com/b4stien/js-csv-encoding including windows 1251, but there are insoluble problems:

Unfortunately noscript does not allow loading external javascript on a page with scripts turned off via it.

Therefore, it is impossible to use js-csv-encoding in the bookmarker, as well as to load jquery! Disabling noscript, especially after meltdown and specter is simply not secure.

Therefore, only the version of a small script written in native javascript is left. If you find an alternative way to run jquery with noscript off, then finding a solution will be easier although I doubt it's possible.

A good solution would be https://www.npmjs.com/package/windows-1251 or https://www.npmjs.com/package/windows-1252 However, it does not succeed to transcode two-byte text into a single-byte text through these scripts. For example:

<script src="windows-1251.js"></script>
<script type="text/javascript">
function download(text, name, type) {
var a = document.getElementById("a");
var file = new Blob([text], {type: type});
a.href = URL.createObjectURL(file);
a.download = name;
</script>

There have been many attempts to use windows1251, for example these:

<script type="text/javascript">
function exportToCsv() {
window.open(windows1251.encode('data:text/csv;charset=windows-1251,' +'текст'));
}
var button = document.getElementById('b');
button.addEventListener('click', exportToCsv);
</script>

<script type="text/javascript">
function exportToCsv() {window.open('data:text/csv;charset=windows-1251,' +windows1251.encode('текст'));}

var button = document.getElementById('b');
button.addEventListener('click', exportToCsv);
</script>

Using encode or decode from windows-1251 does not translate the script into a 8-bit format. In js-csv-encoding, csvContentEncoded is used for transcoding:

Attempts to use something like that have failed. Perhaps you need some kind of hack, just put windows-1251 is not enough, since js stores in utf8, then most likely you need to add the conversion to 1251 at the very end. Part of the code: js-csv-encoding.

var csvContent = 'текст',
textEncoder = new CustomTextEncoder('windows-1251', {NONSTANDARD_allowLegacyEncoding: true}),
fileName = 'some-data.csv';
var a = document.getElementById('download-csv');
a.addEventListener('click', function(e) {
var csvContentEncoded = textEncoder.encode([csvContent]);
var blob = new Blob([csvContentEncoded], {type: 'text/csv;charset=windows-1251;'});
saveAs(blob, fileName);
e.preventDefault();
});

I also tried to use conversions using charcode, saving not to the server but to the computer, so using urlencode .. is not the right solution, because in this case I have to encode the text into the readable one.

Of course, it's hard to find a solution of no more than 4000-5000 characters for a bookmarklet, and my knowledge is not enough. If there is a solution with the help of other scripts, for example, recoding by the value table, this can also be a solution.

1

There are 1 best solutions below

0
On

I spent the half of the day trying to save an xml file with Cyrillic symbols in windows-1251 encoding. Turned out it's pretty easy - you just need to create an appropriate byte array. See the example below (The full repo with this example):

import iconv from 'pika-iconv-lite';
import saveAs from 'save-as';

const byteArrayWin1251 = iconv.encode(
  `<?xml version="1.0" encoding="windows-1251"?>
  <note>
    <to>Михаил</to>
    <from>Андрей</from>
    <heading>Reminder</heading>
    <body>Вот такая вот xml! И сохранюсь я как win-1251</body>
  </note>`,
  'win1251'
);
const blob = new Blob([byteArrayWin1251], { type: 'application/xml;charset=windows-1251' })
saveAs(blob, 'myxml.xml');