Dumping html source using w3m gives unexpected characters/symbols

1.4k Views Asked by user1311034 At 07 June 2025 at 09:37

As a new user of w3m I am trying to do something basic like:

w3m -dump_source nytimes.com > nytimes.html

The output produced gives crazy characters and symbols. However, when I browse using w3m nytimes, it loads properly, and I can even view the HTML using v.

Further when I tried:

w3m -dump_extra nytimes.com > nytimes.html

I get all the extra info associated with the site perfectly, except for the HTML source.

Any help would be appreciated.

Original Q&A

There are 1 best solutions below

Ruslan Osmanov On 22 January 2017 at 07:48

By default, w3m requests compressed output from the server by sending the following HTTP header:

Accept-Encoding: gzip, compress, bzip, bzip2, deflate

The value of the header may vary depending on the version of w3m, but the fact is that the latest versions of the program request compressed output from the host using Accept-Encoding header. You can find out the exact headers with the following command:

w3m -dump_source -reqlog nytimes.com > /dev/null

The request and response headers will be logged to ~/.w3m/request.log file.

You can request uncompressed version by overriding the header as follows:

w3m -dump_source nytimes.com -o accept_encoding='identity;q=0'

Or even

w3m -dump_source nytimes.com -o accept_encoding='*;q=0'

Alternatively, decompress the output via pipe:

w3m -dump_source nytimes.com | gunzip -f

The -f option causes gunzip to copy the input data without change to the standard output, if the input data is not in a format recognized by gunzip. According to the documentation, you should also pass --stdout option, but the piped command should print the result to standard output even without this option.

Note, the server may respond with content compressed in bzip2. In this case, you can pipe the output through bunzip2 -f command.

Dumping html source using w3m gives unexpected characters/symbols

There are 1 best solutions below

Related Questions in HTML

Related Questions in HTTP

Related Questions in W3M

Trending Questions

Popular # Hahtags

Popular Questions