Python UrlLib3 - Can't download file due to SSL Error even when ssl verification is disabled

491 Views Asked by At

I am unable to download a file using this piece of code:

import requests

response = requests.get('https://download.inep.gov.br/informacoes_estatisticas/indicadores_educacionais/taxa_transicao/tx_transicao_municipios_2019_2020.zip', stream=True, verify=False)
with open('tx_transicao_municipios_2019_2020.zip', 'wb') as f:
    for chunk in response.iter_content(chunk_size=1024): 
        if chunk: 
            f.write(chunk)

I keep getting this error even when verify=False is setted:

urllib3.exceptions.SSLError: [SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1006)

When using Chrome, I am able to download the file.

Using verify=certifi.where() doesn't work also.

Environment

  • Windows 10 Enterprise 22H2 (19045.3448);
  • Python v3.11.5;
  • OpenSSL v3.0.9;
  • Urllib3 v2.0.6;
  • Requests v2.31.0;
  • Certifi v2023.7.22;

Also tried in MacOS Catalina (10.15) and MacOS Big Sur (11.x) with no success.

What am I doing wrong here?

2

There are 2 best solutions below

0
On BEST ANSWER

Try:

import ssl
import warnings

import requests
import requests.packages.urllib3.exceptions as urllib3_exceptions

warnings.simplefilter("ignore", urllib3_exceptions.InsecureRequestWarning)


class TLSAdapter(requests.adapters.HTTPAdapter):
    def init_poolmanager(self, *args, **kwargs):
        ctx = ssl.create_default_context()
        ctx.check_hostname = False
        ctx.set_ciphers("DEFAULT@SECLEVEL=1")
        ctx.options |= 0x4
        kwargs["ssl_context"] = ctx
        return super(TLSAdapter, self).init_poolmanager(*args, **kwargs)


url = "https://download.inep.gov.br/informacoes_estatisticas/indicadores_educacionais/taxa_transicao/tx_transicao_municipios_2019_2020.zip"

with requests.session() as s:
    s.mount("https://", TLSAdapter())

    response = s.get(url, verify=False, stream=True)

    with open("tx_transicao_municipios_2019_2020.zip", "wb") as f:
        for chunk in response.iter_content(chunk_size=1024):
            if chunk:
                f.write(chunk)

Downloads tx_transicao_municipios_2019_2020.zip:

-rw-r--r-- 1 root root 17298416 okt 16 20:04 tx_transicao_municipios_2019_2020.zip
0
On

My final code using a specific certificate. Thanks to Andrej Kesely's answer.

import requests
import ssl
import os

# This is required to make OpenSSL 3.0.0 behave like 1.1.1
# https://stackoverflow.com/questions/76907660/sslerror-when-accessing-sidra-ibge-api-using-python-ssl-unsafe-legacy-rene
# https://stackoverflow.com/questions/71603314/ssl-error-unsafe-legacy-renegotiation-disabled
class TLSAdapter(requests.adapters.HTTPAdapter):
    def init_poolmanager(self, *args, **kwargs):
        ctx = ssl.create_default_context()
        ctx.set_ciphers("DEFAULT@SECLEVEL=1")
        ctx.options |= 0x4   # <-- the key part here, OP_LEGACY_SERVER_CONNECT
        kwargs["ssl_context"] = ctx
        return super(TLSAdapter, self).init_poolmanager(*args, **kwargs)

# Brazilian government has a specific certificate available at:
# https://ajuda.rnp.br/icpedu/cc/manual-do-usuario/instalando-o-certificado
# https://www.ccuec.unicamp.br/ccuec/material_apoio/certificados-digitais-acs

with requests.Session() as session:
    session.mount("https://", TLSAdapter())
    response = session.get('https://download.inep.gov.br/informacoes_estatisticas/indicadores_educacionais/taxa_transicao/tx_transicao_municipios_2019_2020.zip', stream=True, verify="gs_root_and_intermediate.pem")
    with open('tx_transicao_municipios_2019_2020.zip', 'wb') as f:
        for chunk in response.iter_content(chunk_size=1024): 
            if chunk: 
                f.write(chunk)
                print(f"Progress: {os.path.getsize(f.name)} bytes", end = '\r')
                
print('\n')
print('Download completed.')