Problem
I'm trying to make a scraping in a page using request's python lib, however I'm getting errors (Like Bad request or Method not allowed).
The page has two forms: one with get, and another one, with post (which I wish). I did pass values to text fields using 'data requests'.
I don't wanna pass an image for the form, just a text field.
I have six buttons in the form, for each button I have a different value.
HTML code
<form enctype="multipart/form-data" action="/page1" method="GET"> ... </form>
...
<form enctype="multipart/form-data" action="/page2" method="POST">
<input type="file" name="smiles_file">
<input type="text" name="smiles_str">
...
<button name="pred_type" type="submit" value="adme"> BT1 </button>
<button name="pred_type" type="submit" value="toxicity"> BT2 </button>
</form>
Python3 code
#imports
import requests
from bs4 import BeautifulSoup as bs
#commmon vars
url = 'www.exampleurl.com/site'
hd = {
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.67 Safari/537.36"
}
dt = {
'smiles_str': 'CC(=O)OC1=CC=CC=C1C(=O)O',
'pred_type': 'adme'
}
#scraping
with requests.Session() as rs:
result = rs.get(url, data=dt, headers=hd)
print ("Code: %s\nHTML\n%s" % (result.status_code, result.text))
EDIT
Using get: status_code: 405 (Method ... ) Using post: status_code: 400 (Bad request)
I don't see a reference to
/page1
nor/page2
in your example, but thers.get
should probably be using the named parameterparams
instead ofdata
and should correspond to the first form URL, while for the second form URL you'd need to use thers.post
method, where using data is okay.