Module grab.document¶
The Document class is the result of network request made with Grab instance.
-
class
grab.document.
Document
(grab=None)[source]¶ - Document (in most cases it is a network response
- i.e. result of network request)
-
detect_charset
()[source]¶ Detect charset of the response.
Try following methods: * meta[name=”Http-Equiv”] * XML declaration * HTTP Content-Type header
Ignore unknown charsets.
Use utf-8 as fallback charset.
-
json
¶ Return response body deserialized into JSON object.
-
parse
(charset=None, headers=None)[source]¶ Parse headers.
This method is called after Grab instance performs network request.
-
save_hash
(location, basedir, ext=None)[source]¶ Save response body into file with special path builded from hash. That allows to lower number of files per directory.
Parameters: - location – URL of file or something else. It is used to build the SHA1 hash.
- basedir – base directory to save the file. Note that file will not be saved directly to this directory but to some sub-directory of basedir
- ext – extension which should be appended to file name. The dot is inserted automatically between filename and extension.
Returns: path to saved file relative to basedir
Example:
>>> url = 'http://yandex.ru/logo.png' >>> g.go(url) >>> g.response.save_hash(url, 'some_dir', ext='png') 'e8/dc/f2918108788296df1facadc975d32b361a6a.png' # the file was saved to $PWD/some_dir/e8/dc/...
TODO: replace basedir with two options: root and save_to. And returns save_to + path