_aux documentation
The functions of the auxiliary script.
- _aux.create_url_id_file(start, stop, out_path, number_of_requests=100)[source]
This function creates CSV files with the id part ot the URL, the year of data acquisition and the tile number or only the URL (if it is a key error URL) as columns.To achieve this, first the framework function is called to get the head content dispositions or in case of a key error the URL. Then the information is filtered and the CSV files are generated. Finally, the content of all non-key error CSV files is merged into one file.
- Parameters:
start (int) – The number of the id part of the first URL to be checked.
stop (int) – The number of the id part of the last URL to be checked.
out_path (str) – Path to the folder where the output should be stored.
number_of_requests (int) – Maximum number of concurrent requests (if the performance is not important, the default value should be kept).
- Returns:
path_name – The path to the URL id file.
- Return type:
str
- async _aux.framework_requests(start=0, stop=0, list_of_ids=None)[source]
In this function first the framework for the request function is set. Then the request function is called repeatedly to get the head content dispositions of the URLs.
- Parameters:
start (int) – The number of the id part of the first URL to be checked.
stop (int) – The number of the id part of the last URL to be checked.
list_of_ids (list of str or None) – This Parameter should not be changed. It only plays a role in the additional checking of the key error URLs.
- Returns:
hcd__url_list (list of str) – A list with the head content dispositions plus the corresponding URLs.
url_key_error_list (list of str) – A list with all URLs where a key error occurred.
- async _aux.get_hcd(session, url)[source]
The function requests the head content disposition from an URL and returns it together with the URL if an key error occurs only the url is returned.
- Parameters:
session (aiohttp.client.ClientSession) – the client session
url (str) – The URL from which the header information should be read.
- Returns:
head_content_disposition + “__” + url (srt) – The head content disposition plus the corresponding URL as one string.
url (str) – the URL