Main documentation

The functions of the main script.

class Geo_419b.GeoFileHandler(path, folder_name, geo_file_list)[source]

Bases: object

class to represent a folder with geo-files

path

path of the directory which contains the folder

Type:

str

folder_name

name of the folder

Type:

str

geo_file_list

a list of dict of {str, array[int]} which contains the absolute path and extension [minX, minY, maxX, maxY] of a geo-file

Type:

list of dict

file_list

list of absolute paths of all files created from geo_file_list because its needed gdal.BuildVRT() in method create_vrt

Type:

list of str

extension

array with [minX, minY, maxX, maxY] for all files

Type:

array of int

create_vrt(name, epsg='EPSG: 25832')[source]

creates a raster-mosaic using gdal.BuildVRT and export as GeoTiff in choosable crs

Parameters:
  • name (str) – name of the GeoTiff

  • epsg (str, default=EPSG: 25832) – EPSG-code of chosen crs

Geo_419b.auto_download(working_dir, path_shp, start_year_elev=None, month_start_year=1, end_year_elev=None, month_end_year=12, start_year_ortho=None, end_year_ortho=None, dgm=True, dom=True, las=True, ortho=True, file_cor_dgm=None, epsg_mosaic='EPSG: 25832', merge_dgm=True, merge_dom=True, merge_ortho=True, delete=True)[source]

The main function of the script, through the parameters of this function one can control the download of the elevation data and orthophoto as well as the further processing of them (height correction and merging). Depending on the parameters, the other functions of the script are called within this function to download the data and perform the processing.

Parameters:
  • working_dir (str) – Path to the directory where the output is to be stored.

  • path_shp (str) – Path to the shapefile of the area of interest.

  • start_year_elev (int or None, default=None) – first year of interest for the elevation data

  • month_start_year (int, default=1) – first month of interest for the elevation data

  • end_year_elev (int or None, default=None) – last year of interest for the elevation data

  • month_end_year (int, default=12) – last month of interest for the elevation data

  • start_year_ortho (int or None, default=None) – first year of interest for the orthophotos

  • end_year_ortho (int or None, default=None) – last year of interest for the orthophotos

  • dgm (bool, default=True) – Are digital terrain models to be downloaded.

  • dom (bool, default=True) – Are digital surface models to be downloaded.

  • las (bool, default=True) – Should laser scanner data be downloaded.

  • ortho (bool, default=True) – Should orthophotos be downloaded.

  • file_cor_dgm (str or None) – Path to the height correction file.

  • epsg_mosaic (str, default=EPSG: 25832) – EPSG-code of the merged mosaics.

  • merge_dgm (bool, default=True) – Should the digital terrain models be merged.

  • merge_dom (bool, default=True) – Should the digital surface models be merged.

  • merge_ortho (bool, default=True) – Should the orthophotos be merged.

  • delete (bool, default=True) – Should the Zip files be deleted.

Geo_419b.c_tile_number_df(geodf)[source]

Creates and returns a dataframe that contains the tile numbers of a geodataframe.

Parameters:

geodf (geopandas.geodataframe.GeoDataFrame) – The Intersected geodataframe of the area of interest and the tile number geodataframe.

Returns:

df – That contains the tile numbers.

Return type:

pandas.core.frame.DataFrame

Geo_419b.create_and_unzip(folder_path, zip_files)[source]

Creates a folder (if it is not already existing) and unzip a list of ZIP files into it. Before the function tries to unpacks a file, it checks whether this file actually exists in the working directory.

Parameters:
  • folder_path (str) – Path to / name of the folder to create.

  • zip_files (list of str) – A list containing the names of the ZIP files to be unzipped.

Geo_419b.create_elev_download_list(elev_aoi, year, start_year, end_year, month_start_year, month_end_year, additional_check)[source]

Creates a list that contains the part of the URL that is different for each data tile for all data tiles to be downloaded and returns that list. If there is no data for the specified year, stop is returned.

Parameters:
  • elev_aoi (geopandas.geodataframe.GeoDataFrame) – The Intersected geodataframe of the area of interest and the metadata geodataframe.

  • year (int) – The year of interest or one of the years ot interest.

  • start_year (int) – First year of interest.

  • end_year (int) – Last year of interest.

  • month_start_year (int) – First month of interest.

  • month_end_year (int) – Last month of interest.

  • additional_check (str) – Is this an additional check.

Returns:

elev_download_list – A list that contains the part of the URL that is different for each data tile for all data tiles to be downloaded.

Return type:

list of str

Geo_419b.create_geo_file_dic(dir, file)[source]

calculate the geometric extension of a raster

Parameters:
  • dir (str) – directory

  • file (str) – name of th file

Returns:

dict of {str: array[int]} with the path as str and an array with the geometric extension with the following values [minX, minY, maxX, maxY]

Return type:

dict

Geo_419b.data_download(type_to_download, data_list_to_download, url_year='', year=0, dem_n='', year_list=None, tile_number_list=None, additional_check_2019=False)[source]

Loops trough a list of data to download puts the URL(s) together and download the ZIP file(s). A list with the name(s) of the downloaded file(s) is returned, if no files were downloaded “no_new_data” is returned. Files are only downloaded, if the file or the content of the file is not already in the working directory.

Parameters:
  • type_to_download (str) – The type of the data to be downloaded.

  • data_list_to_download (list of str) – A list that contains the part of the URL that is different for each data tile for all data tiles to be downloaded.

  • url_year (str) – Part of the URL for the download of the elevation data

  • year (int) – The year of interest or one of the years ot interest.

  • dem_n (str) – Part of the URL for the download of the elevation data

  • year_list (list of int or None, default=None) – A list which contains the year of capture of each orthophoto to be downloaded.

  • tile_number_list (list of str or None, default=None) – A list which contains the tile number of each orthophoto to be downloaded.

  • additional_check_2019 (bool, default=False) – Information on if this is an additional check for 2019 or not.

Returns:

zip_data_list – A list with the name(s) of of the downloaded file(s).

Return type:

list of str

Geo_419b.delete_zip_files(zip_files)[source]

Deletes one or more ZIP files. Before the function tries to delete a file, it checks whether this file actually exists in the working directory.

Parameters:

zip_files (list of str) – A list containing the names of the ZIP files to be deleted.

Geo_419b.get_relevant_url_ids(url_id_df, tile_number_df, start_year, end_year)[source]

Creates and returns three list that are needed for the download of the orthophotos and one list contacting the years where orthophotos are available only for a part of the area of interest. To accomplish this, several dataframe operations are performed.

Parameters:
  • url_id_df (pandas.core.frame.DataFrame) – Dataframe with the the ID part of all URLs, the years and the tile numbers as columns.

  • tile_number_df (pandas.core.frame.DataFrame) – Dataframe containing all relevant tile numbers.

  • start_year (int) – First year or interest.

  • end_year – Last year of interest.

Returns:

  • url_id_list (list of str) – A list which contains the ID part of the URL for each orthophoto downloaded.

  • year_list (list of int) – A list which contains the year of capture of each orthophoto to be downloaded.

  • tile_number_list (list of str) – A list which contains the tile number of each orthophoto to be downloaded.

  • partly_data_list (list of int) – A list contacting the years where orthophotos are available only for a part of the area of interest.

Geo_419b.go_through_all_raster(dir, ending, file_cor=None)[source]

go through all raster of path including subfolders. Calling the function raster_correction (file_cor given) or create_geo_file_dic (no file_cor given) to get the a dictionary with file end extent. For each subfolder an instance of the class GeoFileHandler is created. All Objects of GeoFileHandler are returned as a list.

Parameters:
  • dir (str) – directory with subfolders, witch contains all raster datasets

  • ending (str) – file extension of the raster dataset (f.e. .tif)

  • file_cor (str or None, default=None) – path of a file for raster correction

Returns:

geo_file_handler_list – list with instances of GeoFileHandler for every subfolder

Return type:

list of GeoFileHandler

Geo_419b.intersect_geodfs(geodf_1, geodf_2)[source]

Intersects two geodataframes and returns the intersected geodataframe. If the coordinate reference system (crs) of the geodataframes is different the first geodataframe is re-projected to the crs of the second geodataframe.

Parameters:
  • geodf_1 (geopandas.geodataframe.GeoDataFrame) – geodataframe 1

  • geodf_2 (geopandas.geodataframe.GeoDataFrame) – geodataframe 2

Returns:

intersected_geodf – intersected geodataframe

Return type:

geopandas.geodataframe.GeoDataFrame

Geo_419b.raster_correction(dir, file_raster, file_cor, ending, epsg='EPSG: 25832')[source]

corrects every raster value by addition with a second raster (correction file). Writes the result as a new GeoTiff by replacing the original file extension with _UTM_cor.tif

Parameters:
  • dir (str) – directory

  • file_raster (str) – name of the input raster

  • file_cor (str) – path of the correction-raster-file

  • ending (str) – file extension of the input raster

  • epsg (str, optional, default=EPSG: 25832) – EPSG-code of the input raster. just necessary if not EPSG: 25832

Returns:

dict of {str: array[int]} with the path as str and an array with the geometric extension with the following values [minX, minY, maxX, maxY] for the calculated raster

Return type:

dict

Geo_419b.set_elev_variables(year)[source]

Sets some variables that change depending on the specified year and returns them. However, if it is certain that there is no data for the specified year “stop” is returned.

Parameters:

year (int) – The year of interest or one of the years ot interest.

Returns:

  • url_year (str) – Part of the URL for the download of the elevation data.

  • dem_n (str) – Part of the URL for the download of the elevation data.

  • elev_meta_file (str) – Name of the meta data shapefile for the elevation data.

Geo_419b.split_df(df)[source]

Splits a dataframe into two (based on the year) and returns a list that contains the two new dataframes.

Parameters:

df (pandas.core.frame.DataFrame) – The dataframe to be split.

Returns:

list_ – A list containing the new dataframes.

Return type:

list of pandas.core.frame.DataFrames